Software Transactional Memory
Jaakko Järvi
University of Bergen
<2018-10-02 Tue>
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 1 / 11 Outline
1 Examples
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 2 / 11 Both transactions maintain invariant, yet without early fail can “crash”
Examples Early Rollback is Important
Assume invariant x == 0 && y == 0, and two concurrently executing threads:
atomic { if (x != y) crash(); }
atomic { ++x; ++y; }
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 3 / 11 Examples Early Rollback is Important
Assume invariant x == 0 && y == 0, and two concurrently executing threads:
atomic { if (x != y) crash(); }
atomic { ++x; ++y; }
Both transactions maintain invariant, yet without early fail can “crash”
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 3 / 11 Examples Deposit with lock synchronization (Java)
public void deposit(double amount){ System.out.println("Depositing "+ amount); double nb=0; balanceChangeLock.lock(); try{ nb= balance+ amount; balance= nb; } finally{ balanceChangeLock.unlock(); } System.out.println("New balance is "+ nb); }
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 4 / 11 Examples Deposit with STM (hypothetical Java)
public void deposit(double amount){ System.out.println("Depositing "+ amount); double nb=0;
atomic{ nb= balance+ amount; balance= nb; }
System.out.println("New balance is "+ nb); }
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 5 / 11 Lock all accounts Manually decide what needs to be undone after each kind of exception Side-effects might be visible in other threads before undone
Examples Composing critical sections (lock syncrhonization)
class Bank{ Accounts accounts; ... void transfer(String name1, String name2, int amount){ synchronized(accounts){ try{ accounts.put(name1, accounts.get(name1)-amount); accounts.put(name2, accounts.get(name2)+amount); } catch(Exception1) {..} catch(Exception2) {..} } ... }
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 6 / 11 Examples Composing critical sections (lock syncrhonization)
class Bank{ Accounts accounts; ... void transfer(String name1, String name2, int amount){ synchronized(accounts){ try{ accounts.put(name1, accounts.get(name1)-amount); accounts.put(name2, accounts.get(name2)+amount); } catch(Exception1) {..} catch(Exception2) {..} } ... }
Lock all accounts Manually decide what needs to be undone after each kind of exception Side-effects might be visible in other threads before undone
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 6 / 11 Examples Composing critical sections (STM)
class Bank{ Accounts accounts; ... void transfer(String name1, String name2, int amount){ try{ atomic{ accounts.put(name1, accounts.get(name1)-amount); accounts.put(name2, accounts.get(name2)+amount); } } catch(Exception1) {..} catch(Exception2) {..} } ... }
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 7 / 11 Examples Motivating example: HashMap (thread-unsafe)
public Object get(Object key){ int idx= hash(key); // Compute hash HashEntry e= buckets[idx]; // to find bucket while(e != null){ // Find element in bucket if(key.equals(e.key)) returne.value; e=e.next; } return null; }
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 8 / 11 Simple solution: add a synchronization layer Poor scalability, entire map locked at once.
Examples HashMap (thread-safe via lock synchronization)
public Object get(Object key){ synchronized(mutex) // mutex guards all accesses to map m { returnm.get(key); } }
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 9 / 11 Examples HashMap (thread-safe via lock synchronization)
public Object get(Object key){ synchronized(mutex) // mutex guards all accesses to map m { returnm.get(key); } }
Simple solution: add a synchronization layer Poor scalability, entire map locked at once.
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 9 / 11 Equally simple Good scalability, only the impacted parts of HashMap locked (briefly)
Examples HashMap (thread-safe via STM)
public Object get(Object key){ atomic // System guarantees atomicity { returnm.get(key); } }
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 10 / 11 Examples HashMap (thread-safe via STM)
public Object get(Object key){ atomic // System guarantees atomicity { returnm.get(key); } }
Equally simple Good scalability, only the impacted parts of HashMap locked (briefly)
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 10 / 11 Examples HashMap (scalable thread-safe with fine-grained locking)
public Object get(Object key){ int hash= hash(key); // Try first without locking... Entry[] tab= table; int index= hash&(tab.length-1); Entry first= tab[index]; Entry e; for(e= first;e != null;e=e.next){ if(e.hash == hash&& eq(key,e.key)){ Object value=e.value; if(value != null) return value; else break; } }
// Recheck under synch if key not there or interference Segment seg= segments[hash& SEGMENT_MASK]; synchronized(seg){ tab= table; index= hash&(tab.length-1); Entry newFirst= tab[index]; if(e != null || first != newFirst){ for(e= newFirst;e != null;e=e.next){ if(e.hash == hash&& eq(key,e.key)) returne.value; } } return null; } }
Jaakko Järvi (University of Bergen) Software Transactional Memory <2018-10-02 Tue> 11 / 11 Concurrent Programming in Standard C++
Jaakko Järvi
University of Bergen
<2018-10-02 Tue>
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 1 / 92 Outline
1 C++ Standardization and concurrency
2 C++ memory model
3 Standard library offerings for concurrency
4 Threads
5 Synchronizing threads with mutexes
6 Condition variables
7 Thread local variables, call once functions
8 Atomics
9 Tasks
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 2 / 92 C++ Standardization and concurrency C++11, C++14, C++17, C++2a, . . . and concurrency
C++03 had no support for concurrency All concurrent programs relied on OS specific services that could change from version to version C++11 made two significant additions: a memory model for C++ the beginnings of the standard API for concurrency-related functionality threads, locks, futures, etc. C++14 adds some tweaks to the API C++17 parallel STL algorithms C++2a will add more features composable futures? coroutines? tansactional memory? latches, barriers? atomic smart pointers?
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 3 / 92 C++ memory model Outline
1 C++ Standardization and concurrency
2 C++ memory model
3 Standard library offerings for concurrency
4 Threads
5 Synchronizing threads with mutexes
6 Condition variables
7 Thread local variables, call once functions
8 Atomics
9 Tasks
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 4 / 92 For the programmer: a set of guarantees about the order in which memory reads and writes are observed by a thread conversely, a set of obligations that a programmer has to adhere to when writing concurrent code For the compiler and hardware: a set of rules that defines valid code transformations
C++ memory model Memory model
Memory model defines the semantics of shared variables
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 5 / 92 C++ memory model Memory model
Memory model defines the semantics of shared variables
For the programmer: a set of guarantees about the order in which memory reads and writes are observed by a thread conversely, a set of obligations that a programmer has to adhere to when writing concurrent code For the compiler and hardware: a set of rules that defines valid code transformations
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 5 / 92 C++ memory model Multi-processor system
Thread 1 Thread n ···
r w r w
Shared memory
Multiple threads can update and access the same shared variables global variables, “static” class members, all data accessible to those variables or passed to the thread by other means Each thread also has its own local variables
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 6 / 92 C++ memory model Consistency in multi-processor systems
Lamport: How to make a Multiprocessor Computer That Correctly Executes Multiprocess programs, 1979: Definitions for what consistency means in a multi-processor systems; what kind of leeway can be given to compilers and hardware
Sequential processor The result of the execution is the same as if the operations had been executed in the order specified by the program
Sequential consistency The result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program.
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 7 / 92 C++ memory model Sequential consistency
Any execution that does something like in the lower picture must be indistinguishable from some execution like the upper picture
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 8 / 92 C++ memory model Consistency in multi-processor systems
That each processor is sequential does not guarantee sequential consistency
Additional requirement 1 Each processor issues memory requests in the order specified by its program
Additional requirement 2 Memory requests from all processors issued to an individual memory module are serviced from a single FIFO queue. Issuing a memory request consists of entering the request on this queue.
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 9 / 92 C++ memory model Programming assuming sequential consistency
Assume shared variables x and y both have value 0
// Thread 1 // Thread 2 x=1; y=1; r1= y; r2= x;
Some interleavings
x=1; // 1 y=1; // 2 x=1; // 1 r1= y; // 1 r2=x // 2 y=1; // 2 y=1; // 2 x=1; // 1 r1= y; // 1 r2=x // 2 r1= y; // 1 r2=x // 2 // r1 == 0, r2 == 1 // r1 == 1, r2 == 0 // r1 == 1, r2 == 1
Execute as if at each step one thread selected, and its next statement is executed Repeat until all threads done Under sequential consistency, it is not possible that both r1 and r2 are 0 at the end of the execution
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 10 / 92 allocate memory for the new object initialize x initialize y assign to x assign to y assign to pos
C++ memory model Another example
pos= new Point(pos.x+1, pos.y+1);
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 11 / 92 C++ memory model Another example
pos= new Point(pos.x+1, pos.y+1);
allocate memory for the new object initialize x initialize y assign to x assign to y assign to pos
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 11 / 92 None of today’s architectures are sequentially consistent This is because of performance optimizations in hardware reordering of instructions, speculative execution, buffering writes This is also because of common compiler optimizations common subexpression elimination, eliminating redundant reads, loop optimizations, . . . These hardware or compiler optimizations, not observable in single-threaded code, can become observable in multi-threaded code
C++ memory model Relaxed memory
Sequential consistency would be a nice programming model, but . . .
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 12 / 92 C++ memory model Relaxed memory
Sequential consistency would be a nice programming model, but . . . None of today’s architectures are sequentially consistent This is because of performance optimizations in hardware reordering of instructions, speculative execution, buffering writes This is also because of common compiler optimizations common subexpression elimination, eliminating redundant reads, loop optimizations, . . . These hardware or compiler optimizations, not observable in single-threaded code, can become observable in multi-threaded code
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 12 / 92 C++ memory model An example not assuming sequential consistency
Assume shared variables x and y, both 0
// Thread 1 // Thread 2 x=1; y=1; r1= y; r2= x;
Processors typically do not wait for the first assignment to complete before executing the second Value 1 stored in a buffer, waiting to be written to memory Not visible in other thread immediately The result r1 == 0 and r2 == 0 is possible ⇒ Compilers can rearrange code Both assignments in each thread are independent Compiler free to move r1 = y and r2 = x up The result r1 == 0 and r2 == 0 is possible ⇒
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 13 / 92 Common subexpression elimination will likely rewrite cout << data + 2 to cout << x, and thus the program could print 2
data=1; x= data+2; ready=1; if (ready ==1) cout << x;
And of course, the SC assumption is unrealistic, so the assignments in thread 1 could be reordered
C++ memory model Another example of compiler effects
Assume sequential consistency, and ready == 0. Seeminly this program would then print nothing, or print 3.
data=1; x= data+2; ready=1; if (ready ==1) cout << data+2;
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 14 / 92 And of course, the SC assumption is unrealistic, so the assignments in thread 1 could be reordered
C++ memory model Another example of compiler effects
Assume sequential consistency, and ready == 0. Seeminly this program would then print nothing, or print 3.
data=1; x= data+2; ready=1; if (ready ==1) cout << data+2;
Common subexpression elimination will likely rewrite cout << data + 2 to cout << x, and thus the program could print 2
data=1; x= data+2; ready=1; if (ready ==1) cout << x;
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 14 / 92 C++ memory model Another example of compiler effects
Assume sequential consistency, and ready == 0. Seeminly this program would then print nothing, or print 3.
data=1; x= data+2; ready=1; if (ready ==1) cout << data+2;
Common subexpression elimination will likely rewrite cout << data + 2 to cout << x, and thus the program could print 2
data=1; x= data+2; ready=1; if (ready ==1) cout << x;
And of course, the SC assumption is unrealistic, so the assignments in thread 1 could be reordered
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 14 / 92 C++ memory model Another example
pos= new Point(pos.x+1, pos.y+1);
allocate memory for the new object initialize x initialize y assign to x assign to y assign to pos
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 15 / 92 C++ memory model Another example
pos= new Point(pos.x+1, pos.y+1);
allocate memory for initialize x initialize y the new object assign to x assign to y assign to pos
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 16 / 92 Data race A program allows a data race if there is a sequentially consistent execution, in which two conflicting operations can be executed simultaneously
C++ memory model Data races
The above problems arise because 1 there are two threads that might access the same data simultaneously in a conflicting way (there is a data race) and 2 either the compiler or processor rewrites/rearranges code (or rearranges when memory reads and writes are visible)
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 17 / 92 C++ memory model Data races
The above problems arise because 1 there are two threads that might access the same data simultaneously in a conflicting way (there is a data race) and 2 either the compiler or processor rewrites/rearranges code (or rearranges when memory reads and writes are visible)
Data race A program allows a data race if there is a sequentially consistent execution, in which two conflicting operations can be executed simultaneously
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 17 / 92 If two memory operations from different threads occur adjacently in the interleaving, they could have occurred in the opposite order too; or simultaneously if true concurrency exists.
C++ memory model Definitions
Conflicting memory operations Two memory operations conflict if they access the same memory location, and at least one of them is a write
Simultaneous operations Two operations are simultaneous if they are from different threads and they are adjacent in the interleaving.
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 18 / 92 C++ memory model Definitions
Conflicting memory operations Two memory operations conflict if they access the same memory location, and at least one of them is a write
Simultaneous operations Two operations are simultaneous if they are from different threads and they are adjacent in the interleaving.
If two memory operations from different threads occur adjacently in the interleaving, they could have occurred in the opposite order too; or simultaneously if true concurrency exists.
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 18 / 92 The memory model guarantees that updates to different memory locations do not interfere each other, and do not need to be synchronized Two different bitfields in the same contiguous sequence of bitfields are considered the same memory location In C++98, the following could leave x and y with either 0 or 1 c and b could be allocated into the same word Disallowed in C++11
// thread 1: // thread 2: char c; char b; c=1; b=1; intx= c; inty= b;
C++ memory model C++ memory model guarantee
If a program has no data races, it is sequentially consistent
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 19 / 92 C++ memory model C++ memory model guarantee
If a program has no data races, it is sequentially consistent
The memory model guarantees that updates to different memory locations do not interfere each other, and do not need to be synchronized Two different bitfields in the same contiguous sequence of bitfields are considered the same memory location In C++98, the following could leave x and y with either 0 or 1 c and b could be allocated into the same word Disallowed in C++11
// thread 1: // thread 2: char c; char b; c=1; b=1; intx= c; inty= b;
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 19 / 92 C++ memory model called “Sequentially consistent for data race-free programs” (SC-DRF) The basic approach similar to Java’s memory model, except that for Java, semantics more complex (even when races present, memory safety must be preserved)
C++ memory model C++ memory model guarantee, the flip side
If a program has a data race, its behavior is undefined
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 20 / 92 C++ memory model C++ memory model guarantee, the flip side
If a program has a data race, its behavior is undefined
C++ memory model called “Sequentially consistent for data race-free programs” (SC-DRF) The basic approach similar to Java’s memory model, except that for Java, semantics more complex (even when races present, memory safety must be preserved)
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 20 / 92 C++ memory model Relaxed memory model is not a huge (extra) burden to the programmer
In addition to sequential consistency, the above examples rely on atomicity of some operations. x = 1 etc. was assumed to happen atomically Depending on the type of x, this may not be true in all platforms. A thread could observe states where “half” of x written to Certainly one cannot expect atomicity from updates to arbitrary variables of user-defined types Sequential consistency alone is not enough, the programmer should ⇒ anyway use synchronization to ensure that there are no data races.
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 21 / 92 C++ memory model Simple example
Consider this code: c++, where c a shared variable Most compilers would turn it into tmp = c; ++tmp; c = tmp; Or to a single instruction, which the processor might not execute atomically With threads 1 and 2 both executing c++, we might get: tmp1= c; // reads n tmp2= c; // reads n ++tmp1; ++tmp2; c= tmp1; // writes n + 1 c= tmp2; // writes n + 1
Hence, even with sequential consistency, typically must synchronize
l.lock(); c++; l.unlock();
Programs with data races almost never produce consistently correct results across various hardware and compiler platforms.
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 22 / 92 C++ memory model Intermediate summary
The C++ memory model was designed so that programmers should not have to think about the C++ memory model This works as long as all access to shared variables are ordered with locks, atomics, etc. happens-berore and synchronizes-with relations (to be discussed)
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 23 / 92 // Thread 1: // Thread 2: r1= x; r2= y; if (r1 == 42) y= r1; x= 42;
C++ memory model Reasoning is otherwise too difficult (and unreliable)
Assume x == 0, y == 0
// Thread 2: // Thread 1: r2= y; r1= x; if (r2 == 42) x= 42; if (r1 == 42) y= r1; elsex= 42;
r1 == r2 == 42 possible.
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 24 / 92 C++ memory model Reasoning is otherwise too difficult (and unreliable)
Assume x == 0, y == 0
// Thread 2: // Thread 1: r2= y; r1= x; if (r2 == 42) x= 42; if (r1 == 42) y= r1; elsex= 42;
r1 == r2 == 42 possible.
// Thread 1: // Thread 2: r1= x; r2= y; if (r1 == 42) y= r1; x= 42;
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 24 / 92 Standard library offerings for concurrency C++11 concurrency toolbox
Threads Mutexes, locks Condition variables Atomic variables Futures and promises async() function Abandoning processes
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 25 / 92 Threads Outline
1 C++ Standardization and concurrency
2 C++ memory model
3 Standard library offerings for concurrency
4 Threads
5 Synchronizing threads with mutexes
6 Condition variables
7 Thread local variables, call once functions
8 Atomics
9 Tasks
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 26 / 92 Threads Threads
Instances of the thread class represents threads of execution The code to executed given as a function object and its arguments Only communication between threads is via shared variables return value of the function ignored (unhandled) exceptions from the function lead to program termination! Threads cannot be terminated from outside More precisely, not portably without terminating the whole program Thread typically bound to an OS thread
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 27 / 92 Threads Constructing a thread object
Constructor takes a function object (such as a lambda) and arguments void print(string s1, string s2) { cout << s1 << s2; }
std::thread t1([]() { cout << "Hello"; }); std::thread t2(print,",", "World!");
A default constructed thread is not bound to a thread of execution Threads are not copyable, but they are movable std::thread t1([](){}); std::thread t2(std::move(t1)); // now t1 not joinable
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 28 / 92 Threads Joinability of a thread
A thread may be joinable or not joinable A joinable thread is potentially executing (running, waiting, currently not scheduled, finished) constructed with a function object acquired a thread of execution from another thread object via a move or a swap A not joinable thread is not executing not yet given a function to execute already joined detached moved from Only not joinable threads can be safely destroyed std::terminate() called otherwise
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 29 / 92 , HelloWorld!
Threads Example, launching and joining threads
#include
void print(string s1, string s2) { cout << s1 << s2; }
int main() {
std::thread t1([]() { cout << "Hello"; }); std::thread t2(print,",", "World!");
t1.join(); t2.join(); return0; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 30 / 92 Threads Example, launching and joining threads
#include
void print(string s1, string s2) { cout << s1 << s2; }
int main() {
std::thread t1([]() { cout << "Hello"; }); std::thread t2(print,",", "World!");
t1.join(); t2.join(); , HelloWorld! return0; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 30 / 92 , HelloWorld!
Threads Same example in alang
#include "alang.hpp"
void print(string s1, string s2) { cout << s1 << s2; }
int main() { processes ps; ps += []() { cout << "Hello"; }; ps += []() { print(", ", "World!"); }; return0; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 31 / 92 Threads Same example in alang
#include "alang.hpp"
void print(string s1, string s2) { cout << s1 << s2; }
int main() { processes ps; ps += []() { cout << "Hello"; }; ps += []() { print(", ", "World!",); HelloWorld! }; return0; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 31 / 92 Threads A helper function
void pause_thread_s(int n) { std::this_thread::sleep_for(std::chrono::seconds(n)); } void pause_thread_ms(int n) { std::this_thread::sleep_for(std::chrono::milliseconds(n)); }
this_thread namespace has get_id — get thread’s id yield — hint to the scheduler to reschedule sleep_for, sleep_until
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 32 / 92 , World!
Threads Example, detaching threads
detach() lets a thread “loose” the thread object becomes not joinable
void print(string s1, string s2) { cout << s1 << s2; }
int main() { std::thread([]() { pause_thread_s(2); std::cout << "Hello"; }).detach();
std::thread(print,",", "World!").detach();
pause_thread_s(1); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 33 / 92 Threads Example, detaching threads
detach() lets a thread “loose” the thread object becomes not joinable
void print(string s1, string s2) { cout << s1 << s2; }
int main() { std::thread([]() { pause_thread_s(2); std::cout << "Hello"; }).detach();
std::thread(print,",", "World!").detach();
pause_thread_s(1); , World! }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 33 / 92 acc=336: 1 times acc=369: 1 times acc=375: 1 times acc=376: 1 times acc=381: 3 times acc=385: 993 times
Threads A larger thread example
int acc; void acc_square(int x) { acc +=x* x; }
int main() { map
for(intj=0; j<1000; ++j) { acc=0; vector
for(inti=1; i <= 10; i++) ts.push_back(thread(acc_square, i)); for(auto&t : ts) t.join();
if (m.count(acc) ==0) m[acc]=1; else m[acc]++; }
for(auto kv : m) cout << "acc=" << kv.first <<":" << kv.second << " times\n"; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 34 / 92 Threads A larger thread example
int acc; void acc_square(int x) { acc +=x* x; } acc=336: 1 times acc=369: 1 times int main() { acc=375: 1 times map
for(inti=1; i <= 10; i++) ts.push_back(thread(acc_square, i)); for(auto&t : ts) t.join();
if (m.count(acc) ==0) m[acc]=1; else m[acc]++; }
for(auto kv : m) cout << "acc=" << kv.first <<":" << kv.second << " times\n"; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 34 / 92 Threads Spawning and joining threads and memory model
All memory operations in the thread that spawns a thread are visible in the spawned thread All memory operations in a thread are visible after joining in the thread that joins it
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 35 / 92 Threads thread constructor’s parameters
The thread constructor defined as
template
The arguments are, however, copied when passed to fn Any exceptions thrown at copies are within the “parent” thread that is spawning the new thread To get move or reference semantics, a move, ref, or cref wrappers can be used
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 36 / 92 100000
Threads Sharing via thread constructor parameters
now sharing is via a reference parameter passed to the thread constructor
void psum(int& acc, const vector
int main() { vector
vector
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 37 / 92 Threads Sharing via thread constructor parameters
now sharing is via a reference parameter passed to the thread constructor
void psum(int& acc, const vector
int main() { vector
vector
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 37 / 92 Exception object’s lifetime is an issue Assume thread A constructs an exception object and somehow passes thread B a reference to the exception object By the time thread B catches and handles the exception, thread A may already be finished Lifetime can of course be managed if the type of the exception is known, but a generic mechanism is very useful
Threads About threads and exceptions
We know that unhandled exceptions on a thread terminate a program What if there is no sensible way to handle an exception, but terminating is not acceptable either? How to transfer an exception to another thread?
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 38 / 92 Threads About threads and exceptions
We know that unhandled exceptions on a thread terminate a program What if there is no sensible way to handle an exception, but terminating is not acceptable either? How to transfer an exception to another thread?
Exception object’s lifetime is an issue Assume thread A constructs an exception object and somehow passes thread B a reference to the exception object By the time thread B catches and handles the exception, thread A may already be finished Lifetime can of course be managed if the type of the exception is known, but a generic mechanism is very useful
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 38 / 92 exception: vector
Threads About threads and exceptions: Example
vector
void worker() { try { vector
int main() { thread t(worker); t.join(); for(auto& eptr : exceptions) { try { if (eptr != nullptr) std::rethrow_exception(eptr); } catch(const std::exception& e) { std::cout << "exception: " << e.what() << std::endl; } } } Propagating exceptions from threads must be done manually current_exception obtains a shared pointer exception_ptr to the caught exception; the pointer can be passed around by copy and the exception stays alive; rethrow_exception(eptr) rethrows the exception pointed to by eptr.
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 39 / 92 Threads About threads and exceptions: Example
vector
void worker() { try { vector
int main() { thread t(worker); t.join(); for(auto& eptr : exceptions) { try { if (eptr != nullptr) std::rethrow_exception(eptr); } catch(const std::exception& e) { std::cout << "exception: " << e.what() << std::endl; } } exception: vector } Propagating exceptions from threads must be done manually current_exception obtains a shared pointer exception_ptr to the caught exception; the pointer can be passed around by copy and the exception stays alive; rethrow_exception(eptr) rethrows the exception pointed to by eptr.
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 39 / 92 Synchronizing threads with mutexes Outline
1 C++ Standardization and concurrency
2 C++ memory model
3 Standard library offerings for concurrency
4 Threads
5 Synchronizing threads with mutexes
6 Condition variables
7 Thread local variables, call once functions
8 Atomics
9 Tasks
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 40 / 92 Synchronizing threads with mutexes Protecting shared data
C++ offers a handful of ways to protect access to shared data Mutexes Locks (wrappers over mutexes) Atomics
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 41 / 92 Locking and data races: all lock and unlock operations on the same mutex are totally ordered prior unlock on the same mutex synchronizes-with a lock locking introduces memory fences
Synchronizing threads with mutexes Mutex
MUTual EXclusion object A thread acquires ownership of a mutex by locking and releases it by unlocking Ownership is exclusive; no two threads can simultaneously own the same mutex member functions: lock() — block until mutex available, then acquire ownership (lock) and continue try_lock() — lock if available, return false if not unlock() — release ownership UB if not locked by the current thread A thread must not own the mutex prior to calling lock or try_lock unless the mutex is a recursive_mutex Destroying a locked mutex leads to undefined behavior
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 42 / 92 Synchronizing threads with mutexes Mutex
MUTual EXclusion object A thread acquires ownership of a mutex by locking and releases it by unlocking Ownership is exclusive; no two threads can simultaneously own the same mutex member functions: lock() — block until mutex available, then acquire ownership (lock) and continue try_lock() — lock if available, return false if not unlock() — release ownership UB if not locked by the current thread A thread must not own the mutex prior to calling lock or try_lock unless the mutex is a recursive_mutex Destroying a locked mutex leads to undefined behavior Locking and data races: all lock and unlock operations on the same mutex are totally ordered prior unlock on the same mutex synchronizes-with a lock locking introduces memory fences
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 42 / 92 acc = 385 occurred 1000 times
Synchronizing threads with mutexes Mutex example
int acc=0; mutex acc_mutex;
void acc_square(int x) { int tmp=x* x; acc_mutex.lock(); acc += tmp; acc_mutex.unlock(); }
int main() { map
for(intj=0; j<1000; ++j) { acc=0; vector
for(inti=1; i <= 10; i++) ts.push_back(thread(acc_square, i)); for(auto&t : ts) t.join();
if (m.count(acc) ==0) m[acc]=1; else m[acc]++; }
for(auto kv : m) cout << "acc=" << kv.first <<":" << kv.second << " times\n"; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 43 / 92 Synchronizing threads with mutexes Mutex example
int acc=0; mutex acc_mutex; acc = 385 occurred 1000 times void acc_square(int x) { int tmp=x* x; acc_mutex.lock(); acc += tmp; acc_mutex.unlock(); }
int main() { map
for(intj=0; j<1000; ++j) { acc=0; vector
for(inti=1; i <= 10; i++) ts.push_back(thread(acc_square, i)); for(auto&t : ts) t.join();
if (m.count(acc) ==0) m[acc]=1; else m[acc]++; }
for(auto kv : m) cout << "acc=" << kv.first <<":" << kv.second << " times\n"; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 43 / 92 Synchronizing threads with mutexes Kinds of mutexes
recursive_mutex can be locked repeatedly by the same thread released when unlocked equally many times timed_mutex only wait for a locked mutex for a certain period of time, then give up shared_timed_mutex, shared_recursive_mutex two kinds of access, shared and exclusive useful when simultaneous read access can be given to many, but when writing must be exclusive
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 44 / 92 Results of a few runs
value = 2350 value = 2090 value = 1600 value = 1990 value = 2240
Synchronizing threads with mutexes try_lock, recursive_mutex example
class counter{ recursive_mutex mut; intc=0; public: void tick() { mut.lock(); ++c; mut.unlock(); } void tickManyIfCan(int n) { if (mut.try_lock()) { while (n-->0) tick(); mut.unlock(); } } int value() { return c; } };
void task(counter& ctr) { for(inti=0; i<100; i++){ pause_thread_ms(1); ctr.tickManyIfCan(10); } }
int main() { counter ctr; thread t1(task, ref(ctr)), t2(task, ref(ctr)), t3(task, ref(ctr)); t1.join(); t2.join(); t3.join(); cout <<"\nvalue = " << ctr.value(); } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 45 / 92 Synchronizing threads with mutexes try_lock, recursive_mutex example
class counter{ recursive_mutex mut; intc=0; public: Results of a few runs void tick() { mut.lock(); ++c; mut.unlock(); } void tickManyIfCan(int n) { value = 2350 if (mut.try_lock()) { value = 2090 while (n-->0) tick(); value = 1600 mut.unlock(); value = 1990 } value = 2240 } int value() { return c; } };
void task(counter& ctr) { for(inti=0; i<100; i++){ pause_thread_ms(1); ctr.tickManyIfCan(10); } }
int main() { counter ctr; thread t1(task, ref(ctr)), t2(task, ref(ctr)), t3(task, ref(ctr)); t1.join(); t2.join(); t3.join(); cout <<"\nvalue = " << ctr.value(); } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 45 / 92 #successes = 21 #successes = 21 #successes = 20
Synchronizing threads with mutexes timed_mutex example
timed_mutex mut;
void attempt(int& successes) { if (mut.try_lock_for(chrono::milliseconds(50))) { // now we have the lock ++successes; pause_thread_ms(2); mut.unlock(); } }
void run() { thread ts[100]; int successes=0; for(inti=0; i<100; ++i) ts[i]= thread(attempt, ref(successes)); for(auto&t : ts) t.join(); cout << "#successes = " << successes << endl; }
int main() { run(); run(); run(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 46 / 92 Synchronizing threads with mutexes timed_mutex example
timed_mutex mut;
void attempt(int& successes) { if (mut.try_lock_for(chrono::milliseconds(50))) { // now we have the lock ++successes; pause_thread_ms(2); #successes = 21 mut.unlock(); #successes = 21 } #successes = 20 }
void run() { thread ts[100]; int successes=0; for(inti=0; i<100; ++i) ts[i]= thread(attempt, ref(successes)); for(auto&t : ts) t.join(); cout << "#successes = " << successes << endl; }
int main() { run(); run(); run(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 46 / 92 AlternativeReader A 0.094351result Reader A 3005.2 Reader A 0.0759884005.2 ReaderWriter AC 3005.26008.22 ReaderWriter AB 4005.27008.53 Writer B 5009.25 Writer C 7014.45
Synchronizing threads with mutexes shared_mutex example
std::shared_mutex s_mut; timer tm;
void reading(const string& data, int secs) { s_mut.lock_shared(); pause_thread_s(secs); alang::logl("Reader ", data,"", tm.elapsed()); s_mut.unlock_shared(); } void writing(string& data, string d, int secs) { s_mut.lock(); pause_thread_s(secs); data= d; s_mut.unlock(); alang::logl("Writer ", d,"", tm.elapsed()); }
int main() { string data= "A"; vector
ts.emplace_back(writing, ref(data), "B",1); ts.emplace_back(writing, ref(data), "C",2);
ts.emplace_back(reading, cref(data),0); for(auto&t : ts) t.join(); } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 47 / 92 Alternative result
Reader A 0.075988 Reader A 3005.2 Reader A 4005.2 Writer B 5009.25 Writer C 7014.45
Synchronizing threads with mutexes shared_mutex example
std::shared_mutex s_mut; timer tm;
void reading(const string& data, int secs) { s_mut.lock_shared(); pause_thread_s(secs); alang::logl("Reader ", data,"", tm.elapsed()); s_mut.unlock_shared(); } void writing(string& data, string d, int secs) { s_mut.lock(); pause_thread_s(secs); data= d; s_mut.unlock(); alang::logl("Writer ", d,"", tm.elapsed()); }
int main() { Reader A 0.094351 string data= "A"; Reader A 3005.2 vector
ts.emplace_back(writing, ref(data), "B",1); ts.emplace_back(writing, ref(data), "C",2);
ts.emplace_back(reading, cref(data),0); for(auto&t : ts) t.join(); } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 47 / 92 Synchronizing threads with mutexes shared_mutex example
std::shared_mutex s_mut; timer tm;
void reading(const string& data, int secs) { s_mut.lock_shared(); pause_thread_s(secs); alang::logl("Reader ", data,"", tm.elapsed()); s_mut.unlock_shared(); } void writing(string& data, string d, int secs) { s_mut.lock(); pause_thread_s(secs); data= d; s_mut.unlock(); alang::logl("Writer ", d,"", tm.elapsed()); }
int main() { AlternativeReader A 0.094351result string data= "A"; Reader A 3005.2 vector
ts.emplace_back(reading, cref(data),0); for(auto&t : ts) t.join(); } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 47 / 92 Synchronizing threads with mutexes Locking and RAII
With lock() – unlock() pairs, some care needed so that unlock happens even in the case of exceptions int c; mutex cm; ... cm.lock(); foo(c); // exception? cm.unlock();
The standard library offers convenient “mutex wrappers” that unlock in the destructor int c; mutex cm; ... { lock_guard
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 48 / 92 Synchronizing threads with mutexes std::lock and lock_guard
It is possible to “adopt” an already locked lock container a, b; // assume both have mutex member variable { std::lock(a.mutex, b.mutex); lock_guard
a.put(b.get()); // maybe exceptions
} // locks released here
std::lock() takes any number of mutexes, and locks them without deadlocking
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 49 / 92 Synchronizing threads with mutexes unique_lock, a more versatile mutex wrapper
unique_lock can be given more policies adopt_lock_t defer_lock_t try_to_lock_t or a duration or time point for how long or until what time to try to acquire the lock movable, transfers the mutex unlocking and (re)locking possible std::unique_lock g(mutex); ... g.unlock(); ... g.lock();
condition variables wait for a unique_lock
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 50 / 92 Synchronizing threads with mutexes unique_lock example
container a, b; // assume both have mutex member variable { unique_lock
a.put(b.get()); // maybe exceptions
} // locks released here
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 51 / 92 Synchronizing threads with mutexes C++17: scoped_lock wrapper
Easiest mutex wrapper Can wrap many mutexes, constructor locks all withouth deadlock std::mutex m1, m2, m3; ... { std::scoped_lock lock(m1, m2, m3); // critical section } // mutexes released here (in reverse order)
Mutexes can even be of different types Note: no need to specify mutex types, because of C++17’s class template argument deduction same true for lock_guard and unique_lock
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 52 / 92 Synchronizing threads with mutexes Locking/unlocking and the memory model
lock and unlock operations on the same mutex are (totally) ordered Unlocking a mutex synchronizes-with the next locking of the same mutex. All memory operations prior to mutex released (unlock) are visible after the same mutex acquired (lock).
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 53 / 92 Condition variables Outline
1 C++ Standardization and concurrency
2 C++ memory model
3 Standard library offerings for concurrency
4 Threads
5 Synchronizing threads with mutexes
6 Condition variables
7 Thread local variables, call once functions
8 Atomics
9 Tasks
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 54 / 92 Condition variables Condition variables
Condition variables allow blocking a thread until notified Waiting methods (of condition_variable class) put the current thread to sleep, to start waiting on a condition variable void wait(unique_lock
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 55 / 92 Handled 0, remains 0 Handled 9, remains 91 Handled 4, remains 87 Handled 3, remains 84 Handled 5, remains 79 Handled 1, remains 78 Handled 6, remains 72 Handled 2, remains 70 Handled 7, remains 63 Handled 8, remains 55
Condition variables Condition variable example
#include
mutex mut; condition_variable cv; int resource=0;
void worker(int amount) { unique_lock
int main(){ thread ts[10]; for(inti=0; i< 10; ++i) ts[i]= thread(worker, i);
{ unique_lock
for(auto&t : ts) t.join(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 56 / 92 Condition variables Condition variable example
#include
mutex mut; condition_variable cv; int resource=0;
void worker(int amount) { unique_lock
int main(){ Handled 0, remains 0 thread ts[10]; Handled 9, remains 91 for(inti=0; i< 10; ++i) ts[i]= thread(worker, i); Handled 4, remains 87 Handled 3, remains 84 { Handled 5, remains 79 unique_lock
for(auto&t : ts) t.join(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 56 / 92 Handled 0, remains 0 Handled 8, remains 92 Handled 9, remains 83 Handled 1, remains 82
Condition variables Condition variable example 2
#include
mutex mut; condition_variable cv; int resource=0;
void worker(int amount) { unique_lock
int main(){ thread ts[10]; for(inti=0; i< 10; ++i) ts[i]= thread(worker, i);
{ unique_lock
pause_thread_s(2); for(auto&t : ts) t.detach(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 57 / 92 Condition variables Condition variable example 2
#include
mutex mut; condition_variable cv; int resource=0;
void worker(int amount) { unique_lock
int main(){ thread ts[10]; Handled 0, remains 0 for(inti=0; i< 10; ++i) ts[i]= thread(worker, i); Handled 8, remains 92 Handled 9, remains 83 { Handled 1, remains 82 unique_lock
pause_thread_s(2); for(auto&t : ts) t.detach(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 57 / 92 Thread local variables, call once functions Outline
1 C++ Standardization and concurrency
2 C++ memory model
3 Standard library offerings for concurrency
4 Threads
5 Synchronizing threads with mutexes
6 Condition variables
7 Thread local variables, call once functions
8 Atomics
9 Tasks
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 58 / 92 Thread local variables, call once functions Thread local variables
C++11 introduced a new storage class specifier: thread_local Each thread has a distinct instance of a thread local variable A thread local variable is allocated (and possibly initialized) when a thread begins and deallocated when it ends Any namespace scope, block scope, or static member variables can be declared thread_local
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 59 / 92 A0 B0 C0 A1 B1 C1 A2 B2 C2 A3 B3 C3
Thread local variables, call once functions Thread local variable example mutex io_m;
string freshvar(string s) { thread_local unsigned int counter=0; returns+ to_string(counter++); }
void work(string s){ for(inti=0; i<4; i++){ // do something that needs fresh variable names lock_guard
int main(){ thread t1(work, "A"); thread t2(work, "B"); thread t3(work, "C"); t1.join(); t2.join(); t3.join();
Jaakko} Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 60 / 92 Thread local variables, call once functions Thread local variable example mutex io_m;
string freshvar(string s) { thread_local unsigned int counter=0; returns+ to_string(counter++); } A0 B0 void work(string s){ C0 for(inti=0; i<4; i++){ A1 // do something that needs fresh variable names B1 lock_guard
Jaakko} Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 60 / 92 Thread local variables, call once functions Call once
Sometimes necessary to execute a piece of code only in one thread E.g., to initialize a resource For this, call_once() provides a solution
template< class Callable, class... Args> void call_once(std::once_flag& flag, Callable&& f, Args&&... args);
The Callablef need not be the same in all threads Each call_once invocation with the same once_flag defines a group f invoked only once per group more precisely: until the flag is set, which takes place when f returns normally, not by throwing threads block on call_once so that only one thread at a time executes, until the flag is set (returning invocation) each invocation synhronizes-with the next the returning invocation syncrhonizes-with all subsequent invocations
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 61 / 92 C initialized CCC
Thread local variables, call once functions call_once example
string* ptr; once_flag flag;
void work(string s) { call_once( flag, [&s]{ ptr= new string(s); cout <
int main() { std::thread t1(work, "A"); std::thread t2(work, "B"); std::thread t3(work, "C");
t1.join(); t2.join(); t3.join(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 62 / 92 Thread local variables, call once functions call_once example
string* ptr; once_flag flag;
void work(string s) { call_once( flag, [&s]{ ptr= new string(s); cout <
int main() { C initialized std::thread t1(work, "A"); CCC std::thread t2(work, "B"); std::thread t3(work, "C");
t1.join(); t2.join(); t3.join(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 62 / 92 Atomics Outline
1 C++ Standardization and concurrency
2 C++ memory model
3 Standard library offerings for concurrency
4 Threads
5 Synchronizing threads with mutexes
6 Condition variables
7 Thread local variables, call once functions
8 Atomics
9 Tasks
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 63 / 92 Atomics std::atomic
atomic
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 64 / 92 Atomics Members of atomic objects
atomic assignment T operator=(T) void store(T, memory_order = std::memory_order_seq_cst) atomic read operator T() const T load( memory_order = std::memory_order_seq_cst) const atomic swap T exchange(T, memory_order = std::memory_order_seq_cst) compare-and-swap bool compare_exchange_weak (T& expected, T desired, ...) bool compare_exchange_strong(T& expected, T desired, ...) (the ... stands for memory order parameters) atomic bit manipulation and arithmetic for those T that they make sense fetch_and, fetch_or, fetch_xor, fetch_add, fetch_sub in addition to anding, oring, etc. the value, returns the old value The operators ++, --, &=, |=, ^=, +=, and -= Most of the same operations exist as non-members as well
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 65 / 92 acc=385: 1000 times
Atomics Example
#include
void acc_square(int x) { acc +=x* x; } // += is atomic
int main() { map
for(intj=0; j<1000; ++j) { acc=0; vector
if (m.count(acc) ==0) m[acc]=1; else m[acc]++; // atomic acc converted to regular int whenever needed }
for(auto kv : m) cout << "acc=" << kv.first <<":" << kv.second << " times\n"; } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 66 / 92 Atomics Example
#include
void acc_square(int x) { acc +=x* x; } // += is atomic
int main() { map
if (m.count(acc) ==0) m[acc]=1; else m[acc]++; // atomic acc converted to regular int whenever needed }
for(auto kv : m) cout << "acc=" << kv.first <<":" << kv.second << " times\n"; } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 66 / 92 Atomics Compare-and-swap (CAS)
This usually hardware-supported primitive is a key operation for many lock-free algorithms In C++ spelled as compare_exchange_strong and compare_exchange_weak: bool atomic
if the atomic’s value == expected (bitwise), then replace it with desired, and return true; otherwise load atomic’s value to expected and return false The canonical use is in a loop, trying for as long as the value is as expected
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 67 / 92 Atomics CAS example
#include
// a simple linked list: struct Node { int value; Node* next; }; std::atomic
void prepend(int val) { Node* oldHead= list_head; Node* newNode= new Node {val, oldHead};
while(!list_head.compare_exchange_weak(oldHead, newNode)) newNode->next= oldHead; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 68 / 92 Atomics Memory orders
Atomic loads and stores can be given a memory order flag:
enum memory_order { memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel, memory_order_seq_cst };
Default is memory_order_seq_cst, which specifies a global total order of memory operations a memory location cannot have multiple simultaneous values all memory operations before an atomic store operation to m on thread A will be visible on thread B by all opearations after a later atomic load of m synchronizes-with relation It often suffices to have weaker reordering guarantees, or even no guarantees E.g., counters, such as shared pointers’ reference counters can be implemented with memory_order_relaxed Not for casual concurrent programming
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 69 / 92 counter 40000000, bad 0
Atomics Relaxed memory order example
std::atomic
void worker() { for(intn=0; n< 10000000; ++n) { counter.fetch_add(1, std::memory_order_relaxed); } } void observer() { intc=0; while(true){ int cn= counter.load(std::memory_order_relaxed); if (c> cn) bad.fetch_add(1, std::memory_order_relaxed); c= cn; } }
int main() { std::vector
std::atomic
void worker() { for(intn=0; n< 10000000; ++n) { counter.fetch_add(1, std::memory_order_relaxed); } } void observer() { intc=0; while(true){ int cn= counter.load(std::memory_order_relaxed); counter 40000000, bad 0 if (c> cn) bad.fetch_add(1, std::memory_order_relaxed); c= cn; } }
int main() { std::vector
Programming directly with threads is complex and low-level computations are married to OS threads OS thread creation is typically rather expensive Task-based parallelism abstracts over OS threads Algorithms are decomposed into independent tasks, regardless of how much parallelism is available Allocating tasks to individual processors/cores is a separate problem Task-based parallelism separates what can be run in parallel from what is run in parallel A task-dependency graph specifies the data dependencies between tasks
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 71 / 92 Ideally, futures/promises and async would construct task dependency graphs In current C++, still somewhat married to threads (not specified whether thread pools or threads) The standards committee is actively working towards supporting task-based parallelism composable futures along the lines of Boost.Future
Tasks Task-based parallelism implementation ideas
Thread pools Work stealing tasks initially queued on one thread’s queue, from which idle processors steal tasks that are on blocked threads should be moved back to work stealing-queues tasks should freely migrate between threads ⇒
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 72 / 92 Tasks Task-based parallelism implementation ideas
Thread pools Work stealing tasks initially queued on one thread’s queue, from which idle processors steal tasks that are on blocked threads should be moved back to work stealing-queues tasks should freely migrate between threads ⇒ Ideally, futures/promises and async would construct task dependency graphs In current C++, still somewhat married to threads (not specified whether thread pools or threads) The standards committee is actively working towards supporting task-based parallelism composable futures along the lines of Boost.Future
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 72 / 92 Tasks Futures
Future is a “delayed value”, a promise that eventually there will be a value future
Binding an asynchronous computation, a promise of a value, to a future does not block When the promised value is needed, future can be queried This may mean blocking, if the value is not yet available
assert(f.get() ==1);
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 73 / 92 Tasks Promises
A promise object can provide a future object and can later give a value (or an exception) to that future promise future is a kind of one-off communication channel →
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 74 / 92 1
Tasks Example (no concurrency)
long hailstone(long n) { if (n ==1) return1; if (n%2 ==0) return hailstone(n/2); return hailstone(3*n+1); }
int main() { std::promise
p.set_value(hailstone(9780657631));
std::cout << f.get(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 75 / 92 Tasks Example (no concurrency)
long hailstone(long n) { if (n ==1) return1; if (n%2 ==0) return hailstone(n/2); return hailstone(3*n+1); }
int main() { std::promise
p.set_value(hailstone(9780657631)); 1 std::cout << f.get(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 75 / 92 1
Tasks Example (concurrency)
long hailstone(long n) { if (n ==1) return1; if (n%2 ==0) return hailstone(n/2); return hailstone(3*n+1); }
int main() { std::promise
std::thread([&](){ p.set_value(hailstone(9780657631)); }).detach();
std::cout << f.get(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 76 / 92 Tasks Example (concurrency)
long hailstone(long n) { if (n ==1) return1; if (n%2 ==0) return hailstone(n/2); return hailstone(3*n+1); }
int main() { std::promise
std::thread([&](){ p.set_value(hailstone(9780657631)); }).detach(); 1 std::cout << f.get(); }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 76 / 92 Tasks Promise future communication →
Promise has a shared state (with a future): ready flag evaluated result or exception when ready set_value, set_exception make promise ready: set both the ready flag and value/exception atomically, then unblock threads waiting on promise Promise destruction releases the promise gives up shared state deletes if no futures waiting throws broken_promise if futures waiting Synchronization: set_value and set_exception synchronize-with functions waiting on the shared state e.g., futures’ get and wait functions
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 77 / 92 acc=385: 1000 times
Tasks Reminder: sum of squares with mutexes
int acc=0; mutex acc_mutex;
void acc_square(int x) { int tmp=x* x; acc_mutex.lock(); acc += tmp; acc_mutex.unlock(); }
int main() { map
for(intj=0; j<1000; ++j) { acc=0; vector
for(inti=1; i <= 10; i++) ts.emplace_back(acc_square, i); for(auto&t : ts) t.join();
if (m.count(acc) ==0) m[acc]=1; else m[acc]++; }
for(auto kv : m) cout << "acc=" << kv.first <<":" << kv.second << " times\n"; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 78 / 92 Tasks Reminder: sum of squares with mutexes
int acc=0; mutex acc_mutex; acc=385: 1000 times void acc_square(int x) { int tmp=x* x; acc_mutex.lock(); acc += tmp; acc_mutex.unlock(); }
int main() { map
for(intj=0; j<1000; ++j) { acc=0; vector
for(inti=1; i <= 10; i++) ts.emplace_back(acc_square, i); for(auto&t : ts) t.join();
if (m.count(acc) ==0) m[acc]=1; else m[acc]++; }
for(auto kv : m) cout << "acc=" << kv.first <<":" << kv.second << " times\n"; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 78 / 92 acc=385: 1000 times
Tasks Sum of squares with futures
int square(int x) { returnx* x; }
int main() { map
for(intj=0; j<1000; ++j) { int acc=0; vector
for(auto&f : fs) acc += f.get();
if (m.count(acc) ==0) m[acc]=1; else m[acc]++; }
for(auto kv : m) cout << "acc=" << kv.first <<":" << kv.second << " times\n"; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 79 / 92 Tasks Sum of squares with futures
int square(int x) { returnx* x; } acc=385: 1000 times int main() { map
for(intj=0; j<1000; ++j) { int acc=0; vector
for(auto&f : fs) acc += f.get();
if (m.count(acc) ==0) m[acc]=1; else m[acc]++; }
for(auto kv : m) cout << "acc=" << kv.first <<":" << kv.second << " times\n"; }
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 79 / 92 Tasks The future
template
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 80 / 92 Tasks shared_future
the same as future
Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 81 / 92 Expecting 1 Expecting 2 Expecting 3 Expecting 4 Expecting 0 Expecting 5 Expecting 6 Found 7 Expecting 9 Expecting 8
Tasks Shared future example
void detect(int x, shared_future int main() { promise for(intj=0; j<10; ++j) thread(detect, j, f).detach(); pause_thread_s(1); p.set_value(7); pause_thread_s(1); } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 82 / 92 Tasks Shared future example void detect(int x, shared_future Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 82 / 92 Tasks Futures, promises and exceptions A promise can pass a value, or an exception (concretely an exception_ptr), via the shared state to a future try { p.set_value(foo()); // foo might throw } catch{ p.set_exception(current_exception()); } Exception manifests at future::get() try { auto val= my_future.get(); } catch (std::exception& e) { cout << "Exception: " << e.what(); } If get never called, exception does not manifest Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 83 / 92 Tasks Futures, promises, packaged tasks, and async Promise and future are the basic building blocks A packaged task wraps any function object with a promise-future channel a future object is attached to the wrapped function object calling the wrapped function object resolves the promise with set_value an exception within the function object sets the exception with set_exception async abstracts over a packaged task Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 84 / 92 Too difficult Tasks Packaged task example #include bool is_prime(int n) { if (n==1) return false; if (n==2) return true; throw std::logic_error("Too difficult"); } int main() { packaged_task thread(move(task),3).detach(); try { cout << f.get(); } catch (exception& e) { cout << e.what(); } } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 85 / 92 Tasks Packaged task example #include bool is_prime(int n) { if (n==1) return false; if (n==2) return true; throw std::logic_error("Too difficult"); } int main() { packaged_task thread(move(task),3).detach(); try { cout << f.get(); } catch (exception& e) { cout << e.what(); } } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 85 / 92 Too difficult Tasks Same with async #include bool is_prime(int n) { if (n==1) return false; if (n==2) return true; throw std::logic_error("Too difficult"); } int main() { try { cout << async(&is_prime,3).get(); } catch (exception& e) { cout << e.what(); } } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 86 / 92 Tasks Same with async #include bool is_prime(int n) { if (n==1) return false; if (n==2) return true; throw std::logic_error("Too difficult"); } int main() { Too difficult try { cout << async(&is_prime,3).get(); } catch (exception& e) { cout << e.what(); } } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 86 / 92 Tasks async details Two overloads template template< class Function, class... Args> std::future Launch policy launch::async execute on a new thread (with thread locals initialized) launch::deferred execute lazily: in the same thread, but only at get() call The first form behaves as if launch policy is async|deferred which lets the implementation choose If async’s result bound to a temporary, blocks at that temporary’s destructor async(launch::async,&foo); // blocks for foo to finish here async(launch::async,&bar); // blocks for bar to finish here Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 87 / 92 Tasks Future of futures Current C++ futures rather cumbersome cannot block on one of several futures to become ready, must block on one particular future no composable tasks: not easy to set up data flows, where one task to wait for completion and result of the other Key feature missing: future::then Instead of blocking with get until a future is ready, register a continuation, to be run when future is ready No blocking Not clear what goes into C++2a Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 88 / 92 Tasks Example: then #include cout << f2.get(); } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 89 / 92 Tasks Example: then (?) #include cout << f2.get(); } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 90 / 92 Tasks Example: then (?) #include cout << f2.get(); } Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 91 / 92 Tasks Future combinators when_any, when_all Wait until (at least) one of a set of futures is ready auto f_or_g= when_any(async(f), async(g)); f_or_g.then([](future Wait until all futures in a given set are ready future future Jaakko Järvi (University of Bergen) Concurrent Programming in Standard C++ <2018-10-02 Tue> 92 / 92 Task System Implementations Jaakko Järvi University of Bergen <2018-10-02 Tue> Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 1 / 61 Outline 1 About performance 2 Task System 3 Communication between tasks 4 Deadlock and task systems 5 Task stealing Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 2 / 61 do not block! About performance What kind of speedup can be obtained with threads? GFLOPS in a typical computer Unit GFLOPS Prog. technology GPU 2250 CUDA, OpenGL, OpenCL, DirectX Vectorization unit 500 Autovectorization, intrinsics, OpenCL Multi-threading 50 C++, Intel’s TBB, Apple’s GCD Serial 7 C++ From Sean Parent Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 3 / 61 About performance What kind of speedup can be obtained with threads? GFLOPS in a typical computer Unit GFLOPS Prog. technology GPU 2250 CUDA, OpenGL, OpenCL, DirectX Vectorization unit 500 Autovectorization, intrinsics, OpenCL Multi-threading 50 C++, Intel’s TBB, Apple’s GCD Serial 7 C++ From Sean Parent do not block! Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 3 / 61 About performance Amdahl’s Law 1 S(n) = (1 p) + p − n S is the theoretical speedup n is the factor of increase in resources (number of cores) p is the portion of the program that benefits from the resources Unit is time Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 4 / 61 Observation: diminishing returns Note: any waiting because of thread synchronization adds to the serial part minimize syncrhonization! ⇒ About performance Amdahl’s Law, Example Example: program fetches data (10%) and computes (90%); fetching is serial, computation can be parallelized n S(n) formula S(n) 1 1 ( . )+ 0.9 1.00 1 0 9 1 − 1 2 ( . )+ 0.9 1.82 1 0 9 2 − 1 4 ( . )+ 0.9 3.08 1 0 9 4 − 1 8 ( . )+ 0.9 4.71 1 0 9 8 − 1 16 ( . )+ 0.9 6.40 1 0 9 16 − 1 32 ( . )+ 0.9 7.80 1 0 9 32 − 1 64 ( . )+ 0.9 8.77 1 0 9 64 − 1 1000000 ( . )+ 0.9 10.00 1 0 9 1000000 − 1 10000000 (1 0.9)+ 0.9 10.00 − 10000000 Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 5 / 61 About performance Amdahl’s Law, Example Example: program fetches data (10%) and computes (90%); fetching is serial, computation can be parallelized n S(n) formula S(n) 1 1 ( . )+ 0.9 1.00 1 0 9 1 − 1 2 ( . )+ 0.9 1.82 1 0 9 2 − 1 4 ( . )+ 0.9 3.08 1 0 9 4 − 1 8 ( . )+ 0.9 4.71 1 0 9 8 − 1 16 ( . )+ 0.9 6.40 1 0 9 16 − 1 32 ( . )+ 0.9 7.80 1 0 9 32 − 1 64 ( . )+ 0.9 8.77 1 0 9 64 − 1 1000000 ( . )+ 0.9 10.00 1 0 9 1000000 − 1 10000000 (1 0.9)+ 0.9 10.00 − 10000000 Observation: diminishing returns Note: any waiting because of thread synchronization adds to the serial part minimize syncrhonization! ⇒ Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 5 / 61 About performance Amdahl’s law Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 6 / 61 Task System Outline 1 About performance 2 Task System 3 Communication between tasks 4 Deadlock and task systems 5 Task stealing Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 7 / 61 We closely follow Sean Parent’s portable task system and his talks at various C++ forums Code almost verbatim Task System Task System? We just learned about the use of async Next: how to implement one Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 8 / 61 Task System Task System? We just learned about the use of async Next: how to implement one We closely follow Sean Parent’s portable task system and his talks at various C++ forums Code almost verbatim Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 8 / 61 Task System Some includes and using declarations Bunch of definitions that the examples assume #include using std::forward; using std::move; using std::function; using std::thread; using std::string; using std::future; using lock_t= std::unique_lock Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 9 / 61 Task System We need something to compute, concurrently bool is_prime(long num) { long limit= sqrt(num); if (num<2) return false; for(longi=2; i<=limit ; i++){ if (num%i ==0) return false; } return true; } is_prime(2) == true is_prime(4) == false Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 10 / 61 Task System A notification queue (implemented as a monitor) class notification_queue{ std::deque template a queue for void () functions pop blocks if queue is empty Note: < Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 11 / 61 notice that f is an “out” parameter front accesses the first element, pop_front discards it functions are moved, not copied tasks will not get duplicated Task System Pop void pop(function Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 12 / 61 Task System Pop void pop(function notice that f is an “out” parameter front accesses the first element, pop_front discards it functions are moved, not copied tasks will not get duplicated Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 12 / 61 forward Task System Push template Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 13 / 61 Task System Push template forward Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 13 / 61 Task System Notification Queue 1 class notification_queue{ std::deque template reminder: we defined lock_t is std::unique_lock Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 14 / 61 Task System Some helpers The task-system-utilities.hpp header provides the headers and using declarations above, plus other necessary headers log, logl sleep functions sleep_s(int), sleep_ms(int) sleep_random_ms(int limit) timer class with void reset(); and double elapsed(); methods function time_ms(f) to time execution of f() number_of_threads() function retrieves an “optimal” number of threads for the current computer Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 15 / 61 Task System number_of_threads() The standard function thread::hardware_concurrency may(!) provide the number of cores in the current system thread::hardware_concurrency() is only a hint We wrap it, in case it does not return a useful value unsigned int number_of_threads() { return std::min( 32u, std::max(1u, thread::hardware_concurrency())); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 16 / 61 Hello, World! Task System Test notification queue 1: nq1-example1.cpp #include "task-system-utilities.hpp" < int main() { notification_queue q; q.push([]{ std::cout << "Hello, "; }); thread t([&]{ function q.push([]{ std::cout << "World!"; }); t.join(); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 17 / 61 Task System Test notification queue 1: nq1-example1.cpp #include "task-system-utilities.hpp" < int main() { notification_queue q; q.push([]{ std::cout << "Hello, "; }); thread t([&]{ function q.push([]{ std::cout << "World!"; }); t.join(); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 17 / 61 Task System Task system 1 class task_system{ const unsigned int _nthreads; std::vector void run() { while(true){ function public: < Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 18 / 61 Task System Constructor, destructor, and async Constructor task_system(int nthreads=0) : _nthreads(nthreads>0? nthreads : number_of_threads()) { for(unsigned intn=0; n< _nthreads; ++n) { _threads.emplace_back([&]{ run(); }); } } Destructor ~task_system() { for(thread&t: _threads) t.join(); } async template Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 19 / 61 oneanother possible possible result: result: Hello,World!Hello, World! World! After output, hangs. . . Need a way to shut down the task system 1 Destructor tells the notification queue to be done 2 All waiting pop calls are woken up. 3 Queue’s pop no longer blocks on empty, but instead returns false 4 Task system’s run methods stop if pop returns false Task System Test task system 1: ts1-example1.cpp #include "task-system-utilities.hpp" < int main() { task_system ts; ts.async([]{ std::cout << "Hello, " << std::endl; }); ts.async([]{ std::cout << "World!" << std::endl; }); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 20 / 61 another possible result: World!Hello,Hello, World! After output, hangs. . . Need a way to shut down the task system 1 Destructor tells the notification queue to be done 2 All waiting pop calls are woken up. 3 Queue’s pop no longer blocks on empty, but instead returns false 4 Task system’s run methods stop if pop returns false Task System Test task system 1: ts1-example1.cpp #include "task-system-utilities.hpp" < Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 20 / 61 another possible result: World!Hello, After output, hangs. . . Need a way to shut down the task system 1 Destructor tells the notification queue to be done 2 All waiting pop calls are woken up. 3 Queue’s pop no longer blocks on empty, but instead returns false 4 Task system’s run methods stop if pop returns false Task System Test task system 1: ts1-example1.cpp #include "task-system-utilities.hpp" < Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 20 / 61 After output, hangs. . . Need a way to shut down the task system 1 Destructor tells the notification queue to be done 2 All waiting pop calls are woken up. 3 Queue’s pop no longer blocks on empty, but instead returns false 4 Task system’s run methods stop if pop returns false Task System Test task system 1: ts1-example1.cpp #include "task-system-utilities.hpp" < Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 20 / 61 Need a way to shut down the task system 1 Destructor tells the notification queue to be done 2 All waiting pop calls are woken up. 3 Queue’s pop no longer blocks on empty, but instead returns false 4 Task system’s run methods stop if pop returns false Task System Test task system 1: ts1-example1.cpp #include "task-system-utilities.hpp" < After output, hangs. . . Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 20 / 61 Task System Test task system 1: ts1-example1.cpp #include "task-system-utilities.hpp" < After output, hangs. . . Need a way to shut down the task system 1 Destructor tells the notification queue to be done 2 All waiting pop calls are woken up. 3 Queue’s pop no longer blocks on empty, but instead returns false 4 Task system’s run methods stop if pop returns false Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 20 / 61 Task System Notification queue 2 class notification_queue{ std::deque public: void done() { { lock_t lock(_mutex); _done= true;} _ready.notify_all(); } bool pop(function template Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 21 / 61 Task System Task system 2 class task_system{ const unsigned int _nthreads; std::vector void run() { while(true){ function public: task_system(int nthreads=0) : _nthreads(nthreads>0? nthreads : number_of_threads()) { for(unsigned intn=0; n< _nthreads; ++n) { _threads.emplace_back([&]{ run(); }); } } ~task_system() { _q.done(); for(thread&t: _threads) t.join(); } template Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 22 / 61 (one possible result) Hello, World! no longer hangs Task System Test task system 2: ts2-example1.cpp #include "task-system-utilities.hpp" < int main() { task_system ts; ts.async([]{ std::cout << "Hello, "; }); ts.async([]{ std::cout << "World!"; }); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 23 / 61 Task System Test task system 2: ts2-example1.cpp #include "task-system-utilities.hpp" < no longer hangs Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 23 / 61 Task System Choosing the number of threads Number of threads allocated for the task system affects the speedup What is the best number? Too few threads cores sit idle, while tasks sit on the queue ⇒ Too many threads OS costs of opening, closing, managing threads go up ⇒ Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 24 / 61 time 185.78 using 1 threads time 95.0063 using 2 threads time 49.2618 using 4 threads time 73.3565 using 8 threads time 84.6876 using 16 threads time 85.8173 using 32 threads time 85.1905 using 64 threads time 87.6551 using 128 threads time 86.9235 using 256 threads time 84.6692 using 512 threads time 152.756 using 1024 threads time 304.862 using 2048 threads time 67.7195 using 8 threads Task System Example #include "task-system-utilities.hpp" < const int ntasks= 4096; void test(int nthreads) { double time; timer tmr; { task_system ts(nthreads); for(inti=0; i int main() { for(intn=1; n <= 2048; n *=2) test(n); test(thread::hardware_concurrency()); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 25 / 61 Task System Example time 185.78 using 1 threads #include "task-system-utilities.hpp" time 95.0063 using 2 threads < int main() { for(intn=1; n <= 2048; n *=2) test(n); test(thread::hardware_concurrency()); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 25 / 61 Communication between tasks Outline 1 About performance 2 Task System 3 Communication between tasks 4 Deadlock and task systems 5 Task stealing Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 26 / 61 Communication between tasks Result of a task Thus far we have had communcation between tasks Tasks did not return values Tasks that do return a value afford a simple way of communication async schedules a packaged task, and returns a future the future becomes ready when task completes Reasoning with tasks that return futures is relatively simple if a task depends on some future, it cannot progress before the value is available there is no earlier (inconsistent) value of a shared variable that might be used by accident Dataflow is explicit Operations become non-blocking (possibly) easier to exploit parallelism ⇒ Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 27 / 61 Communication between tasks Aside: Correspondence between monitors and task parallelism Parent’s observation: monitor, whose operations are synchronized and therefore can block, can be turned into a non-blocking operations with a task queue Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 28 / 61 Both set and get may block, making client wait Lock held an unbounded amount of time, hash computation takes time proportional to the length of key Communication between tasks Example class database{ std::mutex _mutex; std::unordered_map auto get(const string& key) -> string { lock_t lock(_mutex); try { return _map.at(key); } catch (...) { return string("not found"); } } }; Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 29 / 61 Communication between tasks Example class database{ std::mutex _mutex; std::unordered_map auto get(const string& key) -> string { lock_t lock(_mutex); try { return _map.at(key); } catch (...) { return string("not found"); } } }; Both set and get may block, making client wait Lock held an unbounded amount of time, hash computation takes time proportional to the length of key Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 29 / 61 Communication between tasks Testing blocking database #include "task-system-utilities.hpp" < int main() { database db; thread t1([&]{db.set("Bergen", "Norway");}), t2([&]{db.set("Turku", "Finland");}), t3([&]{db.set("College Station", "USA");}); logl(db.get("Turku")); logl(db.get("Turku")); t1.join(); t2.join(); t3.join(); } not found Finland Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 30 / 61 Communication between tasks Nonblocking < Caution! tasks store references to _map, and must not outlive _map. Here OK because _ts will be destructed before _map per C++ destruction order Note that the system is single threaded! Serial/sequential queue (no need for lock on the database). Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 31 / 61 Interactivity for “free” But of course extra overhead queue management constructing thunks, temporary objects Communication between tasks Set many void set_many(std::vector Now set and get return (practically) immediately Still can be congestion, but work under lock is constant Even potentially expensive operations (set_many) return almost immediately here, possibly must copy the vector v, but time only depends on length of the vector, not whether there are other tasks accessing the database Client can choose to wait an issued task to finish, or do other things instead Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 32 / 61 Communication between tasks Set many void set_many(std::vector Now set and get return (practically) immediately Still can be congestion, but work under lock is constant Even potentially expensive operations (set_many) return almost immediately here, possibly must copy the vector v, but time only depends on length of the vector, not whether there are other tasks accessing the database Client can choose to wait an issued task to finish, or do other things instead Interactivity for “free” But of course extra overhead queue management constructing thunks, temporary objects Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 32 / 61 not found Norway Norway Norway Norway Norway Independent Norway Norway Norway Communication between tasks Testing non-blocking database #include "task-system-utilities.hpp" < int main() { database db; thread t1([&db]{ std::vector thread t2([&db]{ for(inti=0; i<100; ++i) { sleep_ms(4); db.set("Bergen", "Independent"); } }); thread t3([&db]{ for(inti=0; i<100; ++i) { sleep_ms(1); db.set("Bergen", "Norway"); } }); t1.join(); t2.join(); t3.join(); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 33 / 61 Communication between tasks Testing non-blocking database not found #include "task-system-utilities.hpp" Norway < thread t2([&db]{ for(inti=0; i<100; ++i) { sleep_ms(4); db.set("Bergen", "Independent"); } }); thread t3([&db]{ for(inti=0; i<100; ++i) { sleep_ms(1); db.set("Bergen", "Norway"); } }); t1.join(); t2.join(); t3.join(); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 33 / 61 T1=0.025144, T2=0.02587, TG=98.0445 Communication between tasks Testing set_many #include "task-system-utilities.hpp" < int main() { timer tmg, tml; double time1, time2, timeg; { std::vector for(inti=0; i<100 ’000; ++i) p.emplace_back(std::to_string(i)+"P", std::to_string(i+1)); for(inti=0; i<100 ’000; ++i) q.emplace_back(std::to_string(i)+"Q", std::to_string(i+1)); { database db; tmg.reset(); tml.reset(); db.set_many(move(p)); time1= tml.elapsed(); db.set_many(move(q)); time2= tml.elapsed(); } timeg= tmg.elapsed(); logl("T1=", time1, ", T2=", time2, ", TG=", timeg); } } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 34 / 61 Communication between tasks Testing set_many #include "task-system-utilities.hpp" < int main() { timer tmg, tml; double time1, time2, timeg; { std::vector for(inti=0; i<100 ’000; ++i) p.emplace_back(std::to_string(i)+"P", std::to_string(i+1)); for(inti=0; i<100 ’000; ++i) q.emplace_back(std::to_string(i)+"Q", std::to_string(i+1)); { database db; tmg.reset(); tml.reset(); db.set_many(move(p)); T1=0.025144, T2=0.02587, TG=98.0445 time1= tml.elapsed(); db.set_many(move(q)); time2= tml.elapsed(); } timeg= tmg.elapsed(); logl("T1=", time1, ", T2=", time2, ", TG=", timeg); } } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 34 / 61 Communication between tasks Back to communicating between tasks: async that returns a future The scheme is as follows: 1 async receives a function f 2 constructs a packaged task for f 3 obtains the future of the packaged task 4 pushes the packaged task to the task queue 5 returns the future Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 35 / 61 Communication between tasks Task system 3 class task_system{ < template autop= new packaged_type(forward _q.push([p]{ (*p)(); delete p; }); // delete package after task executed return result; } }; Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 36 / 61 42 Hello, World! Communication between tasks Test task system 3: ts3-example1.cpp #include "task-system-utilities.hpp" < int f() { return 42;} int main() { task_system ts; autoa= ts.async(f); autob= ts.async([]{ return "Hello, "; }); autoc= ts.async([]{ return "World!"; }); logl(a.get(),"", b.get(), c.get()); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 37 / 61 Communication between tasks Test task system 3: ts3-example1.cpp #include "task-system-utilities.hpp" < int f() { return 42;} int main() { 42 Hello, World! task_system ts; autoa= ts.async(f); autob= ts.async([]{ return "Hello, "; }); autoc= ts.async([]{ return "World!"; }); logl(a.get(),"", b.get(), c.get()); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 37 / 61 Communication between tasks Tasks with arguments The tasks must be nullary functions This leads to frequent wrapping with lambdas, and syntactic noise int plus(int i, int j) { returni+ j; }; ts.async([]{ return plus(1,2); }); Next: allow functions with arguments the same way as thread, emplace, etc. ts.async(plus,1,2); Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 38 / 61 Communication between tasks Task system 4: async template autop= new packaged_type( [f= forward std::future Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 39 / 61 42 Hello, World! Communication between tasks Test task system 4: ts4-example1.cpp #include "task-system-utilities.hpp" < int f(int n, int m, int k) { returnn+m+ k; } int main() { task_system ts; autoa= ts.async(f, 10, 20, 12); autob= ts.async([](string s1, string s2) { return s1+ s2; }, "Hello, ", "World!"); logl(a.get(),"", b.get()); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 40 / 61 Communication between tasks Test task system 4: ts4-example1.cpp #include "task-system-utilities.hpp" < int f(int n, int m, int k) { returnn+m+ k; } int main() { 42 Hello, World! task_system ts; autoa= ts.async(f, 10, 20, 12); autob= ts.async([](string s1, string s2) { return s1+ s2; }, "Hello, ", "World!"); logl(a.get(),"", b.get()); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 40 / 61 Deadlock and task systems Outline 1 About performance 2 Task System 3 Communication between tasks 4 Deadlock and task systems 5 Task stealing Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 41 / 61 Deadlock and task systems Tasks, threads, deadlocks Task system implementations can be prone to deadlock Task systems/thread pools typically do not know when a task is blocked A blocked task occupies a thread, and does not let go With fixed number of threads, running out of threads means deadlock if all threads are blocked to wait for computations that are either blocked or still in the queue Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 42 / 61 2 not found 2003 3001 4001 5003 not found 7001 not found 9001 Deadlock and task systems Example (no deadlock) #include "task-system-4.hpp" < int prime_finder(int a, int b) { std::vector int main() { for(inti=0; i<10; ++i) { try { logl(prime_finder(i*1000, i*1000+4)); } catch (string e) { logl(e); } } } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 43 / 61 Deadlock and task systems Example (no deadlock) #include "task-system-4.hpp" < int main() { for(inti=0; i<10; ++i) { try { logl(prime_finder(i*1000, i*1000+4)); } catch (string e) { logl(e); } } } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 43 / 61 Deadlock and task systems Example (yes deadlock) #include "task-system-4.hpp" < task_system ts; int prime_finder(int a, int b) { std::vector int main() { std::vector Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 44 / 61 Deadlock and task systems Deadlock with a serial queue #include "task-system-4.hpp" < task_system ts2{2}; task_system ts1{1}; int main() { // OK ts2.async([]{ ts2.async([]{ return0; }).get(); }); // Deadlock ts1.async([]{ ts1.async([]{ return0; }).get(); }); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 45 / 61 Deadlock and task systems Conservative programming advice A task should not wait/block for another task in the same queue This can be relaxed, but difficult to formulate a modular easy rule Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 46 / 61 Task stealing Outline 1 About performance 2 Task System 3 Communication between tasks 4 Deadlock and task systems 5 Task stealing Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 47 / 61 Task stealing Using several notification queues Pushing and popping to a single queue causes congestion there will be blocking even if threads available New idea: one queue for each thread only need to block if no tasks atomic _index variable indicates the current queue/thread; _index advanced after every enqueue operations Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 48 / 61 Task stealing Task system 5 class task_system{ const unsigned int _nthreads; std::vector < Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 49 / 61 Task stealing Task system 5: constructor and destructor task_system(int nthreads=0) : _nthreads(nthreads>0? nthreads : number_of_threads()), _qs(_nthreads) // note: brittle { for(unsigned intn=0; n< _nthreads; ++n) { _threads.emplace_back([this, n]{ run(n); }); } } Each thread’s run is invoked with a unique index Member initializers are evaluated in the order the members are declared in the class body, hence the comment about brittle code Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 50 / 61 Task stealing Task system 5: run void run(unsigned int i) { while(true){ function Each thread’s run gets a differnt index i that identifies the thread’s queue thread i pops from _qs[i] Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 51 / 61 Task stealing Task system 5: async template autop= new packaged_type( [f= forward autoi= _index++; _qs[i% _nthreads].push([p]{ (*p)(); delete p; }); return result; } Note: _index overflowing is OK, since we use modulo-arithmetic with unsigned integers Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 52 / 61 BAC client API unchanged Task stealing Test task system 5: ts5-example1.cpp #include "task-system-utilities.hpp" < int main() { task_system ts; autop= ts.async([](long t) { sleep_ms(t); return "C"; }, 100); ts.async([]{ std::cout << "A"; }); ts.async([]{ std::cout << "B"; }); logl(p.get()); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 53 / 61 Task stealing Test task system 5: ts5-example1.cpp #include "task-system-utilities.hpp" < int main() { task_system ts; autop= ts.async([](long t) { sleep_ms(t); return "C"; }, 100); ts.async([]{ std::cout << "A"; }); ts.async([]{ std::cout << "B"; }); logl(p.get()); } client API unchanged Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 53 / 61 Task stealing Notification queue 3 class notification_queue{ std::deque void done() { { lock_t lock(_mutex); _done= true; } _ready.notify_all(); } bool pop(function template Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 54 / 61 Task stealing Notification queue 3: try_push template Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 55 / 61 Task stealing Notification queue 3: try_pop bool try_pop(function Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 56 / 61 Task stealing Task system 6 class task_system{ < < < Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 57 / 61 Task stealing Task system 6: run void run(unsigned int i) { while(true){ function Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 58 / 61 Task stealing Task system 6: async template autop= new packaged_type( [f= forward const intK= 56; autoi= _index++; auto pl= [p]{ (*p)(); delete p; }; for(unsigned intn=0; n< _nthreads*K; ++n) { if (_qs[(i+ n)% _nthreads].try_push(pl)) return result; } _qs[i% _nthreads].push(pl); return result; } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 59 / 61 ABC client API unchanged Task stealing Test task system 6: ts6-example1.cpp #include "task-system-utilities.hpp" < int main() { task_system ts; autop= ts.async([](long t) { sleep_ms(t); return "C"; }, 100); ts.async([]{ std::cout << "A"; }); ts.async([]{ std::cout << "B"; }); logl(p.get()); } Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 60 / 61 Task stealing Test task system 6: ts6-example1.cpp #include "task-system-utilities.hpp" < int main() { task_system ts; autop= ts.async([](long t) { sleep_ms(t); return "C"; }, 100); ts.async([]{ std::cout << "A"; }); ts.async([]{ std::cout << "B"; }); logl(p.get()); } client API unchanged Jaakko Järvi (University of Bergen) Task System Implementations <2018-10-02 Tue> 60 / 61 time 4 268.635 using 1 threads time 5 246.731 using 1 threads time 6 74.2265 using 1 threads -- time 4 453.62 using 2 threads time 5 103.271 using 2 threads time 6 59.1793 using 2 threads -- time 4 563.441 using 4 threads time 5 102.584 using 4 threads time 6 100.315 using 4 threads -- time 4 562.981 using 8 threads time 5 119.062 using 8 threads time 6 116.617 using 8 threads -- time 4 561.763 using 16 threads time 5 112.588 using 16 threads time 6 118.333 using 16 threads -- time 4 568.246 using 32 threads time 5 112.912 using 32 threads time 6 118.695 using 32 threads -- time 4 588.09 using 64 threads time 5 113.778 using 64 threads time 6 124.002 using 64 threads -- time 4 623.733 using 128 threads time 5 121.989 using 128 threads time 6 134.022 using 128 threads -- Task stealing Timings #include "task-system-utilities.hpp" < const int ntasks= 40960; const int maxprime= 10; void test(int nthreads) { double time4; timer tmr4; { ts4::task_system t4(nthreads); for(inti=0; i int main() { for(intn=1; n <= 128; n *=2) test(n); } Task stealing Timings #include "task-system-utilities.hpp" < Jaakko Järvi University of Bergen <2018-11-08 Thu> Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 1 / 66 Outline 1 Coroutines 2 Emulating coroutines 3 Passing data with suspend and resume 4 Async/await (asynchronous coroutines) 5 Promises 6 Asynchronous Iterators and Iterables Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 2 / 66 Coroutines Coroutines in Nutshell Generalization of subroutines that allows suspending the execution of the subroutine, and resuming a suspended execution local variables and program counter persist since suspension Coroutines are in progress simultaneously, but not executed simultaneously A control abstraction for co-operative, or non-preemptive, multitasking Many applications: cooperative tasks event loops pipelining iterators, generators Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 3 / 66 2 3 5 7 11 13 17 19 Coroutines Example (JavaScript) function* primes() { let primes= []; letc=2; while(true){ let composite= false; for(let p of primes) { if (c%p ==0) { composite= true; break;} } if(!composite) { primes.push(c); yield c; } ++c; } } letp= primes(); while(true){ letr= p.next(); if (r.value> 20) break; console.log(r.value); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 4 / 66 Coroutines Example (JavaScript) function* primes() { let primes= []; letc=2; while(true){ let composite= false; for(let p of primes) { if (c%p ==0) { composite= true; break;} } if(!composite) { primes.push(c); yield c; } ++c; } 2 } 3 5 letp= primes(); 7 while(true){ 11 letr= p.next(); 13 if (r.value> 20) break; 17 console.log(r.value); 19 } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 4 / 66 Coroutines History First introduced by Melvin Conway in -58 (in Assembly, for COBOL compiler) In some early languages Simula, Modula-2, BCPL Then seemingly forgotten for many years Re-emerged recently, today found in many of the relevant languages Coming to C++ Prominently in Go JavaScript Kotlin Scheme (continuations), Haskell, F#, . . . Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 5 / 66 Coroutines Subroutine Subroutine has a call and a return call pushes a new activation record/frame to stack suspends caller jumps to the beginning of the function return passes return value to caller pops the frame resumes caller’s execution Frames on a stack for most languages Stack based frame allocation even hardware-supported register for top of stack frame allocation/deallocation == modify top-of-stack register Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 6 / 66 Coroutines Coroutine Coroutine has a call, suspend, resume, destroy, and return suspend suspend execution, remember the current point, save current frame, transfer execution back to caller resume restore saved frame, continue from point of suspesion destroy deallocate saved frame (without resuming execution) Corollary: activation frame lifetimes are not nested Heap allocation Smart compiler may be able to optimize (and use alloca) Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 7 / 66 Coroutines Variations Symmetric vs. asymmetric Coroutines first class or not Stackful vs. stackless Parallel execution or not Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 8 / 66 Coroutines Symmetric Coroutines Symmetric coroutines have a single control transfer operator, specifies target yield_to proposed to C++ void producer_body(producer_type::self& self, std::string base, consumer_type& consumer) { std::sort(base.begin(), base.end()); do{ self.yield_to(consumer, base); } while (std::next_permutation(base.begin(), base.end())); } void consumer_body(consumer_type::self& self, const std::string& value, producer_type& producer) { std::cout << value <<"\n"; while(true){ std::cout << self.yield_to(producer)<<"\n"; } } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 9 / 66 Coroutines Asymmetric Coroutines Similar to subroutines, in that control transferred back to the caller typedef coro::generator int range_generator(generator_type::self& self, int min, int max) { while(min< max-1) self.yield(min++); return min; } Control structure simpler Asymmetric/Symmetric coroutines equally expressive Either can emulate the other Most modern languages implement asymmetric coroutines Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 10 / 66 Coroutines First class coroutines First class means “behaves like any value” Can be stored in a variable Can be passed as a parameter Can be returned from a function Can be suspended and yielded! Old languages (CLU, Sather) had restrictions Most modern implementations provide first class coroutines JavaScript, Python, C++ Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 11 / 66 Coroutines Parallel Execution or Not Coroutines can run asynchronously with its caller Caller can await for suspension/return Unproblematic in languages where no shared memory between caller and coroutine Go, JavaScript Web worker operations and other asynchronous APIs’ operations Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 12 / 66 Coroutines Stackful vs. stackless Stackful coroutines can suspend in nested functions When resuming, continue in the nested function Stackless coroutines can only suspend at top level This is a bit of a limitation Yielding from wrapper/helper functions cumbersome, must create a new coroutine layer Python’s, C++(?) coroutines stackless JavaScript has kind of both Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 13 / 66 0 4 Coroutines Example (wrong!) function* yieldAll(arr) { for(let v of arr) yield v; } function* f() { yield0; yieldAll([1,2,3]); yield4; } letp= f(); for(let v of f()) console.log(v); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 14 / 66 Coroutines Example (wrong!) function* yieldAll(arr) { for(let v of arr) yield v; } function* f() { yield0; 0 yieldAll([1,2,3]); 4 yield4; } letp= f(); for(let v of f()) console.log(v); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 14 / 66 0 Object [Generator] {} 4 Coroutines Example (wrong also!) function* yieldAll(arr) { for(let v of arr) yield v; } function* f() { yield0; yield yieldAll([1,2,3]); yield4; } letp= f(); for(let v of f()) console.log(v); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 15 / 66 Coroutines Example (wrong also!) function* yieldAll(arr) { for(let v of arr) yield v; } function* f() { yield0; 0 yield yieldAll([1,2,3]); Object [Generator] {} yield4; 4 } letp= f(); for(let v of f()) console.log(v); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 15 / 66 0 1 2 3 4 Coroutines Example (OK) function* yieldAll(arr) { for(let v of arr) yield v; } function* f() { yield0; yield* yieldAll([1,2,3]); yield4; } letp= f(); for(let v of f()) console.log(v); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 16 / 66 Coroutines Example (OK) function* yieldAll(arr) { for(let v of arr) yield v; } function* f() { yield0; 0 yield* yieldAll([1,2,3]); 1 yield4; 2 } 3 4 letp= f(); for(let v of f()) console.log(v); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 16 / 66 0 1 2 3 4 This works because function* defines a generator, which is an iterable Coroutines Example (OK also) // function* yieldAll(arr) { // for (let v of arr) yield v; // } function* f() { yield0; yield*[1,2,3]; yield4; } letp= f(); for(let v of f()) console.log(v); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 17 / 66 Coroutines Example (OK also) // function* yieldAll(arr) { // for (let v of arr) yield v; // } function* f() { yield0; 0 yield*[1,2,3]; 1 yield4; 2 } 3 4 letp= f(); for(let v of f()) console.log(v); This works because function* defines a generator, which is an iterable Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 17 / 66 Emulating coroutines Outline 1 Coroutines 2 Emulating coroutines 3 Passing data with suspend and resume 4 Async/await (asynchronous coroutines) 5 Promises 6 Asynchronous Iterators and Iterables Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 18 / 66 Emulating coroutines Generators—Iterators—Coroutines JavaScript demonstrates that a coroutine implementation can be merely syntactic sugar over objects and normal procedure calls Python, C++ coroutines also mostly library implementations C++ plans compiler/runtime support for performance Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 19 / 66 Emulating coroutines JavaScript Iterator and Iterable Protocols An object is an iterator if it implements the next() method as follows: no arguments returns an object p such that p.done isboolean if p.done is false, then p.value is the value returned by the iterator An object is an iterable if it has the computed property [Symbol.iterator], which is a nullary function that returns an iterator Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 20 / 66 Emulating coroutines Example class PrimeIterator { constructor() { this._c=2 this._primes= []; } next() { while(true){ let composite= false; for(let p of this._primes) { if(this._c%p ==0) { composite= true; break;} } if(!composite) { this._primes.push(this._c); return { done: false, value: this._c }; } else ++this._c; } } } class PrimeIterable { [Symbol.iterator]() { return new PrimeIterator(); } } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 21 / 66 2 3 5 7 ----- 2 3 5 7 Emulating coroutines Testing iterators letp= new PrimeIterator(); while(true){ letr= p.next(); if (r.value> 10) break; console.log(r.value); } console.log("-----"); for(let v of new PrimeIterable()) { if (v> 10) break; console.log(v); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 22 / 66 Emulating coroutines Testing iterators letp= new PrimeIterator(); while(true){ letr= p.next(); if (r.value> 10) break; 2 console.log(r.value); 3 } 5 console.log("-----"); 7 for(let v of new PrimeIterable()) { ----- if (v> 10) break; 2 console.log(v); 3 } 5 7 Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 22 / 66 2 3 5 7 ----- 2 3 5 7 Emulating coroutines Generator Functions Return Iterators and Iterables function* primes() { let primes= []; letc=2; while(true){ let composite= false; for(let p of primes) { if (c%p ==0) { composite= true; break;} } if(!composite) { primes.push(c); yield c; } ++c; } } letp= primes(); while(true){ letr= p.next(); if (r.value> 10) break; console.log(r.value); } console.log("-----"); for(let v of primes()) { if (v> 10) break; console.log(v); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 23 / 66 Emulating coroutines Generator Functions Return Iterators and Iterables function* primes() { let primes= []; letc=2; while(true){ let composite= false; for(let p of primes) { if (c%p ==0) { composite= true; break;} } if(!composite) { primes.push(c); yield c; } ++c; } } 2 letp= primes(); 3 while(true){ 5 letr= p.next(); 7 if (r.value> 10) break; ----- console.log(r.value); 2 } 3 console.log("-----"); 5 for(let v of primes()) { 7 if (v> 10) break; console.log(v); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 23 / 66 In the iterator object local variables are iterator object’s member variables Emulating coroutines Coroutine Frame Where is the coroutine frame stored in JavaScript coroutines? Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 24 / 66 Emulating coroutines Coroutine Frame Where is the coroutine frame stored in JavaScript coroutines? In the iterator object local variables are iterator object’s member variables Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 24 / 66 2 3 5 7 --- Generators are closable iterators by default One can clean up resources at close Emulating coroutines Closing coroutines Above, we stopped calling the prime generator, even though it would have continued to yield primes Breaking out from for..of loop, however, closed the generator letp= primes(); for(let v of p) { if (v> 10) break; console.log(v); } console.log("---"); for(let v of p) { if (v> 20) break; console.log(v); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 25 / 66 Emulating coroutines Closing coroutines Above, we stopped calling the prime generator, even though it would have continued to yield primes Breaking out from for..of loop, however, closed the generator letp= primes(); for(let v of p) { if (v> 10) break; 2 console.log(v); 3 } 5 console.log("---"); 7 for(let v of p) { --- if (v> 20) break; console.log(v); } Generators are closable iterators by default One can clean up resources at close Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 25 / 66 Emulating coroutines Cleaning up at close Let’s make primes clean up at closing function* primes() { let primes= []; letc=2; try{ while(true){ let composite= false; for(let p of primes) { if (c%p ==0) { composite= true; break;} } if(!composite) { primes.push(c); yield c; } ++c; } } finally{ console.log("cleaning up"); } } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 26 / 66 2 3 5 7 cleaning up --- Emulating coroutines Cleaning up at close < letp= primes(); for(let v of p) { if (v> 10) break; console.log(v); } console.log("---"); for(let v of p) { if (v> 20) break; console.log(v); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 27 / 66 Emulating coroutines Cleaning up at close < Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 27 / 66 Emulating coroutines Closable iterators class PrimeIterator { constructor() { this._c=2 this._primes= []; this._closed= false; } next() { if(this._closed) return { done: true, value: undefined}; while(true){ let composite= false; for(let p of this._primes) { if(this._c%p ==0) { composite= true; break;} } if(!composite) { this._primes.push(this._c); return { done: false, value: this._c }; } else ++this._c; } } return() { console.log("cleaning up"); this._closed= true; return this.next(); } } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 28 / 66 class PrimeIterable { [Symbol.iterator]() { return new PrimeIterator(); } } 2 3 5 7 cleaning up --- 2 3 5 7 11 13 17 19 cleaning up Emulating coroutines Testing closable iterators < letp= new PrimeIterable(); for(let v of p) { if (v> 10) break; console.log(v); } console.log("---"); for(let v of p) { if (v> 20) break; console.log(v); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 29 / 66 Emulating coroutines Testing closable iterators < Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 29 / 66 2 3 5 7 ----- 11 13 cleaning up { done: true, value: undefined } { done: true, value: undefined } Emulating coroutines Closing closable iterators directly letp= new PrimeIterator(); let r; while(true){ r= p.next(); if (r.value> 10) break; console.log(r.value); } console.log("-----"); console.log(r.value); r= p.next(); console.log(r.value); r= p.return(); console.log(r); r= p.next(); console.log(r); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 30 / 66 Emulating coroutines Closing closable iterators directly letp= new PrimeIterator(); 2 let r; 3 while(true){ 5 r= p.next(); 7 if (r.value> 10) break; ----- console.log(r.value); 11 } 13 console.log("-----"); cleaning up console.log(r.value); { done: true, value: undefined } r= p.next(); console.log(r.value); { done: true, value: undefined } r= p.return(); console.log(r); r= p.next(); console.log(r); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 30 / 66 Passing data with suspend and resume Outline 1 Coroutines 2 Emulating coroutines 3 Passing data with suspend and resume 4 Async/await (asynchronous coroutines) 5 Promises 6 Asynchronous Iterators and Iterables Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 31 / 66 But yield e also recieves data when coroutine resumed Passing data with suspend and resume Suspending with a value yield communicates data from a coroutine to its caller when the coroutine is suspended function* counter() { letc=1; while(true) yieldc++; } letp= counter(); let sum= p.next().value+ p.next().value; console.log(sum); 3 Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 32 / 66 Passing data with suspend and resume Suspending with a value yield communicates data from a coroutine to its caller when the coroutine is suspended function* counter() { letc=1; while(true) yieldc++; } letp= counter(); let sum= p.next().value+ p.next().value; console.log(sum); 3 But yield e also recieves data when coroutine resumed Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 32 / 66 Passing data with suspend and resume Resuming with a value yield e is an expression; its value is the value sent to coroutine when it is resumed function* skipcounter() { letc=0; while(true) c= yield ++c; } letp= skipcounter(); let s1= p.next().value; let s2= p.next(s1+100).value; let s3= p.next(s2).value; console.log(s1, s2, s3); 1 102 103 Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 33 / 66 error: malfunction. resetting... 1 102 1 2 Passing data with suspend and resume Resuming with an exception yield e can resume with an exception, sent from the caller function* skipcounter() { letc=0; while(true){ try{ c= yield ++c; } catch (e) { console.log("error: "+e+ ". resetting..."); c=0; } } } letp= skipcounter(); let s1= p.next().value; let s2= p.next(s1+100).value; let s3= p.throw("malfunction").value; let s4= p.next(s3).value; console.log(s1, s2, s3, s4); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 34 / 66 Passing data with suspend and resume Resuming with an exception yield e can resume with an exception, sent from the caller function* skipcounter() { letc=0; while(true){ try{ c= yield ++c; } catch (e) { console.log("error: "+e+ ". resetting..."); c=0; } error: malfunction. resetting... } 1 102 1 2 } letp= skipcounter(); let s1= p.next().value; let s2= p.next(s1+100).value; let s3= p.throw("malfunction").value; let s4= p.next(s3).value; console.log(s1, s2, s3, s4); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 34 / 66 Async/await (asynchronous coroutines) Outline 1 Coroutines 2 Emulating coroutines 3 Passing data with suspend and resume 4 Async/await (asynchronous coroutines) 5 Promises 6 Asynchronous Iterators and Iterables Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 35 / 66 Async/await (asynchronous coroutines) Async/await Many languages offer ways of running coroutines concurrently with their caller This is perhaps not in line with the original notion of a coroutine, but the term has expanded to cover such abstractions Note! If coroutines are run in parallel and also communicate with shared memory, we have all the usual problems (synchronization needed) Common primitives are async functions/tasks await statements Found in JavaScript, Python, C#, C++(?), . . . Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 36 / 66 Even though sendServerRequest is asynchronous (it returns a promise), the structure of the code is similar to if it was a regular function call when called, an async function runs until await (or return), then returns a promise In case of await, registers the rest of the function body as a continuation for the awaited promise Async/await (asynchronous coroutines) Async/await in JavaScript A function defined as async can contain await statements await e evaluates evaluates e to a promise, and waits until that promise is resolved the resolved promise’s value is the value of the expression await e a rejected promise turns into an exception async function client(request) { let result= await sendServerRequest(request); return "Result="+ result; } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 37 / 66 when called, an async function runs until await (or return), then returns a promise In case of await, registers the rest of the function body as a continuation for the awaited promise Async/await (asynchronous coroutines) Async/await in JavaScript A function defined as async can contain await statements await e evaluates evaluates e to a promise, and waits until that promise is resolved the resolved promise’s value is the value of the expression await e a rejected promise turns into an exception async function client(request) { let result= await sendServerRequest(request); return "Result="+ result; } Even though sendServerRequest is asynchronous (it returns a promise), the structure of the code is similar to if it was a regular function call Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 37 / 66 Async/await (asynchronous coroutines) Async/await in JavaScript A function defined as async can contain await statements await e evaluates evaluates e to a promise, and waits until that promise is resolved the resolved promise’s value is the value of the expression await e a rejected promise turns into an exception async function client(request) { let result= await sendServerRequest(request); return "Result="+ result; } Even though sendServerRequest is asynchronous (it returns a promise), the structure of the code is similar to if it was a regular function call when called, an async function runs until await (or return), then returns a promise In case of await, registers the rest of the function body as a continuation for the awaited promise Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 37 / 66 The above is essentially synchronous call to asynchronous operations Async/await (asynchronous coroutines) Async/await, many suspend/resume points async function client(req1, req2, req3) { let p1= await sendServerRequest(req1); let p2= await sendServerRequest(req2); let p3= await sendServerRequest(req3); return "Results="+ combine(r1, r2, r3); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 38 / 66 Async/await (asynchronous coroutines) Async/await, many suspend/resume points async function client(req1, req2, req3) { let p1= await sendServerRequest(req1); let p2= await sendServerRequest(req2); let p3= await sendServerRequest(req3); return "Results="+ combine(r1, r2, r3); } The above is essentially synchronous call to asynchronous operations Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 38 / 66 Async/await (asynchronous coroutines) Async/await, concurrent execution Several asynchronous co-routines can be launched first, awaited later async function client(req1, req2, req3) { let p1= sendServerRequest(req1); let p2= sendServerRequest(req2); let p3= sendServerRequest(req3); // do other stuff let r1= await p1; let r2= await p2; let r3= await p3; return "Results="+ combine(r1, r2, r3); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 39 / 66 Async/await (asynchronous coroutines) Async/await, concurrent execution One can wait for many promises simultaneously async function client(req1, req2, req3) { let p1= sendServerRequest(req1); let p2= sendServerRequest(req2); let p3= sendServerRequest(req3); // do other stuff let [r1, r2, r3]= await Promise.all([p1, p2, p3]); return "Results="+ combine(r1, r2, r3); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 40 / 66 Async/await (asynchronous coroutines) Async/await, concurrent execution One can wait for the first out of many promises async function client(req1, req2, req3) { let p1= sendServerRequest(req1); let p2= sendServerRequest(req2); let p3= sendServerRequest(req3); // do other stuff letr= await Promise.race([p1, p2, p3]); return "First result="+ r; } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 41 / 66 Async/await (asynchronous coroutines) Intermediate Summary function* and yield are used to define synchronous coroutines Under the covers, they are just generator objects, where the object instance variables are the local variables of the coroutine frame async and await are used to define asynchronous coroutines Under the covers, async functions are functions that return promises, and await expressions register continuations for promises What then are promises in JavaScript? Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 42 / 66 Promises Callbacks JavaScript is single-threaded One event loop schedules tasks (JavaScript functions) for sequential execution from a task queue Each task runs into completion, before giving control back to the event loop Each task has been put to the task queue as a result of some event Programmer can register event handlers (JavaScript functions) for various events, e.g.: button.addEventListener("click", () => { alert("button clicked"); }); Environment (Browser, or nodejs) has APIs for launching asynchronous operations, and registering handlers/callbacks for when they finish networking file IO animation timeouts GUI (users perform asynchronous operations) Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 43 / 66 Promises Example asynchronous APIs let input= document.getElementById(’input1’); let handler= null; input.addEventListener("keyup", (event) =>{ clearTimeout(handler); let up=() => { input.value= input.value.toUpperCase(); }; handler= setTimeout(up, 1000); }); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 44 / 66 Promises Callback hell Callbacks are a seemingly simple way to program Every asynchronous function takes a callback, a continuation function, to be called when the operation finishes This, however, composes poorly inversion of control unstructured mess, in general Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 45 / 66 Promises Example: Callbacks Let’s package the stable reading functionality into a function function readStable(input, duration, cont) { let handler= setTimeout(() => { cleanup(); cont(input.value); }, duration); input.addEventListener("keyup", keyup); function keyup(event) { clearTimeout(handler); handler= setTimeout(() => { cleanup(); cont(input.value); }, duration); }; function cleanup() { input.removeEventListener("keyup", keyup); } } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 46 / 66 Promises Example: Callbacks Another asynchronous function: lookup user’s account function findUser(key, cont) { // emulate asynchronous request setTimeout(() =>{ switch (key) { case "EDGAR": cont("Dijkstra"); return; case "TONY": cont("Hoare"); return; default: cont(" Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 47 / 66 Promises Example: Callbacks Composing with callbacks function showUserData() { let input= document.getElementById(’input1’); let output= document.getElementById(’output1’); readStable(input, 2000, (key) =>{ let uKey= key.toUpperCase(); findUser(uKey, (v) => { output.value= v; }); }); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 48 / 66 Promises Promises A promise represents the eventual result of an asynchronous operation. It is a placeholder for either an eventual value, which materializes if the operation succeeds error, which materializes if the operation fails Promises make executing, composing, and managing asynchronous operations much simpler than programming with callbacks and events directly Corresponds roughly to C++ futures (except, for now, are more composable) Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 49 / 66 Promises Promise states A promise can be in one of three states: Pending promise has no value yet Fulfilled the asynchronous operation has completed, and the promise has a value (that will not change) Rejected the asynchronous operation failed, and the promise will never be fulfilled. An error value indicates the reason for failure. Valid state transitions: Pending Fulfilled, Pending Rejected → → Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 50 / 66 Promises Constructing a Promise Promise constructor’s argument is a function (resolve, reject) => { ... } The constructor invokes the function, binding resolve to a function that resolves the promise reject to a function that rejectes the promise Example: a promise that immediately resolves to 1 Promise((resolve, reject) => { resolve(1); }); Example: a promise that immediately rejects, with "error" Promise((resolve, reject) => { reject("error"); }); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 51 / 66 Promises Example: Promises function readStable(input, duration) { return new Promise((resolve, reject) =>{ let handler= setTimeout(() => resolve(input.value), duration); input.onkeyup= (event) =>{ clearTimeout(handler); handler= setTimeout(() => resolve(input.value), duration); }; }); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 52 / 66 Promises Example: Promises function findUser(key) { return new Promise((resolve, reject) =>{ setTimeout(() =>{ switch (key) { case "EDGAR": resolve("Dijkstra"); return; case "TONY": resolve("Hoare"); return; default: reject(key); return; } }); }); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 53 / 66 Promises Example: Promises function showUserData() { let input= document.getElementById(’input1’); let output= document.getElementById(’output1’); readStable(input, 2000) .then(key => key.toUpperCase()) .then(uKey => findUser(uKey)) .then(v => { output.value= v; }) .catch(e => { output.value= "No user "+ e; }); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 54 / 66 Promises Example: Promises (variation 2) function showUserData() { let input= document.getElementById(’input1’); let output= document.getElementById(’output1’); readStable(input, 2000) .then(key => key.toUpperCase()) .then(findUser) .then(v => { output.value= v; }) .catch(e => { output.value= "No user "+ e; }); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 55 / 66 Promises Example: Promises (variation 3) function showUserData() { let input= document.getElementById(’input1’); let output= document.getElementById(’output1’); readStable(input, 2000) .then(key => findUser(key.toUpperCase())) .then(v => { output.value= v; }) .catch(e => { output.value= "No user "+ e; }); } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 56 / 66 Promises Example: async/await async function showUserData() { let input= document.getElementById(’input1’); let output= document.getElementById(’output1’); try{ let key= await readStable(input, 2000); let uKey= key.toUpperCase() letv= await findUser(uKey); output.value= v; } catch (e) { output.value= "No user "+ e; } } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 57 / 66 Promises Translate async/await to promises and coroutines function asyncInc(i) { return Promise.resolve(i+1); } async function countToThree() { leti=0; i= await asyncInc(i); i= await asyncInc(i); i= await asyncInc(i); return i; } 3 countToThree().then(v => console.log(v)); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 58 / 66 Promises Translation function _countToThree() { return spawn(function*(){ leti=0; i= yield asyncInc(i); i= yield asyncInc(i); i= yield asyncInc(i); return i; }); } function spawn(generator) { return new Promise(function(resolve, reject) { let iter= generator(); function step(nextf) { let r; try{r= nextf(); } catch(e) { reject(e); return;} if (r.done) { resolve(r.value); return;} Promise.resolve(r.value).then(v => (step(() => iter.next(v))), e => (step(() => iter.throw(e)))); }; step(() => iter.next()); }); } 3 _countToThree().then((v) => console.log(v)); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 59 / 66 Asynchronous Iterators and Iterables Outline 1 Coroutines 2 Emulating coroutines 3 Passing data with suspend and resume 4 Async/await (asynchronous coroutines) 5 Promises 6 Asynchronous Iterators and Iterables Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 60 / 66 Asynchronous Iterators and Iterables Asynchronous Iterators and Iterables To further hide the distinction between synchronous and asynchronus functions from the programmer, JavaScript has protocols for asynchronous iterables and iterators They are not merely synchronous iterables/iterators of promises the number of elements the iterable has may depend on the asynchronous computations that produce the elements The asynchronous iterator protocol is the same as the synchronous iterator protocol, except that the next method returns a promise, and can be async The asynchronous iterable protocol requires that an object has a method [Symbol.asyncIterator]() that returns an object that conforms to the asynchronous iterator protocol Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 61 / 66 Asynchronous Iterators and Iterables Example function sleep(time, v) { return new Promise((resolve) => setTimeout(() => resolve(v), time)); } class TickAsyncIterator { constructor(time) { this._time= time; this._ticks=0; this._done= false; } next() { if(this._done) return { done: true}; return sleep(this._time, ++this._ticks).then(v => ({ done: false, value: v })); }; return() { this._done= true; return this.next(); } } class TickAsyncIterable { constructor(time) { this._time= time; } [Symbol.asyncIterator]() { return new AsyncPrimeIterator(this._time); } } Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 62 / 66 — 1 1008 2 1011 3 1012 the iterator does not (necessarily) await for the promise it yields Asynchronous Iterators and Iterables Testing asynchronous iterators letp= new TickAsyncIterator(1000); letd= Date.now(); p.next().then(v => console.log(v.value, Date.now()-d)); p.next().then(v => console.log(v.value, Date.now()-d)); p.next().then(v => console.log(v.value, Date.now()-d)); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 63 / 66 Asynchronous Iterators and Iterables Testing asynchronous iterators letp= new TickAsyncIterator(1000); letd= Date.now(); p.next().then(v => console.log(v.value, Date.now()-d)); p.next().then(v => console.log(v.value, Date.now()-d)); p.next().then(v => console.log(v.value, Date.now()-d)); — 1 1008 2 1011 3 1012 the iterator does not (necessarily) await for the promise it yields Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 63 / 66 Asynchronous Iterators and Iterables Example class TickAsyncIterator { constructor(time) { this._time= time; this._ticks=0; this._done= false; } async next() { if(this._done) return { done: true}; letv= await sleep(this._time, ++this._ticks); return { done: false, value: v }; }; return() { this._done= true; return this.next(); } } class TickAsyncIterable { constructor(time) { this._time= time; } [Symbol.asyncIterator]() { return new TickAsyncIterator(this._time); } } Here, await guarantees that the yielded promise is resolved Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 64 / 66 Asynchronous Iterators and Iterables Testing asynchronous iterators that await before yielding async function f() { letp= new TickAsyncIterator(1000); letd= Date.now(); await p.next().then(v => console.log(v.value, Date.now()-d)); await p.next().then(v => console.log(v.value, Date.now()-d)); await p.next().then(v => console.log(v.value, Date.now()-d)); }; f(); 1 1003 2 2013 3 3018 Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 65 / 66 Asynchronous Iterators and Iterables for..await..of Asynchronous iterables (will be) integrated with the rest of the language With generators (yielding promises), with loops function mkDeferred() { let rr, jj; letp= new Promise((r, j) => { rr= r; jj= j; }); p.resolve= rr; p.reject= jj; return p; } let mpos= mkDeferred(); document.addEventListener("mousemove", evt => { mpos.resolve(evt); mpos= mkDeferred(); }); async function* mouseMoves() { while(true){ let evt= await mpos; yield "("+ evt.clientX+ ","+ evt.clientY+ ")"; } } async function() { for await (m of mouseMoves()) console.log(m); }(); Jaakko Järvi (University of Bergen) Coroutines <2018-11-08 Thu> 66 / 66 Taste of Distributed Systems Jaakko Järvi University of Bergen <2018-11-20 Tue> Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 1 / 31 Outline 1 Introduction 2 Computational Model 3 Consensus in Synchronous Distributed Systems 4 Time in distributed systems Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 2 / 31 Introduction Outline 1 Introduction 2 Computational Model 3 Consensus in Synchronous Distributed Systems 4 Time in distributed systems Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 3 / 31 Important and complex topic ignored: distributed computing Next: a small taste Introduction Class thus far Programs run on one computer Centrally coordinated Processes are assumed not to fail Message delivery is assumed not to fail Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 4 / 31 Introduction Class thus far Programs run on one computer Centrally coordinated Processes are assumed not to fail Message delivery is assumed not to fail Important and complex topic ignored: distributed computing Next: a small taste Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 4 / 31 Programming distributed systems is difficult Tricky algorithms how to get a group of possibly faulty independent agents to converge to a desired result Tricky proofs of those algorithms Introduction Distributed Systems Composed of independently executing activities Typically in different computers No central authority, no central clock Loosely coupled Connections not fixed Message delivery may fail, varying latencies Agents can come and go Uncertainty Agents can fail, behave erratically, even maliciously Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 5 / 31 Introduction Distributed Systems Composed of independently executing activities Typically in different computers No central authority, no central clock Loosely coupled Connections not fixed Message delivery may fail, varying latencies Agents can come and go Uncertainty Agents can fail, behave erratically, even maliciously Programming distributed systems is difficult Tricky algorithms how to get a group of possibly faulty independent agents to converge to a desired result Tricky proofs of those algorithms Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 5 / 31 Introduction Distributed Systems ubiquitous Replicated Data Bases How is Facebook’s data stored? Software fault-tolerance Communication networks Mobile ad-hoc networks Grid computing Blockchain Robot swarms Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 6 / 31 Many are some forms of consensus problems Introduction Typical Problems Distributed Graph algorihtms Leader election Mutual exclusion Consistency Think of “Coordinated Attack Problem” Notions of time and causality Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 7 / 31 Introduction Typical Problems Distributed Graph algorihtms Leader election Mutual exclusion Consistency Think of “Coordinated Attack Problem” Notions of time and causality Many are some forms of consensus problems Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 7 / 31 Introduction About Consensus in Asynchronous Systems FLP Theorem: “Distributed Consenus is Impossible with One Faulty Process” It is possible that all processes stay undecided forever Randomized algorithms can make probability of non-decision near 0, for long enough runs Programming without consensus tricky “Conflict-free replicated data types” Concurrently updatable objects that eventually converge to consensus state if all updates are performed by all replicas Updates monotonous operations, merges associative and commutative Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 8 / 31 Goal often self-stabilization distributed system recovers from failures that cause corruption of state Introduction Failures crash faulty processor stops Byzantine faulty processor can do anything, even be adverserial Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 9 / 31 Introduction Failures crash faulty processor stops Byzantine faulty processor can do anything, even be adverserial Goal often self-stabilization distributed system recovers from failures that cause corruption of state Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 9 / 31 Computational Model Outline 1 Introduction 2 Computational Model 3 Consensus in Synchronous Distributed Systems 4 Time in distributed systems Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 10 / 31 Computational Model Message Passing p0, p1,..., pn 1 processors − bidirectional channels between processors, labelled in each processor processors may not know the other end of a channel Processor’s state is its local variables and concents of incoming message buffer Configuration/snapshot of the system Each processor’s state Messages on the fly (outgoing buffers) Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 11 / 31 Execution configuration event configuration event configuration ... → → → → Computational Model Abstraction of Events in Distributed System Delivery Event Move message from out-buffer to in-buffer, making it available to receiver Computation Event State transition that Handles all incoming messages Possibly modifies local variables Ends with empty in-buffer, and new messages in out-buffers Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 12 / 31 Computational Model Abstraction of Events in Distributed System Delivery Event Move message from out-buffer to in-buffer, making it available to receiver Computation Event State transition that Handles all incoming messages Possibly modifies local variables Ends with empty in-buffer, and new messages in out-buffers Execution configuration event configuration event configuration ... → → → → Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 12 / 31 Computational Model Simple example: Flooding Processor’s local state: color (black/white) Initially, p0 black, p0s out-buffers contain flood message B Transition for all pi if message B is in in-buffer and color is white, change color to black and send B Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 13 / 31 Computational Model Nondeterminism Several possible executions (sequences of events) possible, depending on delays (order) in message delivery Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 14 / 31 Computational Model Complexity measures Message complexity maximum number of messages sent in any (admissible) execution size of messages Time complexity maximum “time” until termination Set arbitrary upper limit (1) for any message delivery delay sets limit to worst case message delivery delay (1), allows arbitrary interleavings Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 15 / 31 Computational Model Flooding Algorithms Complexity Terminating states: color is black Message complexity 2m, m number of channels One message over each channel (edge) in each directiom Time complexity diameter + 1 Black from node a reaches a node b through the shortest path between a and b graph’s diameter is the greatest distance between any two nodes Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 16 / 31 Consensus in Synchronous Distributed Systems Outline 1 Introduction 2 Computational Model 3 Consensus in Synchronous Distributed Systems 4 Time in distributed systems Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 17 / 31 Consensus in Synchronous Distributed Systems Synchronous Distributed Systems Execution is an infinite sequence of rounds On each round: perform all delivery events: move all sent messages into in-buffers of their receivers perform a computation events on each processor pi Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 18 / 31 Consensus in Synchronous Distributed Systems Consensus Results in a Synchronous System crash failures Byzantine failures number of rounds f + 1 f + 1 number of processors f + 1 3f + 1 message size polynomial polynomial at most f faulty processors the above are tight bounds Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 19 / 31 Consensus in Synchronous Distributed Systems Consensus algorithm for Crash failures Example: each processor holds value vi , agree on minimum f the maximum number of faulty processors tolerated Each processor’s transition function: vsent= false; for (round=1tof+1){ if(!vsent) { send v to all; vsent= true;} receive all vi // from the non-faulty ones if (some vi Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 20 / 31 Time in distributed systems Outline 1 Introduction 2 Computational Model 3 Consensus in Synchronous Distributed Systems 4 Time in distributed systems Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 21 / 31 Time in distributed systems Time in distributed systems Our perception of time is that all events are totally ordered by when they occur But time is relative noticeably so in a distributed system, where different processors’ “clocks” are not in sync Leslie Lamport’s seminal paper (78): Time, Clocks, and the Ordering of Events in a Distributed System Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 22 / 31 irreflexive transitive antisymmetric (implied by the above two) Time in distributed systems Happens Before Relation a b denotes that an event a happened before b → 1 if a and b occur on the same processor and a precedes b, then a b → 2 if a is the sending of message m and b the receiving of m, then a b → 3 if a b and b c, then a c → → → 4 Happens before is an strict partial order Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 23 / 31 Time in distributed systems Happens Before Relation a b denotes that an event a happened before b → 1 if a and b occur on the same processor and a precedes b, then a b → 2 if a is the sending of message m and b the receiving of m, then a b → 3 if a b and b c, then a c → → → 4 Happens before is an strict partial order irreflexive transitive antisymmetric (implied by the above two) Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 23 / 31 Time in distributed systems Concurrent events If a b and b a then a and b are concurrent 6→ 6→ a b k Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 24 / 31 Time in distributed systems Logical clocks For each process Pi , logical clock is function Ci that assigns a number Ci (a) to any event in the process The entire system of logical clocks is function C such that C(a) = Ci (a) if a is an event in Pi C satisfies clock condition: For all events a and b, if a b then C(a) < C(b) → If C should define a total ordering, use process id to order concurrent events Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 25 / 31 Note, happens before is a partial order but clock values (integers) are a total order a b C(a) < C(b) C(→a) <⇒C(b) a b 6⇒ → Time in distributed systems Logical Timestamps Algorithm Each process keeps a counter ci , initialized to 0 Timestamp every message sent by Pi with ci At every computational event a, increment ci to be greater than its current value all the time stamps received on this event C(a) is then the value of ci Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 26 / 31 Time in distributed systems Logical Timestamps Algorithm Each process keeps a counter ci , initialized to 0 Timestamp every message sent by Pi with ci At every computational event a, increment ci to be greater than its current value all the time stamps received on this event C(a) is then the value of ci Note, happens before is a partial order but clock values (integers) are a total order a b C(a) < C(b) C(→a) <⇒C(b) a b 6⇒ → Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 26 / 31 Time in distributed systems Vector Clocks For vector clock V we want a b if and only if V (a) < V (b) → i Each process pi keeps an n-vector v (one element for each process) i vj is pi ’s estimate of how many steps pj has taken Vector operations v = w iff vi = wi for all i v w iff v w for all i ≤ i ≤ i v < w iff v w and v = w ≤ 6 v w iff v w and w v k 6≤ 6≤ Examples (2, 1, 3) = (2, 1, 3) (2, 1, 3) (3, 1, 3) ≤ (2, 1, 3) < (3, 1, 4) (2, 1, 3) (2, 0, 5) k Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 27 / 31 Time in distributed systems Vector Timestamps Algorithm Initialize all v i to the 0 vector. i With every message, pi sends its vector v . i At every computational event, increment vi by one. At reseaving a message with vector t, pi updates its clock vector, i i i other than vi , so that for all j, vj = max(tj , vj ) i For an event a at pi , V (a) = v at the end of a Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 28 / 31 Time in distributed systems Correctness 1 a b implies V (a) < V (b) → i 1 if a and b on processor i, vi increases on every step j 2 if a on pi sends m which b on pj receives, pj updates v so that V (a) V (b) ≤ i j j the estimate vj is never an overestimate of vj ; since b increases vj , V (a) < V (b) 3 if c, s.t. a c and c b ∃ → → by induction from 1. and 2., and transitivity of < 2 V (a) < V (b) implies a b; i.e. a b implies V (a) < V (b) → 6→ 6 Assume a occurrs at p , b at p and a b. Let k = V (a) . i j 6→ i Since a b there is no chain of messages from pi to pj originating on th 6→ pi ’s k step or later and ending at pj before b Therefore V (b)i < k. Therefore V (a) < V (b). 6 Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 29 / 31 Time in distributed systems About vector timestamps They are big n components each values grow without bound There is no more efficient way n is a tight bound Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 30 / 31 Each processor stores a snapshot periodically, including vector clock value To restore the system, must find a (recent) consistent snapshot, i.e., a consistent cut that is K. ≤ a cut of execution is K = (k0,..., kn1 ), where ki is number steps in processor pi in a consistent cut if step s in pi happens before step kj of pj , then s k . ≤ i Every received message in the cut was sent within the cut Time in distributed systems Applications Recover distributed system from crash Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 31 / 31 Time in distributed systems Applications Recover distributed system from crash Each processor stores a snapshot periodically, including vector clock value To restore the system, must find a (recent) consistent snapshot, i.e., a consistent cut that is K. ≤ a cut of execution is K = (k0,..., kn1 ), where ki is number steps in processor pi in a consistent cut if step s in pi happens before step kj of pj , then s k . ≤ i Every received message in the cut was sent within the cut Jaakko Järvi (University of Bergen) Taste of Distributed Systems <2018-11-20 Tue> 31 / 31