Lecture 7: Parallelism

Lecture 7: Parallelism Fall 2018 Jason Tang Slides based upon Operating System Concept slides, http://codex.cs.yale.edu/avi/os-book/OS9/slide-dir/index.html Copyright Silberschatz, Galvin, and Gagne, 2013 "1 Topics • Overview of Threading! • Multithreading Models! • Pthreads Library! • Implicit Threading "2 Why Threading? • Modern programs, including kernels, are usually multithreaded! • Example: Update display, while! • Handling key strokes, while! • Sending network messages! • Process creation is heavy-weight, thread creation is light-weight! • Can simplify code and increase computer usage "3 Single and Multithreaded Processes "4 Multithreading Benefits • Responsiveness - continue processing while waiting for other I/O, especially in GUIs! • Resource Sharing - because threads share the process’s resources, they may be more e$cient than having di%erent processes communicate via messages! • Economy - thread switching usually faster than process context switching! • Scalability - can run in parallel across multiple processors "5 Concurrency vs. Parallelism • Concurrency - Multiple processes making progress! • If only one CPU, no true parallelism; OS rapidly switches between processes to give illusion of parallelism! • Parallelism - Multiple processes simultaneously making progress! • Modern computers have multiple cores, with true parallelism! • Challenge is to write programs that run in parallel, which is a very hard problem "6 Concurrency vs. Parallelism • Concurrent execution on single core system:! • Concurrent and parallelism on a multicore system: "7 Types of Parallelism • Data Parallelism - distributes subsets of same data across multiple cores; same operation run on each core! • Examples: ray tracer, search engine, BitCoin miner! • Task Parallelism - distributes threads across cores, with each thread performing unique operations! • Examples: web server, database server! • In modern software, programs tend to have both data and task parallelism "8 Amdahl’s Law • Programs have a mix of serial (non-parallel) and parallel components! • Let S = percent of program that must be serial, and N = number of cores! • Example: if process foo is 75% parallel and 25% serial, then by going from 1 to 2 cores, the expected speedup is 1.6x! • As N approaches infinity, speedup approaches S-1 "9 User Threads and Kernel Threads • User Threads - threading handled entirely by user-level threading library; kernel is unaware of any threading! • Examples: Java “green” threads, Stackless Python, Tcl! • Kernel Threads - kernel supported threading! • Supported by nearly all modern OSes (Windows, macOS, Linux, Unix, …) "10 Many-to-One Multithreading Model • Many user-level threads mapped to single kernel thread! • If one thread blocks, entire process blocks! • Only model possible if no kernel support! • Threading library handles scheduling! • Faster to schedule (no context switching)! • Unable to take advantage of multiple processors "11 One-to-One Multithreading Model • Each thread created by user maps to a kernel thread! • OS schedules each thread independently! • Each thread could run on separate processor! • OS may limit number of threads per process! • Used by Windows, Linux, etc. "12 Many-to-Many (Hybrid) Multithreading Model • Many user-level threads map to many kernel threads! • Threading library handles scheduling! • OS creates kernel threads based upon number of processors available! • Much more complex to implement! • Available in Windows 7, older versions of FreeBSD, and in experimental OSes "13 Thread Libraries • Thread Library - provides programmer an API for creating and managing threads! • Threads could be entirely user-space construct (green threads)! • Or library invokes system calls to request OS to create threads! • POSIX Threads (Pthreads) - standard API for threading, used by Linux, macOS, and other Unix-like systems "14 About Pthreads • Pthreads library specifies thread creation, management, synchronizations (next lecture)! • Add #include <pthread.h> to programs! • On Linux, compile and link with -pthread flag! gcc -o foo foo.c -pthread • See man 7 pthreads for an overview! • On Ubuntu, make sure you have installed the manpages-dev package (sudo apt-get install manpages-dev) "15 Thread Creation int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg) • Use to create a thread! • First parameter is address to store new thread’s unique ID! • Second parameter gives thread settings, or set to NULL for default settings! • Third parameter gives starting address for thread! • Fourth parameter is for arbitrary data to be passed into thread! • Return value is 0 on success, or negative on error "16 Starting Routine void *thread_func(void *) { … } • Function where newly created thread will start running! • Fourth parameter of pthread_create() passed as this function’s parameter! • When this function returns, thread will terminate! • OS stores function’s return value until later retrieved! • In Pthreads, the new thread will be begin in the ready state (assuming thread creation was successful)! • In Windows, two-step process (create thread, and then start running it) "17 Joining Threads int pthread_join(pthread_t thread, void **retval); • Just as with processes, Linux stores a thread’s return status, until something later reaps the thread! • In Linux, threads and processes are implemented uniformly as tasks! • This function blocks until target thread terminates! • First parameter is target thread to join! • Second parameter gives location to store return status, or NULL to ignore return status "18 Example Threading Code #1 #include <pthread.h> #include <stdio.h> static void *example1(void *args) { printf("In thread\n"); return NULL; } int main(void) { pthread_t tid; pthread_create(&tid, NULL, example1, NULL); printf("In main thread\n"); pthread_join(tid, NULL); printf("After join\n"); return 0; } "19 Example Threading Code #2 #include <pthread.h> #include <stdio.h> static void *example2(void *args) { int my_ID = *(int *)(args); printf("I am thread %d\n", my_ID); return NULL; } int main(void) { int i, data[4]; pthread_t tids[4]; for (i = 0; i < 4; i++) { data[i] = i; pthread_create(tids + i, NULL, example2, data + i); } for (i = 0; i < 4; i++) { pthread_join(tids[i], NULL); } printf("After join\n"); return 0; } "20 Example Threading Code #3 #include <pthread.h> #include <stdio.h> static void *example2(void *args) { int my_ID = *(int *)(args); printf("I am thread %d\n", my_ID); return NULL; } int main(void) { int i, data[4]; pthread_t tids[4]; for (i = 0; i < 4; i++) { pthread_create(tids + i, NULL, example2, &i); } for (i = 0; i < 4; i++) { pthread_join(tids[i], NULL); } printf("After join\n"); return 0; } "21 Implicit Threading • Creation and thread management automatically done by compilers and run- time libraries, instead of programmers! • Examples:! • Thread pools! • OpenMP! • Grand Central Dispatch "22 Thread Pools • Pre-create many threads, in a pool awaiting work! • Usually faster to service request when thread already exists, rather than creating a new thread each time! • Number of threads can be customized based upon application workload, amount of RAM, number of CPUs! • Starting with Windows Vista, a process can push work into a thread pool! • Within Linux kernel code, lower-priority tasks run within workqueues "23 OpenMP • Compiler directives that will #include <stdio.h> automatically parallelize #include <math.h> loops! int main(void) { int i; • Creates threads when double vals[1000000], sum = 0; compiler detects parallel #pragma omp parallel for regions! for (i = 0; i < 1000000; i++) { vals[i] = sqrt(i) + logf(i + 1); } • Available for C, C++, Fortran, running on modern /* this loop cannot be optimized */ operating OSes! for (i = 0; i < 1000000; i++) { sum += vals[i]; } • Used in scientific computing printf("Sum is %f\n", sum); return 0; } "24 Grand Central Dispatch • Designed by Apple for macOS and iOS, but has been ported to other OSes! • Programmer designates parts of their program that can be run in parallel! • GCD automatically dispatches that code into thread(s)! • GCD abstracts thread creation and joining! • OS determines how many threads to create based upon system load "25.

Load more