
Processes • DEF’N: a process is a program in heap file descriptor execution, as seen by the operating table POSIX Threads system • A process imposes a specific set of management responsibilities on the register OS states – a unique virtual address space virtual address space (stack, heap, code) David McCaughan, HPC Analyst program – file descriptor table stack counter SHARCNET, University of Guelph – program counter [email protected] – register states, etc. Conceptual View of a Process (simplified) HPC Resources Threads Issues for Parallelization • DEF’N: a thread is a sequence of • What does this have to do with HPC exactly? heap file descriptor executable code within a process table – multiprocessor machines are now relatively common in the consumer market • A serial process can be seen, at its – multi-core machines are already available, and are set to become simplest, as a single thread (a single commodity computing resources in the immediate future “thread of control”) – represented by the program counter register • How do you take advantage of multiple CPUs/cores in a single – sequence (increment PC), iteration/ states conditional branch (set value of PC) system image? virtual address space thread – multi-programming: • In terms of record-keeping, only a small program • start multiple processes to perform work simultaneously subset of a process is relevant when stack counter – multi-threading: considering a thread • introduce multiple threads of control within a single process to – register states; program counter perform work simultaneously Conceptual View of a Thread (simplified) HPC Resources HPC Resources 1 Multi-programming Multi-threading process_1 • Distribute work by spawning multiple • Distribute work by defining multiple processes threads to do the work – e.g. fork(2), exec(2) – e.g. pthreads thread_1 • Advantages • Advantages process_2 – conceptually simple: each process is – all process resources are implicitly completely independent of others thread_2 shared (memory, file descriptors, etc.) – process functionality can be highly – overhead incurred to manage multiple cohesive threads is relatively low – easily distributed thread_3 – looks much like serial code process_3 • Disadvantages • Disadvantages – high resource cost – all data being implicitly shared – sharing file descriptors requires care creates a world of hammers, and your and effort code is the thumb – exclusive access, contention, etc. HPC Resources HPC Resources Threaded code on a Single Pthreads Processor • There have been a number of threading models historically (some of • Multi-threading your code is not strictly an issue for multi- which are still used) processor/core system – mach threads, etc. – even on a single processor system, it is possible to interleave work and improve performance • Lack of standards led to IEEE to define POSIX 1003.1c standard for • efficient handling of asynchronous events threading • allow computation to occur during long I/O operations – POSIX threads, or Pthreads • permit fine grained scheduling of different operations performed by the same process • Note: – provides “upward compatibility” – threads are peers – POSIX threads run in user space • threading a suitable piece of code does not break it on a single processor system, but has built-in potential for speed- • contrast with what would be implied by kernel threads up if multiple processors/cores are available • trade-offs? HPC Resources HPC Resources 2 Pthreads Programming Pthreads vs. OpenMP Basics • OpenMP is a language extension for parallel programming in a • Include Pthread header file SMP environment – allows the programmer to define “parallel regions” in code which are – #include “pthread.h” executed in separate threads (typically found around loops with no loop-carried dependencies) • Compile with Pthreads support/library – the details of the thread creation are hidden from the user – cc -pthread … • OpenMP is considered fairly easy to learn and use, so why bother • compiler vendors may differ in their usage to support with Pthreads at all? pthreads (link a library, etc.) 1. Right tool, right job: if OpenMP will service your needs you should be • GNU, Intel and Pathscale use the -pthread argument so using it this will suffice for our purposes 2. OpenMP supports parallelism in a very rigid sense and lacks versatility – Pthreads allows far more complex parallel approaches which • when in doubt, consult the man page for the compiler in would be difficult or impossible to implement in OpenMP question HPC Resources HPC Resources Pthreads Programming pthread_create Basics (cont.) • Note that all processes have an implicit “main thread of control” • thread int pthread_create – handle (ID) for the created thread ( • We need a means of creating a new thread pthread_t *thread, • attr – pthread_create() const pthread_attr_t *attr, – attributes for the thread (if not NULL) void *(*f_start)(void *), • We need a way to terminating a thread void *arg • f_start ); – threads are terminated implicitly when the function that was the entry – pointer to the function that is to be point of the thread returns, or can be explicitly destroyed using called first in the new thread pthread_exit() • Create a new thread with a • arg – the argument provided to f_start • We may need to distinguish one thread from another at run-time function as its entry point when called – pthread_self(), pthread_equal() – consider: how would you provide multiple arguments to a thread? HPC Resources HPC Resources 3 pthread_self, pthread_exit pthread_equal • status pthread_t pthread_self int pthread_equal void pthread_exit ( – exit status of the thread ( ( void pthread_t t1, – made available to any join pthread_t t2 void *status ); with the terminated thread ); ); • Note: • Compares two thread handles for • Returns the handle of the calling equality • Terminate the calling thread – if pthread_exit() is not thread and performs any necessary called explicity, the exit • e.g. conditional execution based clean-up status of the thread is the • This value can be saved for later on which thread is in a given return value of f_start use in identifying a thread section of code HPC Resources HPC Resources Synchronization pthread_join • There are several situations that arise where synchronization • thread between threads is important int pthread_join – handle (ID) of the thread we are waiting on to finish – execution dependency ( pthread_t *thread, • thread(s) must wait for other threads to complete their work before proceeding void **status • status ); – mutual exclusion – value returned by f_start, or provided to • a shared data structure must be protected from simultaneous pthread_exit() (if not modification by multiple threads NULL) – critical sections • Suspends execution of the • a region of code must be executed by only one thread at a time current thread until the specified thread is complete • Pthreads provides support for handling each of these situations HPC Resources HPC Resources 4 Example: join.c Example: join.c (cont.) … … int main() int fact(int *n) { { int i1 = 5, i2 = 2, r1, r2; int i, sum = 1; pthread_t t1, t2; for (i = 1; i <= (*n); i++) pthread_create(&t1, NULL, (void *)fact, (void *)&i1); sum *= i; pthread_create(&t2, NULL, (void *)fact, (void *)&i2); return(sum); /* } * resulting sum must wait for results */ pthread_join(t1,(void **)&r1); pthread_join(t2,(void **)&r2); printf("Sum of fact = %d\n", r1 + r2); return(0); } HPC… Resources HPC Resources Fundamental coding Example: “Hello, world!” issues • There are a couple of issues that tend to catch novice pthreads #include <stdlib.h> #include <stdio.h> programmers (although not strictly pthread issues): #include “pthread.h” – don't use a loop index directly as "thread id" passed to thread void output (void *); • providing a pointer to the loop index variable as the thread argument will result in all threads having a pointer to the same integer int main(int argc, char *argv[]) { • where this is necessary, use an array of integers int id, nthreads = atoi(argv[1]); – store loop index in position i, pass address of that array entry to thread int id_array[nthreads]; – not usually inconvenient as it's likely you're storing the thread handles in an array as well pthread_t thread[nthreads]; – don't allow main thread to exit while child threads are still pending /* • may appear that threads are doing nothing (or terminating early) * note: in order for a thread to receive an integer "id" which is * it's thread number, we can't pass &id to every one...as that would • most implementations will terminate child threads when parent thread exits * be a pointer to the same variable and the number would change */ • use pthread_join to have main() wait for thread completion for (id = 0; id < nthreads; id++) { id_array[id] = id; • The following "Hello, world!" example illustrates both of these… pthread_create(&thread[id], NULL, (void *)output, &id_array[id]); } HPC Resources HPC… Resources 5 Example (cont.) … for (id = 0; id < nthreads; id++) pthread_join(thread_array[id], NULL); Exercise: return(0); } Threaded “Hello, world!” void output(void *thread_num) { printf(“Hello, world! from thread %d\n”, *((int *)thread_num)); } The purpose of this exercise is to allow you to work with simple thread operations, and begin to consider some of the issues that arise with basic pthread funcationality. HPC Resources Exercise Exercise (cont.) 1) The thello.c file in ~dbm/public/exercises/pthreads is a copy • Compile and run this program with of the one used in
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages11 Page
-
File Size-