Introduction to Parallel Programming
Total Page:16
File Type:pdf, Size:1020Kb
11/6/17 Introduction to parallel programming Alberto Bosio, Associate Professor – UM Microelectronic Departement [email protected] Definitions l What is parallel programming? l Parallel computing is the simultaneous use of multiple compute resources to solve a computational problem 1 11/6/17 Serial vs parallel l Serial Computing l Traditionally, software has been written for serial computation: To be run on a single CPU; A problem is broken into a discrete series of instructions. Instructions are executed one after another. l Only one instruction may execute at any moment Serial vs parallel l Simultaneous use of multiple compute resources to solve a computational problem: l To be run using multiple CPUs l A problem is broken into discrete parts, solved concurrently l Each part is broken down to a series of instructions l Instructions from each part execute simultaneously on different CPUs 2 11/6/17 Serial vs parallel Why parallel computing? l Save time and/or money l Solve larger problems l Provide concurrency l Use of non-local resources l Limits to serial computing, physical and practical l Transmission speeds l Miniaturization l Economic limitations 3 11/6/17 Shared Memory l Shared memory: all processors access all memory as a global address space. l CPUs operate independently but share memory resources. Changes in a memory location effected by one processor are visible to all other processors. l Two main classes based upon memory access times: UMA and NUMA. Uniform Memory Access l Uniform Memory Access (UMA): Identical processors l Equal access and access times to memory Cache coherent: if one processor updates a location in shared memory, all the other processors know about the update. 4 11/6/17 Non-Uniform Memory Access l Non-Uniform Memory Access (NUMA): l Often made by physically linking two or more SMPs l One SMP can directly access memory of another SMP l Not all processors have equal access time to all memories l Memory access across link is slower Shared Memory l Advantages l Global address space is easy to program l Data sharing is fast and uniform due to the proximity memory/CPUs l Disadvantages l Lack of scalability between memory and CPUs Programmer responsibility for synchronization 5 11/6/17 Distributed Memory l Communication network to connect inter- processor memory. Processors have their own local memory. l No concept of global address space l CPUs operate independently. l Change to local memory have no effect on the memory of other processors. l Cache coherency does not apply. l Data communication and synchronization are programmer's responsibility l Connection used for data transfer varies (es. Ethernet) Distributed Memory 6.