UNIT – IV NATIVE PROGRAMMING and SOFTWARE APPLICATIONS Desktop Supercomputing
Total Page:16
File Type:pdf, Size:1020Kb
CS693 Grid Computing Unit - IV UNIT – IV NATIVE PROGRAMMING AND SOFTWARE APPLICATIONS Desktop supercomputing – parallel computing – parallel programming paradigms – problems of current parallel programming paradigms – Desktop supercomputing programming paradigms – parallelizing existing applications – Grid enabling software applications – Needs of the Grid users – methods of Grid deployment – Requirements for Grid enabling software – Grid enabling software applications Desktop Supercomputing: Parallel Computing – Historical Background MIMD Computers The language of desktop supercomputing, CxC, combines the advantages of C, Java, and Fortran and is designed for MIMD architectures any parallel computer not following the SIMD approach (one-program-on-one-processor-controls-all-others) automatically fell into the MIMD category. Parallel Asynchronous Hardware Architectures MTech CSE (PT, 2011-14) SRM, Ramapuram 1 hcr:innovationcse@gg CS693 Grid Computing Unit - IV List of popular MIMD hardware architectures: Symmetric Multiprocessing Systems (SMP) Massively Parallel Processing Systems (MPP) Cluster computers Proprietary supercomputers Cache-Coherent-Non-Uniform Memory Access (CC-NUMA) computers Blade servers Clusters of blade servers MIMD computer classification Single-Node/Single-Processor (SNSP) Single-Node/Multiple-Processors (SNMP) Multiple-Node/Single-Processor (MNSP) Multiple-Node/Multiple-Processor systems (MNMP) Single-Node/Single-Processor (SNSP) also known as von-Neumann computers same as Flynn’s Single-Instruction-Single-Data (SISD) category Single-Node/Multiple-Processors (SNMP) shared memory computers having multiple processors within the same node accessing the same memory Representatives are blade servers, symmetric multiprocessing systems (SMP), CC-NUMA architectures, and other custom-made high-performance computers. Array and vector computers (SIMD) would fall into this category Multiple-Node/Single-Processor (MNSP) distributed-memory computers represented by a network of workstations MTech CSE (PT, 2011-14) SRM, Ramapuram 2 hcr:innovationcse@gg CS693 Grid Computing Unit - IV Multiple-Node/Multiple-Processor systems (MNMP) multiple shared-memory computers (SNMPs) connected by a network. MNMPs are a loosely coupled cluster of closely coupled nodes. Typical representatives for loosely couple shared-memory computers are SMP clusters or clusters of blade servers Parallel Programming Paradigms Single Node Single Processor (SNSP) Single Node Multi Processor (SNMP) Multi Node Single Processor (MNSP) Multi Node Multi Processor (MNMP) Single Node Single Processor (SNSP) preemptive multitasking is used as the parallel processing model. All processes share the same processor—which spends only a limited amount of time on each process o so their execution appears to be quasi-parallel. The local memory can usually be accessed by all threads/processes during their execution time MTech CSE (PT, 2011-14) SRM, Ramapuram 3 hcr:innovationcse@gg CS693 Grid Computing Unit - IV Single Node Multi Processor (SNMP) based on symmetric multiprocessing hardware architecture Shared memory is used as the parallel processing model each processor works on processes truly in parallel and each process can access the shared memory of the compute node Data access in SNMP computer connected through a high-speed connection fabric, they represent this single shared-memory system The OS has been parallelized so that each processor can access the system memory at the same time. The shared-memory programming model is easy to use all processors are able to run a partitioned version of sequential algorithms created for single processor systems. Share Memory Paradigm in Symmetric Multi Processing Disadvantage of SNMP systems: scalability is limited to a small number of processors. Limitations are based on system design include problems such as bottlenecks with the memory connection fabric or I/O channels, every memory and I/O request has to go through the connection or I/O fabric. Methods for shared-memory (asynchronous) parallelism are OpenMP, Linda, or Global Arrays (GA). Data parallel synchronous parallelism High Performance FORTRAN (HPF). Multi Node Single Processor (MNSP) The programming model for standard MNSP (distributed-memory) computers, such as clusters and MPP systems, usually involves a message-passing model Message-Passing Model parallel programs must explicitly specify communication functions on the sender and the receiver sides. When data needed in a computation are not present on the local computer o issuing a send function to the remote computer holding the data o issuing a receive function at the local computer. The process for passing information from one computer to another computer via the network includes o data transfer from a running application to a device driver; o the device driver then assembles a message to be transferred into packets to the remote computer, which is subsequently sent through networks and cables to the receiving computer. o On the receiving computer’s side, the mirrored receiving process has to be initiated: the application triggers “wait for receiving a message,” using the device driver. o Finally, the message arrives in packets, reconstructed, handed over to the waiting application. MTech CSE (PT, 2011-14) SRM, Ramapuram 4 hcr:innovationcse@gg CS693 Grid Computing Unit - IV Message Passing Paradigm Distributed Memory Architecture Disadvantages for the Message-Passing Model time delay/loss o due to waiting on both transmitting and receiving ends; synchronization problems, such as deadlocks o applications can wait indefinitely if a sender sends data to a remote computer not ready to receive it data loss o each sender needs a complementing receiver (if one fails, data gets lost); difficult programming o algorithms have to be specifically programmed for the message-passing model o sequential algorithms cannot be reused without significant changes Distributed Shared Memory (DSM) or Virtual Shared Memory An approach used to overcome the difficult-to-program problem of the message- passing model simulation of the shared-memory model on top of a distributed-memory environment. provides the function of shared memory, even though physical memory is distributed among Disadvantage : loss of performance: every time a processor tries to access data in a remote computer, the local computer performs message passing of a whole memory page. This leads to huge network traffic and the network becomes such a significant bottleneck - the decreased performance becomes unacceptable for most applications MTech CSE (PT, 2011-14) SRM, Ramapuram 5 hcr:innovationcse@gg CS693 Grid Computing Unit - IV Problems Of Current Parallel Programming Paradigms new art for most software developers parallel computers are very expensive and not available to most software developers complexity of parallel programming o With different architectures, there are also different parallel programming paradigms o no satisfying model for Multiple-Node/Multiple-Processor computers, o simulated shared memory leads to unacceptable performance for the application. Shared-Memory Programming Model and the Message-Passing Programming Model offer both advantages and disadvantages. o Algorithms implemented in one model have to be reprogrammed with significant o There is no effective parallel processing paradigm that works for both SNMP and MNSP systems. o Mixing models is unacceptable due to complexity in the programs that makes maintenance difficult. o Complexity of programming and lack of a standardized programming model for hybrid compute clusters is an unsatisfying and unacceptable situation Desktop supercomputing programming paradigms Connected Memory Paradigm allows developers to focus on building parallel algorithms by creating a virtual parallel computer consisting of virtual processing elements. It effectively maps, distributes, and executes programs on any available physical hardware. Then it maps a virtual parallel computer to available physical hardware, o with creation of algorithms independent of any particular architecture. Desktop Supercomputing makes the parallelization process for even complex problems simple. It enables: Ease of programming The language CxC allows developers to design algorithms by defining a virtual parallel computer o instead of having to fit algorithms into the boundaries and restriction of a real computer. Architecture Independence Executables run on any of the following architectures without modification: SNMP, MNSP, or MNMP. Today developers can use shared memory on SNMP and message passing on MNSP architectures that are distinctly different, requiring significant effort to rewrite programs for the other architecture. Scalability Developers can create programs on small computers and run these same programs on a cluster of hundreds or thousands of connected computers. This scalability allows testing of algorithms in a laboratory environment and tackling problems of sizes not previously solvable. Enhancement It has the ability to unleash the performance of MNMP computers that have the best performance/price ratio of all parallel computers. Desktop Supercomputing with CxC offers the advantages of message passing—using distributed-computing solutions—with the easier programmability of shared memory. MTech CSE (PT, 2011-14) SRM, Ramapuram 6 hcr:innovationcse@gg CS693 Grid Computing Unit - IV Parallel Programming in CxC CxC is the language of Desktop Supercomputing.