Parallel and Distributed Computing Chapter 1: Introduction to Parallel Computing
Jun Zhang
Department of Computer Science University of Kentucky Lexington, KY 40506
Chapter 1: CS621 1 1.1a: von Neumann Architecture
Common machine model for over 70 years Stored-program concept CPU executes a stored program A sequence of read and write operations on the memory Order of operations is sequential
Chapter 1: CS621 2 1.1b: A More Detailed Architecture based on von Neumann Model
Chapter 1: CS621 3 1.1c: Old von Neumann Computer
Chapter 1: CS621 4 1.1d: CISC von Neumann Computer
CISC stands for Complex Instruction Set Computer with a single bus system Harvard (RISC) architecture utilizes two buses, a separate data bus and an address bus RISC stands for Reduced Instruction Set Computer They are SISD machines – Single Instruction Stream on Single Data Stream
Chapter 1: CS621 5 1.1e: Personal Computer
Chapter 1: CS621 6 1.1f: John von Neumann
December 28, 1903 – February 8, 1957 Hungarian mathematician Mastered calculus at 8 Graduate level math at 12 Got his Ph.D. at 23 His proposal to his 1st wife, “You and I might be able to have some fun together, seeing as how we both like to drink."
Chapter 1: CS621 7 1.2a: Motivations for Parallel Computing
Fundamental limits on single processor speed Heat dissipation from CPU chips Disparity between CPU & memory speeds Distributed data communications Need for very large scale computing platforms
Chapter 1: CS621 8 1.2b: Fundamental Limits – Cycle Speed
Intel 8080 2MHz 1974 ARM 2 8MHz 1986 Intel Pentium Pro 200MHz 1996 AMD Athlon 1.2GHz 2000 Intel QX6700 2.66GHz 2006 Intel Core i7 3770k 3.9GHz 2013 Speed of light: 30cm in 1ns Signal travels about 10 times slower
Chapter 1: CS621 9 1.2c: High-End CPU is Expensive
Price for high-end CPU rises sharply
Intel processor price/performance
Chapter 1: CS621 10 1.2d: Moore’s Law
Moore’s observation in 1965: the number of transistors per square inch on integrated circuits had doubled every year since the integrated circuit was invented Moore’s revised observation in 1975: the pace was slowed down a bit, but data density had doubled approximately every 18 months How about the future? (price of computing power falls by a half every 18 months?)
Chapter 1: CS621 11 1.2e: Moore’s Law – Held for Now
Chapter 1: CS621 12 1.3: Power Wall Effect in Computer Architecture
Too many transistors in a given chip die area Tremendous increase in power density Increased chip temperature High temperature slows down the transistor switching rate and the overall speed of the computer Chip may melt down if not cooled properly Efficient cooling systems are expensive
Chapter 1: CS621 13 1.3: Cooling Computer Chips
Some people suggest to put computer chips in liquid nitrogen to cool them
Chapter 1: CS621 14 1.3: Solutions
Use multiple inexpensive processors A processor with multiple cores
Chapter 1: CS621 15 1.3: A Multi-core Processor
Chapter 1: CS621 16 1.3a: CPU and Memory Speeds
In 20 years, CPU speed (clock rate) has increased by a factor of 1000 DRAM speed has increased only by a factor of smaller than 4 How to feed data faster enough to keep CPU busy? CPU speed: 1-2 ns DRAM speed: 50-60 ns Cache: 10 ns
Chapter 1: CS621 17 1.3b: Memory Access and CPU Speed
Chapter 1: CS621 18 1.3b: CPU, Memory, and Disk Speed
Chapter 1: CS621 19 1.3c: Possible Solutions
A hierarchy of successively fast memory devices (multilevel caches) Location of data reference (code) Efficient programming can be an issue Parallel systems may provide 1.) larger aggregate cache 2.) higher aggregate bandwidth to the memory system
Chapter 1: CS621 20 1.3f: Multilevel Hierarchical Cache
Chapter 1: CS621 21 1.4a: Distributed Data Communications
Data may be collected and stored at different locations It is expensive to bring them to a central location for processing Many computing assignments may be inherently parallel Privacy issues in data mining and other large scale commercial database manipulations
Chapter 1: CS621 22 1.4b: Distributed Data Communications
Chapter 1: CS621 23 1.4c: Move Computation to Data (CS626: Large Scale Data Science)
Chapter 1: CS621 24 1.5a: Why Use Parallel Computing
Save time – reduce wall clock time – many processors work together Solve larger problems – larger than one processor’s CPU and memory can handle Provide concurrency – do multiple things at the same time: online access to databases, search engine Google’s 4,000 PC servers were one of the largest clusters in the world (server farm)
Chapter 1: CS621 25 1.4b: Google’s Data Center
Chapter 1: CS621 26 1.5b: Other Reasons for Parallel Computing
Taking advantages of non-local resources – using computing resources on a wide area network, or even internet (grid or cloud computing) Cost savings – using multiple “cheap” computing resources instead of a high-end CPU Overcoming memory constraints – for large problems, using memories of multiple computers may overcome the memory constraint obstacle
Chapter 1: CS621 27 1.6a: Need for Large Scale Modeling
Long term weather forecasting Large scale ocean modeling Oil reservoir simulations Car and airplane manufacturing Semiconductor simulation Pollution tracking Large scale commercial databases Aerospace (NASA microgravity modeling)
Chapter 1: CS621 28 1.6b: Semiconductor Simulation
Before 1975, an engineer had to make several runs through the fabrication line until a successful device was fabricated Device dimensions shrink below 0.1 micro-meter A fabrication line costs 1.0 billion dollars to build A design must be thoroughly verified before it is committed to silicon A realistic simulation for one diffusion process may take days or months to run on a workstation Chip price drops quickly after entering the market
Chapter 1: CS621 29 1.4b: Semiconductor Diffusion Process
Chapter 1: CS621 30 1.6c: Drug Design
Most drugs work by binding to a specific site, called a receptor, on a protein A central problem is to find molecules (ligands) with high binding affinity Need to accurately and efficiently estimate electrostatic forces in molecular and atomic interactions Calculate drug-protein binding energies from quantum mechanics, statistical mechanics and simulation techniques
Chapter 1: CS621 31 1.6d: Computing Protein Binding
Chapter 1: CS621 32 1.6d: Computer Aided Drug Design
Chapter 1: CS621 33 1.7: Issues in Parallel Computing
Design of parallel computers Design of efficient parallel algorithms Methods for evaluating parallel algorithms Parallel computer languages Parallel programming tools Portable parallel programs Automatic programming of parallel computers Education of parallel computing philosophy
Chapter 1: CS621 34 1.8 Eclipse Parallel Tools Platform
A standard, portable parallel integrated development environment that supports a wide range of parallel architectures and run time systems (IBM) A scalable parallel debugger Support for the integration of a wide range of parallel tools An environment that simplifies the end-user interaction with parallel systems
Chapter 1: CS621 35 1.9 Message Passing Interface (MPI)
We will use MPI for hands-on experience in this class Message Passing Interface can be downloaded from an online website (MPICH) Parallel computing can be simulated on your own computers (Microsoft MPI) UK can provide distributed computing services for research purposes, for free
Chapter 1: CS621 36 1.10 Cloud Computing
Cloud Computing is a style of computing in which dynamically scalable and often virtualized resources are provided over the Internet Users need not have knowledge of, expertise in, or control over the technology infrastructure in the “cloud” that supports them Compared to: Grid Computing (cluster of networked, loosely coupled computers) Utility Computing (packaging of computing resources as a metered service) and Autonomic Computing (computer systems capable of self-management)
Chapter 1: CS621 37 1.11 Cloud Computing
By Sam Johnston in Wikimedia Commons
Chapter 1: CS621 38 1.12 Cloud Computing
A means to increase computing capacity or add computing capabilities at any time without investing in new infrastructure, training new personnel, or licensing new software
Chapter 1: CS621 39 1.13 Cloud Computing
Chapter 1: CS621 40 1.14 Cloud Computing
Chapter 1: CS621 41 1.15 Cloud Computing
Chapter 1: CS621 42 1.16 Top Players in Cloud Computing
Amazon.com (IaaS, most comprehesive) Vmware (vCloud, software to build cloud) Microsoft (PaaS, IaaS) Salesforce.com (SaaS) Google (IaaS, PaaS, Google App Engine) Rackspace (IaaS) IBM (Openstack) Citrix (compete with Vmware, free cloud operating system) Joyent (compete with Vmware, OpenStack, Citrix) SoftLayer (web hosting service provider)
Chapter 1: CS621 43 1.17 Parallel and Distributed Computing (CS621)
This class is about parallel and distributed computing algorithms and applications Algorithms for communications between processors Algorithms to solve scientific computing problems Hands-on experience with message-passing interface (MPI) to write parallel programs to solve some problems
This class is NOT about parallel data processing (CS626)
Chapter 1: CS621 44 *** Hands-on Experience
You can down and install a copy of Microsoft MPI (Message Passing Interface) at
https://www.microsoft.com/en- us/download/details.aspx?id=57467
You also need MS Visual Studio to work with it. https://visualstudio.microsoft.com/
Chapter 1: CS621 45 *** Hands-on Experience
Here is a video demonstrating how to set up MS Visual Studio for MS MPI programming https://www.youtube.com/watch?v=IW3SKDM_yEs&t=330s
Please make sure that you have installed the MS Visual Studio and MS MPI on your computer
Please run the sample MPI “Hello World” program https://mpitutorial.com/tutorials/mpi-hello-world/
Chapter 1: CS621 46 *** Hands-on Experience
If you have serious research projects that need a lot of computing resources, you can also request an account on a supercomputer at the UK Center for Computational Science
Go to the Center for Computational Science and get a form to fill out and ask your advisor to sign it
https://www.ccs.uky.edu/
Chapter 1: CS621 47