<<

Parallel and Distributed Chapter 1: Introduction to

Jun Zhang

Department of University of Kentucky Lexington, KY 40506

Chapter 1: CS621 1 1.1a:

 Common machine model for over 70 years  Stored-program concept  CPU executes a stored program  A sequence of read and write operations on the memory  Order of operations is sequential

Chapter 1: CS621 2 1.1b: A More Detailed Architecture based on von Neumann Model

Chapter 1: CS621 3 1.1c: Old von Neumann Computer

Chapter 1: CS621 4 1.1d: CISC von Neumann Computer

 CISC stands for Complex Instruction Set Computer with a single system  Harvard (RISC) architecture utilizes two buses, a separate data bus and an address bus  RISC stands for Reduced Instruction Set Computer  They are SISD machines – Single Instruction Stream on Single Data Stream

Chapter 1: CS621 5 1.1e:

Chapter 1: CS621 6 1.1f: John von Neumann

 December 28, 1903 – February 8, 1957  Hungarian mathematician  Mastered calculus at 8  Graduate level math at 12  Got his Ph.D. at 23  His proposal to his 1st wife, “You and I might be able to have some fun together, seeing as how we both like to drink."

Chapter 1: CS621 7 1.2a: Motivations for Parallel Computing

 Fundamental limits on single speed  Heat dissipation from CPU chips  Disparity between CPU & memory speeds  Distributed data communications  Need for very large scale computing platforms

Chapter 1: CS621 8 1.2b: Fundamental Limits – Cycle Speed

 Intel 8080 2MHz 1974  ARM 2 8MHz 1986  Intel Pentium Pro 200MHz 1996  AMD Athlon 1.2GHz 2000  Intel QX6700 2.66GHz 2006  Intel Core i7 3770k 3.9GHz 2013  Speed of light: 30cm in 1ns  Signal travels about 10 times slower

Chapter 1: CS621 9 1.2c: High-End CPU is Expensive

Price for high-end CPU rises sharply

Intel processor price/performance

Chapter 1: CS621 10 1.2d: Moore’s Law

 Moore’s observation in 1965: the number of transistors per square inch on integrated circuits had doubled every year since the was invented  Moore’s revised observation in 1975: the pace was slowed down a bit, but data density had doubled approximately every 18 months  How about the future? (price of computing power falls by a half every 18 months?)

Chapter 1: CS621 11 1.2e: Moore’s Law – Held for Now

Chapter 1: CS621 12 1.3: Power Wall Effect in

 Too many transistors in a given chip die area  Tremendous increase in power density  Increased chip temperature  High temperature slows down the transistor switching rate and the overall speed of the computer  Chip may melt down if not cooled properly  Efficient cooling systems are expensive

Chapter 1: CS621 13 1.3: Cooling Computer Chips

Some people suggest to put computer chips in liquid nitrogen to cool them

Chapter 1: CS621 14 1.3: Solutions

Use multiple inexpensive processors A processor with multiple cores

Chapter 1: CS621 15 1.3: A Multi-core Processor

Chapter 1: CS621 16 1.3a: CPU and Memory Speeds

 In 20 years, CPU speed () has increased by a factor of 1000  DRAM speed has increased only by a factor of smaller than 4  How to feed data faster enough to keep CPU busy?  CPU speed: 1-2 ns  DRAM speed: 50-60 ns  Cache: 10 ns

Chapter 1: CS621 17 1.3b: Memory Access and CPU Speed

Chapter 1: CS621 18 1.3b: CPU, Memory, and Disk Speed

Chapter 1: CS621 19 1.3c: Possible Solutions

 A hierarchy of successively fast memory devices (multilevel caches)  Location of data reference (code)  Efficient programming can be an issue  Parallel systems may provide 1.) larger aggregate cache 2.) higher aggregate bandwidth to the memory system

Chapter 1: CS621 20 1.3f: Multilevel Hierarchical Cache

Chapter 1: CS621 21 1.4a: Distributed Data Communications

 Data may be collected and stored at different locations  It is expensive to bring them to a central location for processing  Many computing assignments may be inherently parallel  Privacy issues in and other large scale commercial manipulations

Chapter 1: CS621 22 1.4b: Distributed Data Communications

Chapter 1: CS621 23 1.4c: Move Computation to Data (CS626: Large Scale )

Chapter 1: CS621 24 1.5a: Why Use Parallel Computing

 Save time – reduce wall clock time – many processors work together  Solve larger problems – larger than one processor’s CPU and memory can handle  Provide concurrency – do multiple things at the same time: online access to , search engine Google’s 4,000 PC servers were one of the largest clusters in the world (server farm)

Chapter 1: CS621 25 1.4b: Google’s Data Center

Chapter 1: CS621 26 1.5b: Other Reasons for Parallel Computing

 Taking advantages of non-local resources – using computing resources on a wide area network, or even (grid or )  Cost savings – using multiple “cheap” computing resources instead of a high-end CPU  Overcoming memory constraints – for large problems, using memories of multiple computers may overcome the memory constraint obstacle

Chapter 1: CS621 27 1.6a: Need for Large Scale Modeling

 Long term weather forecasting  Large scale ocean modeling  Oil reservoir simulations  Car and airplane manufacturing  Semiconductor simulation  Pollution tracking  Large scale commercial databases  Aerospace (NASA microgravity modeling)

Chapter 1: CS621 28 1.6b: Semiconductor Simulation

 Before 1975, an had to make several runs through the fabrication line until a successful device was fabricated  Device dimensions shrink below 0.1 micro-meter  A fabrication line costs 1.0 billion dollars to build  A design must be thoroughly verified before it is committed to silicon  A realistic simulation for one diffusion may take days or months to run on a workstation  Chip price drops quickly after entering the market

Chapter 1: CS621 29 1.4b: Semiconductor Diffusion Process

Chapter 1: CS621 30 1.6c: Drug Design

 Most drugs work by binding to a specific site, called a receptor, on a protein  A central problem is to find molecules (ligands) with high binding affinity  Need to accurately and efficiently estimate electrostatic forces in molecular and atomic interactions  Calculate drug-protein binding energies from quantum mechanics, statistical mechanics and simulation techniques

Chapter 1: CS621 31 1.6d: Computing Protein Binding

Chapter 1: CS621 32 1.6d: Computer Aided Drug Design

Chapter 1: CS621 33 1.7: Issues in Parallel Computing

 Design of parallel computers  Design of efficient parallel  Methods for evaluating parallel algorithms  Parallel computer languages  Parallel programming tools  Portable parallel programs  Automatic programming of parallel computers  Education of parallel computing philosophy

Chapter 1: CS621 34 1.8 Eclipse Parallel Tools Platform

 A standard, portable parallel integrated development environment that supports a wide range of parallel architectures and run time systems (IBM)  A scalable parallel debugger  Support for the integration of a wide range of parallel tools  An environment that simplifies the end-user interaction with parallel systems

Chapter 1: CS621 35 1.9 Interface (MPI)

 We will use MPI for hands-on experience in this class  Message Passing Interface can be downloaded from an online website (MPICH)  Parallel computing can be simulated on your own computers (Microsoft MPI)  UK can provide services for research purposes, for free

Chapter 1: CS621 36 1.10 Cloud Computing

 Cloud Computing is a style of computing in which dynamically scalable and often virtualized resources are provided over the Internet  Users need not have knowledge of, expertise in, or control over the technology infrastructure in the “cloud” that supports them  Compared to: (cluster of networked, loosely coupled computers)  (packaging of computing resources as a metered service)  and Autonomic Computing (computer systems capable of self-management)

Chapter 1: CS621 37 1.11 Cloud Computing

 By Sam Johnston in Wikimedia Commons

Chapter 1: CS621 38 1.12 Cloud Computing

A means to increase computing capacity or add computing capabilities at any time without investing in new infrastructure, training new personnel, or licensing new software

Chapter 1: CS621 39 1.13 Cloud Computing

Chapter 1: CS621 40 1.14 Cloud Computing

Chapter 1: CS621 41 1.15 Cloud Computing

Chapter 1: CS621 42 1.16 Top Players in Cloud Computing

 Amazon.com (IaaS, most comprehesive)  Vmware (vCloud, software to build cloud)  Microsoft (PaaS, IaaS)  Salesforce.com (SaaS)  Google (IaaS, PaaS, Google App Engine)  Rackspace (IaaS)  IBM (Openstack)  Citrix (compete with Vmware, free cloud )  Joyent (compete with Vmware, OpenStack, Citrix)  SoftLayer (web hosting service provider)

Chapter 1: CS621 43 1.17 Parallel and Distributed Computing (CS621)

 This class is about parallel and distributed computing algorithms and applications  Algorithms for communications between processors  Algorithms to solve scientific computing problems  Hands-on experience with message-passing interface (MPI) to write parallel programs to solve some problems

 This class is NOT about parallel data processing (CS626)

Chapter 1: CS621 44 *** Hands-on Experience

 You can down and install a copy of Microsoft MPI (Message Passing Interface) at

https://www.microsoft.com/en- us/download/details.aspx?id=57467

You also need MS Visual Studio to work with it. https://visualstudio.microsoft.com/

Chapter 1: CS621 45 *** Hands-on Experience

 Here is a video demonstrating how to set up MS Visual Studio for MS MPI programming https://www.youtube.com/watch?v=IW3SKDM_yEs&t=330s

 Please make sure that you have installed the MS Visual Studio and MS MPI on your computer

 Please run the sample MPI “Hello World” program https://mpitutorial.com/tutorials/mpi-hello-world/

Chapter 1: CS621 46 *** Hands-on Experience

 If you have serious research projects that need a lot of computing resources, you can also request an account on a at the UK Center for

 Go to the Center for Computational Science and get a form to fill out and ask your advisor to sign it

 https://www.ccs.uky.edu/

Chapter 1: CS621 47