Chapter 1

Introduction Why High Performance Computing?

Quote:

It is hard to understand an ocean because it is too big. It is hard to understand a molecule because it is too small. It is hard to understand nuclear physics because it is too fast. It is hard to understand the greenhouse effect because it is too slow. break these barriers to understanding. They, in effect, shrink oceans, zoom in on molecules, slow down physics, and fastforward climates. Clearly a scientist who can see natural phenomena at the right size and the right speed learns more than one who is faced with a blur.

Al Gore, 1990

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-2 Why High Performance Computing?

• Grand Challenges (Basic Research) • Decoding of human genome • Kosmogenesis • Global Climate Changes • Biological Macro molecules • Product development • Fluid mechanics • Crash tests • Material minimization • Drug design • Chip design • IT Infrastructure • Search engines • Data Mining • Virtual Reality • Rendering • Vision Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-3 Examples for HPC-Applications: Finite Element Method: Crash-Analysis

Asymmetric frontal impact of a car to a rigid obstacle

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-4 Examples for HPC-Applications: Aerodynamics

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-5 Examples for HPC-Applications: Molecular dynamics

Simulation of a noble gas (Argon) with a partitioning for 8 processors: 2048 molecules in a cube of16 nm edge length simulated in time steps of 2x10-14 sec.

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-6 Examples for HPC-Applications: Nuclear Energy

• 1946 Baker Test • 23.000 Tons TNT

• Lots of nukes produced during the cold war • Status today: Largely unknown

Source: Wikipedia

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-7 Examples for HPC-Applications: Visualization

Rendering (Interior architecture) Rendering (Movie: Titanic)

Photo realistic presentation using sophisticated illumination modules

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-8 Examples for HPC-Applications: Analysis of complex materials

C60-trimer Si1600

MoS2

4H-SiC

Source: Th. Frauenheim, Paderborn

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-9 Grosch's Law

Performance

Price „The performance of a computer increases (roughly) quadratically with the prize"

Consequence: It is better to buy a computer which is twice as fast than to buy two slower computers. (The law was valid in the sixties and seventies over a wide range of universal computers.)

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-10 Eighties

Availability of powerful Microprocessors • High Integration density (VLSI) • Single-Chip-Processors • Automated production process • Automated development process • High volume production

Consequence: Grosch's Law no longer valid: 1000 cheap microprocessors render (theoretically) more performance in (MFLOPS) than expensive (e.g. )

Idea: To achieve high performance at low cost use many microprocessors together Parallel Processing

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-11 Eighties

• Wide spread use of workstations and PCs Terminals being replaced by PCs Workstations achieve (computational) performance of mainframes at a fraction of the price.

• Availability of local area networks (LAN) (Ethernet) Possibility to connect a larger number of autonomous computers using a low cost medium. Access to data of other computers. Usage of programs and other resources of remote computers.

• Network of Workstations as Parallel Computer Possibility to exploit unused computational capacity of other computers for computational-intensive calculations (idle time computing).

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-12 Nineties

• Parallel computers are built of a large number of microprocessors (Massively Parallel Processors, MPP), e. g. Transputer systems, Connection Machine CM-5, Paragon, • Alternative Architectures are built (Connection Machine CM-2, MasPar). • Trend to use cost efficient standard components (“commercial-off- the-shelf, COTS”) leads to coupling of standard PCs to a “Cluster” (Beowulf, 1995) • Exploitation of unused compute power for HPC (“idle time computing”) • Program libraries like Parallel Virtual Machine(PVM) and the Message Passing Interface (MPI) allow for development of portable parallel programs • Linux as operating system for HPC becomes prevailing

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-13 Top 10 of TOP500 List (06/2001)

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-14 Top 10 of TOP500 List (06/2002)

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-15 Earth Simulator (Rank 1, 2002-2004)

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-16 Earth Simulator

• 640 Nodes • 8 vector processors (each 8 GFLOPS) and 16 GB per node • 5120 CPUs • 10 TB main memory • 40 TFLOPS peak performance • 65m x 50m physical dimension

Source: H.Simon, NERSC

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-17 Rank Site Computer / Processors Rmax Country/Year Manufacturer Rpeak 1 IBM/DOE BlueGene/L beta-System 70720 United States/2004 BlueGene/L DD2 beta-System (0.7 GHz PowerPC 440) / 32768 91750 IBM 2 NASA/Ames Research Columbia 51870 Center/NAS SGI Altix 1.5 GHz, Voltaire Infiniband / 10160 60960 United States/2004 SGI 3 The Earth Simulator Center Earth-Simulator / 5120 35860 Japan/2002 NEC 40960 4 Barcelona Supercomputer MareNostrum 20530 Center eServer BladeCenter JS20 (PowerPC970 2.2 GHz), Myrinet / 3564 31363 Spain/2004 IBM 5 Lawrence Livermore National Thunder 19940 Laboratory Intel Itanium2 Tiger4 1.4GHz - Quadrics / 4096 22938 United States/2004 California Digital Corporation 6 Los Alamos National ASCI Q 13880 Laboratory ASCI Q - AlphaServer SC45, 1.25 GHz / 8192 20480 United States/2002 HP 7 Virginia Tech System X 12250 United States/2004 1100 Dual 2.3 GHz Apple XServe/Mellanox Infiniband 4X/Cisco GigE / 2200 20240 Self-made 8 IBM - Rochester BlueGene/L DD1 Prototype (0.5GHz PowerPC 440 w/Custom) / 8192 11680 United States/2004 IBM/ LLNL 16384 TOP 10 / Nov. 2004 10 / Nov. TOP 9 Naval Oceanographic Office eServer pSeries 655 (1.7 GHz Power4+) / 2944 10310 United States/2004 IBM 20019.2 10 NCSA Tungsten 9819 United States/2003 PowerEdge 1750, P4 3.06 GHz, Myrinet / 2500 15300 Dell Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-18 Rank Site Computer Processors Year Rmax DOE/NNSA/LLNL BlueGene/L - eServer Blue Gene Solution 1 131072 2005 280600 United States IBM

IBM Thomas J. Watson Research BGW - eServer Blue Gene Solution 2 40960 2005 91290 Center United States IBM

DOE/NNSA/LLNL ASC Purple - eServer pSeries p5 575 1.9 3 10240 2005 63390 United States GHz IBM

NASA/Ames Research Center/NAS Columbia - SGI Altix 1.5 GHz, Voltaire 4 10160 2004 51870 United States Infiniband SGI

Thunderbird - PowerEdge 1850, 3.6 GHz, Sandia National Laboratories 5 Infiniband 8000 2005 38270 United States Dell

Sandia National Laboratories Cray XT3, 2.0 GHz 6 10880 2005 36190 United States Cray Inc.

The Earth Simulator Center Earth-Simulator 7 5120 2002 35860 Japan NEC

Barcelona Supercomputer Center MareNostrum - JS20 Cluster, PPC 970, 8 4800 2005 27910 Spain 2.2 GHz, Myrinet IBM TOP 10 / Nov. 2005 10 / Nov. TOP ASTRON/University Groningen Stella - eServer Blue Gene Solution 9 12288 2005 27450 Netherlands IBM

Oak Ridge National Laboratory Jaguar - Cray XT3, 2.4 GHz 10 5200 2005 20527 United States Cray Inc. Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-19 IBM Blue Gene

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-20 IBM Blue Gene

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-21 TOP 10 / Nov. 2006

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-22 Top 500 Nov 2008

Processor System Inter- Rank Computer Country Year Cores RMax Family Family connect 1 BladeCenter QS22/LS21 Cluster, PowerXCell 8i 3.2 IBM Ghz / Opteron DC 1.8 GHz , Voltaire Infiniband United States 2008 129600 1105000 Power Cluster Infiniband 2 Cray XT5 QC 2.3 GHz United States 2008 150152 1059000AMD x86_64 Cray XT Proprietary 3 SGI Altix ICE 8200EX, Xeon QC 3.0/2.66 GHz United States 2008 51200 487005 Intel EM64T SGI Altix Infiniband IBM 4 eServer Blue Gene Solution United States 2007 212992 478200 Power BlueGene Proprietary IBM 5 Blue Gene/P Solution United States 2007 163840 450300 Power BlueGene Proprietary Sun Blade 6 SunBlade x6420, Opteron QC 2.3 Ghz, Infiniband United States 2008 62976 433200AMD x86_64 System Infiniband 7 Cray XT4 QuadCore 2.3 GHz United States 2008 38642 266300AMD x86_64 Cray XT Proprietary 8 Cray XT4 QuadCore 2.1 GHz United States 2008 30976 205000AMD x86_64 Cray XT Proprietary 9 Sandia/ Cray Red Storm, XT3/4, 2.4/2.2 GHz Cray Inter- dual/quad core United States 2008 38208 204200AMD x86_64 Cray XT connect Dawning 5000A, QC Opteron 1.9 Ghz, Infiniband, Dawning 10 Windows HPC 2008 China 2008 30720 180600AMD x86_64Cluster Infiniband

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-23 Top 500 Nov 2008

Processor System Inter- Rank Computer Country Year Cores RMax Family Family connect IBM 11 Blue Gene/P Solution Germany 2007 65536 180000 Power BlueGene Proprietary Intel 12 SGI Altix ICE 8200, Xeon quad core 3.0 GHz United States 2007 14336 133200 EM64T SGI Altix Infiniband HP Cluster Cluster Platform 3000 BL460c, Xeon 53xx 3GHz, Intel Platform 13 Infiniband India 2008 14384 132800 EM64T 3000BL Infiniband Intel 14 SGI Altix ICE 8200EX, Xeon quad core 3.0 GHz France 2008 12288 128400 EM64T SGI Altix Infiniband AMD 15 Cray XT4 QuadCore 2.3 GHz United States 2008 17956 125128 x86_64 Cray XT Proprietary IBM 16 Blue Gene/P Solution France 2008 40960 112500 Power BlueGene Proprietary Intel 17 SGI Altix ICE 8200EX, Xeon quad core 3.0 GHz France 2008 10240 106100 EM64T SGI Altix Infiniband HP Cluster Cluster Platform 3000 BL460c, Xeon 53xx 2.66GHz, Intel Platform 18 Infiniband Sweden 2007 13728 102800 EM64T 3000BL Infiniband DeepComp 7000, HS21/x3950 Cluster, Xeon QC HT Intel Lenovo 19 3 GHz/2.93 GHz, Infiniband China 2008 12216 102800 EM64T Cluster Infiniband

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-24 Top 500 Nov 2009

Rank Site Computer/Year Vendor Cores Rmax Rpeak Power Oak Ridge National Laboratory Jaguar - Cray XT5-HE Opteron Six Core 2.6 GHz / 2009 1 224162 1759.00 2331.00 6950.60 United States Cray Inc. Roadrunner - BladeCenter QS22/LS21 Cluster, PowerXCell 8i DOE/NNSA/LANL 2 3.2 Ghz / Opteron DC 1.8 GHz, Voltaire Infiniband / 2009 122400 1042.00 1375.78 2345.50 United States IBM National Institute for Computational Kraken XT5 - Cray XT5-HE Opteron Six Core 2.6 GHz / 2009 / 3 Sciences/University of Tennessee 98928 831.70 1028.85 Cray Inc. United States Forschungszentrum Juelich (FZJ) 4 JUGENE - Blue Gene/P Solution / 2009 / IBM 294912 825.50 1002.70 2268.00 Germany National SuperComputer Center in Tianhe-1 - NUDT TH-1 Cluster, Xeon E5540/E5450, ATI 5 71680 563.10 1206.19 Tianjin/NUDT / China Radeon HD 4870 2, Infiniband / 2009 / NUDT NASA/Ames Research Center/NAS Pleiades - SGI Altix ICE 8200EX, Xeon QC 3.0 GHz/Nehalem 6 56320 544.30 673.26 2348.00 United States EP 2.93 Ghz / 2009 / SGI 7 DOE/NNSA/LLNL / United States BlueGene/L - eServer Blue Gene Solution / 2007 / IBM 212992 478.20 596.38 2329.60 Argonne National Laboratory 8 Blue Gene/P Solution / 2007 / IBM 163840 458.61 557.06 1260.00 United States Texas Advanced Computing Ranger - SunBlade x6420, Opteron QC 2.3 Ghz, Infiniband / 9 62976 433.20 579.38 2000.00 Center/Univ. of Texas / United States 2008 / Sun Microsystems Sandia National Laboratories / Red Sky - Sun Blade x6275, Xeon X55xx 2.93 Ghz, Infiniband / 10 National Renewable Energy 41616 423.90 487.74 2009 / Sun Microsystems Laboratory / United States

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-25 Top 500 Nov 2011

Total Rank Computer Site Country Cores Rmax Rpeak K computer, SPARC64 VIIIfx 2.0GHz, Tofu RIKEN Advanced Institute for 1 interconnect Computational Science (AICS) Japan 705024 10510000 11280384 NUDT YH MPP, Xeon X5670 6C 2.93 GHz, National Supercomputing Center in 2 NVIDIA 2050 Tianjin China 186368 2566000 4701000 DOE/SC/Oak Ridge National United 3 Cray XT5-HE Opteron 6-core 2.6 GHz Laboratory States 224162 1759000 2331000 Dawning TC3600 Blade System, Xeon X5650 6C 2.66GHz, Infiniband QDR, National Supercomputing Centre in 4 NVIDIA 2050 Shenzhen (NSCS) China 120640 1271000 2984300 HP ProLiant SL390s G7 Xeon 6C X5670, GSIC Center, Tokyo Institute of 5 Nvidia GPU, Linux/Windows Technology Japan 73278 1192000 2287630 Cray XE6, Opteron 6136 8C 2.40GHz, United 6 Custom DOE/NNSA/LANL/SNL States 142272 1110000 1365811 SGI Altix ICE 8200EX/8400EX, Xeon HT QC 3.0/Xeon 5570/5670 2.93 Ghz, United 7 Infiniband NASA/Ames Research Center/NAS States 111104 1088000 1315328 Cray XE6, Opteron 6172 12C 2.10GHz, United 8 Custom DOE/SC/LBNL/NERSC States 153408 1054000 1288627 Commissariat a l'Energie Atomique 9 Bull bullx super-node S6010/S6030 (CEA) France 138368 1050000 1254550 BladeCenter QS22/LS21 Cluster, PowerXCell 8i 3.2 Ghz / Opteron DC 1.8 United 10 GHz, Voltaire Infiniband DOE/NNSA/LANL States 122400 1042000 1375776

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-26 K-computer (AICS), Rank 1

• 800 racks with 88,128 SPARC64 CPUs / 705,024 cores • 200,000 cables with total length over 1,000km • third floor (50m x 60m) free of structural pillars. • New problem: floor load capacity of 1t/m2 (avg.) • Each rack has up to 1.5t • High-tech construction required • first floor with global file system has just three pillars • Own power station (300m2)

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-27 The K-Computer

2013: Human Brain Simulation (Coop: Forschungszentrum Jülich) Simulating 1 second of the activity of 1% of a Brain took 40 Minutes. Simulation of a whole Brain possible with Exa- scale Computers?

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-28 Jugene (Forschungszentrum Jülich), Rank 13

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-29 Top 500 Nov 2012

Rank Name Computer Site Country Total Cores Rmax Rpeak Cray XK7 , Opteron 6274 16C DOE/SC/Oak Ridge National 1 2.200GHz, Cray Gemini United States 560640 17590000 27112550 Laboratory interconnect, NVIDIA K20x BlueGene/Q, Power BQC 16C 1.60 2Sequoia DOE/NNSA/LLNL United States 1572864 16324751 20132659,2 GHz, Custom K computer, SPARC64 VIIIfx RIKEN Advanced Institute for 3Kcomputer Japan 705024 10510000 11280384 2.0GHz, Tofu interconnect Computational Science (AICS) BlueGene/Q, Power BQC 16C DOE/SC/Argonne National 4Mira United States 786432 8162376 10066330 1.60GHz, Custom Laboratory BlueGene/Q, Power BQC 16C Forschungszentrum Juelich 5 JUQUEEN Germany 393216 4141180 5033165 1.600GHz, Custom Interconnect (FZJ) iDataPlex DX360M4, Xeon E5- 6SuperMUC Leibniz Rechenzentrum Germany 147456 2897000 3185050 2680 8C 2.70GHz, Infiniband FDR PowerEdge C8220, Xeon E5-2680 Texas Advanced Computing 7Stampede8C 2.700GHz, Infiniband FDR, United States 204900 2660290 3958965 Center/Univ. of Texas Intel NUDT YH MPP, Xeon X5670 6C National Supercomputing 8 Tianhe-1A China 186368 2566000 4701000 2.93 GHz, NVIDIA 2050 Center in Tianjin BlueGene/Q, Power BQC 16C 9Fermi CINECA Italy 163840 1725492 2097152 1.60GHz, Custom DARPA Trial Power 775, POWER7 8C 10 IBM Development Engineering United States 63360 1515000 1944391,68 Subset 3.836GHz, Custom Interconnect

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-30 Top 500 Nov 2013

Rank Name Computer Site Country Total Cores Rmax Rpeak TH-IVB-FEP Cluster, Intel Xeon E5- Tianhe-2 National Super Computer Center in 1 2692 12C 2.200GHz, TH Express-2, China 3120000 33862700 54902400 (MilkyWay-2) Guangzhou Intel Xeon Phi 31S1P Cray XK7 , Opteron 6274 16C DOE/SC/Oak Ridge National 2Titan 2.200GHz, Cray Gemini interconnect, United States 560640 17590000 27112550 Laboratory NVIDIA K20x BlueGene/Q, Power BQC 16C 1.60 3 DOE/NNSA/LLNL United States 1572864 17173224 20132659,2 GHz, Custom

K computer, SPARC64 VIIIfx 2.0GHz, RIKEN Advanced Institute for 4 Kcomputer Japan 705024 10510000 11280384 Tofu interconnect Computational Science (AICS)

BlueGene/Q, Power BQC 16C DOE/SC/Argonne National 5Mira United States 786432 8586612 10066330 1.60GHz, Custom Laboratory

Cray XC30, Xeon E5-2670 8C Swiss National Supercomputing 6 Piz Daint 2.600GHz, Aries interconnect , NVIDIA Switzerland 115984 6271000 7788852,8 Centre (CSCS) K20x

PowerEdge C8220, Xeon E5-2680 8C Texas Advanced Computing 7 Stampede 2.700GHz, Infiniband FDR, Intel Xeon United States 462462 5168110 8520111,6 Center/Univ. of Texas Phi SE10P

BlueGene/Q, Power BQC 16C 8 JUQUEEN Forschungszentrum Juelich (FZJ) Germany 458752 5008857 5872025,6 1.600GHz, Custom Interconnect

BlueGene/Q, Power BQC 16C 9Vulcan DOE/NNSA/LLNL United States 393216 4293306 5033165 1.600GHz, Custom Interconnect

iDataPlex DX360M4, Xeon E5-2680 8C 10 SuperMUC Leibniz Rechenzentrum Germany 147456 2897000 3185050 2.70GHz, Infiniband FDR

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-31 Tianhe-2 (MilkyWay-2), National Super Computer Center in Guangzhou, Rank 1

• Intel Xeon systems with Xeon Phi accelerators • 32000 Xeon E5-2692 (12 Core, 2.2 Ghz) • 48000 Xeon Phi 31S1P (57 Cores, 1.1 GHz) • Organization: 16000 Nodes, each • 2 Xeon E5-2692 • 3 Xeon Phi 31S1P • 1408 TB Memory • 1024 TB CPUs (64 GB per node) • 384 TB Xeon Phi (3x8 GB per node) • Power 17.6 MW (24 MW with cooling)

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-32 Titan (Oak Ridge National Laboratory), Rank 2 (2012: R1)

Cray XK7 • 18,688 nodes with 299,008 cores, 710 TB (598 TB CPU and 112 TB GPU) • AMD Opteron 6274 16-core CPU • Nvidia Tesla K20X GPU • Memory: 32GB (CPU) + 6 GB (GPU) • Cray Linux Environment • Power 8.2 MW

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-33 JUQUEEN (Forschungszentrum Jülich), Rank 8 (2012: Rank 5)

• 28 racks (7 rows à 4 racks) • 28,672 nodes (458,752 cores) • Rack: 2 midplanes à 16 nodeboards (16,384 cores) • Nodeboard: 32 compute nodes • Node: 16 cores IBM PowerPC A2, 1.6 GHz

• Main memory: 448 TB (16 GB per node)

• Network: 5D Torus — 40 GBps; 2.5 μsec latency (worst case)

• Overall peak performance: 5.9 Petaflops • Linpack: > 4.141 Petaflops

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-34 HERMIT (HLRS Stuttgart), Rank 39 (2012: Rank 27)

• Cray XE6

• 3.552 Nodes • 113.664 Cores • 126TB RAM

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-35 Top 500 Nov 2014

Rmax Rpeak Power R Site System Cores (TF/s) (TF/s) (kW) Tianhe-2 3,120,0 1 China 33,862.7 54,902.4 17,808 (MilkyWay-2) 00 2United States Titan 560,640 17,590.0 27,112.5 8,209 1,572,8 3United States Sequoia 17,173.2 20,132.7 7,890 64 4 Japan K computer 705,024 10,510.0 11,280.4 12,660

5United States Mira 786,432 8,586.6 10,066.3 3,945

6 Switzerland Piz Daint 115,984 6,271.0 7,788.9 2,325

7United States Stampede 462,462 5,168.1 8,520.1 4,510

8Germany JUQUEEN 458,752 5,008.9 5,872.0 2,301

9United States Vulcan 393,216 4,293.3 5,033.2 1,972

10 United States Cray CS-Storm 72,800 3,577.0 6,131.8 1,499

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-36 Green 500 Top entries Nov. 2014

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-37 Top 500 vs. Green 500

Top Rmax Rpeak Power Green System 500 (TF/s) (TF/s) (kW) 500 Tianhe-2 1 (MilkyWay- 33,862.7 54,902.4 17,808 57 2) 2 Titan 17,590.0 27,112.5 8,209 53

3 Sequoia 17,173.2 20,132.7 7,890 48

4 K computer 10,510.0 11,280.4 12,660 156

5 Mira 8,586.6 10,066.3 3,945 47

6 Piz Daint 6,271.0 7,788.9 2,325 9

7 Stampede 5,168.1 8,520.1 4,510 91

8 JUQUEEN 5,008.9 5,872.0 2,301 40

9 Vulcan 4,293.3 5,033.2 1,972 38 Cray CS- 10 3,577.0 6,131.8 1,499 4 Storm

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-38 HLRN (Hannover + Berlin), Rank 120 + 121

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-39 Volunteer Computing

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-40 Barry Linnert, [email protected], Cluster Computing SoSe 2017 Slide 41 Cloud Computing (XaaS)

• Usage of remote physical and logical resources

Software as a service

Platform as a service

Infrastructure as a service

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-42 Supercomputer in the Cloud: Rank 64

• Amazon EC2 C3 Instance • Intel Xeon E5-2680v2 10C 2.800GHz • 10G Ethernet • 26496 Cores (according to TOP500 list) • Rmax: 484179 GFLOPS • Rpeak: 593510,4 GFLOPS

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-43 Support for parallel programs

Parallel Application

Program libraries (e.g. communication, synchronization,..)

Middleware (e.g. administration, scheduling,..)

lok.BS lok.BS Distributedlok.BS operatinglok.BS systemlok.BS lok.BS

node node node node node node 1 2 3 4 5 n

Connection network

Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-44 Tasks User‘s point of view (programming comfort, short response times) • Efficient Interaction (Information exchange) • Message passing • Shared memory • Synchronization • Automatic distribution and allocation of code and data • Load balancing • Debugging support • Security • Machine independence Operator‘s point of view (High utilization, high throughput) • Multiprogramming / Partitioning • Time sharing • Space sharing • Load distribution / Load balancing • Stability • Security Barry Linnert, [email protected], Cluster Computing SoSe 2017 1-45