IBM BG/P Workshop Lukas Arnold, Forschungszentrum Jülich, 14.-16.10.2009 Contact: [email protected] Aim of This Workshop Contribution

IBM BG/P Workshop Lukas Arnold, Forschungszentrum Jülich, 14.-16.10.2009 Contact: L.Arnold@Fz-Juelich.De Aim of This Workshop Contribution

IBM BG/P Workshop Lukas Arnold, Forschungszentrum Jülich, 14.-16.10.2009 contact: [email protected] aim of this workshop contribution ! give a brief introduction to the IBM BG/P (sw+hw) ! guide intensively through two aspects ! spent most time with hands-on ! this is not a complete reference talk, as there are already many of them ! aimed for HPC beginners 14.-16.10.2009 Lukas Arnold 2 contents ! part I - Introduction to FZJ/BGP ! systems at FZJ ! IBM Blue Gene/P architecture overview ! part II - jugene Usage ! compiler, submission system ! hands-on: “Hallo (MPI) World!” ! part III - PowerPC 450 ! ASIC, internal structure, compiler optimization ! hands-on: “Matrix-Matrix-Multiplication, a.k.a. dgemm” ! part IV - 3D Torus Network ! torus network strategy, linkage and usage, DMA engine ! hands-on: “Simple Hyperbolic Solver” and “communication and computation overlap” 14.-16.10.2009 Lukas Arnold 3 PART I INTRODUCTION TO FZJ/BGP 14.-16.10.2009 Lukas Arnold 4 Forschungszentrum Jülich (FZJ) ! one of the 15 Helmholtz Research Centers in Germany ! Europe’s largest multi-disciplinary research center ! Area 2.2 km2, 4400 employees, 1300 scientists 14.-16.10.2009 Lukas Arnold 5 Jülich Supercomputing Center (JSC) @ FZJ ! operation of the supercomputers, user support, R&D work in the field of computer and computational science, education and training, 130 employees ! peer-reviewed provision of computer time to national and European computational science projects (NIC, John von Neumann Institute for Computing) 14.-16.10.2009 Lukas Arnold 6 research fields of current projects 14.-16.10.2009 Lukas Arnold 7 user support at JSC 14.-16.10.2009 Lukas Arnold 8 simulation laboratories 14.-16.10.2009 Lukas Arnold 9 systems @ JSC jugene just hpc-ff juropa ! total power consumption: 2.5 MW (jugene) + 0.3 MW (just) + 1.5 MW (hpc-ff+juropa) + 0.9 MW (cooling) " 5 MW ! total performance: 1000 TF/s (jugene) + 300 TF/s (hpc-ff+juropa) " 1300 TF/s = 1.3 PF/s ! total storage: 0.3 PB (Lustre-FS) + 2.2 PB (GPFS@34GB/s) + 2.5 PB (Archive) ! 5 PB 14.-16.10.2009 Lukas Arnold 10 hpc-ff + juropa ! 3288 Compute nodes in total ! 2 Intel Xeon X5570 (Nehalem-EP) ! quad-core processors per node ! 2.93 GHz and Hyperthreading ! 3 GB per physical core ! Installed at JSC in April-June 2009 ! 308 TFlop/s peak performance ! 274.8 TFlop/s LINPACK performance ! No. 10 in TOP500 on June 2009 14.-16.10.2009 Lukas Arnold 11 jugene ! IBM BlueGene/P system ! 72 Racks (294,912 cores) ! Installed at JSC in April/May 2009 ! 1 PFlop/s peak performance ! 825.5 TFlop/s LINPACK performance ! No. 3 in TOP500 of June 2009 ! No. 1 system in Europe 14.-16.10.2009 Lukas Arnold 12 jugene setup in 60 seconds 14.-16.10.2009 Lukas Arnold 13 jugene building blocks Node Card Jugene system (32 chips 4x4x2) 72 Racks, 72x32x32 32 compute, 0-2 IO cards 1 PF/s, 144 TB 435 GF/s, 64 GB Rack 32 Node Cards Cabled 8x8x16 13.9 TF/s, 2 TB Chip 4 processors 13.6 GF/s Compute Card 1 chip, 13.6 GF/s 2.0 GB DDR2 (4.0GB optional) 14.-16.10.2009 Lukas Arnold 14 BG/P compute and node card Blue Gene/P compute ASIC 4 cores, 8MB cache Cu heatsink SDRAM – DDR2 2GB memory Node card connector network, power 14.-16.10.2009 Lukas Arnold 15 BG/P in numbers Property Node Node Processors 4* 450 PowerPC® Properties Processor Frequency 0.85GHz Coherency SMP L3 Cache size (shared) 8MB Main Store 2GB Main Store Bandwidth (1:2 pclk) 13.6 GB/s Peak Performance 13.9 GF/node Torus Bandwidth 6*2*425MB/s=5.1GB/s Network Hardware Latency (Nearest 100ns (32B packet) Neighbour) 800ns (256B packet) Hardware Latency (Worst Case) 3.2#s(64 hops) Tree Bandwidth 2*0.85GB/s=1.7GB/s Network Hardware Latency (worst case) 3.5#s System Area (72k nodes) 160m2 Properties Peak Performance (72k nodes) ~ 1PF Total Power ~2.3MW 14.-16.10.2009 Lukas Arnold 16 system access Blue Gene/P 73728 Compute Nodes Control-System Service Node 600 I/O Nodes Service Node mpirun FrontEnd FrontEnd SSH RAID DB2 Fileserver JUST 14.-16.10.2009 Lukas Arnold 17 system access (cont.) ! Compute Nodes dedicated to running user application, and almost nothing else -simple compute node kernel (CNK) ! I/O Nodes run Linux and provide a more complete range of OS services –files, sockets, process launch, signalling, debugging, and termination ! Service Node performs system management services (e.g., partitioning, heart beating, monitoring errors) -transparent to application software 14.-16.10.2009 Lukas Arnold 18 BG/P compute node software ! Compute Node Kernel (CNK) ! minimal kernel ! handles signals, function shipping ! system calls to I/O nodes, starting/stopping jobs, threads ! not much else ! very “linux-like”, uses glibc ! missing some system calls (fork() mostly) ! limited support for mmap(), execve() ! but, most apps that run on Linux work out-of-the-box on BG/P 14.-16.10.2009 Lukas Arnold 19 BG/P I/O node software ! I/O Node Kernel, Mini-Control Program (MCP) ! Linux ! port of the Linux kernel, GPL/LGPL licensed ! Linux version 2.6.16 ! very minimal distribution ! only connection from compute nodes to outside world ! handles syscalls (ie fopen()) and I/O requests ! file system support: NFS, PVFS, GPFS, Lustre FS 14.-16.10.2009 Lukas Arnold 20 BG/P networks ! 3D torus network ! only for point-to-point between compute nodes ! hardware latency: 0.5 – 5 #s MPI latency: 3 – 10 #s ! bandwidth: 6$2$425 MB/s=5.1 GB/s (per compute node) ! direct memory access (DMA) unit, communication and computation overlap ! collective network ! one-to-all, reduction functionality (compute and I/O nodes) ! one way tree transversal latency: 1.3 #s; MPI: 5 #s ! bandwidth: 850 MB/s per link 14.-16.10.2009 Lukas Arnold 21 BG/P networks (cont.) ! barrier network ! hardware latency for full system: 0.65 #s; MPI 1.6 #s ! 10 Gb network ! I/O nodes only ! file I/O, all external communication ! 1 Gb network ! control network (boot, debug, monitor) ! compute and I/O nodes 14.-16.10.2009 Lukas Arnold 22 BG/P architectural features ! low area foot print (4k cores per rack) ! high energy efficiency (2.5kW per 1 TF/s) ! no network hierarchy, scalable up to full system ! easy programming based on MPI ! high reliability ! balanced system 14.-16.10.2009 Lukas Arnold 23 comparison to other architectures (approximation) ! core linpack performance ! BG/P 3 GF/s ! XT5/PWR6/x86 7/ 12.5/ 12 GF/s ! triad memory bandwidth [related to GF/s] per core ! BG/P: 4.4 GB/s [ 1.5 byte/flop ] ! XT5/PWR6/x86 2.5/ 3.3/ (8) GB/s [ 0.3/ 0.25/ 0.7 ] ! all-to-all performance, two nodes [related to GF/s] ! BG/P: 1 GB/s [ 0.08 byte/flop ] ! XT5/PWR6/x86 3/ 3/ 2 GB/s [ 0.05/ 0.004/ 0.01 ] ! energy efficiency ! BG/P: 300 MF/J ! XT5/PWR6/x86 150/ 85/ 200 MF/J 14.-16.10.2009 Lukas Arnold 24 BG/P cons ! only 512 MB memory per core ! low core performance, 5 to 10 times more cores needed (compared to nowadays general CPUs) ! torus network might not perform well for unstructured communication pattern ! cross compilation ! CNK (compute node kernel) is not a full Linux system 14.-16.10.2009 Lukas Arnold 25 application scaling example PEPC performance 100 10 time in inner loop [s] inner loop timein 1 512 1024 2048 4096 number of cores IBM BG/P - jugene Intel Nehalem - juropa Cray XT5 - louhi IBM Power6 - huygens 14.-16.10.2009 Lukas Arnold 26 application scaling example (cont.) PEPC performance 100 10 time in inner loop [s] inner loop timein 1 1 10 100 partition performance [TF/s] IBM BG/P - jugene Intel Nehalem - juropa Cray XT5 - louhi IBM Power6 - huygens 14.-16.10.2009 Lukas Arnold 27 practical information ! contact me (now or tomorrow) for a private key ! account will be valid until 18.10.2009 ! common passphrase: (WS-kra09) ! make sure you are able to login on jugene, !"#$$%#&'#()*+,-)#%./0,1223456)+)789:45)0'/%7;)# ! have a brief look at our documentation and user info, http://www.fz-juelich.de/jsc/jugene/ ! you will be able to submit jobs on 16./17.10.2009 14.-16.10.2009 Lukas Arnold 28 PART II JUGENE USAGE 14.-16.10.2009 Lukas Arnold 29 login ! use the uniquely distributed private key ! Login via ! ssh -i ssh_key [email protected] ! Automatically distributed to two different login nodes ! jugene3 and jugene4 see: http://www.fz-juelich.de/jsc/jugene/usage/logon/ 14.-16.10.2009 Lukas Arnold 30 available compiler ! need to cross-compile ! compiler for front-end (Power6) only ! GNU: gcc, gfortran, ... ! IBM XL: xlc, xlf90, ... ! and for jugene (PowerPC 450) with MPI wrapper ! GNU: mpicc, mpif90, ... ! IBM XL: mpixlc, mpixlf90, ... ! thread save versions available (*_r) FZJ-Info: http://www.fz-juelich.de/jsc/jugene/usage/tuning/ IBM XL documentation: http://publib.boulder.ibm.com/infocenter/compbgpl/v9v111/index.jsp BP/P redbook: http://www.fz-juelich.de/jsc/datapool/jugene/bgp_appl_sg247287_V1.4.pdf 14.-16.10.2009 Lukas Arnold 31 XL compiler options (optimization) ! -O2 ! default optimization level ! eliminates redundant code ! basic loop optimization ! can structure code to take advantage of -qarch and -qtune settings ! -O3 ! In-depth memory access analysis ! Better loop scheduling ! High-order loop analysis and transformations ! Inlining of small procedures within a compilation unit by default ! Pointer aliasing improvements to enhance other optimizations ! ..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    89 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us