NVIDIA GPU Applications Presentation

NVIDIA GPU Applications Presentation

Outline of NVIDIA tutorial at MSI I. Intro to GPUs A. MSI plans to add significantly more GPUs in the future; the current system is described at: https://www.msi.umn.edu/hpc/cascade B. MSI currently has: i. 8 Westmere nodes with 4 Fermi M2070 GPGPUs/node (Cascade queue) ii. 4 SandyBridge nodes with 2 Kepler K20m GPGPUs/node (Kepler queue) II. Computational Chemistry and Biology Applications A. Molecular Dynamics Applications, including: i. AMBER, NAMD, LAMMPS, GROMACS, CHARMM, DESMOND, DL_POLY, OpenMM ii. GPU only codes: ACEMD, HOOMD-Blue B. Quantum Chemistry, including: i. Abinit, BigDFT, CP2K, Gaussian, GAMESS, GPAW, NWChem, Quantum Espresso, Q-CHEM, VASP & more ii. GPU only code: TeraChem (ParaChem) C. Bioinformatics i. NVBIO (nvBowtie), BWA-MEM (Both currently in development) update Updated: Oct. 21, 2013 Founded 1993 Invented GPU 1999 – Computer Graphics Visual Computing, Supercomputing, Cloud & Mobile Computing NVIDIA - Core Technologies and Brands GPU Mobile Cloud ® ® GeForce Tegra GRID Quadro® , Tesla® Accelerated Computing Multi-core plus Many-cores GPU Accelerator CPU Optimized for Many Optimized for Parallel Tasks Serial Tasks 3-10X+ Comp Thruput 7X Memory Bandwidth 5x Energy Efficiency How GPU Acceleration Works Application Code Compute-Intensive Functions Rest of Sequential 5% of Code CPU Code GPU CPU + GPUs : Two Year Heart Beat 32 Volta Stacked DRAM 16 Maxwell Unified Virtual Memory 8 Kepler Dynamic Parallelism 4 Fermi 2 FP64 DP GFLOPS GFLOPS per DP Watt 1 Tesla 0.5 CUDA 2008 2010 2012 2014 Kepler Features Make GPU Coding Easier Hyper-Q Dynamic Parallelism Speedup Legacy MPI Apps Less Back-Forth, Simpler Code FERMI 1 Work Queue CPU Fermi GPU CPU Kepler GPU KEPLER 32 Concurrent Work Queues Developer Momentum Continues to Grow 100M 430M CUDA –Capable GPUs CUDA-Capable GPUs 150K 1.6M CUDA Downloads CUDA Downloads 1 50 Supercomputer Supercomputers 60 640 University Courses University Courses 4,000 37,000 Academic Papers Academic Papers 2008 2013 Explosive Growth of GPU Accelerated Apps # of Apps Top Scientific Apps 200 61% Increase Molecular AMBER LAMMPS CHARMM NAMD Dynamics GROMACS DL_POLY 150 Quantum QMCPACK Gaussian 40% Increase Quantum Espresso NWChem Chemistry GAMESS-US VASP CAM-SE 100 Climate & COSMO NIM GEOS-5 Weather WRF Chroma GTS 50 Physics Denovo ENZO GTC MILC ANSYS Mechanical ANSYS Fluent 0 CAE MSC Nastran OpenFOAM 2010 2011 2012 SIMULIA Abaqus LS-DYNA Accelerated, In Development Overview of Life & Material Accelerated Apps MD: All key codes are available AMBER, CHARMM, DESMOND, DL_POLY, GROMACS, LAMMPS, NAMD GPU only codes: ACEMD, HOOMD-Blue Great multi-GPU performance Focus: scaling to large numbers of GPUs / nodes QC: All key codes are ported/optimizing: Active GPU acceleration projects: Abinit, BigDFT, CP2K, GAMESS, Gaussian, GPAW, NWChem, Quantum Espresso, VASP & more GPU only code: TeraChem Analytical instruments – actively recruiting Bioinformatics – market development GPU Test Drive Experience GPU Acceleration For Computational Chemistry Researchers, Biophysicists Preconfigured with Molecular Dynamics Apps Remotely Hosted GPU Servers Free & Easy – Sign up, Log in and See Results www.nvidia.com/gputestdrive 11 Molecular Dynamics (MD) Applications Features Application GPU Perf Release Status Notes/Benchmarks Supported AMBER 12, GPU Revision Support 12.2 PMEMD Explicit Solvent & GB > 100 ns/day JAC Released http://ambermd.org/gpus/benchmarks. AMBER Implicit Solvent NVE on 2X K20s Multi-GPU, multi-node htm#Benchmarks 2x C2070 equals Release C37b1; Implicit (5x), Explicit (2x) Solvent Released 32-35x X5667 http://www.charmm.org/news/c37b1.html#pos CHARMM via OpenMM Single & Multi-GPU in single node CPUs tjump Two-body Forces, Link-cell Pairs, Source only, Results Published Release V 4.04 Ewald SPME forces, 4x http://www.stfc.ac.uk/CSE/randd/ccg/softwar DL_POLY Multi-GPU, multi-node Shake VV e/DL_POLY/25526.aspx 165 ns/Day DHFR Released Release 4.6.3; 1st Multi-GPU support Implicit (5x), Explicit (2x) on GROMACS Multi-GPU, multi-node www.gromacs.org 4X C2075s http://lammps.sandia.gov/bench.html#desktop Lennard-Jones, Gay-Berne, 3.5-18x on Released and LAMMPS Tersoff & many more potentials ORNL Titan Multi-GPU, multi-node http://lammps.sandia.gov/bench.html#titan 4.0 ns/day Released Full electrostatics with PME and NAMD 2.9 F1-ATPase on 100M atom capable NAMD most simulation features http://www.ks.uiuc.edu/Research/namd/ 1x K20X Multi-GPU, multi-node GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison New/Additional MD Applications Ramping Features Application GPU Perf Release Status Notes Supported Production biomolecular dynamics (MD) software specially 150 ns/day DHFR on 1x Released Written for use only on GPUs optimized to run on GPUs ACEMD K20 Single and Multi-GPUs http://www.acellera.com/ Bonded, pair, excluded interactions; Van Released, Version der Waals, electrostatic, nonbonded far TBD http://www.schrodinger.com/productpage/14/3/ DESMOND 3.4.0/0.7.2 interactions Powerful distributed computing molecular Depends upon number of Released http://folding.stanford.edu dynamics system; implicit solvent and Folding@Home GPUs GPUs and CPUs GPUs get 4X the points of CPUs folding High-performance all-atom biomolecular Depends upon number of http://www.gpugrid.net/ Released GPUGrid.net simulations; explicit solvent and binding GPUs Simple fluids and binary mixtures (pair Up to 66x on 2090 vs. 1 Released, Version 0.2.0 http://halmd.org/benchmarks.html#supercooled-binary- potentials, high-precision NVE and NVT, HALMD CPU core Single GPU mixture-kob-andersen dynamic correlations) Kepler 2X faster than Released, Version 0.11.3 http://codeblue.umich.edu/hoomd-blue/ Written for use only on GPUs HOOMD-Blue Fermi Single and Multi-GPU on 1 node Multi-GPU w/ MPI in late 2013 mdcore TBD TBD Released, Version 0.1.7 http://mdcore.sourceforge.net/download.html Implicit: 127-213 ns/day Library and application for molecular dynamics on high- Implicit and explicit solvent, custom Released, Version 5.2 Explicit: 18-55 performance OpenMM forces Multi-GPU ns/day DHFR https://simtk.org/home/openmm_suite GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison Quantum Chemistry Applications Application Features Supported GPU Perf Release Status Notes Local Hamiltonian, non-local Released; Version 7.4.1 www.abinit.org Hamiltonian, LOBPCG algorithm, 1.3-2.7X ABINIT Multi-GPU support diagonalization / orthogonalization Integrating scheduling GPU into SIAL http://www.olcf.ornl.gov/wp- Under development programming language and SIP 10X on kernels content/training/electronic-structure- ACES III Multi-GPU support runtime environment 2012/deumens_ESaccel_2012.pdf Available Q1 2014 Fock Matrix, Hessians TBD www.scm.com ADF Multi-GPU support DFT; Daubechies wavelets, Released, Version 1.7.0 2-5X http://bigdft.org BigDFT part of Abinit Multi-GPU support Code for performing quantum Monte Carlo (QMC) electronic structure Under development TBD http://www.tcm.phy.cam.ac.uk/~mdt26/casino.html Casino calculations for finite and periodic Multi-GPU support systems CASTEP TBD TBD Under development http://www.castep.org/Main/HomePage http://www.olcf.ornl.gov/wp- Released DBCSR (spare matrix multiply library) 2-7X content/training/ascc_2012/friday/ACSS_2012_VandeVon CP2K Multi-GPU support dele_s.pdf Libqc with Rys Quadrature Algorithm, 1.3-1.6X, Released http://www.msg.ameslab.gov/gamess/index.html GAMESS-US Hartree-Fock, MP2 and CCSD 2.3-2.9x HF Multi-GPU support GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison Quantum Chemistry Applications Application Features Supported GPU Perf Release Status Notes (ss|ss) type integrals within calculations using Hartree Fock ab initio methods and Released, Version 7.0 8x http://www.ncbi.nlm.nih.gov/pubmed/21541963 GAMESS-UK density functional theory. Supports Multi-GPU support organics & inorganics. Under development Joint PGI, NVIDIA & Gaussian TBD Multi-GPU support http://www.gaussian.com/g_press/nvidia_press.htm Gaussian Collaboration Electrostatic poisson equation, Released https://wiki.fysik.dtu.dk/gpaw/devel/projects/gpu.ht GPAW orthonormalizing of vectors, residual 8x Multi-GPU support ml, Samuli Hakala (CSC Finland) & minimization method (rmm-diis) Chris O’Grady (SLAC) Under development Schrodinger, Inc. Jaguar Investigating GPU acceleration TBD Multi-GPU support http://www.schrodinger.com/kb/278 http://www.schrodinger.com/productpage/14/7/32/ http://on- Released demand.gputechconf.com/gtc/2013/presentations/S31 CU_BLAS, SP2 algorithm TBD LATTE Multi-GPU support 95-Fast-Quantum-Molecular-Dynamics-in-LATTE.pdf Released, Version 7.8 MOLCAS CU_BLAS support 1.1x Single GPU; Additional GPU support www.molcas.org coming in Version 8 Density-fitted MP2 (DF-MP2), density Under development www.molpro.net fitted local correlation methods (DF-RHF, 1.7-2.3X projected MOLPRO Multiple GPU Hans-Joachim Werner DF-KS), DFT GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison Quantum Chemistry Applications Features Application GPU Perf Release Status Notes Supported Pseudodiagonalization, full Released Academic port. diagonalization, and density matrix 3.8-14X MOPAC2013 available Q1 2014 MOPAC2012 http://openmopac.net assembling Single GPU Development

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    199 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us