Accélération De Calculs De Simulations Des Matériaux Sur GPU

Total Page:16

File Type:pdf, Size:1020Kb

Accélération De Calculs De Simulations Des Matériaux Sur GPU Accélération de calculs de simulations des matériaux sur GPU Francois Courteille Senior Solution Architect, Accelerated Computing NVIDIA TEGRA K1 Mar 2, 2014 NVIDIA Confidential1 Tesla, the NVIDIA compute platform Quantum Chemistry Applications Overview AGENDA Molecular Dynamics Applications Overview Material Science Applications at the start of the Pascal Era 2 GAMING AUTO ENTERPRISE HPC & CLOUD TECHNOLOGY THE WORLD LEADER IN VISUAL COMPUTING 3 OUR TEN YEARS IN HPC GPU-Trained AI Machine Beats World Champion in Go World’s First Atomic Model of HIV Capsid Oak Ridge Deploys World’s Fastest Supercomputer w/ GPUs Google Outperform Stanford Builds AI Humans in ImageNet Machine using GPUs Fermi: World’s AlexNet beats expert code First HPC GPU by huge margin using GPUs World’s First GPU Top500 System Discovered How H1N1 World’s First 3-D Mapping Mutates to Resist Drugs of Human Genome CUDA Launched 2006 2008 2010 2012 2014 20164 CREDENTIALS BUILT OVER TIME Academia Games Finance TORCH CAFFE # of Applications Manufacturing Internet Oil & Gas 450 410 National Labs Automotive Defense 400 370 M & E THEANO MATCONVNET 350 300 287 MOCHA.JL PURINE 242 250 206 MINERVA MXNET* 200 300K 150 113 BIG SUR TENSORFLOW 100 50 WATSON CNTK 0 2011 2012 2013 2014 2015 2016 Majority of HPC Applications are 300K CUDA Developers, 100% of Deep Learning GPU-Accelerated, 410 and Growing 4x Growth in 4 years Frameworks are Accelerated www.nvidia.com/appscatalog 5 END-TO-END TESLA PRODUCT FAMILY FULLY INTEGRATED DL HYPERSCALE HPC MIXED-APPS HPC STRONG-SCALING HPC SUPERCOMPUTER Tesla M4, M40 Tesla K80 Tesla P100 DGX-1 Hyperscale deployment for DL HPC data centers running mix Hyperscale & HPC data For customers who need to training, inference, video & of CPU and GPU workloads centers running apps that get going now with fully image processing scale to multiple GPUs integrated solution 6 TESLA FOR SIMULATION LIBRARIES DIRECTIVES LANGUAGES ACCELERATED COMPUTING TOOLKIT TESLA ACCELERATED COMPUTING 8 OVERVIEW OF LIFE & MATERIAL ACCELERATED APPS MD: All key codes are GPU-accelerated QC: All key codes are ported or optimizing Great multi-GPU performance Focus on using GPU-accelerated math libraries, OpenACC directives Focus on dense (up to 16) GPU nodes &/or large # of GPU nodes GPU-accelerated and available today: ACEMD*, AMBER (PMEMD)*, BAND, CHARMM, DESMOND, ESPResso, ABINIT, ACES III, ADF, BigDFT, CP2K, GAMESS, GAMESS- Folding@Home, GPUgrid.net, GROMACS, HALMD, HOOMD-Blue*, UK, GPAW, LATTE, LSDalton, LSMS, MOLCAS, MOPAC2012, LAMMPS, Lattice Microbes*, mdcore, MELD, miniMD, NAMD, NWChem, OCTOPUS*, PEtot, QUICK, Q-Chem, QMCPack, OpenMM, PolyFTS, SOP-GPU* & more Quantum Espresso/PWscf, QUICK, TeraChem* Active GPU acceleration projects: CASTEP, GAMESS, Gaussian, ONETEP, Quantum Supercharger Library*, VASP & more 10 green* = application where >90% of the workload is on GPU MD VS. QC ON GPUS “Classical” Molecular Dynamics Quantum Chemistry (MO, PW, DFT, Semi-Emp) Simulates positions of atoms over time; Calculates electronic properties; chemical-biological or ground state, excited states, spectral properties, chemical-material behaviors making/breaking bonds, physical properties Forces calculated from simple empirical formulas Forces derived from electron wave function (bond rearrangement generally forbidden) (bond rearrangement OK, e.g., bond energies) Up to millions of atoms Up to a few thousand atoms Solvent included without difficulty Generally in a vacuum but if needed, solvent treated classically (QM/MM) or using implicit methods Single precision dominated Double precision is important Uses cuBLAS, cuFFT, CUDA Uses cuBLAS, cuFFT, OpenACC Geforce (Accademics), Tesla (Servers) Tesla recommended ECC off ECC on 11 GPU-Accelerated Quantum Chemistry Apps Green Lettering Indicates Performance Slides Included Abinit Gaussian Octopus Quantum SuperCharger ACES III GPAW ONETEP Library ADF LATTE Petot TeraChem BigDFT LSDalton Q-Chem VASP CP2K MOLCAS QMCPACK WL-LSMS GAMESS-US Mopac2012 Quantum Espresso NWChem GPU Perf compared against dual multi-core x86 CPU socket. 12 ABINIT 6th International ABINIT Developer Workshop FROM RESEARCH TO INDUSTRY April 15-18, 2013 - Dinard, France USING ABINIT ON GRAPHICS PROCESSING UNITS (GPU) Marc Torrent CEA, DAM, DIF, F-91297 Arpajon, France Florent Dahm C-S – 22 av. Galilée. 92350 Le Plessis-Robinson With a contribution from Yann Pouillon for the build system www.cea.fr ABINIT ON GPU OUTLINE Introduction Introduction to GPU computing Architecture, programming model Density-Functional Theory with plane waves How to port it on GPU ABINIT on GPU Implementation Performances of ABINIT v7.0 How To… Compile Use ABINIT on GPU| April 15, 2013 | PAGE 13 LOCAL OPERATOR: FAST FOURIER TRANSFORMS ! Our choice (for a first version): g r g r g v dr e v r c e eff eff g ' Replace the MPI plane wave level g ' ~ = i ~ () F FT by a FFT computation on GPU i ( ) ! Included in Cuda package : cuFFT library Interfaced with c and Fortran ! FTT A system dependent efficiency is expected ! Small FFT underuse the GPU and are dominated by ! the time required to transfer the data to/from the GPU Within PAW, small plane wave basis are used ABINIT on GPU| April 15, 2013 | PAGE 14 LOCAL OPERATOR: FAST FOURIER TRANSFORMS ! Implementation ! Interface Cuda / Fortran ! New routines that encapsulate cufft calls ! ! Development of 3 small cuda kernels Optimization of memory copy to overlap fourwf – option 2 computation and data transfer Performance highly Speedup dependent on FFT size TEST CASE 107 gold atoms cell FFT size TGCC-Curie (Intel Westmere + Nvidia M2090) ABINIT on GPU| April 15, 2013 | PAGE 15 LOCAL OPERATOR: FAST FOURIER TRANSFORMS Improvement : multiple band optimization ! The Cuda kernels are more intensively used Requires less transfers ! See bandpp variable When Locally Optimal Block Preconditioned Conjugate Gradient algorithm is used 107 gold atoms 31 copper atoms bandpp fourwf on GPU bandpp fourwf on GPU 1 112,94 1 36,02 2 84,55 2 20,79 4 70,55 4 14,14 TEST CASE 8 61,91 10 9,16 TGCC-Curie 18 54,79 20 9,50 (Intel Westmere + Nvidia M2090) 72 50,55 100 7,94 ABINIT on GPU| April 15, 2013 | PAGE 16 NON LOCAL OPERATOR ! Implementation ~ ~ ! Development of 2 Cuda Kernels for p ψ = ∑ p g ' g ' ψ i g ' i ~ ! ψ ~ ψ Development of 2 Cuda Kernels for g VNL = ∑i , j g pi Dij p j ! d Using collaborative memory zones as much as possible (threads linke • Non-local operator Used• for Overlap: operator to the same plane wave lie in the same collaborative zone) •• ForcesNon-l and stress tensor Performances (NL operator only) • PAW occupation matrix nonlop CPU nonlop GPU Speedup time (s) time (s) GPU vs CPU In addition to all MPI 107 gold atoms 3142 1122 2.8 parallelization levels ! 31 copper atom 85 42 2.0 • Need a large number of PW s 3 UO2 atoms 63 36 1.8 • Need a large number of NL projectors TGCC-Curie (Intel Westmere + Nvidia M2090) ABINIT on GPU| April 15, 2013 | PAGE 17 Algorithm 1 lobpcg Locally Optimal Block Require: 0 = { 0 , . , 0 } close to the minimum and K a pre- 1 m Preconditioned Conjugate conditioner; P = {P(0) , . , P0 } is initialized to 0. 1 m Gradient 1: for i=0,1, , .do. 2: ⌥(i) = ⌥( (i) ) bandpp Total tim e Tim e Line ar Alge bra 3: R(i) = H (i) ⌥(i) O (i) 1 674 119 2 612 116 4: W(i) = KR(i) 10 1077 428 5: The Rayleigh-Ritz method is applied within the subspace ⌅ = n o 50 5816 5167 (i) (i) (i) (i) (i) (i) 1 ,P. .. , Pm , 1 , . .. , m , W1 , . , Wm 6: (i+1) = (i) (i) + ⇤(i) W(i) + (i) P(i) 7: P(i+1) = ⇤(i) W(i) + (i) P(i) 8: end for Matrix-matrix and matrix-vector products Diagonalization/othogonalization within blocks ! These products are not parallelized ! Size of matrix increases when b andparallelization is used ! Size of vectors/matrix increases with the ___ .z number of band/plane-wave CPU cores. ! Need for- a free LAPACK packag e on GPU ! Our choice : use of cuBlas Our choice : use of MAGMA ABINIT on GPU| April 15, 2013 | PAGE 18 Matrix Algebra on GPU and Home Multicore Architectures Overview Matrix Algebra on GPU and Multicore Latest MAGMA News Architectures 2013-03- 12 News Downloads The MAGMA project aims 1o oeveop a dense linear agebra library MAGMA MIC 1.0 Beta for Intel Xeon similar 1o LAPACK bu1 for heterogeneouslhybrd architectures, Phi Coprocessors Re~ased PublPeopcaletion s 2012-11- 14 starting with current 'Multcore-tGPU' systems. Dcx:umentatb MAGMA 1.3 Released Pan rtners The MAGMA research is based on 1he dlea 1hat, 1o address the complex challenges of 1heemerging hybrd environments, optimal software solutbns will themselves have to hybrdize, combining the 2012-11-13 User Forum strengths of differen1 agornhms wnhin a single framework. Buitling MAGMA MIC 0.3 for Intel Xeon Phi ! A project by Innovating Computing Laboratory, eonna bthleis a pdleap'ca, ton wes a im1o ftoully oes eqnxp oit li n eatherage powebraar g1hornhmat eacs h aondf th e h ybr d Coprocessors Released cfraomponmeworksents fooffersr hybr. d manycore and GPU systems that can University of Tenessee 2012-10-24 clMAGMA 1.0 Released 2012-06-29 ! “Free of use” project MAGMA 1.2.1 Released ! First stable release 25-08-2011; ICL{>l3r Current version 1.3 lver27 2013 Sponsored By: Industry Support From: AMD:I @ Mlaoeolt ,.............. ! Computation is cut in order to offload License consuming parts on GPU Copyrqht 2013 Th.. Uni ,:.r.;it of T..-,nn.. - e , All rt}h re .. ! Goal : address hybrid environments ing d. in conditbns are met: ABINIT on GPU| April 15, 2013 | PAGE 19 ITERATIVE EIGENSOLVER Speedup of cuBLAS/MAGMA code Performances
Recommended publications
  • GPAW, Gpus, and LUMI
    GPAW, GPUs, and LUMI Martti Louhivuori, CSC - IT Center for Science Jussi Enkovaara GPAW 2021: Users and Developers Meeting, 2021-06-01 Outline LUMI supercomputer Brief history of GPAW with GPUs GPUs and DFT Current status Roadmap LUMI - EuroHPC system of the North Pre-exascale system with AMD CPUs and GPUs ~ 550 Pflop/s performance Half of the resources dedicated to consortium members Programming for LUMI Finland, Belgium, Czechia, MPI between nodes / GPUs Denmark, Estonia, Iceland, HIP and OpenMP for GPUs Norway, Poland, Sweden, and how to use Python with AMD Switzerland GPUs? https://www.lumi-supercomputer.eu GPAW and GPUs: history (1/2) Early proof-of-concept implementation for NVIDIA GPUs in 2012 ground state DFT and real-time TD-DFT with finite-difference basis separate version for RPA with plane-waves Hakala et al. in "Electronic Structure Calculations on Graphics Processing Units", Wiley (2016), https://doi.org/10.1002/9781118670712 PyCUDA, cuBLAS, cuFFT, custom CUDA kernels Promising performance with factor of 4-8 speedup in best cases (CPU node vs. GPU node) GPAW and GPUs: history (2/2) Code base diverged from the main branch quite a bit proof-of-concept implementation had lots of quick and dirty hacks fixes and features were pulled from other branches and patches no proper unit tests for GPU functionality active development stopped soon after publications Before development re-started, code didn't even work anymore on modern GPUs without applying a few small patches Lesson learned: try to always get new functionality to the
    [Show full text]
  • Free and Open Source Software for Computational Chemistry Education
    Free and Open Source Software for Computational Chemistry Education Susi Lehtola∗,y and Antti J. Karttunenz yMolecular Sciences Software Institute, Blacksburg, Virginia 24061, United States zDepartment of Chemistry and Materials Science, Aalto University, Espoo, Finland E-mail: [email protected].fi Abstract Long in the making, computational chemistry for the masses [J. Chem. Educ. 1996, 73, 104] is finally here. We point out the existence of a variety of free and open source software (FOSS) packages for computational chemistry that offer a wide range of functionality all the way from approximate semiempirical calculations with tight- binding density functional theory to sophisticated ab initio wave function methods such as coupled-cluster theory, both for molecular and for solid-state systems. By their very definition, FOSS packages allow usage for whatever purpose by anyone, meaning they can also be used in industrial applications without limitation. Also, FOSS software has no limitations to redistribution in source or binary form, allowing their easy distribution and installation by third parties. Many FOSS scientific software packages are available as part of popular Linux distributions, and other package managers such as pip and conda. Combined with the remarkable increase in the power of personal devices—which rival that of the fastest supercomputers in the world of the 1990s—a decentralized model for teaching computational chemistry is now possible, enabling students to perform reasonable modeling on their own computing devices, in the bring your own device 1 (BYOD) scheme. In addition to the programs’ use for various applications, open access to the programs’ source code also enables comprehensive teaching strategies, as actual algorithms’ implementations can be used in teaching.
    [Show full text]
  • D6.1 Report on the Deployment of the Max Demonstrators and Feedback to WP1-5
    Ref. Ares(2020)2820381 - 31/05/2020 HORIZON2020 European Centre of Excellence ​ Deliverable D6.1 Report on the deployment of the MaX Demonstrators and feedback to WP1-5 D6.1 Report on the deployment of the MaX Demonstrators and feedback to WP1-5 Pablo Ordejón, Uliana Alekseeva, Stefano Baroni, Riccardo Bertossa, Miki Bonacci, Pietro Bonfà, Claudia Cardoso, Carlo Cavazzoni, Vladimir Dikan, Stefano de Gironcoli, Andrea Ferretti, Alberto García, Luigi Genovese, Federico Grasselli, Anton Kozhevnikov, Deborah Prezzi, Davide Sangalli, Joost VandeVondele, Daniele Varsano, Daniel Wortmann Due date of deliverable: 31/05/2020 Actual submission date: 31/05/2020 Final version: 31/05/2020 Lead beneficiary: ICN2 (participant number 3) Dissemination level: PU - Public www.max-centre.eu 1 HORIZON2020 European Centre of Excellence ​ Deliverable D6.1 Report on the deployment of the MaX Demonstrators and feedback to WP1-5 Document information Project acronym: MaX Project full title: Materials Design at the Exascale Research Action Project type: European Centre of Excellence in materials modelling, simulations and design EC Grant agreement no.: 824143 Project starting / end date: 01/12/2018 (month 1) / 30/11/2021 (month 36) Website: www.max-centre.eu Deliverable No.: D6.1 Authors: P. Ordejón, U. Alekseeva, S. Baroni, R. Bertossa, M. ​ Bonacci, P. Bonfà, C. Cardoso, C. Cavazzoni, V. Dikan, S. de Gironcoli, A. Ferretti, A. García, L. Genovese, F. Grasselli, A. Kozhevnikov, D. Prezzi, D. Sangalli, J. VandeVondele, D. Varsano, D. Wortmann To be cited as: Ordejón, et al., (2020): Report on the deployment of the MaX Demonstrators and feedback to WP1-5. Deliverable D6.1 of the H2020 project MaX (final version as of 31/05/2020).
    [Show full text]
  • Introducing ONETEP: Linear-Scaling Density Functional Simulations on Parallel Computers Chris-Kriton Skylaris,A) Peter D
    THE JOURNAL OF CHEMICAL PHYSICS 122, 084119 ͑2005͒ Introducing ONETEP: Linear-scaling density functional simulations on parallel computers Chris-Kriton Skylaris,a) Peter D. Haynes, Arash A. Mostofi, and Mike C. Payne Theory of Condensed Matter, Cavendish Laboratory, Madingley Road, Cambridge CB3 0HE, United Kingdom ͑Received 29 September 2004; accepted 4 November 2004; published online 23 February 2005͒ We present ONETEP ͑order-N electronic total energy package͒, a density functional program for parallel computers whose computational cost scales linearly with the number of atoms and the number of processors. ONETEP is based on our reformulation of the plane wave pseudopotential method which exploits the electronic localization that is inherent in systems with a nonvanishing band gap. We summarize the theoretical developments that enable the direct optimization of strictly localized quantities expressed in terms of a delocalized plane wave basis. These same localized quantities lead us to a physical way of dividing the computational effort among many processors to allow calculations to be performed efficiently on parallel supercomputers. We show with examples that ONETEP achieves excellent speedups with increasing numbers of processors and confirm that the time taken by ONETEP as a function of increasing number of atoms for a given number of processors is indeed linear. What distinguishes our approach is that the localization is achieved in a controlled and mathematically consistent manner so that ONETEP obtains the same accuracy as conventional cubic-scaling plane wave approaches and offers fast and stable convergence. We expect that calculations with ONETEP have the potential to provide quantitative theoretical predictions for problems involving thousands of atoms such as those often encountered in nanoscience and biophysics.
    [Show full text]
  • Supporting Information
    Electronic Supplementary Material (ESI) for RSC Advances. This journal is © The Royal Society of Chemistry 2020 Supporting Information How to Select Ionic Liquids as Extracting Agent Systematically? Special Case Study for Extractive Denitrification Process Shurong Gaoa,b,c,*, Jiaxin Jina,b, Masroor Abroc, Ruozhen Songc, Miao Hed, Xiaochun Chenc,* a State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources, North China Electric Power University, Beijing, 102206, China b Research Center of Engineering Thermophysics, North China Electric Power University, Beijing, 102206, China c Beijing Key Laboratory of Membrane Science and Technology & College of Chemical Engineering, Beijing University of Chemical Technology, Beijing 100029, PR China d Office of Laboratory Safety Administration, Beijing University of Technology, Beijing 100124, China * Corresponding author, Tel./Fax: +86-10-6443-3570, E-mail: [email protected], [email protected] 1 COSMO-RS Computation COSMOtherm allows for simple and efficient processing of large numbers of compounds, i.e., a database of molecular COSMO files; e.g. the COSMObase database. COSMObase is a database of molecular COSMO files available from COSMOlogic GmbH & Co KG. Currently COSMObase consists of over 2000 compounds including a large number of industrial solvents plus a wide variety of common organic compounds. All compounds in COSMObase are indexed by their Chemical Abstracts / Registry Number (CAS/RN), by a trivial name and additionally by their sum formula and molecular weight, allowing a simple identification of the compounds. We obtained the anions and cations of different ILs and the molecular structure of typical N-compounds directly from the COSMObase database in this manuscript.
    [Show full text]
  • 5 Jul 2020 (finite Non-Periodic Vs
    ELSI | An Open Infrastructure for Electronic Structure Solvers Victor Wen-zhe Yua, Carmen Camposb, William Dawsonc, Alberto Garc´ıad, Ville Havue, Ben Hourahinef, William P. Huhna, Mathias Jacqueling, Weile Jiag,h, Murat Ke¸celii, Raul Laasnera, Yingzhou Lij, Lin Ling,h, Jianfeng Luj,k,l, Jonathan Moussam, Jose E. Romanb, Alvaro´ V´azquez-Mayagoitiai, Chao Yangg, Volker Bluma,l,∗ aDepartment of Mechanical Engineering and Materials Science, Duke University, Durham, NC 27708, USA bDepartament de Sistemes Inform`aticsi Computaci´o,Universitat Polit`ecnica de Val`encia,Val`encia,Spain cRIKEN Center for Computational Science, Kobe 650-0047, Japan dInstitut de Ci`enciade Materials de Barcelona (ICMAB-CSIC), Bellaterra E-08193, Spain eDepartment of Applied Physics, Aalto University, Aalto FI-00076, Finland fSUPA, University of Strathclyde, Glasgow G4 0NG, UK gComputational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA hDepartment of Mathematics, University of California, Berkeley, CA 94720, USA iComputational Science Division, Argonne National Laboratory, Argonne, IL 60439, USA jDepartment of Mathematics, Duke University, Durham, NC 27708, USA kDepartment of Physics, Duke University, Durham, NC 27708, USA lDepartment of Chemistry, Duke University, Durham, NC 27708, USA mMolecular Sciences Software Institute, Blacksburg, VA 24060, USA Abstract Routine applications of electronic structure theory to molecules and peri- odic systems need to compute the electron density from given Hamiltonian and, in case of non-orthogonal basis sets, overlap matrices. System sizes can range from few to thousands or, in some examples, millions of atoms. Different discretization schemes (basis sets) and different system geometries arXiv:1912.13403v3 [physics.comp-ph] 5 Jul 2020 (finite non-periodic vs.
    [Show full text]
  • GROMACS: Fast, Flexible, and Free
    GROMACS: Fast, Flexible, and Free DAVID VAN DER SPOEL,1 ERIK LINDAHL,2 BERK HESS,3 GERRIT GROENHOF,4 ALAN E. MARK,4 HERMAN J. C. BERENDSEN4 1Department of Cell and Molecular Biology, Uppsala University, Husargatan 3, Box 596, S-75124 Uppsala, Sweden 2Stockholm Bioinformatics Center, SCFAB, Stockholm University, SE-10691 Stockholm, Sweden 3Max-Planck Institut fu¨r Polymerforschung, Ackermannweg 10, D-55128 Mainz, Germany 4Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Nijenborgh 4, NL-9747 AG Groningen, The Netherlands Received 12 February 2005; Accepted 18 March 2005 DOI 10.1002/jcc.20291 Published online in Wiley InterScience (www.interscience.wiley.com). Abstract: This article describes the software suite GROMACS (Groningen MAchine for Chemical Simulation) that was developed at the University of Groningen, The Netherlands, in the early 1990s. The software, written in ANSI C, originates from a parallel hardware project, and is well suited for parallelization on processor clusters. By careful optimization of neighbor searching and of inner loop performance, GROMACS is a very fast program for molecular dynamics simulation. It does not have a force field of its own, but is compatible with GROMOS, OPLS, AMBER, and ENCAD force fields. In addition, it can handle polarizable shell models and flexible constraints. The program is versatile, as force routines can be added by the user, tabulated functions can be specified, and analyses can be easily customized. Nonequilibrium dynamics and free energy determinations are incorporated. Interfaces with popular quantum-chemical packages (MOPAC, GAMES-UK, GAUSSIAN) are provided to perform mixed MM/QM simula- tions. The package includes about 100 utility and analysis programs.
    [Show full text]
  • An Ab Initio Materials Simulation Code
    CP2K: An ab initio materials simulation code Lianheng Tong Physics on Boat Tutorial , Helsinki, Finland 2015-06-08 Faculty of Natural & Mathematical Sciences Department of Physics Brief Overview • Overview of CP2K - General information about the package - QUICKSTEP: DFT engine • Practical example of using CP2K to generate simulated STM images - Terminal states in AGNR segments with 1H and 2H termination groups What is CP2K? Swiss army knife of molecular simulation www.cp2k.org • Geometry and cell optimisation Energy and • Molecular dynamics (NVE, Force Engine NVT, NPT, Langevin) • STM Images • Sampling energy surfaces (metadynamics) • DFT (LDA, GGA, vdW, • Finding transition states Hybrid) (Nudged Elastic Band) • Quantum Chemistry (MP2, • Path integral molecular RPA) dynamics • Semi-Empirical (DFTB) • Monte Carlo • Classical Force Fields (FIST) • And many more… • Combinations (QM/MM) Development • Freely available, open source, GNU Public License • www.cp2k.org • FORTRAN 95, > 1,000,000 lines of code, very active development (daily commits) • Currently being developed and maintained by community of developers: • Switzerland: Paul Scherrer Institute Switzerland (PSI), Swiss Federal Institute of Technology in Zurich (ETHZ), Universität Zürich (UZH) • USA: IBM Research, Lawrence Livermore National Laboratory (LLNL), Pacific Northwest National Laboratory (PNL) • UK: Edinburgh Parallel Computing Centre (EPCC), King’s College London (KCL), University College London (UCL) • Germany: Ruhr-University Bochum • Others: We welcome contributions from
    [Show full text]
  • CESMIX: Center for the Exascale Simulation of Materials in Extreme Environments
    CESMIX: Center for the Exascale Simulation of Materials in Extreme Environments Project Overview Youssef Marzouk MIT PSAAP-3 Team 18 August 2020 The CESMIX team • Our team integrates expertise in quantum chemistry, atomistic simulation, materials science; hypersonic flow; validation & uncertainty quantification; numerical algorithms; parallel computing, programming languages, compilers, and software performance engineering 1 Project objectives • Exascale simulation of materials in extreme environments • In particular: ultrahigh temperature ceramics in hypersonic flows – Complex materials, e.g., metal diborides – Extreme aerothermal and chemical loading – Predict materials degradation and damage (oxidation, melting, ablation), capturing the central role of surfaces and interfaces • New predictive simulation paradigms and new CS tools for the exascale 2 Broad relevance • Intense current interest in reentry vehicles and hypersonic flight – A national priority! – Materials technologies are a key limiting factor • Material properties are of cross-cutting importance: – Oxidation rates – Thermo-mechanical properties: thermal expansion, creep, fracture – Melting and ablation – Void formation • New systems being proposed and fabricated (e.g., metallic high-entropy alloys) • May have relevance to materials aging • Yet extreme environments are largely inaccessible in the laboratory – Predictive simulation is an essential path… 3 Demonstration problem: specifics • Aerosurfaces of a hypersonic vehicle… • Hafnium diboride (HfB2) promises necessary temperature
    [Show full text]
  • Automated Construction of Quantum–Classical Hybrid Models Arxiv:2102.09355V1 [Physics.Chem-Ph] 18 Feb 2021
    Automated construction of quantum{classical hybrid models Christoph Brunken and Markus Reiher∗ ETH Z¨urich, Laboratorium f¨urPhysikalische Chemie, Vladimir-Prelog-Weg 2, 8093 Z¨urich, Switzerland February 18, 2021 Abstract We present a protocol for the fully automated construction of quantum mechanical-(QM){ classical hybrid models by extending our previously reported approach on self-parametri- zing system-focused atomistic models (SFAM) [J. Chem. Theory Comput. 2020, 16 (3), 1646{1665]. In this QM/SFAM approach, the size and composition of the QM region is evaluated in an automated manner based on first principles so that the hybrid model describes the atomic forces in the center of the QM region accurately. This entails the au- tomated construction and evaluation of differently sized QM regions with a bearable com- putational overhead that needs to be paid for automated validation procedures. Applying SFAM for the classical part of the model eliminates any dependence on pre-existing pa- rameters due to its system-focused quantum mechanically derived parametrization. Hence, QM/SFAM is capable of delivering a high fidelity and complete automation. Furthermore, since SFAM parameters are generated for the whole system, our ansatz allows for a con- venient re-definition of the QM region during a molecular exploration. For this purpose, a local re-parametrization scheme is introduced, which efficiently generates additional clas- sical parameters on the fly when new covalent bonds are formed (or broken) and moved to the classical region. arXiv:2102.09355v1 [physics.chem-ph] 18 Feb 2021 ∗Corresponding author; e-mail: [email protected] 1 1 Introduction In contrast to most protocols of computational quantum chemistry that consider isolated molecules, chemical processes can take place in a vast variety of complex environments.
    [Show full text]
  • Chem Compute Quickstart
    Chem Compute Quickstart Chem Compute is maintained by Mark Perri at Sonoma State University and hosted on Jetstream at Indiana University. The Chem Compute URL is https://chemcompute.org/. We will use Chem Compute as the frontend for running electronic structure calculations with The General Atomic and Molecular Electronic Structure System, GAMESS (http://www.msg.ameslab.gov/gamess/). Chem Compute also provides access to other computational chemistry resources including PSI4 and the molecular dynamics packages TINKER and NAMD, though we will not be using those resource at this time. Follow this link, https://chemcompute.org/gamess/submit, to directly access the Chem Compute GAMESS guided submission interface. If you are a returning Chem Computer user, please log in now. If your University is part of the InCommon Federation you can log in without registering by clicking Login then "Log In with Google or your University" – select your University from the dropdown list. Otherwise if this is your first time working with Chem Compute, please register as a Chem Compute user by clicking the “Register” link in the top-right corner of the page. This gives you access to all of the computational resources available at ChemCompute.org and will allow you to maintain copies of your calculations in your user “Dashboard” that you can refer to later. Registering also helps track usage and obtain the resources needed to continue providing its service. When logged in with the GAMESS-Submit tabs selected, an instruction section appears on the left side of the page with instructions for several different kinds of calculations.
    [Show full text]
  • Popular GPU-Accelerated Applications
    LIFE & MATERIALS SCIENCES GPU-ACCELERATED APPLICATIONS | CATALOG | AUG 12 LIFE & MATERIALS SCIENCES APPLICATIONS CATALOG Application Description Supported Features Expected Multi-GPU Release Status Speed Up* Support Bioinformatics BarraCUDA Sequence mapping software Alignment of short sequencing 6-10x Yes Available now reads Version 0.6.2 CUDASW++ Open source software for Smith-Waterman Parallel search of Smith- 10-50x Yes Available now protein database searches on GPUs Waterman database Version 2.0.8 CUSHAW Parallelized short read aligner Parallel, accurate long read 10x Yes Available now aligner - gapped alignments to Version 1.0.40 large genomes CATALOG GPU-BLAST Local search with fast k-tuple heuristic Protein alignment according to 3-4x Single Only Available now blastp, multi cpu threads Version 2.2.26 GPU-HMMER Parallelized local and global search with Parallel local and global search 60-100x Yes Available now profile Hidden Markov models of Hidden Markov Models Version 2.3.2 mCUDA-MEME Ultrafast scalable motif discovery algorithm Scalable motif discovery 4-10x Yes Available now based on MEME algorithm based on MEME Version 3.0.12 MUMmerGPU A high-throughput DNA sequence alignment Aligns multiple query sequences 3-10x Yes Available now LIFE & MATERIALS& LIFE SCIENCES APPLICATIONS program against reference sequence in Version 2 parallel SeqNFind A GPU Accelerated Sequence Analysis Toolset HW & SW for reference 400x Yes Available now assembly, blast, SW, HMM, de novo assembly UGENE Opensource Smith-Waterman for SSE/CUDA, Fast short
    [Show full text]