Journal of Computational Science and Technology

Total Page:16

File Type:pdf, Size:1020Kb

Journal of Computational Science and Technology Journal of Computational Vol.5, No.3, 2011 Science and Technology Implementation of GPU-FFT into Planewave Based First Principles Calculation Method∗ Hidekazu TOMONO∗∗, Masaru AOKI∗∗∗,∗∗, Toshiaki IITAKA† and Kazuo TSUMURAYA∗∗ ∗∗ Department of Mechanical Engineering Informatics, School of Science and Technology, Meiji University 1–1–1 Higashimita, Tama-ku, Kawasaki, Kanagawa 214-8571, Japan E-mail: [email protected] (KT) ∗∗∗ School of Management, Shizuoka Sangyo University 1572-1 Ohwara, Iwata, Shizuoka 438-0043, Japan † RIKEN (The Institute of Physical and Chemical Research) 2-1 Hirosawa, Wako, Saitama 351-0198, Japan Abstract We present an implementation of a GPU based FFT routine (Graphics Processing Unit based Fast Fourier Transformation) into a CPU based ab initio periodic DFT (Den- sity Functional Theory) calculation code. The FFT calculation in the CPU based DFT codes is the most time-consuming part; for the 128 silicon system, the fraction of time of a CPU FFT calculation amounts to 0.64 of the whole periodic DFT calculation. The replacement of a double precision FFT in the periodic PWscf code with a single precision FFT gives no appreciable differences in both the numerical total energies and the interatomic forces, guaranteeing the use of a single precision GPU based FFT, CUFFT, for the code. The use of the CUFFT reduces the fraction to 0.20 of the whole PWscf code; the replacement speedups a factor of 2.2 for single CPU system. The use of the multi-CPU system with the GPU FFT accelerates by 2.2 f , where f is the acceleration factor of the multi-CPU system. The single precision GPU calculation is implementable in any self-consistent electronic structure code, except for the eigen- solver part in the DFT codes. Key words : GPU, First Principles Calculation, DFT, Planewaves, GPGPU, CUFFT 1. Introduction First principles or ab initio electronic structure calculation methods are a robust technique to calculate and predict the properties of materials; the methods enable us not only to predict the solid state properties but also to design new materials. The methods, however, require a high computation cost. There have been two approach to accelerate the computation; one is an improvement of the hardware system and the other is the one of the software system. The devices, with higher mobility of carriers such as GaAs compound than silicon devices, have been attempted to use for the CPU (Central Processing Unit) devices to accelerate the calculations. Since the GaAs compound is a two component system, it has been difficult to manufacture defect- free GaAs devices. At present we are unable to find commercial computers using the GaAs device. So the silicon devices with finer spaced wiring have been manufactured instead of the two-component CPU devices to accelerate the computations. The decrease of the spacing of the wiring, however, has been saturated due to the limit of the use of the short wavelength of rights. These lead to the stagnation of the acceleration of the calculations with single CPU’s. On one hand, the MPI (Message Passing Interface) system has enabled us to accelerate ∗ Received 30 May, 2011 (No. 11-0305) the calculations to overcome the stagnation. This is the parallel computing. On the other hand, [DOI: 10.1299/jcst.5.89] Copyright c 2011 by JSME the GPU devices, which have been used for the fast outputting the data into graphic display, 89 Journal of Computational Vol.5, No.3, 2011 Science and Technology have begun to be used for the scientific numerical calculations. This is another challenge to overcome the stagnation. This is the GPGPU (General Purpose Graphics Processing Units). The application of the GPGPU to the planewave based first principles electronic structure code is essential for accelerating the code. The code allows us to solve a partial differential Kohn-Sham equation HKSΨi = iΨi. (1) The Ψ is the wavefunction, i is the eigenstates, and is the eigenvalues. The HKS is a Kohn- Sham Hamiltonian, which is given by ∇2 H = − + V (r) + V (r) + V (r)(2) KS 2 ion H xc in Hartree energy Eh, and Bohr radius a0 units. The first term is the kinetic operator for electron, the second the electron-atom potential, the third the Hartree potential, and the fourth the exchange and correlation (XC) potential of the electrons. The XC potential is given by δE [ρ] V = xc , (3) xc δρ(r) where ρ is the charge density. The planewave based DFT methods expand the wavefunctions into the planewaves to solve the equation. The following process of iteration is needed to reach the self-consistent solution of the Kohn-Sham equation. The use of the charge density ρ calculated from the wavefunction Ψ, which is the solution of the Eq. (1), allows us to cal- culates the Hartree potential and the XC potential. Substituting the old potential in Eq. (1) for the new potential, which is the sum of the Hartree potential, the XC potential, and the pseu- dopotentials, enables us to solve the equation which will gives another new potential. This is an improvement process of the potential. This is a self-consistent process of the Kohn-Sham equation. Since the XC potential is defined in a real space, we need to transform the charge density in spectral space into the real space, to evaluate the XC potential, and to transform back to the spectral space. These processes use the fast Fourier transformation (FFT) rou- tines. To obtain the wavefunction Ψ, which is the eigenvectors of the equation, we are able to use an iterative procedure for the large size of the rectangular Hamiltonian matrices. This is the CP (Car-Parrinello) method(1). The planewave based self-consistent process uses a number of the FFT and inverse FFT routines to reach the self-consistent solutions. So the FFT calculations spend a large fraction of the time of the total calculation in the first principles calculations. As will be shown later, the FFT spends 0.65 of the fraction of the total time of the calculation for the system with two silicon atoms in a rhombohedral unitcell and the FFT spends 0.64 for the system with 128 silicon atoms. The implementation of GPU FFT routines instead of the CPU FFT routines is expected to accelerate the total computation time of the self-consistent process. In this paper, we accelerate the code using the GPU FFT instead of the CPU FFT. An earlier, primitive version of this paper has been presented at elsewhere(2). While Goedecker’s group has accel- erated their realspace BIGDFT code with a GPU based wavelet transformation routine and Intel compiler in 2009(3), there has been no report on the implementation of the GPU based FFT code in the reciprocal space, periodic, DFT code. 2. GPU and GPGPU 2.1. Graphics Processing Unit: GPU The GPU’s are specialized devices for rendering and accelerating graphics operations and are commercial and mass products for general consumers including gamers. In a broad sense, on one hand, GPU is a graphics card. On the other hand, in a narrow sense, GPU means the processor on the graphic card. In this paper, we define GPU as the former broad sense. The GPU has the following features: First, the GPU is a processor system to calculate floating point operations. Almost all the GPU’s are for four-byte floating point operations, 90 Journal of Computational Vol.5, No.3, 2011 Science and Technology which is faster in computation than the eight-byte computation. Second, the GPU’s are de- signed as Single Instruction Multiple Data (SIMD) system. We are able to exploit multiple data stream against a single instruction stream to perform operations which are naturally par- allelized. The GPU’s have advantages over CPU in primitive operations for graphics display output such as matrix multiplication and trigonometric functions and are able to draw directly to screen much faster than CPU. Third, the GPU’s incorporate many processor cores; for instance, GeForce GTX 285 contains 240 processor cores which are used for special mathe- matical operations such as trigonometrical functions and matrix operations. Fourth, the GPU’s have three types of memories; They are local, shared, and global memories; each of the mi- crochips uses all the types of memories. The microchips have high memory bandwidth. While the bandwidth of a CPU system is from 2.0 to 25.6 GB/s, the one of the GPU system amounts to 10.0 GB/s to over 177 GB/s. These four features enable us to extend the original use of the GPU’s to general computational science and engineering. This application is the GPGPU. 2.2. GPU as heterogeneous multicore The processing of the computer has been accelerated by the development of the CPU in the following sequence; scalar, vector, parallel, and multicore parallel processing. In the past a single computer contained a single processor; To accelerate the computations, the CPU has changed from the scalar CPU to the vector CPU and to the parallel CPU. After that, more than single processor have been integrated into the single computer. This is the multicore processors. The processors have enabled us to accelerate the computations in the numerical calculations. This has been forced by the stagnation of the development of high speed devices with high mobilities of carriers. The next stage of the acceleration of the computations has been a heterogeneous mul- ticore processing. One of the processing was the GRAPE system, for instance. The system uses a hardware to accelerate computations of the gravitational interaction which spends a large fraction of the total computation time(4).
Recommended publications
  • GPAW, Gpus, and LUMI
    GPAW, GPUs, and LUMI Martti Louhivuori, CSC - IT Center for Science Jussi Enkovaara GPAW 2021: Users and Developers Meeting, 2021-06-01 Outline LUMI supercomputer Brief history of GPAW with GPUs GPUs and DFT Current status Roadmap LUMI - EuroHPC system of the North Pre-exascale system with AMD CPUs and GPUs ~ 550 Pflop/s performance Half of the resources dedicated to consortium members Programming for LUMI Finland, Belgium, Czechia, MPI between nodes / GPUs Denmark, Estonia, Iceland, HIP and OpenMP for GPUs Norway, Poland, Sweden, and how to use Python with AMD Switzerland GPUs? https://www.lumi-supercomputer.eu GPAW and GPUs: history (1/2) Early proof-of-concept implementation for NVIDIA GPUs in 2012 ground state DFT and real-time TD-DFT with finite-difference basis separate version for RPA with plane-waves Hakala et al. in "Electronic Structure Calculations on Graphics Processing Units", Wiley (2016), https://doi.org/10.1002/9781118670712 PyCUDA, cuBLAS, cuFFT, custom CUDA kernels Promising performance with factor of 4-8 speedup in best cases (CPU node vs. GPU node) GPAW and GPUs: history (2/2) Code base diverged from the main branch quite a bit proof-of-concept implementation had lots of quick and dirty hacks fixes and features were pulled from other branches and patches no proper unit tests for GPU functionality active development stopped soon after publications Before development re-started, code didn't even work anymore on modern GPUs without applying a few small patches Lesson learned: try to always get new functionality to the
    [Show full text]
  • D6.1 Report on the Deployment of the Max Demonstrators and Feedback to WP1-5
    Ref. Ares(2020)2820381 - 31/05/2020 HORIZON2020 European Centre of Excellence ​ Deliverable D6.1 Report on the deployment of the MaX Demonstrators and feedback to WP1-5 D6.1 Report on the deployment of the MaX Demonstrators and feedback to WP1-5 Pablo Ordejón, Uliana Alekseeva, Stefano Baroni, Riccardo Bertossa, Miki Bonacci, Pietro Bonfà, Claudia Cardoso, Carlo Cavazzoni, Vladimir Dikan, Stefano de Gironcoli, Andrea Ferretti, Alberto García, Luigi Genovese, Federico Grasselli, Anton Kozhevnikov, Deborah Prezzi, Davide Sangalli, Joost VandeVondele, Daniele Varsano, Daniel Wortmann Due date of deliverable: 31/05/2020 Actual submission date: 31/05/2020 Final version: 31/05/2020 Lead beneficiary: ICN2 (participant number 3) Dissemination level: PU - Public www.max-centre.eu 1 HORIZON2020 European Centre of Excellence ​ Deliverable D6.1 Report on the deployment of the MaX Demonstrators and feedback to WP1-5 Document information Project acronym: MaX Project full title: Materials Design at the Exascale Research Action Project type: European Centre of Excellence in materials modelling, simulations and design EC Grant agreement no.: 824143 Project starting / end date: 01/12/2018 (month 1) / 30/11/2021 (month 36) Website: www.max-centre.eu Deliverable No.: D6.1 Authors: P. Ordejón, U. Alekseeva, S. Baroni, R. Bertossa, M. ​ Bonacci, P. Bonfà, C. Cardoso, C. Cavazzoni, V. Dikan, S. de Gironcoli, A. Ferretti, A. García, L. Genovese, F. Grasselli, A. Kozhevnikov, D. Prezzi, D. Sangalli, J. VandeVondele, D. Varsano, D. Wortmann To be cited as: Ordejón, et al., (2020): Report on the deployment of the MaX Demonstrators and feedback to WP1-5. Deliverable D6.1 of the H2020 project MaX (final version as of 31/05/2020).
    [Show full text]
  • 5 Jul 2020 (finite Non-Periodic Vs
    ELSI | An Open Infrastructure for Electronic Structure Solvers Victor Wen-zhe Yua, Carmen Camposb, William Dawsonc, Alberto Garc´ıad, Ville Havue, Ben Hourahinef, William P. Huhna, Mathias Jacqueling, Weile Jiag,h, Murat Ke¸celii, Raul Laasnera, Yingzhou Lij, Lin Ling,h, Jianfeng Luj,k,l, Jonathan Moussam, Jose E. Romanb, Alvaro´ V´azquez-Mayagoitiai, Chao Yangg, Volker Bluma,l,∗ aDepartment of Mechanical Engineering and Materials Science, Duke University, Durham, NC 27708, USA bDepartament de Sistemes Inform`aticsi Computaci´o,Universitat Polit`ecnica de Val`encia,Val`encia,Spain cRIKEN Center for Computational Science, Kobe 650-0047, Japan dInstitut de Ci`enciade Materials de Barcelona (ICMAB-CSIC), Bellaterra E-08193, Spain eDepartment of Applied Physics, Aalto University, Aalto FI-00076, Finland fSUPA, University of Strathclyde, Glasgow G4 0NG, UK gComputational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA hDepartment of Mathematics, University of California, Berkeley, CA 94720, USA iComputational Science Division, Argonne National Laboratory, Argonne, IL 60439, USA jDepartment of Mathematics, Duke University, Durham, NC 27708, USA kDepartment of Physics, Duke University, Durham, NC 27708, USA lDepartment of Chemistry, Duke University, Durham, NC 27708, USA mMolecular Sciences Software Institute, Blacksburg, VA 24060, USA Abstract Routine applications of electronic structure theory to molecules and peri- odic systems need to compute the electron density from given Hamiltonian and, in case of non-orthogonal basis sets, overlap matrices. System sizes can range from few to thousands or, in some examples, millions of atoms. Different discretization schemes (basis sets) and different system geometries arXiv:1912.13403v3 [physics.comp-ph] 5 Jul 2020 (finite non-periodic vs.
    [Show full text]
  • The CECAM Electronic Structure Library and the Modular Software Development Paradigm
    The CECAM electronic structure library and the modular software development paradigm Cite as: J. Chem. Phys. 153, 024117 (2020); https://doi.org/10.1063/5.0012901 Submitted: 06 May 2020 . Accepted: 08 June 2020 . Published Online: 13 July 2020 Micael J. T. Oliveira , Nick Papior , Yann Pouillon , Volker Blum , Emilio Artacho , Damien Caliste , Fabiano Corsetti , Stefano de Gironcoli , Alin M. Elena , Alberto García , Víctor M. García-Suárez , Luigi Genovese , William P. Huhn , Georg Huhs , Sebastian Kokott , Emine Küçükbenli , Ask H. Larsen , Alfio Lazzaro , Irina V. Lebedeva , Yingzhou Li , David López- Durán , Pablo López-Tarifa , Martin Lüders , Miguel A. L. Marques , Jan Minar , Stephan Mohr , Arash A. Mostofi , Alan O’Cais , Mike C. Payne, Thomas Ruh, Daniel G. A. Smith , José M. Soler , David A. Strubbe , Nicolas Tancogne-Dejean , Dominic Tildesley, Marc Torrent , and Victor Wen-zhe Yu COLLECTIONS Paper published as part of the special topic on Electronic Structure Software Note: This article is part of the JCP Special Topic on Electronic Structure Software. This paper was selected as Featured ARTICLES YOU MAY BE INTERESTED IN Recent developments in the PySCF program package The Journal of Chemical Physics 153, 024109 (2020); https://doi.org/10.1063/5.0006074 An open-source coding paradigm for electronic structure calculations Scilight 2020, 291101 (2020); https://doi.org/10.1063/10.0001593 Siesta: Recent developments and applications The Journal of Chemical Physics 152, 204108 (2020); https://doi.org/10.1063/5.0005077 J. Chem. Phys. 153, 024117 (2020); https://doi.org/10.1063/5.0012901 153, 024117 © 2020 Author(s). The Journal ARTICLE of Chemical Physics scitation.org/journal/jcp The CECAM electronic structure library and the modular software development paradigm Cite as: J.
    [Show full text]
  • Improvements of Bigdft Code in Modern HPC Architectures
    Available on-line at www.prace-ri.eu Partnership for Advanced Computing in Europe Improvements of BigDFT code in modern HPC architectures Luigi Genovesea;b;∗, Brice Videaua, Thierry Deutscha, Huan Tranc, Stefan Goedeckerc aLaboratoire de Simulation Atomistique, SP2M/INAC/CEA, 17 Av. des Martyrs, 38054 Grenoble, France bEuropean Synchrotron Radiation Facility, 6 rue Horowitz, BP 220, 38043 Grenoble, France cInstitut f¨urPhysik, Universit¨atBasel, Klingelbergstr.82, 4056 Basel, Switzerland Abstract Electronic structure calculations (DFT codes) are certainly among the disciplines for which an increasing of the computa- tional power correspond to an advancement in the scientific results. In this report, we present the ongoing advancements of DFT code that can run on massively parallel, hybrid and heterogeneous CPU-GPU clusters. This DFT code, named BigDFT, is delivered within the GNU-GPL license either in a stand-alone version or integrated in the ABINIT software package. Hybrid BigDFT routines were initially ported with NVidia's CUDA language, and recently more functionalities have been added with new routines writeen within Kronos' OpenCL standard. The formalism of this code is based on Daubechies wavelets, which is a systematic real-space based basis set. The properties of this basis set are well suited for an extension on a GPU-accelerated environment. In addition to focusing on the performances of the MPI and OpenMP parallelisation the BigDFT code, this presentation also relies of the usage of the GPU resources in a complex code with different kinds of operations. A discussion on the interest of present and expected performances of Hybrid architectures computation in the framework of electronic structure calculations is also adressed.
    [Show full text]
  • Kepler Gpus and NVIDIA's Life and Material Science
    LIFE AND MATERIAL SCIENCES Mark Berger; [email protected] Founded 1993 Invented GPU 1999 – Computer Graphics Visual Computing, Supercomputing, Cloud & Mobile Computing NVIDIA - Core Technologies and Brands GPU Mobile Cloud ® ® GeForce Tegra GRID Quadro® , Tesla® Accelerated Computing Multi-core plus Many-cores GPU Accelerator CPU Optimized for Many Optimized for Parallel Tasks Serial Tasks 3-10X+ Comp Thruput 7X Memory Bandwidth 5x Energy Efficiency How GPU Acceleration Works Application Code Compute-Intensive Functions Rest of Sequential 5% of Code CPU Code GPU CPU + GPUs : Two Year Heart Beat 32 Volta Stacked DRAM 16 Maxwell Unified Virtual Memory 8 Kepler Dynamic Parallelism 4 Fermi 2 FP64 DP GFLOPS GFLOPS per DP Watt 1 Tesla 0.5 CUDA 2008 2010 2012 2014 Kepler Features Make GPU Coding Easier Hyper-Q Dynamic Parallelism Speedup Legacy MPI Apps Less Back-Forth, Simpler Code FERMI 1 Work Queue CPU Fermi GPU CPU Kepler GPU KEPLER 32 Concurrent Work Queues Developer Momentum Continues to Grow 100M 430M CUDA –Capable GPUs CUDA-Capable GPUs 150K 1.6M CUDA Downloads CUDA Downloads 1 50 Supercomputer Supercomputers 60 640 University Courses University Courses 4,000 37,000 Academic Papers Academic Papers 2008 2013 Explosive Growth of GPU Accelerated Apps # of Apps Top Scientific Apps 200 61% Increase Molecular AMBER LAMMPS CHARMM NAMD Dynamics GROMACS DL_POLY 150 Quantum QMCPACK Gaussian 40% Increase Quantum Espresso NWChem Chemistry GAMESS-US VASP CAM-SE 100 Climate & COSMO NIM GEOS-5 Weather WRF Chroma GTS 50 Physics Denovo ENZO GTC MILC ANSYS Mechanical ANSYS Fluent 0 CAE MSC Nastran OpenFOAM 2010 2011 2012 SIMULIA Abaqus LS-DYNA Accelerated, In Development NVIDIA GPU Life Science Focus Molecular Dynamics: All codes are available AMBER, CHARMM, DESMOND, DL_POLY, GROMACS, LAMMPS, NAMD Great multi-GPU performance GPU codes: ACEMD, HOOMD-Blue Focus: scaling to large numbers of GPUs Quantum Chemistry: key codes ported or optimizing Active GPU acceleration projects: VASP, NWChem, Gaussian, GAMESS, ABINIT, Quantum Espresso, BigDFT, CP2K, GPAW, etc.
    [Show full text]
  • Introduction to DFT and the Plane-Wave Pseudopotential Method
    Introduction to DFT and the plane-wave pseudopotential method Keith Refson STFC Rutherford Appleton Laboratory Chilton, Didcot, OXON OX11 0QX 23 Apr 2014 Parallel Materials Modelling Packages @ EPCC 1 / 55 Introduction Synopsis Motivation Some ab initio codes Quantum-mechanical approaches Density Functional Theory Electronic Structure of Condensed Phases Total-energy calculations Introduction Basis sets Plane-waves and Pseudopotentials How to solve the equations Parallel Materials Modelling Packages @ EPCC 2 / 55 Synopsis Introduction A guided tour inside the “black box” of ab-initio simulation. Synopsis • Motivation • The rise of quantum-mechanical simulations. Some ab initio codes Wavefunction-based theory • Density-functional theory (DFT) Quantum-mechanical • approaches Quantum theory in periodic boundaries • Plane-wave and other basis sets Density Functional • Theory SCF solvers • Molecular Dynamics Electronic Structure of Condensed Phases Recommended Reading and Further Study Total-energy calculations • Basis sets Jorge Kohanoff Electronic Structure Calculations for Solids and Molecules, Plane-waves and Theory and Computational Methods, Cambridge, ISBN-13: 9780521815918 Pseudopotentials • Dominik Marx, J¨urg Hutter Ab Initio Molecular Dynamics: Basic Theory and How to solve the Advanced Methods Cambridge University Press, ISBN: 0521898633 equations • Richard M. Martin Electronic Structure: Basic Theory and Practical Methods: Basic Theory and Practical Density Functional Approaches Vol 1 Cambridge University Press, ISBN: 0521782856
    [Show full text]
  • The CECAM Electronic Structure Library and the Modular Software Development Paradigm
    The CECAM electronic structure library and the modular software development paradigm Cite as: J. Chem. Phys. 153, 024117 (2020); https://doi.org/10.1063/5.0012901 Submitted: 06 May 2020 . Accepted: 08 June 2020 . Published Online: 13 July 2020 Micael J. T. Oliveira, Nick Papior, Yann Pouillon, Volker Blum, Emilio Artacho, Damien Caliste, Fabiano Corsetti, Stefano de Gironcoli, Alin M. Elena, Alberto García, Víctor M. García-Suárez, Luigi Genovese, William P. Huhn, Georg Huhs, Sebastian Kokott, Emine Küçükbenli, Ask H. Larsen, Alfio Lazzaro, Irina V. Lebedeva, Yingzhou Li, David López-Durán, Pablo López-Tarifa, Martin Lüders, Miguel A. L. Marques, Jan Minar, Stephan Mohr, Arash A. Mostofi, Alan O’Cais, Mike C. Payne, Thomas Ruh, Daniel G. A. Smith, José M. Soler, David A. Strubbe, Nicolas Tancogne-Dejean, Dominic Tildesley, Marc Torrent, and Victor Wen-zhe Yu COLLECTIONS Paper published as part of the special topic on Electronic Structure SoftwareESS2020 This paper was selected as Featured This paper was selected as Scilight ARTICLES YOU MAY BE INTERESTED IN Recent developments in the PySCF program package The Journal of Chemical Physics 153, 024109 (2020); https://doi.org/10.1063/5.0006074 Electronic structure software The Journal of Chemical Physics 153, 070401 (2020); https://doi.org/10.1063/5.0023185 Siesta: Recent developments and applications The Journal of Chemical Physics 152, 204108 (2020); https://doi.org/10.1063/5.0005077 J. Chem. Phys. 153, 024117 (2020); https://doi.org/10.1063/5.0012901 153, 024117 © 2020 Author(s). The Journal ARTICLE of Chemical Physics scitation.org/journal/jcp The CECAM electronic structure library and the modular software development paradigm Cite as: J.
    [Show full text]
  • Arxiv:2005.05756V2
    “This article may be downloaded for personal use only. Any other use requires prior permission of the author and AIP Publishing. This article appeared in Oliveira, M.J.T. [et al.]. The CECAM electronic structure library and the modular software development paradigm. "The Journal of Chemical Physics", 12020, vol. 153, núm. 2, and may be found at https://aip.scitation.org/doi/10.1063/5.0012901. The CECAM Electronic Structure Library and the modular software development paradigm Micael J. T. Oliveira,1, a) Nick Papior,2, b) Yann Pouillon,3, 4, c) Volker Blum,5, 6 Emilio Artacho,7, 8, 9 Damien Caliste,10 Fabiano Corsetti,11, 12 Stefano de Gironcoli,13 Alin M. Elena,14 Alberto Garc´ıa,15 V´ıctor M. Garc´ıa-Su´arez,16 Luigi Genovese,10 William P. Huhn,5 Georg Huhs,17 Sebastian Kokott,18 Emine K¨u¸c¨ukbenli,13, 19 Ask H. Larsen,20, 4 Alfio Lazzaro,21 Irina V. Lebedeva,22 Yingzhou Li,23 David L´opez-Dur´an,22 Pablo L´opez-Tarifa,24 Martin L¨uders,1, 14 Miguel A. L. Marques,25 Jan Minar,26 Stephan Mohr,17 Arash A. Mostofi,11 Alan O'Cais,27 Mike C. Payne,9 Thomas Ruh,28 Daniel G. A. Smith,29 Jos´eM. Soler,30 David A. Strubbe,31 Nicolas Tancogne-Dejean,1 Dominic Tildesley,32 Marc Torrent,33, 34 and Victor Wen-zhe Yu5 1)Max Planck Institute for the Structure and Dynamics of Matter, D-22761 Hamburg, Germany 2)DTU Computing Center, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark 3)Departamento CITIMAC, Universidad de Cantabria, Santander, Spain 4)Simune Atomistics, 20018 San Sebasti´an,Spain 5)Department of Mechanical Engineering and Materials Science, Duke University, Durham, NC 27708, USA 6)Department of Chemistry, Duke University, Durham, NC 27708, USA 7)CIC Nanogune BRTA and DIPC, 20018 San Sebasti´an,Spain 8)Ikerbasque, Basque Foundation for Science, 48011 Bilbao, Spain 9)Theory of Condensed Matter, Cavendish Laboratory, University of Cambridge, Cambridge CB3 0HE, United Kingdom 10)Department of Physics, IRIG, Univ.
    [Show full text]
  • (DFT) and Its Application to Defects in Semiconductors
    Introduction to DFT and its Application to Defects in Semiconductors Noa Marom Physics and Engineering Physics Tulane University New Orleans The Future: Computer-Aided Materials Design • Can access the space of materials not experimentally known • Can scan through structures and compositions faster than is possible experimentally • Unbiased search can yield unintuitive solutions • Can accelerate the discovery and deployment of new materials Accurate electronic structure methods Efficient search algorithms Dirac’s Challenge “The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble. It therefore becomes desirable that approximate practical methods of applying quantum mechanics should be developed, which can lead to an P. A. M. Dirac explanation of the main features of Physics Nobel complex atomic systems ... ” Prize, 1933 -P. A. M. Dirac, 1929 The Many (Many, Many) Body Problem Schrödinger’s Equation: The physical laws are completely known But… There are as many electrons in a penny as stars in the known universe! Electronic Structure Methods for Materials Properties Structure Ionization potential (IP) Absorption Mechanical Electron Affinity (EA) spectrum properties Fundamental gap Optical gap Vibrational Defect/dopant charge Exciton binding spectrum transition levels energy Ground State Charged Excitation Neutral Excitation DFT
    [Show full text]
  • Lawrence Berkeley National Laboratory Recent Work
    Lawrence Berkeley National Laboratory Recent Work Title From NWChem to NWChemEx: Evolving with the Computational Chemistry Landscape. Permalink https://escholarship.org/uc/item/4sm897jh Journal Chemical reviews, 121(8) ISSN 0009-2665 Authors Kowalski, Karol Bair, Raymond Bauman, Nicholas P et al. Publication Date 2021-04-01 DOI 10.1021/acs.chemrev.0c00998 Peer reviewed eScholarship.org Powered by the California Digital Library University of California From NWChem to NWChemEx: Evolving with the computational chemistry landscape Karol Kowalski,y Raymond Bair,z Nicholas P. Bauman,y Jeffery S. Boschen,{ Eric J. Bylaska,y Jeff Daily,y Wibe A. de Jong,x Thom Dunning, Jr,y Niranjan Govind,y Robert J. Harrison,k Murat Keçeli,z Kristopher Keipert,? Sriram Krishnamoorthy,y Suraj Kumar,y Erdal Mutlu,y Bruce Palmer,y Ajay Panyala,y Bo Peng,y Ryan M. Richard,{ T. P. Straatsma,# Peter Sushko,y Edward F. Valeev,@ Marat Valiev,y Hubertus J. J. van Dam,4 Jonathan M. Waldrop,{ David B. Williams-Young,x Chao Yang,x Marcin Zalewski,y and Theresa L. Windus*,r yPacific Northwest National Laboratory, Richland, WA 99352 zArgonne National Laboratory, Lemont, IL 60439 {Ames Laboratory, Ames, IA 50011 xLawrence Berkeley National Laboratory, Berkeley, 94720 kInstitute for Advanced Computational Science, Stony Brook University, Stony Brook, NY 11794 ?NVIDIA Inc, previously Argonne National Laboratory, Lemont, IL 60439 #National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, TN 37831-6373 @Department of Chemistry, Virginia Tech, Blacksburg, VA 24061 4Brookhaven National Laboratory, Upton, NY 11973 rDepartment of Chemistry, Iowa State University and Ames Laboratory, Ames, IA 50011 E-mail: [email protected] 1 Abstract Since the advent of the first computers, chemists have been at the forefront of using computers to understand and solve complex chemical problems.
    [Show full text]
  • Chemistry Packages at CHPC
    Chemistry Packages at CHPC Anita M. Orendt Center for High Performance Computing [email protected] Fall 2014 Purpose of Presentation • Identify the computational chemistry software and related tools currently available at CHPC • Present brief overview of these packages • Present how to access packages on CHPC • Information on usage of Gaussian09 http://www.chpc.utah.edu Ember Lonepeak /scratch/lonepeak/serial 141 nodes/1692 cores 16 nodes/256 cores Infiniband and GigE Large memory, GigE General 70 nodes/852 cores 12 GPU nodes (6 General) Kingspeak 199 nodes/4008 cores Ash cluster (417 nodes) Infiniband and GigE Switch General 44 nodes/832 cores Apex Arch Great Wall PFS Telluride cluster (93 nodes) /scratch/ibrix meteoXX/atmosXX nodes NFS Home NFS Administrative Directories & /scratch/kingspeak/serial Group Nodes Directories 10/1/14 http://www.chpc.utah.edu Slide 3 Brief Overview CHPC Resources • Computational Clusters – kingspeak, ember – general allocation and owner-guest on owner nodes – lonepeak – no allocation – ash and telluride – can run as smithp-guest and cheatham-guest, respectively • Home directory – NFS mounted on all clusters – /uufs/chpc.utah.edu/common/home/<uNID> – generally not backed up (there are exceptions) • Scratch systems – /scratch/ibrix/chpc_gen – kingspeak, ember, ash, telluride – 56TB – /scratch/kingspeak/serial – kingspeak, ember, ash – 175TB – /scratch/lonepeak/serial – lonepeak – 33 TB – /scratch/local on compute nodes • Applications – /uufs/chpc.utah.edu/sys/pkg (use for lonepeak, linux desktops if chpc
    [Show full text]