HPC Documentation Release 0.0
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Slurm Overview and Elasticsearch Plugin
Slurm Workload Manager Overview SC15 Alejandro Sanchez [email protected] Copyright 2015 SchedMD LLC http://www.schedmd.com Slurm Workload Manager Overview ● Originally intended as simple resource manager, but has evolved into sophisticated batch scheduler ● Able to satisfy scheduling requirements for major computer centers with use of optional plugins ● No single point of failure, backup daemons, fault-tolerant job options ● Highly scalable (3.1M core Tianhe-2 at NUDT) ● Highly portable (autoconf, extensive plugins for various environments) ● Open source (GPL v2) ● Operating on many of the world's largest computers ● About 500,000 lines of code today (plus test suite and documentation) Copyright 2015 SchedMD LLC http://www.schedmd.com Enterprise Architecture Copyright 2015 SchedMD LLC http://www.schedmd.com Architecture ● Kernel with core functions plus about 100 plugins to support various architectures and features ● Easily configured using building-block approach ● Easy to enhance for new architectures or features, typically just by adding new plugins Copyright 2015 SchedMD LLC http://www.schedmd.com Elasticsearch Plugin Copyright 2015 SchedMD LLC http://www.schedmd.com Scheduling Capabilities ● Fair-share scheduling with hierarchical bank accounts ● Preemptive and gang scheduling (time-slicing parallel jobs) ● Integrated with database for accounting and configuration ● Resource allocations optimized for topology ● Advanced resource reservations ● Manages resources across an enterprise Copyright 2015 SchedMD LLC http://www.schedmd.com -
The Component Architecture of Open Mpi: Enabling Third-Party Collective Algorithms∗
THE COMPONENT ARCHITECTURE OF OPEN MPI: ENABLING THIRD-PARTY COLLECTIVE ALGORITHMS∗ Jeffrey M. Squyres and Andrew Lumsdaine Open Systems Laboratory, Indiana University Bloomington, Indiana, USA [email protected] [email protected] Abstract As large-scale clusters become more distributed and heterogeneous, significant research interest has emerged in optimizing MPI collective operations because of the performance gains that can be realized. However, researchers wishing to develop new algorithms for MPI collective operations are typically faced with sig- nificant design, implementation, and logistical challenges. To address a number of needs in the MPI research community, Open MPI has been developed, a new MPI-2 implementation centered around a lightweight component architecture that provides a set of component frameworks for realizing collective algorithms, point-to-point communication, and other aspects of MPI implementations. In this paper, we focus on the collective algorithm component framework. The “coll” framework provides tools for researchers to easily design, implement, and exper- iment with new collective algorithms in the context of a production-quality MPI. Performance results with basic collective operations demonstrate that the com- ponent architecture of Open MPI does not introduce any performance penalty. Keywords: MPI implementation, Parallel computing, Component architecture, Collective algorithms, High performance 1. Introduction Although the performance of the MPI collective operations [6, 17] can be a large factor in the overall run-time of a parallel application, their optimiza- tion has not necessarily been a focus in some MPI implementations until re- ∗This work was supported by a grant from the Lilly Endowment and by National Science Foundation grant 0116050. -
Kepler Gpus and NVIDIA's Life and Material Science
LIFE AND MATERIAL SCIENCES Mark Berger; [email protected] Founded 1993 Invented GPU 1999 – Computer Graphics Visual Computing, Supercomputing, Cloud & Mobile Computing NVIDIA - Core Technologies and Brands GPU Mobile Cloud ® ® GeForce Tegra GRID Quadro® , Tesla® Accelerated Computing Multi-core plus Many-cores GPU Accelerator CPU Optimized for Many Optimized for Parallel Tasks Serial Tasks 3-10X+ Comp Thruput 7X Memory Bandwidth 5x Energy Efficiency How GPU Acceleration Works Application Code Compute-Intensive Functions Rest of Sequential 5% of Code CPU Code GPU CPU + GPUs : Two Year Heart Beat 32 Volta Stacked DRAM 16 Maxwell Unified Virtual Memory 8 Kepler Dynamic Parallelism 4 Fermi 2 FP64 DP GFLOPS GFLOPS per DP Watt 1 Tesla 0.5 CUDA 2008 2010 2012 2014 Kepler Features Make GPU Coding Easier Hyper-Q Dynamic Parallelism Speedup Legacy MPI Apps Less Back-Forth, Simpler Code FERMI 1 Work Queue CPU Fermi GPU CPU Kepler GPU KEPLER 32 Concurrent Work Queues Developer Momentum Continues to Grow 100M 430M CUDA –Capable GPUs CUDA-Capable GPUs 150K 1.6M CUDA Downloads CUDA Downloads 1 50 Supercomputer Supercomputers 60 640 University Courses University Courses 4,000 37,000 Academic Papers Academic Papers 2008 2013 Explosive Growth of GPU Accelerated Apps # of Apps Top Scientific Apps 200 61% Increase Molecular AMBER LAMMPS CHARMM NAMD Dynamics GROMACS DL_POLY 150 Quantum QMCPACK Gaussian 40% Increase Quantum Espresso NWChem Chemistry GAMESS-US VASP CAM-SE 100 Climate & COSMO NIM GEOS-5 Weather WRF Chroma GTS 50 Physics Denovo ENZO GTC MILC ANSYS Mechanical ANSYS Fluent 0 CAE MSC Nastran OpenFOAM 2010 2011 2012 SIMULIA Abaqus LS-DYNA Accelerated, In Development NVIDIA GPU Life Science Focus Molecular Dynamics: All codes are available AMBER, CHARMM, DESMOND, DL_POLY, GROMACS, LAMMPS, NAMD Great multi-GPU performance GPU codes: ACEMD, HOOMD-Blue Focus: scaling to large numbers of GPUs Quantum Chemistry: key codes ported or optimizing Active GPU acceleration projects: VASP, NWChem, Gaussian, GAMESS, ABINIT, Quantum Espresso, BigDFT, CP2K, GPAW, etc. -
Modern Quantum Chemistry with [Open]Molcas
Modern quantum chemistry with [Open]Molcas Cite as: J. Chem. Phys. 152, 214117 (2020); https://doi.org/10.1063/5.0004835 Submitted: 17 February 2020 . Accepted: 11 May 2020 . Published Online: 05 June 2020 Francesco Aquilante , Jochen Autschbach , Alberto Baiardi , Stefano Battaglia , Veniamin A. Borin , Liviu F. Chibotaru , Irene Conti , Luca De Vico , Mickaël Delcey , Ignacio Fdez. Galván , Nicolas Ferré , Leon Freitag , Marco Garavelli , Xuejun Gong , Stefan Knecht , Ernst D. Larsson , Roland Lindh , Marcus Lundberg , Per Åke Malmqvist , Artur Nenov , Jesper Norell , Michael Odelius , Massimo Olivucci , Thomas B. Pedersen , Laura Pedraza-González , Quan M. Phung , Kristine Pierloot , Markus Reiher , Igor Schapiro , Javier Segarra-Martí , Francesco Segatta , Luis Seijo , Saumik Sen , Dumitru-Claudiu Sergentu , Christopher J. Stein , Liviu Ungur , Morgane Vacher , Alessio Valentini , and Valera Veryazov J. Chem. Phys. 152, 214117 (2020); https://doi.org/10.1063/5.0004835 152, 214117 © 2020 Author(s). The Journal ARTICLE of Chemical Physics scitation.org/journal/jcp Modern quantum chemistry with [Open]Molcas Cite as: J. Chem. Phys. 152, 214117 (2020); doi: 10.1063/5.0004835 Submitted: 17 February 2020 • Accepted: 11 May 2020 • Published Online: 5 June 2020 Francesco Aquilante,1,a) Jochen Autschbach,2,b) Alberto Baiardi,3,c) Stefano Battaglia,4,d) Veniamin A. Borin,5,e) Liviu F. Chibotaru,6,f) Irene Conti,7,g) Luca De Vico,8,h) Mickaël Delcey,9,i) Ignacio Fdez. Galván,4,j) Nicolas Ferré,10,k) Leon Freitag,3,l) Marco Garavelli,7,m) Xuejun Gong,11,n) Stefan Knecht,3,o) Ernst D. Larsson,12,p) Roland Lindh,4,q) Marcus Lundberg,9,r) Per Åke Malmqvist,12,s) Artur Nenov,7,t) Jesper Norell,13,u) Michael Odelius,13,v) Massimo Olivucci,8,14,w) Thomas B. -
Arxiv:1911.06836V1 [Physics.Chem-Ph] 8 Aug 2019 A
Multireference electron correlation methods: Journeys along potential energy surfaces Jae Woo Park,1, ∗ Rachael Al-Saadon,2 Matthew K. MacLeod,3 Toru Shiozaki,2, 4 and Bess Vlaisavljevich5, y 1Department of Chemistry, Chungbuk National University, Chungdae-ro 1, Cheongju 28644, Korea. 2Department of Chemistry, Northwestern University, 2145 Sheridan Rd., Evanston, IL 60208, USA. 3Workday, 4900 Pearl Circle East, Suite 100, Boulder, CO 80301, USA. 4Quantum Simulation Technologies, Inc., 625 Massachusetts Ave., Cambridge, MA 02139, USA. 5Department of Chemistry, University of South Dakota, 414 E. Clark Street, Vermillion, SD 57069, USA. (Dated: November 19, 2019) Multireference electron correlation methods describe static and dynamical electron correlation in a balanced way, and therefore, can yield accurate and predictive results even when single-reference methods or multicon- figurational self-consistent field (MCSCF) theory fails. One of their most prominent applications in quantum chemistry is the exploration of potential energy surfaces (PES). This includes the optimization of molecular ge- ometries, such as equilibrium geometries and conical intersections, and on-the-fly photodynamics simulations; both depend heavily on the ability of the method to properly explore the PES. Since such applications require the nuclear gradients and derivative couplings, the availability of analytical nuclear gradients greatly improves the utility of quantum chemical methods. This review focuses on the developments and advances made in the past two decades. To motivate the readers, we first summarize the notable applications of multireference elec- tron correlation methods to mainstream chemistry, including geometry optimizations and on-the-fly dynamics. Subsequently, we review the analytical nuclear gradient and derivative coupling theories for these methods, and the software infrastructure that allows one to make use of these quantities in applications. -
Intel® SSF Reference Design: Intel® Xeon Phi™ Processor, Intel® OPA
Intel® Scalable System Framework Reference Design Intel® Scalable System Framework (Intel® SSF) Reference Design Cluster installation for systems based on Intel® Xeon® processor E5-2600 v4 family including Intel® Ethernet. Based on OpenHPC v1.1 Version 1.0 Intel® Scalable System Framework Reference Design Summary This Reference Design is part of the Intel® Scalable System Framework series of reference collateral. The Reference Design is a verified implementation example of a given reference architecture, complete with hardware and software Bill of Materials information and cluster configuration instructions. It can confidently be used “as is”, or be the foundation for enhancements and/or modifications. Additional Reference Designs are expected in the future to provide example solutions for existing reference architecture definitions and for utilizing additional Intel® SSF elements. Similarly, more Reference Designs are expected as new reference architecture definitions are introduced. This Reference Design is developed in support of the specification listed below using certain Intel® SSF elements: • Intel® Scalable System Framework Architecture Specification • Servers with Intel® Xeon® processor E5-2600 v4 family processors • Intel® Ethernet • Software stack based on OpenHPC v1.1 2 Version 1.0 Legal Notices No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. -
A Tutorial for Theodore 2.0.2
A Tutorial for TheoDORE 2.0.2 Felix Plasser, Patrick Kimber Loughborough, 2019 Department of Chemistry – Loughborough University Contents 1 Before Starting3 1.1 Introduction . .3 1.2 Notation . .3 1.3 Installation . .3 2 Natural transition orbitals (Turbomole)4 2.1 Input generation . .4 2.2 Transition density matrix (1TDM) analysis . .5 2.3 Plotting of the orbitals . .6 3 Charge transfer number and exciton analysis (Turbomole)9 3.1 Input generation . .9 3.2 Transition density matrix (1TDM) analysis . 10 3.3 Electron-hole correlation plots . 11 4 Interface to the external cclib library (Gaussian 09) 12 4.1 Check the log file . 12 4.2 Input generation . 13 4.3 Transition density matrix (1TDM) analysis . 15 5 Advanced fragment input and double excitations (Columbus) 16 5.1 Fragment preparation using Avogadro . 17 5.2 Input generation . 18 5.3 Transition density matrix (1TDM) analysis . 19 6 Fragment decomposition for a transition metal complex 21 6.1 Input generation . 21 6.2 Transition density matrix analysis and decomposition . 23 7 Domain NTO and conditional density analysis 26 7.1 Input generation . 26 1 7.2 Transition density matrix analysis . 28 7.3 Plotting of the orbitals . 29 8 Attachment/detachment analysis (Molcas - natural orbitals) 31 8.1 Input generation . 31 8.2 State density matrix analysis . 32 8.3 Plotting of the orbitals . 33 9 Contact 34 TheoDORE tutorial 2 1 Before Starting 1.1 Introduction This tutorial is intended to provide an overview over the functionalities of the TheoDORE program package. Various tasks of different complexity are discussed using interfaces to dif- ferent quantum chemistry packages. -
SUSE Linux Enterprise High Performance Computing 15 SP3 Administration Guide Administration Guide SUSE Linux Enterprise High Performance Computing 15 SP3
SUSE Linux Enterprise High Performance Computing 15 SP3 Administration Guide Administration Guide SUSE Linux Enterprise High Performance Computing 15 SP3 SUSE Linux Enterprise High Performance Computing Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2020–2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see http://www.suse.com/company/legal/ . All third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its aliates. Asterisks (*) denote third-party trademarks. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents Preface vii 1 Available documentation vii 2 Giving feedback vii 3 Documentation conventions viii 4 Support x Support statement for SUSE Linux Enterprise High Performance Computing x • Technology previews xi 1 Introduction 1 1.1 Components provided 1 1.2 Hardware platform support 2 1.3 Support -
Lawrence Berkeley National Laboratory Recent Work
Lawrence Berkeley National Laboratory Recent Work Title From NWChem to NWChemEx: Evolving with the Computational Chemistry Landscape. Permalink https://escholarship.org/uc/item/4sm897jh Journal Chemical reviews, 121(8) ISSN 0009-2665 Authors Kowalski, Karol Bair, Raymond Bauman, Nicholas P et al. Publication Date 2021-04-01 DOI 10.1021/acs.chemrev.0c00998 Peer reviewed eScholarship.org Powered by the California Digital Library University of California From NWChem to NWChemEx: Evolving with the computational chemistry landscape Karol Kowalski,y Raymond Bair,z Nicholas P. Bauman,y Jeffery S. Boschen,{ Eric J. Bylaska,y Jeff Daily,y Wibe A. de Jong,x Thom Dunning, Jr,y Niranjan Govind,y Robert J. Harrison,k Murat Keçeli,z Kristopher Keipert,? Sriram Krishnamoorthy,y Suraj Kumar,y Erdal Mutlu,y Bruce Palmer,y Ajay Panyala,y Bo Peng,y Ryan M. Richard,{ T. P. Straatsma,# Peter Sushko,y Edward F. Valeev,@ Marat Valiev,y Hubertus J. J. van Dam,4 Jonathan M. Waldrop,{ David B. Williams-Young,x Chao Yang,x Marcin Zalewski,y and Theresa L. Windus*,r yPacific Northwest National Laboratory, Richland, WA 99352 zArgonne National Laboratory, Lemont, IL 60439 {Ames Laboratory, Ames, IA 50011 xLawrence Berkeley National Laboratory, Berkeley, 94720 kInstitute for Advanced Computational Science, Stony Brook University, Stony Brook, NY 11794 ?NVIDIA Inc, previously Argonne National Laboratory, Lemont, IL 60439 #National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, TN 37831-6373 @Department of Chemistry, Virginia Tech, Blacksburg, VA 24061 4Brookhaven National Laboratory, Upton, NY 11973 rDepartment of Chemistry, Iowa State University and Ames Laboratory, Ames, IA 50011 E-mail: [email protected] 1 Abstract Since the advent of the first computers, chemists have been at the forefront of using computers to understand and solve complex chemical problems. -
Natural Transition Orbital Calculations in the RASSI Module of Openmolcas Introduction
1 Natural Transition Orbital Calculations in the RASSI module of OpenMolcas Introduction This document is a manual for the natural transition orbital (NTO) calculation that is done in the RASSI module of OpenMolcas. It includes discussion of the theory of NTOs, an explanation of the code, and several input and output examples of the NTO calculations with the RASSI module. Prior to adding the NTO calculation to the RASSI module, OpenMolcas already contained code for NTO calculations (and other analysis tools) in the WFA module https://molcas.gitlab.io/OpenMolcas/sphinx/users.guide/programs/wfa.html but the new implementation extends the capability in two key ways: 1. An external package, in particular TheoDORE,1 is required for visualizing the NTOs calculated in the WHA module; however, NTOs calculated in the RASSI module can be visualized with luscus, https://sourceforge.net/projects/luscus/ which is a more commonly used external package for OpenMolcas users than TheoDORE. The new implementation prints out the NTO files in the same format as how other orbital files, such as ScfOrb file and RasOrb file, are printed. So the output file can also be used for three-dimensional graphics with the GRID_IT module. 2. WHA cannot calculate NTOs for states with different symmetries, but the new implementation in RASSI can do this, as shown in Section 4 for this document. To distinguish the NTO calculations in two modules, those performed in the RASSI module will be called RASSI-NTO, or simply NTO, and those performed in the WFA module will be called WFA-NTO. The comparison between RASSI-NTO and WFA-NTO is only made in Section 4, so all discussions in other sections of this module refer to the implementation in the RASSI module. -
Version Control Graphical Interface for Open Ondemand
VERSION CONTROL GRAPHICAL INTERFACE FOR OPEN ONDEMAND by HUAN CHEN Submitted in partial fulfillment of the requirements for the degree of Master of Science Department of Electrical Engineering and Computer Science CASE WESTERN RESERVE UNIVERSITY AUGUST 2018 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the thesis/dissertation of Huan Chen candidate for the degree of Master of Science Committee Chair Chris Fietkiewicz Committee Member Christian Zorman Committee Member Roger Bielefeld Date of Defense June 27, 2018 TABLE OF CONTENTS Abstract CHAPTER 1: INTRODUCTION ............................................................................ 1 CHAPTER 2: METHODS ...................................................................................... 4 2.1 Installation for Environments and Open OnDemand .............................................. 4 2.1.1 Install SLURM ................................................................................................. 4 2.1.1.1 Create User .................................................................................... 4 2.1.1.2 Install and Configure Munge ........................................................... 5 2.1.1.3 Install and Configure SLURM ......................................................... 6 2.1.1.4 Enable Accounting ......................................................................... 7 2.1.2 Install Open OnDemand .................................................................................. 9 2.2 Git Version Control for Open OnDemand -
Extending SLURM for Dynamic Resource-Aware Adaptive Batch
1 Extending SLURM for Dynamic Resource-Aware Adaptive Batch Scheduling Mohak Chadha∗, Jophin John†, Michael Gerndt‡ Chair of Computer Architecture and Parallel Systems, Technische Universitat¨ Munchen¨ Garching (near Munich), Germany Email: [email protected], [email protected], [email protected] their resource requirements at runtime due to varying computa- Abstract—With the growing constraints on power budget tional phases based upon refinement or coarsening of meshes. and increasing hardware failure rates, the operation of future A solution to overcome these challenges and efficiently utilize exascale systems faces several challenges. Towards this, resource awareness and adaptivity by enabling malleable jobs has been the system components is resource awareness and adaptivity. actively researched in the HPC community. Malleable jobs can A method to achieve adaptivity in HPC systems is by enabling change their computing resources at runtime and can signifi- malleable jobs. cantly improve HPC system performance. However, due to the A job is said to be malleable if it can adapt to resource rigid nature of popular parallel programming paradigms such changes triggered by the batch system at runtime [6]. These as MPI and lack of support for dynamic resource management in batch systems, malleable jobs have been largely unrealized. In resource changes can either increase (expand operation) or this paper, we extend the SLURM batch system to support the reduce (shrink operation) the number of processors. Malleable execution and batch scheduling of malleable jobs. The malleable jobs have shown to significantly improve the performance of a applications are written using a new adaptive parallel paradigm batch system in terms of both system and user-centric metrics called Invasive MPI which extends the MPI standard to support such as system utilization and average response time [7], resource-adaptivity at runtime.