
Jupyter+HPC for Computational Chemistry Jun 18, 2019 PRESENTED BY: ECSS Symposium Albert Lu [email protected] 1 Outline • What is Jupyter Notebook • TACC Visualization Portal • Run chemistry applications in Jupyter • Parallel computing and workflow managing 2 Jupyter Notebook The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. https://jupyter.org/ 3 Jupyter Notebook Plot Basic arithmetic Function Symbolic math Interactive interface 4 Use code, equations, figures, tables, images, and narrative texts to tell a story 5 TACC Visualization Portal https://vis.tacc.utexas.edu • Use your TACC or XSEDE User Portal username and password to log in • Run VNC, iPython/Jupyter Notebook, R Studio on Stampede2 or Wrangler • Available Stampede2 queues for iPython/Jupyter: development, skx-dev, normal, skx-normal, … • One compute node each time KNL: 68 cores, SKX: 48 cores 6 TACC Visualization Portal https://vis.tacc.utexas.edu 7 Computational Chemistry in One Slide Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. -Wikipedia Dealing with • Static/dynamic problems • Temporal scales ( < femtosecond to > day) • Spatial scales (electronic structure, molecule, nanoparticle, continuum …) Approaches: Simulation/modeling/theory/data analysis Common methods: Ab initio, semi-empirical, molecular mechanics, molecular dynamics, multi-scale, continuum approach …. Popular software: AMBER, CHARMM, GROMACS, LAMMPS, NAMD, VASP, Gaussian, QE, …… 8 LAMMPS • LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel Simulator. • LAMMPS is a classical molecular dynamics simulation code with a focus on materials modeling. • It was designed to run efficiently on parallel computers. • It was developed originally at Sandia National Laboratories. 1. Build structure 2. Specify interactions 3. Time integration Xi, Yi, Zi R(t’) Fij R(t) https://lammps.sandia.gov/ 9 LAMMPS Command line: lmp_stampede < input On Stampede2: $module load lammps will load default version 16Mar18. The latest version 5Jun19 is also available In Python3 In Jupyter https://lammps.sandia.gov/doc/Python_head.html 10 LAMMPS Learning from examples > 200 examples included in the package Use Jupyter Notebook… Examples: flow, crack, deposit, granregion https://github.com/lammps/lammps/tree/master/examples 11 Some jupyter-friendly chemistry packages that I Not Just LAMMPS installed and tested on Stampede2 (not a full list) Name Version Function There are many many other packages gpaw 1.5.1 Quantum DFT that you can install and use in Jupyter lammps 22Aug18 Classical MD hoomd-blue 2.3.5 Classical MD, CGMD Many packages can be installed simply ase 3.17.0 Simulation interface by using pip: tsase master Transition state library for ASE rdkit 2018_03_4 Cheminformatics, ML $pip install myPackage --user mdtraj 1.9.2 Analysis tool pytraj 2.0.3 Analysis tool cpptraj 18.00 Analysis tool parmed 3.03 Analysis tool OpenKIM 1.9.7 Force field database libxc 4.2.3 XC library libvdwxc 0.3.2 XC-VDW library nglview 1.1.7 Visualizer 12 Atomic Simulation Environment (ASE) Supported Software Jupyter ASE VASP Name Description ST2 Name Description ST2 Asap Highly efficient EMT code gaussian Gaussian based electronic structure code YES GPAW Real-space/plane-wave/LCAO PAW code YES gromacs Classical molecular dynamics code YES Hotbit DFT based tight binding gulp Interatomic potential code YES abinit Plane-wave pseudopotential code jacapo Plane-wave ultra-soft pseudopotential code amber Classical molecular dynamics code YES lammps Classical molecular dynamics code YES castep Plane-wave pseudopotential code mopac Semiempirical quantum chemistry code cp2k DFT and classical potentials nwchem Gaussian based electronic structure code YES demon Gaussian based DFT code dftb DFT based tight binding octopus Real-space pseudopotential code dmol Atomic orbital DFT code onetep Linear-scaling pseudopotential code QE Plane-wave pseudopotential code YES siesta LCAO pseudopotential code YES exciting Full Potential LAPW code turbomol Fast atom orbital code aims Numeric atomic orbital, full potential code VASP Plane-wave PAW code YES fleur Full Potential LAPW code dftd3 DFT-D3 dispersion correction calculator https://wiki.fysik.dtu.dk/ase/ase/calculators/calculators.html#module-ase.calculators 13 LAMMPS+Jupyter+HPC TACC Vis Portal lets you run Jupyter Notebook on ONE compute node (KNL: 68 cores, SKX: 48 cores) • LAMMPS’s USER-OMP package (provides optimized and multi-threaded version of many LAMMPS functions) • Run LAMMPS on multiple processors (use ipyparallel + mpi4py) MPI4Py MPI for Python provides bindings of the Message Passing Interface (MPI) standard for the Python programming language, allowing any Python program to exploit multiple processors. https://mpi4py.readthedocs.io/en/stable/ Ipyparalle: Ipyparallel (formerly IPython parallel) enables all types of parallel applications to be developed, executed, debugged, and monitored interactively. https://ipyparallel.readthedocs.io/en/latest/intro.html 14 Two parallel schemes in LAMMPS: Type 1: Domain decomposition Type 2: Replicas with domain decomposition 16 cores 4x4 cores 1 2 3 4 1 2 5 6 9 10 13 14 5 6 7 8 9 10 11 12 3 4 7 8 11 12 15 16 13 14 15 16 Replica 1 Replica 2 Replica 3 Replica 4 15 Type 1: Simple domain decomposition Step 1 Start ipcluster Start with 4 ipython engines Step 2 Import ipyparalle Step 3 Cell magic for executing python Import mpi4py and start commands on the ipython engines LAMMPS 16 Type 2: multi-replica (Example: lammps_neb_hop1.ipynb) E.g. 13 replicas, 2 cpus/replica Start with 26 iPython engines Start LAMMPS with 13 replicas with 2 cores/replica 17 • Run LAMMPS and many other packages using Jupyter Notebook on Vis Portal • Use package(“omp n”) to enable USER-OMP multithreading support • Use IPyparalle and MPI4Py to run parallel LAMMPS simulations in Jupyter notebook Next • Compose and execute workflow in Jupyter 18 PARSL Parsl is a native Python library. It allows you to write functions that execute in parallel and tie them together with dependencies to create workflows. “App” is a piece of code that can be asynchronously executed on an execution resource Parsl provides support for pure Python apps (python_app) and also command-line apps executed via Bash (bash_app) Parsl creates implicit workflows based on the passing of control or data between Apps. @python_app def hello (): return 'Hello World!' @bash_app def echo_hello(stdout='echo-hello.stdout', stderr='echo-hello.stderr’): return 'echo "Hello World!"' @bash_app http://parsl-project.org/ def run_lammps(stdout=‘stdout’, stderr=‘lmp.stderr’): return 'lmp_stampede < input' 19 Example 1 (Fe/Cr random alloy) Pure python code Create Energy Heating structure minimization E1 M tasks E2 N workers E3 …… …… …… EN Parallel workflow 20 Example 2 (Al surface diffusion, Hessian matrix) H = @V/@x @x Pure python code <latexit sha1_base64="07bLMPK/kOWwty0gjiKwk0ziesQ=">AAACInicbVDLSsNAFJ3UV62vqks3g0VwVZNWUBdC0U2XFewDmhAm00k7djIJMxOxhHyLG3/FjQtFXQl+jJO2SG09MHDuuY+593gRo1KZ5peRW1peWV3Lrxc2Nre2d4q7ey0ZxgKTJg5ZKDoekoRRTpqKKkY6kSAo8Bhpe8PrLN++J0LSkN+qUUScAPU59SlGSktu8aLuJlVuKxoQWeXppZ3YERKKIgZb8AT+Bg8unQ3u7NQtlsyyOQZcJNaUlMAUDbf4YfdCHAeEK8yQlF3LjJSTZDMxI2nBjiWJEB6iPulqypFeyUnGJ6bwSCs96IdCP67gWJ3tSFAg5SjwdGWA1EDO5zLxv1w3Vv65k1AexYpwPPnIjxlUIcz8gj0qCFZspAnCgupdIR4ggbDSrha0Cdb8yYukVSlb1XLl5rRUu5rakQcH4BAcAwucgRqogwZoAgwewTN4BW/Gk/FivBufk9KcMe3ZB39gfP8Azumj1g==</latexit> 3n 3n i j ⇥ { } Create structures of two Compute Hessian minimum states 3n elements/worker Saddle point structure Calculate eigen values M2 Nudged- elastic band S Compute rate M1 prefactor 21 Example 3 (Alkane C-H Bond dissociation Energy) Alkane isomers: Number of isomers: C10H22: 75 C18H38: 60,523 E.g. C H 5 12 C12H26: 355 C20H42: 366,319 C15H32: 4,347 C21H44: 910,726 C H : > 14.5M n-pentane neo-pentane i-pentane 24 50 MD & Remove H & Python Python Minimization Minimization Python script openbabel script (LAMMPS) (LAMMPS) script Isomer SMILES 3D structure LAMMPS E E structure 1 2 E2-E1 code (PDB format) input files code 40000 C(C)(C)(C)(C) X, Y, Z … (neo-pentane) 22 Example 3 (Alkane C-H Bond dissociation Energy) Performance test Stampede2 SKX node (48 cores) Run directly $time python3 alkane.py 48 workers Efficiency ∼50% 96 mins → 4 mins 23 Example 4 (Fe/Cr random alloy, ipywidgets) Create Set Cr Compute plot concentration structure Energy 24 Thank You [email protected] 25 Questions? [email protected] 26.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages26 Page
-
File Size-