MD Simulations with NAMD (and VMD)
João V. Ribeiro Research Programmer
NIH Center for Macromolecular Modeling and Bioinformatics University of Illinois at Urbana-Champaign www.ks.uiuc.edu/~jribeiro [email protected] PRACE/BioExcel Spring School 2019
HPC for Life Sciences Sweden A Brief History (and Future) of NAMD and VMD Number of Citations per Year
VMD NAMD
Hours Until Next Citation (VMD + NAMD) NAMD in a Nutshell • Developed in C++, CUDA (GPU), and Charm++ • Performance Scales to Hundreds of Thousands of IEEE Fernbach Award 2012 - Cores and Hundreds of GPUs “For outstanding contributions to Large Systems the development of widely used - parallel software for large - Enhanced Sampling biomolecular systems simulation” • Large Variety of User Defined Forces and Biased Simulations • TCL Script as Input File - Allows Scripting in the Input File - Workflow Control - Method Development at Higher Level • Close Relationship with VMD - Preparation - QwikMD - Analysis - Cross Correlation, Clustering… E.Coli Chemosensory Array Protocell - Visualization- Ray Tracing NAMD: http://www.ks.uiuc.edu/Research/namd/ VMD: https://www.ks.uiuc.edu/Research/vmd/ Main NAMD Developers and Contributors
David Hardy Julio Maia Jim Philips Ryan McGreevy Senior Research Programmer Research Programmer NCSA Blue Waters Research Programmer
Jérôme Hénin Brian Radak Wei Jiang Giacomo Fiorin Institut de Biologie Research Programmer Argonne Lab Temple University Physico-Chimique (Paris) NAMD Developer Workshop
I NAMD Developer Workshop - Chicago, IL 2016 II NAMD Developer Workshop - Chicago, IL 2017
III NAMD Developer Workshop - Urbana, IL 2018 NAMD Developer Workshop
I NAMD Developer Workshop - Chicago, IL 2016 II NAMD Developer Workshop - Chicago, IL 2017
Upcoming NAMD Developer Workshop August 19-20 2019 Urbana Illinois III NAMD Developer Workshop - Urbana, IL 2018 http://www.ks.uiuc.edu/Training/Workshop/Urbana2019/ Hands-On NAMD • 55 Workshops • 50+ Tutorials - 5 New Tutorials - 1800+ pages of Tutorials • 12 Case Studies • Hands-On Workshop on Enhanced Sampling and Free-Energy Calculation - September (2019) - to be Announced
Training - https://www.ks.uiuc.edu/Training/ Previous Workshop Streams - https://www.youtube.com/user/tcbguiuc/playlists
Hands-On Workshops NAMD 2.13 - What’s New
• Stochastic Velocity Rescaling Thermostat • Replica Exchange with Solute Scaling (REST2) • Hybrid Quantum Mechanics/Molecular Mechanics (QM/MM) Simulation • Interleaved Double-Wide Sampling for Alchemical FEP • Constant-pH MD • Gaussian accelerated MD (GaMD) • τ-Random acceleration MD (τRAMD) • Improved Support for Lone pair and Polarizable Drude Force Field • Scaling on Summit Supercomputer • Support for billion-atom systems
NAMD: https://www.ks.uiuc.edu/Research/namd/2.13/features.html NAMD 2.13 - What’s New
• Stochastic Velocity Rescaling Thermostat • Replica Exchange with Solute Scaling (REST2) • Hybrid Quantum Mechanics/Molecular Mechanics (QM/MM) Simulation • Interleaved Double-Wide Sampling for Alchemical FEP • Constant-pH MD • Gaussian accelerated MD (GaMD) • τ-Random acceleration MD (τRAMD) • Improved Support for Lone pair and Polarizable Drude Force Field • Scaling on Summit Supercomputer Support for billion-atom systems • NAMD pre-2.13, STMV matrix, 2fs timesteps 5x2x2 STMV ≈ 21M atoms 7x6x5 STMV ≈ 224M atoms
NAMD: https://www.ks.uiuc.edu/Research/namd/2.13/features.html NAMD On Summit: http://www.ks.uiuc.edu/Research/namd/2.13/NAMD-IBM-Journal-Manuscript-Revised.pdf NAMD 2.13 GPU Performance Improvements
• Simulation Parameters: - Integration Time Step: 1 fs - Cutoff: 12 - Switch: 10 - CHARMM Force Field
• Ivy Bridge system: dual socket Intel Xeon CPU E5-2690 v2 @ 3.00 GHz, 20 total cores. • Haswell system: dual socket Intel Xeon CPU E5-2698 v3 @ 2.30 GHz, 32 total cores • Skylake system: dual socket Intel Xeon Gold 6148 CPU @ 2.4 GHz, 40 total cores.
Apolipoprotein A-I NAMD: http://www.ks.uiuc.edu/Research/namd/benchmarks/ NAMD 2.13 GPU Performance Improvements
• Simulation Parameters: - Integration Time Step: 1 fs - Cutoff: 12 - Switch: 10 - CHARMM Force Field
• Ivy Bridge system: dual socket Intel Xeon CPU E5-2690 v2 @ 3.00 GHz, 20 total cores. • Haswell system: dual socket Intel Xeon CPU E5-2698 v3 @ 2.30 GHz, 32 total cores • Skylake system: dual socket Intel Xeon Gold 6148 CPU @ 2.4 GHz, 40 total cores.
Satellite Tobacco Mosaic Virus NAMD: http://www.ks.uiuc.edu/Research/namd/benchmarks/ Exemplary NAMD Features • User Defined Forces - Grid Forces - Interactive Molecular Dynamics - Steered Molecular Dynamics • Accelerated Sampling Methods - Replica Exchange • Collective Variable (Colvars) - Biased Simulations - Enhanced Sampling • Free-Energy Calculation Methods - Free-Energy Perturbation - Adaptative Biasing Force - Constant pH Simulations . • Hybrid QM/MM Simulations Complete List of NAMD Features: https://www.ks.uiuc.edu/Research/namd/2.13/ug/ Grid Forces
• Addition Potential Term • Arbitrary Shape and Magnitude • Three Dimensional Grid with Scaling factor on Each Voxel
• Use VMD to “Translate” Density Data into Potential Grid
UEM(R) = ∑ wjVEM(rj), j
Trabuco et al. Structure (2008) Trabuco et al. Methods (2009) Wehmer et al. PNAS (2017) Grid Forces - https://www.ks.uiuc.edu/Research/namd/2.13/ug/node41.html MDFF- https://www.ks.uiuc.edu/Research/mdff/ MDFF Tutorial - https://www.ks.uiuc.edu/Training/Tutorials/#mdff Grid Forces for Molecular Dynamics Flexible Fitting
• Addition Potential Term • Arbitrary Shape and Magnitude • Three Dimensional Grid with Scaling factor on Each Voxel
• Use VMD to “Translate” Density Data into Potential Grid - Molecular Dynamics Flexible Fitting
UEM(R) = ∑ wjVEM(rj), j Φ(r) − Φthr if ξ 1 − , Φ(r) ≥ Φthr ( Φmax − Φthr ) VEM(r) = if ξ, Φ(r) < Φthr Trabuco et al. Structure (2008) Trabuco et al. Methods (2009) Wehmer et al. PNAS (2017) Grid Forces - https://www.ks.uiuc.edu/Research/namd/2.13/ug/node41.html MDFF- https://www.ks.uiuc.edu/Research/mdff/ MDFF Tutorial - https://www.ks.uiuc.edu/Training/Tutorials/#mdff Molecular Dynamics Flexible Fitting (Ribosome-bound YidC)
Electron APS Microscope Synchrotron
Match through MD
EM density crystallographic map Supercomputer structure Molecular Dynamics Flexible Fitting (MDFF)
Integrating experimental data to produce models of biomolecular complexes with atomic detail
E.Coli Chemosensory Array
Trabuco et al. Structure (2008) Trabuco et al. Methods (2009) Wehmer et al. PNAS (2017) Cassidy et al. eLife (2015) Chemotaxis http://www.ks.uiuc.edu/Research/chemotaxis/ Proteasome MDFF- https://www.ks.uiuc.edu/Research/mdff/ MDFF Tutorial - https://www.ks.uiuc.edu/Training/Tutorials/#mdff Molecular Dynamics Flexible Fitting (MDFF)
Integrating experimental data to produce models of biomolecular complexes with atomic detail
E.Coli Chemosensory Array
Trabuco et al. Structure (2008) Trabuco et al. Methods (2009) Wehmer et al. PNAS (2017) Cassidy et al. eLife (2015) Chemotaxis http://www.ks.uiuc.edu/Research/chemotaxis/ Proteasome MDFF- https://www.ks.uiuc.edu/Research/mdff/ MDFF Tutorial - https://www.ks.uiuc.edu/Training/Tutorials/#mdff Molecular Dynamics Flexible Fitting (MDFF)
Integrating experimental data to produce models of biomolecular complexes with atomic detail
E.Coli Chemosensory Array
Cascade MDFF High Resolution Density Maps
Trabuco et al. Structure (2008) Trabuco et al. Methods (2009) Wehmer et al. PNAS (2017) Cassidy et al. eLife (2015) Chemotaxis http://www.ks.uiuc.edu/Research/chemotaxis/ Proteasome MDFF- https://www.ks.uiuc.edu/Research/mdff/ MDFF Tutorial - https://www.ks.uiuc.edu/Training/Tutorials/#mdff Interactive Modeling with MDFF GUI • Apply forces to manually manipulate structure into the density • Useful for difficult to fit structures with large conformational changes
Set up and run interactive (or traditional) Analyze interactive simulations MDFF/xMDFF simulations in real-time Modeling Large Complex Membrane Systems
Vesicle Construction Coarse Grain Protein CG Protein Placement Combine Lipid + Protein
Distribution of proteins across the membrane surface (dense environment) • Ability the handle a variety of protein geometries • Proper orientation of proteins in relation to the membrane surface • Generalizable and automated method for membranes of arbitrary shape Embedding proteins into the membrane • Account for surface area occupied by proteins in inner and outer leaflets • Proper lipid packing around embedded proteins Modeling Large Complex Membrane Systems
Vesicle Construction Coarse Grain Protein CG Protein Placement Combine Lipid + Protein
Distribution of proteins across the membrane surface (dense environment) • Ability the handle a variety of protein geometries • Proper orientation of proteins in relation to the membrane surface • Generalizable and automated method for membranes of arbitrary shape Embedding proteins into the membrane • Account for surface area occupied by proteins in inner and outer leaflets • Proper lipid packing around embedded proteins Modeling Large Complex Membrane Systems
Vesicle Construction Coarse Grain Protein CG Protein Placement Combine Lipid + Protein
Distribution of proteins across the membrane surface (dense environment) • Ability the handle a variety of protein geometries • Proper orientation of proteins in relation to the membrane surface • Generalizable and automated method for membranes of arbitrary shape Embedding proteins into the membrane • Account for surface area occupied by proteins in inner and outer leaflets • Proper lipid packing around embedded proteins Modeling Large Complex Membrane Systems
Vesicle Construction Coarse Grain Protein CG Protein Placement Combine Lipid + Protein
Distribution of proteins across the membrane surface (dense environment) • Ability the handle a variety of protein geometries • Proper orientation of proteins in relation to the membrane surface • Generalizable and automated method for membranes of arbitrary shape Embedding proteins into the membrane • Account for surface area occupied by proteins in inner and outer leaflets • Proper lipid packing around embedded proteins 113 million Martini particles representing 1 billion atoms
Protein Components Copy # Aquaporin Z 97 Copper Transporter (CopA) 166 F1 ATPase 63 Lipid Flipase (MsbA) 29 Molybdenum transporter (ModBC) 130 Translocon (SecY) 103 Methionine transporter (MetNI) 136 Membrane chaperon (YidC) 126 Energy coupling factor (ECF) 117 Potassium transporter (KtrAB) 148 Glutamate transporter (GltTk) 41 Cytidine-Diphosphate diacylglycerol (Cds) 50 Membrane-bound protease (PCAT) 57 Folate transporter (FolT) 134 1,397
3.7 M lipids (DPPC), 2.4 M Na+ & Cl- ions, 0.4 μm 104 M water particles (4 H2O / particle) Simulating Large Complex Membrane Systems
• 200nm diameter spherical vesicle constructed from 390k POPC lipid molecules • Solvated with explicit water with 150mM NaCl salt concentration • 1600 proteins, 400 copies each of: - Kv1.2 potassium channel [the small, inward- pointing protrusion] - F1c10 ATPase complex [large, inward- pointing, ball protrusion] - Multi-drug transporter P-glycoprotein (P-gp) [the V-shaped protein] - Human glucose transporter (GLUT1) Simulating Large Complex Membrane Systems
• 200nm diameter spherical vesicle constructed from 390k POPC lipid molecules • Solvated with explicit water with 150mM NaCl salt concentration • 1600 proteins, 400 copies each of: - Kv1.2 potassium channel [the small, inward- pointing protrusion] - F1c10 ATPase complex [large, inward- pointing, ball protrusion] - Multi-drug transporter P-glycoprotein (P-gp) [the V-shaped protein] - Human glucose transporter (GLUT1) Simulating Large Complex Membrane Systems
• 200nm diameter spherical vesicle constructed from 390k POPC lipid molecules • Solvated with explicit water with 150mM NaCl salt concentration • 1600 proteins, 400 copies each of: - Kv1.2 potassium channel [the small, inward- pointing protrusion] - F1c10 ATPase complex [large, inward- pointing, ball protrusion] - Multi-drug transporter P-glycoprotein (P-gp) [the V-shaped protein] - Human glucose transporter (GLUT1) Simulating Large Complex Membrane Systems
• 200nm diameter spherical vesicle constructed from 390k POPC lipid molecules • Solvated with explicit water with 150mM NaCl salt concentration • 1600 proteins, 400 copies each of: - Kv1.2 potassium channel [the small, inward- pointing protrusion] - F1c10 ATPase complex [large, inward- pointing, ball protrusion] - Multi-drug transporter P-glycoprotein (P-gp) [the V-shaped protein] - Human glucose transporter (GLUT1) Workflow for Multi-Scale Modeling
Membrane Budding/Fusion Workflow for Multi-Scale Modeling
Workflow integrating multiple tools designed to generate potentials for multi-scale simulations
Time Domain construct mesh from manipulate mesh transform to potentials shape(s) MeshLab Visual Molecular Dynamics 123D Design Workflow for Multi-Scale Modeling
Grid Design and Construction plane } sphere remove plane points remove sphere within the sphere points below plane
Membrane Budding/Fusion Bacterial Infection (MRSA) Methicillin Resistant Staphylococcus aureus
Staphylococcus bacterium There’s a dearth of new antibiotics to treat what the U.S. Centers for Disease Control calls “nightmare bacteria.”
Bacterium Adhesin
Human Extracellular Matrix
www.ks.uiuc.edu/~rcbernardi Steered Molecular Dynamics
• Biased Simulation • Constant Force • Constant Velocity - Pulling with a spring (Hook’s Law) ‣ F = -k . Dx “Dummy Atom” - Atomic Force Microscopy (AFM) Pulling Selection - Single Atom or Selection’s Center of Mass Force
Extension H Grubmüller, et. al. Science (1996) S Izrailev, et. al. Langmuir (1997) SMD on NAMD Tutorial: http://www.ks.uiuc.edu/Training/Tutorials/namd/namd-tutorial-unix-html/node18.html Adhesion Mechanism – Staphylococcus epidermidis’ SdrG SdrG (serine-aspartate repeat protein G)
Blood flow
Hermann Gaub (LMU)
Staph. Targets Human’s Fibrinogen β (Fgβ)
Bacterium SdrG Bacterium host B1 & B2 domains targets
Human Fgβ Human Extracellular Matrix
Milles et.al, Science (2018) https://www.ks.uiuc.edu/~rcbernardi/ Adhesion Mechanism – Staphylococcus epidermidis’ SdrG SdrG (serine-aspartate repeat protein G)
Hermann Gaub (LMU)
Milles et.al, Science (2018) https://www.ks.uiuc.edu/~rcbernardi/ Force Profile Sampling Loading Rate Dependent Over 2400 Steered Molecular Dynamics Simulations
Receptor:Li
Receptor:Li
• NAMD Enabled the Generation of Extensive Sampling (2400 Independent Simulations) • The Experimental Traces Agreed with the Simulation (Force Loading Rate Dependent) - Dudko-Hummer-Szabo (DHS) Theory Dudko, et. al. Physical Review Letters (2006) Bullerjahn, et. al. Nature Communications (2014) Verdorfer et al. JACS (2017) Milles et.al Science (2018) https://www.ks.uiuc.edu/~rcbernardi/ Overcoming Timescale Limitations When “Let It Go” Is Not an Option
• Enhanced Sampling Techniques • Biased Simulations • Replica Exchange Simulations • Steered MD • Temperature • Target MD • Solute Tempering • ABF Metadynamics • Bias Exchange • Colvars - Wicked Useful • Grid Potentials • RMSD • String Method with Swarm of • Distance Trajectories • Orientation • Umbrella Sampling • Metadynamics . Stochastic Simulations Umbrella Sampling Replica Exchange INTRODUCTION TO FREE-ENERGY CALCULATIONS INTRODUCTION
gABF eABF MtD FREE-ENERGY METHODS 2016 2011 ABF egABF US 2004 2017 TI / geometric eABF transformations Colvars 2000 2010 FEP / WCA ABF
2008 scripted variables Hamiltonian MW/ABF hopping / FEP 2015 Chipot group FEP 2009 2014 2001 Roux group TI meta-eABF Fiorin / Hénin 2007 constant-pH MD 2018 Others Introduction to Free Energy Calculations (Chris Chipot) - https://youtu.be/LCKtsR1ijsA 2016 SLIDE COURTESY OF CHRIS CHIPOT Portable Innovation using Tcl and Colvars Milestoning
Use string method to identify low-energy transition path and partition space into Voronoi polygons
Run many trajectories, stop at boundary
Faradjian and Elber. J. Chem. Phys. (2004) Bello-Rivas and Elber J. Chem. Phys (2015) Ma and Schulten JACS (2015) Portable Innovation using Tcl and Colvars: Milestoning
Use string method to identify low-energy transition path and partition space into Voronoi polygons
Run many trajectories, stop at boundary
Faradjian and Elber. J. Chem. Phys. (2004) Bello-Rivas and Elber J. Chem. Phys (2015) Ma and Schulten JACS (2015) Portable Innovation using Tcl and Colvars: SEEKER Combines NAMD (MD) with BrownDye (BD) through milestoning to efficiently predict kinetics of ligand-receptor binding and off-rates
• Anton requires ~600 to 15000 ns MD for a single binding event Votapka & Amaro, PLOS Comp Biol (2015) (predicted kinetics are less accurate) Votapka, Jagger, Heyneman, Amaro, J Phys Chem B (2017) • Provides accurate & efficient binding kinetics predictions SEEKER: https://github.com/nbcrrolls/SEEKR • Computes on- and off- rates Using Colvars To Explore Transition Pathways in ABC Exporters
M. Moradi and ET PNAS (2013). M. Moradi and ET JCTC (2014). NBD Doorknob Mechanism M. Moradi and ET PNAS (2013) Complex Processes Require Complex Treatments
M. Moradi and ET PNAS (2013) M. Moradi and ET JCTC (2014) M. Moradi, G. Enkavi, and ET Nature Comm. (2015) Complex Reaction Pathways - https://youtu.be/ax74TgWY3wA and https://youtu.be/3hCaeFF05Jc Describing a Complete Cycle (Adding Substrate) Requiring a Combination of Multiple Collective Variables
30 r x 20 ns 30 r x 20 ns
12 replicas x 40 ns (H1/H7) 12 replicas x 40 ns (H1/H7) 150 24 replicas x 20 ns (H1/H7) 50 replicas x 20 ns (10 Hs) replicas 200 replicas (2D) x 5 ns 50 replicas x 20 ns
30 r x 20 ns M. Moradi and ET PNAS (2013) 30 r x 20 ns M. Moradi and ET JCTC (2014) 30 r x 20 ns M. Moradi, G. Enkavi, and ET Nature Comm. (2015) Complex Reaction Pathways - https://youtu.be/ax74TgWY3wA and https://youtu.be/3hCaeFF05Jc QM/MM – Fully Featured and Flexible Hybrid Interface
QM/MM highly requested by users Integrating Quantum Mechanics (QM) to Molecular Mechanics (MM), QM/MM allows study of chemical reactions and many other quantum processes. Main Features • Interface to ORCA, MOPAC, and “Generic Interface to any QM Software” • Multiple QM/MM Coupling Schemes (Charge Redistribution Schemes) https://www.ks.uiuc.edu/Research/qmmm/ • PME for Long-Range Electrostatics • Solvent Molecule Switcher (necessary for long timescale QM/MM simulations) Collaborators • Multiple Independent QM Regions • Easy Setup Interface with QwikMD Frank Neese (ORCA) • Mix and Match (Combine with any other NAMD Feature): Max Planck Institute for -Polarizable Force Field in MM Region Chemical Energy Conversion Mülheim an der Ruhr, Germany -Replica Exchange Molecular Dynamics -Adaptive Biasing Force Gerd Rocha (MOPAC-GPU) -Steered Molecular Dynamics Theoretical Quantum Chemistry Group -and many other features. UFPB - João Pessoa, Brazil • Multi-Level QM Regions (Coming Soon) Combining Hybrid QM/MM Simulations with Enhanced Sampling Metadynamics
Enhanced Sampling Techniques can be used with QM/MM to investigate e.g. reaction mechanisms
Combining the Collective Variables module of NAMD with QM/ MM allows for the investigation of reaction pathways with the utmost level of details.
Stochastic Simulations Replica Exchange Molecular Dynamics
Melo et al. Nature Methods (2018) Bernardi et al., Biochimica et Biophysica Acta (BBA), (2015) NAMD Constant pH Simulations
Staph nuclease (SNase) http://www.ks.uiuc.edu/Training/Tutorials
• Protein Residues • neMD/MC Molecular Dynamics • Dual topology • TCL scripts - Milestoning - and others... NAMD Constant pH simulations
=2/3 neMD alchemical MC sample of auxiliary λ growth removal of auxiliary coordinates coordinates
=1 λ=0 λ λ=1/3
• Drive alchemical growth with nonequilibrium work • Accept/reject with a generalized Metropolis criterion
Stern J Chem Phys, (2007) Chen and Roux J Chem Theory Comput (2015) Radak, et al. J Chem Theory Comput (2017) Electrostatics with Multilevel Summation Method
• Uses Hierarchical Interpolation of Smoothed Pairwise Potential • Improves Parallel Communication over FFT-based PME
Simulate non-periodic systems Simulate semi-periodic systems
Hardy, et al. JCTC (2015) Hardy, et al. JCP (2016) NAMD Coming Attractions
• Fast Single Node GPU-Accelerated Mode • Improved single-GPU/single-node performance • GPU support for FEP & TI • GPU support for multilevel summation method (MSM) • GPU support for Drude polarizable force field • Long-range dispersion forces (LJ-PME, LJ-MSM) • Support for Martini 2.x force field NAMD – Single-node GPU Performance Trace
CPU integrator is now the bottleneck
Integrator running on CPU cores
1% of computation is now ~50% of timestep work.
Next step: Offload it to the GPU
Force compute running on GPU
NVIDIA Nsight Systems profiler NAMD – Single-node GPU Performance Trace
Nanoseconds per Day Ongoing Development 50 45.5
37.5 Benchmarked System: ApoA1 (92.224 atoms) 2fs timestep 25 12A cutoff 14.5 Intel Xeon E5-2650 (16 Cores) + Nvidia Titan V 12.5 1.2 0 Per Patch Per Core Per System
GPU GPU GPU integrator integrator integrator per per system per patch CPU core (Ongoing) 1.2 ns/day 14.5 ns/day 45.5 ns/day One kernel launch per patch Better usage of GPU resources Avoiding memory transfers Lots of memory transfers Still bottlenecked by memory transfers Keeping data on GPU as much as possible NAMD – Single-node GPU Performance Trace
Future Developments
Gaps in the GPU usage — get rid of additional Charm++ related overheads
Atom migration steps are becoming larger as intermediate steps shrink — move them to the GPU as well Thanks to: NIH, NSF, DOE, NCSA, ALCF, OLCF, and 20+ years of NAMD and Charm++ developers and users.
João V. Ribeiro NIH Center for Macromolecular Modeling and Bioinformatics University of Illinois at Urbana-Champaign http://www.ks.uiuc.edu/Research/namd/