Chemical Science

Chemical Science View Article Online EDGE ARTICLE View Journal | View Issue Automatic discovery of photoisomerization mechanisms with nanosecond machine learning Cite this: Chem. Sci.,2021,12,5302 † All publication charges for this article photodynamics simulations have been paid for by the Royal Society a b d c of Chemistry Jingbai Li, Patrick Reiser, Benjamin R. Boswell, Andre´ Eberhard, Noah Z. Burns, *d Pascal Friederich*bc and Steven A. Lopez *a Photochemical reactions are widely used by academic and industrial researchers to construct complex molecular architectures via mechanisms that often require harsh reaction conditions. Photodynamics simulations provide time-resolved snapshots of molecular excited-state structures required to understand and predict reactivities and chemoselectivities. Molecular excited-states are often nearly degenerate and require computationally intensive multiconfigurational quantum mechanical methods, especially at conical intersections. Non-adiabatic molecular dynamics require thousands of these computations per trajectory, which limits simulations to 1 picosecond for most organic photochemical Creative Commons Attribution-NonCommercial 3.0 Unported Licence. reactions. Westermayr et al. recently introduced a neural-network-based method to accelerate the predictions of electronic properties and pushed the simulation limit to 1 ns for the model system, + methylenimmonium cation (CH2NH2 ). We have adapted this methodology to develop the Python- based, Python Rapid Artificial Intelligence Ab Initio Molecular Dynamics (PyRAI2MD) software for the cis– trans isomerization of trans-hexafluoro-2-butene and the 4p-electrocyclic ring-closing of a norbornyl hexacyclodiene. We performed a 10 ns simulation for trans-hexafluoro-2-butene in just 2 days. The same simulation would take approximately 58 years with traditional multiconfigurational photodynamics simulations. We generated training data by combining Wigner sampling, geometrical interpolations, and short-time quantum chemical trajectories to adaptively sample sparse data regions along reaction This article is licensed under a coordinates. The final data set of the cis–trans isomerization and the 4p-electrocyclic ring-closing model has 6207 and 6267 data points, respectively. The training errors in energy using feedforward neural networks achieved chemical accuracy (0.023–0.032 eV). The neural network photodynamics Received 9th October 2020 Open Access Article. Published on 10 March 2021. Downloaded 4/27/2021 5:13:29 PM. simulations of trans-hexafluoro-2-butene agree with the quantum chemical calculations showing the Accepted 24th February 2021 formation of the cis-product and reactive carbene intermediate. The neural network trajectories of the DOI: 10.1039/d0sc05610c norbornyl cyclohexadiene corroborate the low-yielding syn-product, which was absent in the quantum rsc.li/chemical-science chemical trajectories, and revealed subsequent thermal reactions in 1 ns. 1. Introduction Photochemistry provides access to highly strained molecular architectures,1 photoswitches,2 organic photovoltaics,3 and 4,5 aDepartment of Chemistry and Chemical Biology, Northeastern University, Boston, MA solar fuel materials, which are characterized by mild condi- 02115, USA. E-mail: [email protected] tions and high atom economy. Photochemical reactions typi- bInstitute of Nanotechnology, Karlsruhe Institute of Technology, Karlsruhe, Germany. cally consist of a series of molecular transformations that occur E-mail: [email protected] in excited molecules aer light absorption. Unlike ground state cInstitute of Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, processes, where spectroscopy and crystallography reveal Germany dDepartment of Chemistry, Stanford University, Stanford, CA, USA. E-mail: nburns@ structural information, light-promoted excited-state reactions À15 À12 stanford.edu involve short-lived femto-to picosecond (10 to 10 s) † Electronic supplementary information (ESI) available: Machine learning model, molecular excited states and reactive intermediates. This the initial training set generation, NACs phase corrections, grid search, and ultrafast processes involve relaxation to the excited-state adaptive sampling, the computational cost of ML-NAMD, data and code minima for uorescence or non-radiative transition to the availability, the QC calculations of hexauoro-2-butene and norbornyl ground-state through a state crossing point or seam, which cyclohexadiene systems, the Cartesian coordinates of optimized geometries, and the experimental details for the norbornyl cyclohexadiene system. See DOI: plays essential roles in the chemoselectivity of a photochemical 6,7 10.1039/d0sc05610c reaction. 5302 | Chem. Sci.,2021,12, 5302–5314 © 2021 The Author(s). Published by the Royal Society of Chemistry View Article Online Edge Article Chemical Science Unravelling the origin of chemoselectivity and stereo- a nontoxic and not ammable industrial working uid used as selectivity in organic photochemical reactions is challenging a refrigerant and a foam-blowing agent.20 We also applied the because of the short-lived molecular excited states. Quantum ML-NAMD simulations to the 4p-electrocyclic ring-closing of chemical calculations offer insight into the bonding changes norbornyl cyclohexadiene (3), a previously unknown and that occur along a reaction coordinate and non-adiabatic complex photochemical system. molecular dynamics (NAMD) simulations to gain mechanistic insights and develop structure–reactivity relationships in 2. Computational methods complex photochemical reactions. The nuclei-electron coupling and time-dependent terms increase the complexity of Hamil- 2.1 Quantum chemical calculations tonian in NAMD and, thus, computation time. Multiple We performed quantum chemical calculations with Open- methods developed in the last two decades simplied the time- Molcas 19.11 21 to generate reference and training data. We dependent molecular wave functions, e.g. ab initio multiple chose the less resource-intensive CASSCF method because it spawning (AIMS),8,9 and fewest switches surface hopping produced consistent potential energy surfaces with CASPT2 (FSSH).10–12 However, computing high dimensional PESs with calculations, tested by the interpolated reaction coordinate the requisite multicongurational methods is extremely diagrams of the model reactions (Fig. S11 and S17†). The resource-intensive. For example, most quantum chemical so- converged orbitals and optimized geometries are available in ware packages encode an upper limit to active space of 16 the ESI.† electrons and 16 orbitals for complete active space self- In the cis–trans isomerization of trans-1, we selected an consistent eld (CASSCF) calculations. A typical NAMD experi- active space of 2 p-electrons and 2 p-type orbitals (i.e. (2,2)) for ment requires thousands of such calculations, resulting in the C]C bond. The geometries of trans-1 and cis-1 were opti- a maximum simulation time on the order of 1 picosecond, at mized with the cc-pVDZ basis set.22 A vibrational analysis a computational cost of approximately 101–104 wall-clock hours. conrmed only positive frequencies. We located two minimum / Creative Commons Attribution-NonCommercial 3.0 Unported Licence. Unfortunately, many direct excitation photochemical reactions energy crossing points (MECPs) corresponding to the trans have low quantum yields and long excited-state lifetimes. This cis (MECP-trans-1) and the reverse reaction (MECP-cis-1). The results in prohibitively expensive NAMD simulations, thus NAMD simulations with CASSCF(2,2)/cc-pVDZ used the fewest limiting broad understanding of excited-state structure–reac- switches surface hopping algorithm (FSSH)10–12 in Hammes- tivity relationships in photochemistry. Schiffer/Tully (HST) scheme10 with decoherence correction of An increasing number of studies reported that tting 0.1 Hartree23 implemented in OpenMolcas 19.11.21 We gener- machine learning (ML) potentials could substantially accelerate ated 1500 initial geometries and associated velocities near the the NAMD simulation.13–15 Aleotti et al. have parameterized ad equilibrium geometry of trans-1 with Wigner sampling at 300 K. hoc force elds for a 10 ps dynamic simulation of azobenzene.16 The NAMD trajectories were propagated at 300 K (Nosé–Hoover This article is licensed under a 24 Westermeyr et al. have trained multilayer feedforward neural thermostat ) from the S1 Franck–Condon region for 500 fs networks (NNs) to enable 1 ns simulation of methyl- with 0.5 fs timesteps. It is important to note that applying + 17 enimmonium cation (CH2NH2 ) in 59 days. Later, they have a thermostat can bias the excited-state dynamics because the Open Access Article. Published on 10 March 2021. Downloaded 4/27/2021 5:13:29 PM. extended the application of the NN model in predicting excited- excited-state lifetime of a few hundred femtoseconds o en does states electronic properties for other small molecules, such as not suffice for thermalization. We used a thermostat to reduce SO2 and CSH2 employing a deep continuous-lter the kinetic energy gained from surface hopping, which convolutional-layer neural network, SchNet, combined with compensates for the overestimated excitation energy of CASSCF SHARC.18 Recently, Ha et al. have trained the SchNet model to calculation. We collected 1371 of the 1500 trajectories that study the excited-state dynamics of penta-2,4-dieniminium reached the ground-state within 500 fs, whereas the others cation.19 Applying

Chemical Science

Real-Time Pymol Visualization for Rosetta and Pyrosetta

An Explicit-Solvent Conformation Search Method Using Open Software

Python Tools in Computational Chemistry (And Biology)

Computational Chemistry for Chemistry Educators Shawn C

Preparing and Analyzing Large Molecular Simulations With

Kepler Gpus and NVIDIA's Life and Material Science

Trends in Atomistic Simulation Software Usage [1.3]

Methods for the Refinement of Protein Structure 3D Models

A Coarse-Grained Molecular Dynamics Simulation Using NAMD Package to Reveal Aggregation Profile of Phospholipids Self-Assembly in Water

Statistics of Software Package Usage in Supercomputer Complexes

Chemistry 6V39

NAMD User's Guide