SBM CDT 2019 Computational Module

Day 2

Dr Fernanda Duarte Department of Chemistry, University of Oxford

http://fduartegroup.org 1 Workplan

Tuesday Wednesday Friday 9:00-10:00 Lecture 2 Lecture 3*

10:30-12:00 Lecture 1 Project Work Project Work

14:00- 17:00 Lab session Project Work Presentations

Lab session Project Work Presentations

2 Outline (Lecture 2)

•Day 1

•The good side: Applications of DFT in Chemistry

•The other side …. Challenges in DFT modelling

•A bit more on Functionals and Basis sets

3

3 Why should you care? Hˆ = Hˆ + Hˆ HEˆ Y = Y Born-Oppenheimer Approximation N e

Y =y ey N

ElectronicTheory Schrödinger Equation Modelling ˆ Hey e = Eey e

2 electrons electrons nuclei electrons ˆ -! 2 Z A 1 Hei= ååååÑ- + 2m iiAijrRiA- < rrij-

Kinetic energy Coulomb attraction Electronic repulsion (nuclei-electrons)

Experiments

Synthesis

Kinetics

spectroscopy

“Artificial Intelligence will not replace chemists. But chemists who doesn’t use (AI) will be replace by those who do” Willem Van Hoorn 4 Computational Chemistry What is - and why is it relevant?

Which System Do I Have?

What Do You Want to Compute (and Why)?

Which Model /Method Should I Choose?

Verify Approach (vs. Experiment)

Interpret/Analyse 5 Computational Chemistry

Which System Do I Have?

10 atoms – organic molecule – singlet

6 Computational Chemistry

What Do I Want to Compute (and Why)?

Asymmetric Induction via 1,2-Addition to Carbonyl Compounds

Conformations for the starting material and TS Which product is preferred? What is the molecular origin of such preference?

7 Computational Chemistry

Which Model /Method Should I Choose?

Chemical Accuracy

{φi} double hybrid: ωB97X-2, XYG3, B2PLYP HF/3-21G

εx hybrid-GGA: hyper-meta-GGA: B3LYP, mPW1K M06-2X, M11,TPSSh NOT recommended Many known deficiencies

τ or meta-GGA: … 2 ∇ ρ(r) τHCTH,TPSS, M06-L But fast… Simplicity Accuracy

∇ρ(r) GGA: Wong and Paddon-Row PBE, BLYP, OLYP, B97 Theoretical evidence in support of the Anh – Eisenstein electronic model in controlling π-facial stereoselectivity in nucleophilic additions to carbonyl compounds ρ(r) LDA: VWN, GPW92 J. Chem. Soc. Chem. Commun. 1990, 456

Hartree Fock theory 8 Computational Chemistry

Which System Do I Have?

What Do You Want to Compute (and Why)?

Often the most Which Model /Method Should I Choose? interesting result is when the “calculation gets it wrong”

Verify Approach (vs. Experiment/Previous studies)

Interpret/Analyse 9 Computational Chemistry

Conformational Analysis: 2 minima

ΔG = -RTlnK K = exp[-(ΔΔG)/RT]/RT

K = e-3.6 kcal/mol /(0.001987)(298) = [A]/[B] Thus % min1 = (0.0023/1.0023) x 100% = 99.8%

T(K) = 298 R(kcal mol-1) = 0.001987 R(kJ K-1 mol-1) = 8.3144598 ×10−3 h(J*s) = 6.6262 x 10-34 kb(J/K) = 1.3807 x 10-23 10 Computational Chemistry

Conformational Analysis: 2 minima

k T k = B e-DG / RT h T(K) = 298 R(kcal mol-1) = 0.001987 R(kJ K-1 mol-1) = 8.3144598 ×10−3

-34 -23 h(J*s) = 6.6262 x 10 kb(J/K) = 1.3807 x 10 11 Computational Chemistry

Asymmetric Induction via 1,2-Addition to Carbonyl Compounds

Cornforth model J. Am. Chem. Soc. 1959

polar Felkin−Anh (PFA) model Tetrahedron Lett. 1968

Computational Organic Chemistry Steven M. Bachrach

Paddon-Row, Rondan & Houk J. Am. Chem. Soc. 1982, 104, 7162. Houk, Paddon-Row, Rondan, Wu, Brown, Spellmeyer, Metz, Li & Longarich Science 1986, 231, 1108 Cee, Cramer & Evans J. Am. Chem. Soc. 2006, 128, 2920 12 Computational Chemistry

Physical Organic Chemistry

transforms slowly at room temperature O benzene benzene endo + O exo diastereomer RT Δ diastereomer O Kinetic Thermodynamic Product Product

13 https://pubs.rsc.org/en/Content/ArticleLanding/2016/CS/C6CS00573J Computational Chemistry

transforms slowly at room temperature O benzene benzene endo + O exo diastereomer RT Δ diastereomer O Kinetic Thermodynamic Product Product

20.4 19.6 + + ΔΔG

O O O O O O 0.0 + -6.7 + G -7.6 ΔΔ rxn + 14 https://pubs.rsc.org/en/Content/ArticleLanding/2016/CS/C6CS00573J Computational Chemistry

15 https://pubs.rsc.org/en/Content/ArticleLanding/2016/CS/C6CS00573J Computational Chemistry

Conformations

ΔG n = exp[-(ΔG2-ΔG1)/RT] ni/∑n (%) (kcal mol-1) 0.0 1.00 83.4 1.0 0.18 15.4

2.5 0.01 1.2

5.0 0.00 0.0

1.20 100.0

T(K) = 298 R(kcal mol-1) = 0.001987 R(kJ K-1 mol-1) = 8.3144598 ×10−3

16 http://www.metadynamics.cz/eyring/eyring.html Computational Chemistry Kinetics k T k = B e-DG / RT h T(K) = 298 R(kcal mol-1) = 0.00831 R(kJ K-1 mol-1) = 8.3144598 ×10−3

h(J*s) = 6.6262 x 10-34 kb(J/K) = 1.3807 x 10-23

ΔG‡ k t1/2 t1/2 (kcal mol-1) (s-1) (s-1) 12 9.8 x 103 7.1 x 10-5 70.5μs 17 2.11 3.3 x 10-1 327 ms

22 4.5 x 10-4 1.5 x 103 25min

27 9.8 x 10-8 7.1 x 106 81.1 days

30 6.2 x 10-10 2.4 x 10-6 35.5 years 17

http://www.metadynamics.cz/eyring/eyring.html Computational Chemistry

When there are competing pathways leading from interconverting intermediates, the product ratio is determined by the relative heights of the highest energy barriers leading to the products"

18 http://www.metadynamics.cz/eyring/eyring.html Computational Chemistry

Experimental Determinations of Activation Parameters

ΔG‡ = ΔH‡ – TΔS entropy: energy associated with conformation, bond strength, vibrational states and how changes in these properties affect the overall energy of the system. enthalpy: can be related to the height of the surface while entropy is related to the width of the channels leading from one energy well to another

19

http://www.metadynamics.cz/eyring/eyring.html Computational Chemistry

Experimental Determinations of Activation Parameters Experimental Determinations of Activation and Arrhenius Parameters ΔG‡ = ΔH‡ – TΔS The Eyring equation can be mathematically manipulated to give the equation of a line with a dependence on temperature

kBT S H Eyring plot k = exp exp h R RT kh slope H ln k T kBT H S B ln k = ln – + h RT R y-intercept S kh H S ln = – + 1 / T (K-1) kBT RT R

Similar manipulation of the Arrhenius equation allows one to experimentally determine values for Ea and A Arrhenius plot –E k = A exp a 20 RT slope Ea http://www.metadynamics.cz/eyring/eyring.htmlln k E y-intercept A ln k = ln A – a RT

1 / T (K-1) What is DFT useful for?

Phosphate/sulfate hydrolysis

Dissociative Associative

2.34 2.45 2.27 1.75

Bond Forming Bond P- Bond Breaking Olg

21

Neese et al. J. Chem. Phys. 2013, 138, 034106 Kumar et al. Chem. Sci. 2018, 9, 2655 What is DFT useful for?

Phosphate/sulfate hydrolysis

associative dissociative Linear Free energy Relationship (LFER) a. 3,5-NO -4 2 a. 3,5-NO2 -1.42±0.03 b. 4-NO2 NO -6 NO2 . 3-NO -4-Cl O b. 4-NOO 2 2 O O O O O O 2 NO X d. 3-NO NO2 2 c. 3-NO -4-Cl P CH P P 2 O O O O O O O O S 2 O O 3 -8 O O X O O -1 O O e. 3,4-Cl P CH P d. 3-NO2 S 3 P / s f. 3-Cl O O O O O O e. 3,4-Cl k -10 O O g. 4-Cl

f. 3-Cl log -12 h. H g. 4-Cl h. H -14 -16

-18 6 8 10 12 14

pKa

22

Duarte et al. J. Am. Chem. Soc. 2015, 137, 1081 (Cover article and Spotlight) Duarte et al. J. Am. Chem. Soc. 2016, 138, 10664 Edge Article Chemical Science

1 1

5 5

10 10

15 15

20 20

Fig. 2 Cation–p complexes analyzed in this work. Models of (A)/(B) lysine; (C)/(D), arginine; and (E)–(H), histidine sidechains interacting with benzene. 25 25 with one or three N–H atoms facing the aromatic ring.41 This complexes (A), (C), (E), and (G) CCSD(T) calculations were also binding mode is most prevalent in proteinWhat inter-residue is DFT inter- carrieduseful out using for? a dielectric constant of 4.2 (diethyl ether) and 30 actions.42 Guanidinium–benzene complexes can adopt at least 78.4 (water). 30 two conformations: perpendicular (T-shaped) or parallel. While In relation to the D6h symmetry of benzene, two vectors in the T-shaped geometries are preferred in gas-phase, parallel- plane of the ring represent extreme scenarios of displacement – complexes are preferred in solution and have been observedMagnitudesone towards a and C–H bond origins (angle displacement)of nonbonded and the other interactions more frequently in protein structures.43 We studied the parallel towards a C–C bond (side displacement). These vectors are 7,44 35 conguration, based on its biological relevance. For [C H ] related by a rotation of m 30 about the C axis (Fig. 3). By 35 6 6 ¼ 6 [Imi]+ complexes, both perpendicular and parallel arrange- plotting a potential energy curve (PEC) with vertical distance ments are found in protein interactions.45 We included both Cation–π interactions [C6H6][NH4]+ [C6H6][Gdm]+ interaction types in our analysis.

We studied the interaction energy (Eint) of each model 40 cation–p complex as a function of intermolecular separation 40 ] and of horizontal displacement parallel to the aromatic plane. Distances were calculated between the center of mass of both species in the complex. Cartesian (x,y) displacements are dened with respect to the benzene ring as shown (Fig. 3). 45 45 Potential energy curves (PECs) were generated at the domain- + + [C6H6][ImiT] based local pair-natural orbital coupled cluster with perturba- [C6H6][ImiP] tive triple excitations, DLPNO-CCSD(T),46 level of theory. An augmented, correlation consistent , aug-cc-pVTZ, was 50 used. The convergence of valence DZ, TZ, and QZ quality basis 50 sets was examined for the [benzene][Na]+ complex (Fig. S1†). The aug-cc-pVTZ equilibrium separation closely matches that obtained with a larger aug-cc-pVQZ basis, with an interaction Fig. 3 (Left) Parameters describing the relative geometry for PEC 1 energy within 0.5 kcal molÀ . DLPNO-CCSD(T) energies achieve calculations between the cation and benzene using the distance (R), 1 47 ff ff 55 an accuracy of 1 kcal molÀ or better compared to CCSD(T), vertical o set (Rz, along the normal) and horizontal o sets (Rx and Ry, 55 while CCSD(T)/CBS values are generally considered benchmark parallel to the plane of benzene). (Right) The side and angle 23 48 displacements of the cation relative to benzene corresponding to values for intermolecular interaction energies. DLPNO- – – Neese et al. J. Chem. Phys.vectors 2013 pointing, 138, 034106 to a C C/C H bond by adjusting X and Y coordinates, CCSD(T) thermochemistry of small organic molecules is accu- used to describe the difference in geometry between pairs of 1 Kumar et al. Chem.49 Sci. 2018, 9, 2655 rate to within 3 kJ molÀ against experimental data. For complexes, e.g. (E) and (F).

This journal is © The Royal Society of Chemistry 2018 Chem. Sci.,2018,xx,1–11 | 3 a isosurfaces cation of separation 4 Figure along an intermolecular axis perpendicular to the aromatic plane computed We -3 A What is DFT useful for? B . Top: t the t the

Magnitudes and origins of nonbondedDLPNO interactions minimum energy separations. DLPNO – π

complexes. Minimum energies ( + + [C6H6][Gdm] - CCSD(T)/aug a isosurfaces cation of separation 4 Figure [C6H6][NH4] along an intermolecular axis perpendicular to the aromatic plane computed We - CCSD(T)/aug D C -3 A B C D A B . 3 kcal/mol 3 Top: - t the t the cc - VZ neato eege ( energies interaction pVTZ DLPNO minimum energy separations. - DLPNO

cc – π - complexes. Minimum energies ( pVTZ - CCSD(T)/aug –1 –1 –1 –1 E

–19.2 kcal mol –18.7 kcal mol 12 –7.5 kcal mol –7.4 kcal mol min - CCSD(T)/aug

) and equilibrium separations ( separations equilibrium and ) D C neato eege fr l egt complexes eight all for energies interaction F E 3 kcal/mol 3 - cc cl mol kcal - VZ neato eege ( energies interaction pVTZ , as shown in , as shown -

cc – 1 - ) as a function of intermolecular intermolecular of function a as ) pVTZ E 12 G min F R

) and equilibrium separations ( separations equilibrium and ) z neato eege fr l egt complexes eight all for energies interaction Figure 4 ) shown. Bottom: Bottom: shown. ) F E .

cl mol kcal

, as shown in , as shown 24 NCI –

1 ) as a function of intermolecular intermolecular of function a as ) DLPNO-CCSD(T)/aug-cc-pVTZ CPCM-MP2/cc-pVTZ G F

Figure 7. Normalized distance dependence of empirical interactionsR Figure (bars) compared7. Normalized against distance DLPNO dependence- oFiguref empirical 7. Normalized interactions distance (bars) compared dependence against of empirical DLPNO - interactions (bars) compared against DLPNO- z Figure 4 ) shown. Bottom: Bottom: shown. ) CCSD(T)/aug-cc-pVTZ computed potential energy curves (kcal mol–1) in CCSD(T)/aug the gas phase-cc and-pVTZ with computed a dielectric pot ential energyCCSD(T)/aug curves (kcal-cc mol-pVTZ–1) incomputed the gas pot phaseential and energy with a curves dielectric (kcal mol–1) in the gas phase and with a dielectric

constant of 4.2 (diethyl ether) and 78.4 (water). Solvation corrections were computedconstant ofat 4.the2 (diethylCPCM- MP2/ccether) and-pVTZ 78.4 (water). Solvationconstant correctionsof 4.2 (diethyl were ether) computed and 78. at 4the (water).. CPCM Solvation-MP2/cc- pVTZcorrections were computed at the CPCM-MP2/cc-pVTZ level of theory. level of theory. level of theory. NCI

Each cation–π interaction is weakened by the presence of a surroundingEach cation dielectric–π interaction medium is. Forweakened by Eachthe presence cation– πof interaction a surrounding is weakened dielectric by medium the presence. For of a surrounding dielectric medium. For

–1 lysine, the interaction strength decreases to 19% of its gas-phase lysine,value ( theto 3. interaction6 kcal mol strength–1) for ε =decreases tolysine, 19% of the its interaction gas-phase strengthvalue (to decreases 3.6 kcal molto 19)% for of εits = gas-phase value (to 3.6 kcal mol–1) for ε =

–1 4.2, and 7% (1.3 kcal mol–1) for ε = 78.4. For arginine, the decrease4.2, andis less 7% pronounced (1.3 kcal mol at these) for ε = 78.4. 4.2,For andarginine, 7% (1.3 the kcaldecrease mol– 1is) forless ε pronounced = 78.4. For atarginine, these the decrease is less pronounced at these values of dielectric constant, to 47% (3.2 kcal mol–1) and 34% values(2.3 kcal of moldielectric–1) – inconstant, water this to 47% (3.2 valueskcal mol of –dielectric1) and 34 %constant, (2.3 kcal to mol47%–1 )(3. – 2in kcal water mol this–1) and 34% (2.3 kcal mol–1) – in water this interaction is stronger than lysine’s even though it is less favorableinteraction by 11.5 kcalis stronger mol–1 i thann the lysine gas- ’s even thoughinteraction it is less is stronger favorable than by lysine 11.5 kcal’s even mol though–1 in the it gasis less- favorable by 11.5 kcal mol–1 in the gas- phase. With the SMD solvation model,66 arginine’s cation–π interactionphase. With is the also SMD stronger solvation than model,66 phase.arginine’s With cation the SMD–π interaction solvation is model, also 66 stronger arginine’s than cation–π interaction is also stronger than lysine’s in water, although the interaction strengths were greater.lysine’s A p inrevious water, computational although the interactionlysine’s strengths in water, were greater. although A the previous interaction computational strengths were greater. A previous computational estimate of the lysine–benzene interaction strength in water estimateis larger of (5.5 the kcal lysine mol–benzene–1 with interactionestimate strength of in the water lysine is– benzenelarger (5.5interaction kcal mol strength–1 with in water is larger (5.5 kcal mol–1 with

SM5.42R/HF/6-31+G*)7b compared to the values of 1.3/2.8 kcalSM5.42R mol–1 (CPCM/SMD)/HF/6-31+G* )we7b compared have to theSM5.42R values of/HF/6 1.3/2.8-31+G* kcal) 7bmol compared–1 (CPCM/SMD) to the values we have of 1.3/2.8 kcal mol–1 (CPCM/SMD) we have Fernanda Duarte 21/12/2017 20:53 Fernanda Duarte 21/12/2017 20:53 Fernanda Duarte 21/12/2017 20:53 obtained with correlated wavefunction theory and a larger basis setobtained. with correlated wavefunction theoryCommentobtained and [5]: a withlarger We could correlated basis put theset SMD. wavefunction theory and a larger basisComment set. [5]: We could put the SMD Comment [5]: We could put the SMD values in Table S3, next to the CPCM ones values in Table S3, next to the CPCM ones values in Table S3, next to the CPCM ones

18 18 18 What is DFT useful for?

Angewandte Kinetic Catalytic Model Chemie

elimination to form a weakly bonded complex 17 between the

product and [HCo(CO)3]. Finally, to complete the formal catalytic cycle, this species can release the product and add carbon monoxide. WhileDFT the aboveoptimisation cycle is de- and frequency scribed as starting from monomeric 1, thisB3LYP/6-311G(d) species is known to be in (fairly rapid) equilibrium with

[Co2(CO)8] 23 under catalytic con- ditions. For this species CCSD(T) calculationsSingle-point are not possible energies (becauseCCSD(T)-F12 of computational expense (explicit treatment of and multireference behavior) so our computedelectron energy correlation). is based on DFT. The calculated free energy 1 change of 30.6 kJmolÀ for forming two equivalents of 1 from 23 and hydrogen is in good agreement with the experimental value mea- 1 [17] sured in heptane (22.6 kJmolÀ ). Alkene hydrogenation is a wasteful side reaction in some applications of hydroformylation, but has not been considered in previous studies of the cobalt-cata- lyzed reaction.[10] We propose that it occurs from intermediate 6, by

addition of H2 instead of CO, to yield the dihydrogen complex 20, over a low barrier TS19. In contrast to the case of the related complex 25 13, which yields product 18 through Scheme 1. Modeled catalytic cycle for alkene hydroformylation and hydrogenation. oxidative addition/reductive elimi- nation, release of propane 22 is Harvey et al Angew. Chem. Int. Ed. 2014, 53, 8672 found to occur through a one-step s-bond metathesis over TS21. This TSs corresponding to such mechanisms, but could not locate transition state has a structure somewhat similar to that of the any that would be competitive with the dissociative route.[16] reductive elimination TS16 in the hydroformylation mecha-

The next steps in the mechanism are insertion of the nism. The putative TS for oxidative addition of H2 to 20 has alkene into the Co H bond of 4 through TS5 to yield the 16- apparently disappeared. It is noteworthy that the correspond- À electron ethyl complex 6, which can add carbon monoxide ing TS14 lies lower in potential energy than the CoIII (without a potential energy barrier) to form the correspond- dihydride species 15, perhaps because this species is a mini- ing tetracarbonyl species 7. This can in turn undergo CO mum at the B3LYP level of theory used for optimization, but insertion into the cobalt–alkyl bond through TS8 to yield an not at the CCSD(T) level. Note that hydrogenation of the unsaturated acyl species 9. Addition of carbon monoxide to 9 aldehyde product to form an alcohol also often occurs during involves a small potential energy barrier, TS10, because of the hydroformylation (especially with phosphine-modified cata- need to “displace” the acyl oxygen atom that interacts with lysts).[3] Exploring the mechanism and relative rate of this the formally vacant site at the metal. This process yields the reaction would be valuable but goes beyond the scope of this saturated acylcobalt tetracarbonyl species 11, the intermedi- study. ate with the lowest potential energy in the whole catalytic Based on our experience,[18] our CCSD(T)-F12 protocol cycle (though it lies higher in standard free energy at 1508C should yield accurate energies for transition metal com- 1 than 7). From this stable species, the first step towards the pounds, with an error of approximately 10 kJmolÀ . The hydroformylation product is the reverse of the last process— experimental room temperature enthalpy change for the loss of carbon monoxide to yield 9, followed by addition of hydrogenation and hydroformylation of propene are 123.9 1 [19] À molecular hydrogen (over the low barrier TS12) leading to and 109.9 kJmolÀ respectively, compared to calculated À 1 the dihydrogen complex 13. Dihydrogen activation can then values of 122.9 and 108.4 kJmolÀ . À À occur to yield the dihydride 15, followed by reductive

Angew. Chem. Int. Ed. 2014, 53, 8672 –8676  2014 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.angewandte.org 8673 Which Softwares Do I Use? !Turbomole 6.2 $$ http://www.turbomole.com 09/16 $$$$$$ !Q-Chem 3.2 $$$$ http://www.gaussian.com http://www.q-chem.com General purpose, easy interface !Molpro7 $$$$ http://www.molpro.net !ADF 2010 $$$$$$$$$$$$$$$$$$$$$$$$$ Accurate correlated ab initio methods http://www.scm.com General purpose, DFT-oriented !Molcas 7$? http://www.teokem.lu.se/molcas !Jaguar 2010 $$$$$$$$$$$$$$$$$$$$$$$$$ Excited states (CASSCF, RASSCF, CASPT2) http://www.schrodinger.com/products/14/7 General purpose, fast DFT !Crystal 09 $ !Spartan’10 $$ http://www.crystal.unito.it http://www.wavefun.com/products/spartan.html General purpose, fast DFT and post-HF General purpose, GUI included Solid state and physics, periodic conditions 26 Which Softwares Do I Use? !Abinit 6.6 !GAMESS Oct1, 2010 http://www.abinit.org http://www.msg.ameslab.gov/gamess Light and portable DFT code General purpose and highly scalable ! 6.6 !NWChem 6.0 http://wiki.chem.vu.nl/dirac/index.php/Dirac_Program http://www.nwchem-sw.org Properties using relativistic calculations General purpose and intensively parallelized !Siesta 3.0 !Orca 2.8 http://www.icmab.es/siesta http://www.thch.uni-bonn.de/tc/orca Simulations of materials General purpose, extra-fast RI-DFT and RI-CC !CPMD 3.13 ! 2.0 http://www.cpmd.org http://www.kjemi.uio.no/software/dalton Carr-Parrinello General purpose, multi-reference calculations !CP2K !Mopac 2009 http://cp2k.berlios.de http://openmopac.net/MOPAC2009.html Solid state, liquids and biological simulations Semiempirical methods (PM3, PM6) !Octopus 3.2 !SAPT 2008 http://www.tddft.org/programs/octopus/wiki http://www.physics.udel.edu/~szalewic/SAPT Symmetry-Adapted Perturbation Theory TDDFT 27 Break

28