METHODOLOGY and APPLICATIONS. by Vitalii Vanovschi a Disse

ELECTRONIC STRUCTURE AND SPECTROSCOPY IN THE GAS AND CONDENSED PHASE: METHODOLOGY AND APPLICATIONS.

Vitalii Vanovschi

A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulﬁllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (CHEMISTRY)

January 2009

I’d like to thank my scientiﬁc adviser and mentor Prof. Anna I. Krylov. Her advice and support were the critical success factors in the projects presented in this thesis. Her enthusiasm toward the results I’ve obtained was highly encouraging. I appreciate the mentoring on the number various topics Anna provided me with. Also, I would like to thank Anna for the team-building activities she organized: hikes, trips, rock-climbing. Thanks to the current and the former members of Anna’s group, especially Vadim Mozhayskiy, Dr. Kadir Diri, Evgeny Epifanovsky, Lucasz Koziol, Dr. Sergey V.Levchenko and Dr. Piotr A. Pieniazek for their help throught the projects. I’d like to thank Prof. Curt Witting for the fascinating physical theories I’ve learned in his class and for mentoring me during the course of studies. I am grateful to our main collaborator on the effective fragment potential project, Prof. Lyudmila Slipchenko for answering all the numerous questions I had about this complicated method. Also I’d like to thank Prof. Mark S. Gordon for organizing a welcome visit to Iowa State University where, working in the laboratory of the original EFP inventors, I made a substantial progress on this project. I’d like to thank Dr. Yihan Shao and Dr. Ryan Steele who helped me to ﬁgure out the nuts and bolts of Q-Chem. Especially I’d like to thank Dr. Christopher Williams for implementing the multipole integrals, required for the EFP electrostatics and polarization components.

ii Table of Contents

Acknowledgements ii

List of Figures v

List of Tables viii

Abstract ix

1 Implementation of the Effective Fragment Potential method 1 1.1 Introduction ...... 1 1.2 Electrostatics ...... 9 1.3 Polarization ...... 14 1.4 Dispersion ...... 21 1.5 Exchange-repulsion ...... 25 1.6 Implementation of EFP in Q-Chem ...... 28 1.7 EFP application: Beyond benzene dimer ...... 37 1.8 Conclusions ...... 50 Reference list ...... 51

2 Structure, vibrational frequencies, ionization energies, and photoelectron spectrum of the para-benzyne radical anion 57 2.1 Introduction ...... 57 2.2 Molecular orbital picture and the origin of vibronic interactions in the para-benzyne anion ...... 60 2.3 Computational details ...... 65 2.4 Equilibrium structures, charge distributions, and the D2h→ C2v potential energy proﬁles at different levels of theory ...... 67 − 2.5 Vibrational frequencies and the photoelectron spectrum of para-C6H4 . 77 2.6 DFT self-interaction error and vibronic interactions ...... 82 2.7 Calculation of electron afﬁnities of p-benzyne ...... 86 2.8 Conclusions ...... 90 Reference list ...... 92

iii 3 Future work 96 3.1 Methodology ...... 96 3.2 Implementation ...... 100 3.3 Applications ...... 101 Reference list ...... 102

Bibliography 104

A GAMESS input for generating EFP parameters of benzene 117

B Geometries of benezene oligomers 119

C Converter of the EFP parameters 122

D Fragment-fragment electrostatic interaction 128

E Buckingham multipoles 130 E.1 The conversion from the sperical to the Buckingham multipoles . . . . 130 E.2 The conversion from the Cartesian to the Buckingham multipoles . . . . 132

F The Q-Chem input for EFP water cluster 134

G Q-Chem’s keywords affecting the EFP parameters computation 138

iv List of Figures

1.1 Comparison of continuum (a) and discrete (b) models for a water molecule in the environment of other water molecules...... 3

1.2 Uracil in aqueous solution. Active region (uracil) is shown with a balls and sticks model. Effective fragments (water) are depicted by lines. . . 4

1.3 Distributed multipole expansion points for a water molecule...... 10

1.4 Sketch of the charge penetration phenomenon for two water molecules. 13

1.5 The shape of the grid (monotonous gray area) for ﬁtting the screening parameters of a water molecule...... 15

1.6 Distributed polarizabilities for a water molecule...... 16

1.7 Computing the induced dipoles through the two level self-consistent iterative procedure. Here Ψ is the ab initio wavefunction, µ is a set of induced dipoles, FΨ is the force due to the ab initio wavefunction and Fµ is the force due to the induced dipoles...... 19 1.8 Distributed dispersion points for a water molecule...... 23

1.9 The exchange-repulsion LMO centroids for a water molecule...... 27

1.10 Class diagram for the EFP energy part of the implementation...... 29

1.11 The internal structure of the FragmentParams class...... 29

1.12 Representing the orientation with the Euler angles. xyz is the ﬁxed system, XYZ is the rotated system, N is the intersection between the xy and XY planes called the line of nodes. Transformation between the xyz and XYZ systems is given by α rotation in the xy plane, followed by β rotation in the new y’z’ plane (around the N axis), followed by γ rotation in the new x”y” plane...... 30

1.13 The rotation matrix corresponding to the Euler angles α, β, γ...... 31

v 1.14 The ﬂow of control between Q-Chem and EFP energy module...... 32

1.15 The format for the $efp params section...... 35

1.16 The format for the $efp fragments section...... 35

1.17 Sandwitch (S), t-shaped (T) and parallel-displaced (PD) structures of the benzene dimer...... 40

1.18 The structures of eight benzene trimers...... 45

1.19 Dependence of the non-additive energy share on the length of benzene oligomers...... 50

2.1 Frontier MOs and leading electronic conﬁgurations of the two lowest electronic states at D2h and C2v structures...... 62

− 2.2 The two lowest electronic states of C6H4 can be described by: (i) EOM- IP with the dianion reference; (ii) EOM-EA with the triplet neutral p- benzyne reference; (iii) EOM-SF with the quartet p-benzyne anion reference...... 64

2.3 D2h structures optimized at different levels of theory...... 68 2.4 CCC angles, bonds lengths calculated by different coupled-cluster methods and B3LYP...... 69

2.5 The distance between the radical-anion centers calculated by different coupled-cluster methods and B3LYP...... 70

2.6 The NBO charge at C1 (top panel) and dipole moments (bottom panel) at the optimized C2v geometries. The ROHF-CCSD and ROHF-EA- CCSD values are calculated at the geometries optimized by the corresponding UHF-based method...... 71

2.7 Energy differences between the C2v and D2h structures (6-311+G** basis). Upper panel: calculated using the UHF-CCSD optimized geometries. MCSCF and CASPT2N employ the MCSCF geometries. Lower panel: calculated at the respective optimized geometries...... 73

2.8 The potential energy proﬁles along the D2h-C2v scans. At the D2h structure, CCSD and CCSD(T) energies are calculated using symmetry-broken and pure-symmetry UHF references (see section 2.3)...... 75

2.9 Displacements along the normal modes upon ionization (D2h structure). 80

vi 2.10 Top: Singlet (solid line) and triplet (dotted line) bands of the electron photodetachment spectrum for D2h using the CCSD equilibrium geometries and frequencies of the anion. Bottom: the experimental (solid line) and the calculated (dotted line) spectra for D2h using the CCSD structure and normal modes...... 81

2.11 The experimental (solid line) and the calculated (dotted line) spectra for C2v using the CCSD structure and normal modes...... 83 2.12 The experimental (solid line) and the calculated (dotted line) spectra for D2h using the B3LYP structure and normal modes...... 84

2.13 Energy differences between the D2h and C2v minima (using the CCSD structures) calculated using different density functionals...... 87

2.14 Adiabatic electron afﬁnities calculated by energy differences for the singlet (top panel) and triplet (bottom panel) state of the para-benzyne diradical...... 89

vii List of Tables

1.1 The geometric parameters and energies of the benzene dimers . . . . . 41

1.2 Dissociation energies of the benzene dimers at different levels of theory. 42

1.3 Total and many-body interaction energies of the benzene trimers (ﬁrst group)...... 43

1.4 Total and many-body interaction energies of the benzene trimers (second group)...... 44

1.5 Binding energies of benzene oligomers computed with EFP...... 49

2.1 Energy differences between the C2v and D2h minima (eV) in the 6- 311+G** basis set. Negative values correspond to the C2v minimum being lower. ZPEs are not included...... 74

2.2 Vibrational frequencies of the para-benzyne radical anion and the diradical...... 78 √ 2.3 Displacements in normal modes upon ionization, A ∗ amu...... 80

3 1 2.4 Electron afﬁnities (eV) of the a B1u and X Ag states of the para-benzyne diradical...... 90

viii Abstract

Method developments and applications of the condensed phase and gas phase modeling techniques are presented in two parts of this work. In the ﬁrst part, the theory of the effective fragment potential (EFP) method designed for accurate modeling of the molecular properties of the extended systems is introduced. The implementation of EFP developed in this work within the Q-Chem electronic structure package is discussed in details. The code is applied to the studies of π − π interactions in benzene oligomers. The primary interest of these studies are the many-body and total non-additive π − π interactions. A detailed comparison of EFP with ab initio theories is presented for the set of eight benzene trimers. Three-body intermolecular interaction energies at the EFP level of theory are within 0.04 kcal/mol from the full ab initio results. To elucidate the asymptomatic behavior of the total non-additive binding energy, three types of linear benzene oligomers with 10, 20, 30 and 40 monomers are modeled using our new EFP implementation. The result of this investigation is that with the increase in the size of linear benzene clusters, the share of the total non-binding energy approaches a constant and does not exceed 4% for the structures considered in the study. This result indicates that there are no substantial non-local effects in the binding mechanism of linear benzene clusters. In the second part, the equilibrium structure, vibrational frequencies, and ionization energies of the para-benzyne radical anion are characterized by coupled-cluster

ix and equation-of-motion methods. Vibronic interactions with the low-lying excited state result in a flat potential energy surface along the coupling mode and even in lower- symmetry C2v structures. Additional complications arise due to the Hartree-Fock instabilities and near-instabilities. The magnitude of vibronic interactions was characterized by geometrical parameters, charge localization patterns and energy differences between the D2h and C2v structures. The observed trends suggest that the C2v minimum predicted by several theoretical methods is an artifact of an incomplete correlation treatment. The comparison between the calculated and experimental spectrum confirmed D2h structure of the anion, as well as accuracy of the coupled-cluster and spin-flip geometries, frequencies and normal modes of the anion and the diradical. Density functional calculations (B3LYP) yielded only a D2h minimum, however, the quality of the structure and vibrational frequencies is poor, as follows from the comparison to the high-level wave function calculations and the calculated spectrum. The analysis of charge localization patterns and the performance of different functionals revealed that B3LYP underestimates the magnitude of vibronic interactions due to self-interaction error.

x Chapter 1

Implementation of the Effective Fragment Potential method

1.1 Introduction

Condensed phase encompasses the majority of matter on the Earth. Not surprisingly, many molecular systems that are significant due to their biological, electrical or mechanical properties also exist in the condensed phase. Properties of individual molecules in such systems are influenced by the environment. Therefore, intermolecular interactions play crucial role in chemical and physical properties of the condensed phase. There are three distinct classes of methods for modeling the properties of the extended systems at molecular level: ab initio (QM), classical (MM) and hybrid (QM/MM) approaches. The ab initio methods treat a molecule of interest together with the neighboring molecules as one system at the same level of ab initio theory. The benefit of this approach is that a high accuracy can be achieved, with a proper choice of the ab initio method. However, computational costs of many ab initio methods scale steeply with the size of the system [for instance, as N 7 for CCSD(T)1], which makes the full ab initio treatment of the environmental effects impractical. New directions in the development of the ab initio methods with more favorable computational costs are:

1 • Linear scaling variants of ab initio methods2 are based on local and/or density ﬁtting (resolution of identity) approximations. At present, they are available for a few ab initio methods (HF3, DFT4, MP25, CC6).

• Separation techniques such as semi-classical divide and conquer7 and fragment molecular orbitals8 decompose the molecular system into fragments (e.g., molecules), which are described by individual and thus localized wavefunctions. The wavefunctions for each fragment are computed in the ﬁeld of the frozen wave- functions of all other fragments until self-consistency is achieved.

• Plane-wave DFT represents the wavefunction in terms of the plane waves9. It is suitable for systems with periodic boundary conditions (such as metals and crystals) and features linear scaling of the computational effort.

On the other side of the theory spectrum, the whole molecular system can be modeled using molecular mechanics, that is classically using empirical force fields10–17. The advantage of this approach is its efficiency - very large systems with many thousands of atoms can be modeled. However, molecular mechanics is incapable of modeling chemical reactions or molecular properties depending on the electronic structure. Furthermore description of the intermolecular interactions is typically limited to parametrized pairwise potentials and thus lacks accuracy. In recent years, a number of hybrid methods combining the accuracy of ab initio and the efficiency of molecular mechanics by treating different parts of the system with different levels of theory has been developed18–21. The hybrid methods can be subcategorized into continuum and discrete ones (Fig. 1.1). The continuum methods consider the environment to be uniform, as the name suggests. Its effects on an individual molecule is represented with a potential that depends only on the shape of the cavity the molecule allocates and the bulk properties

2 of the matter. The simplicity of continuum models makes them computationally efﬁ- cient, but limit their accuracy. The discrete methods treat individual molecules of the environment explicitly, for instance by including all the neighbor molecules within a certain radius in the model. Unlike the fully ab initio methods, the molecules of the environment are described classically, i.e., by a molecular mechanics (MM) force ﬁeld. The discrete methods are superior in the sense that their accuracy can be systematically improved by (1) increasing the number of neighboring molecules included in the model and by (2) using a higher level theory to describe them. Overall, the continuum models are preferred when accurate description of the bulk is essential. For example, a shift in the absorption spectrum of a molecule in a solvent can be well reproduced by a continuum model. They break down when the environment is non-uniform (e.g., enzyme active center) or when relaxation is important due to the strong interaction between solvent and solute. Such cases call for discrete methods. However, to describe bulk behavior the discrete methods need to be coupled with a sampling technique (molecular dynamics or Monte Carlo).

Figure 1.1: Comparison of continuum (a) and discrete (b) models for a water molecule in the environment of other water molecules.

3 Depending on the way QM and MM parts are integrated with each other, the discrete QM/MM methods can be categorized as mechanical embedding and electrostatics embedding18. Mechanical embedding computes the MM parameters from the QM wavefunction and then treats the whole system classically. Although mechanical embedding is simple and computationally cheap, the large drawback is that the perturbation of the QM wavefunction due to MM part is ignored. Electrical embedding explicitly accounts for the presence of the MM part by including the perturbative potential from MM into the Hamiltonian of the QM system. Thus, it provides better description of the electronic structure of the QM part, at the price of the increased computational complexity. One of the promising hybrid methods based on the electrostatic embedding is the effective fragment potential (EFP) approach22–33. In EFP the molecular system is decomposed into an active region and a number of effective fragments. An example of such decomposition for uracil (active region) in water (effective fragments) is demonstrated in Fig. 1.2.

Figure 1.2: Uracil in aqueous solution. Active region (uracil) is shown with a balls and sticks model. Effective fragments (water) are depicted by lines.

The active region is a part of the molecular system that requires accurate description, for instance, because it participates in a chemical reaction or its electronic properties are

4 of interest. Therefore, the active region is modeled with an ab initio method. The effective fragments are the spectator molecules and thus do not require high accuracy for their internal description. However, since the interaction between the active region and the effective fragments affects the properties of the active region, this interaction should also be modeled with reasonable accuracy. An effective fragment in EFP is described semi- classically by a set of multipoles (with screening coefﬁcients), polarizability points, dispersion points and exchange-repulsion parameters. The interaction between the effective fragments and the active region is accounted for by adding a one-electron term to the Hamiltonian of the active region:

H0 = H + hp |V | qi , (1.1) where V is the potential due to the effective fragments. The expression for the inter-fragment interaction energy is derived by applying perturbation theory to the wavefunctions of two interacting molecules, as detailed in Ref. 53. Within this approach, the total energy for fragments A and B is:

Etotal = E0 + E1 + E2 + ...

E0 = EA + EB (1.2) E1 = h00 |T | 00i X h00 |T | mni hmn |T | 00i E2 = − , E0 − E0 mn mn 00 where EA and EB are energies of the individual isolated fragments, En are the pertur-

0 bative corrections, T is the electrostatics operator, Emn is the energy of the excited state mn for the two non-interacting fragments.

5 The ﬁrst order perturbative correction E1 is the electrostatic interaction. In EFP, it is described by a distributed multipole expansion (section 1.2). The second order perturbative correction consists of polarization and dispersion terms. The polarization component of EFP is accounted for by distributed polarizabilities (section 1.3). The dispersion is represented similarly by a set of distributed dispersion points (section 1.4). These terms are derived from the exact ab initio expressions and, therefore, their accuracy can be systematically improved. The higher order terms in perturbative expansion are expected to have smaller values and are neglected in the current EFP formalism. One of the reasons of the EFP success29 is that some of the higher order terms are believed to cancel out each other, thus omitting them does not affect the total interaction energy substantially. For example, charge-transfer and exchange-induction are similar in magnitude, but have the opposite signs and thus cancel out each other. The exchange-dispersion partially cancels out with higher order dispersion terms (such as induced dipole-induced quadrupole) that are also omitted in the EFP formalism29. Perturbative treatment of intermolecular interactions described above works well for large distances, but breaks down when molecules are close to each other and their electronic densities overlap. To account for this effect, EFP includes a charge-penetration correction (section 1.2) to the electrostatics and the exchange-repulsion component (section 1.5) accounting for the Pauli exclusion principle. Both terms are also derived from the exact ab initio expressions. Thus, EFP intermolecular interaction consists of four components: electrostatics (with charge-penetration correction), polarization, dispersion and exchange-repulsion. The interaction energy between the effective fragments is computed as:

Eef−ef = Eelec + Epol + Edisp + Eex−rep (1.3)

6 Since the active region is described by an ab initio method, the electrostatics and polarization interactions with the effective fragments are accounted for by additional one-electron terms in the Hamiltonian:

0 elec pol H = H + p V + V q (1.4)

In the present EFP model, dispersion and exchange-repulsion interactions between active region and the effective fragments are treated similarly to the fragment-fragment interactions as additive corrections to the total energy. One of the advantages of EFP is that all of its terms has been derived from the exact ab initio expressions. Thus, the parameters necessary to describe an effective fragment can be obtained from the wavefunction of an isolated molecule. The parameters are computed only once for every fragment type and then can be reused for all consequent computations. This makes EFP method very efﬁcient. Its computational cost is O(N 2) for the system with N effective fragments, plus the cost of the ab initio computation for the active region, which does not depend on the number of fragments. Another advantage of EFP is that, unlike standard QM/MM approaches, it is a polarized force ﬁeld. Therefore, the effective fragments not only polarize the active region, but also are polarized by it (polarized embedding), as well as by each other. Therefore, EFP describes polarization effects more accurately than standard QM/MM approaches. The third advantage is that the accuracy of EFP can be systematically improved by introducing higher order terms and cross terms in the perturbative expansion (e.g. exchange-dispersion), as well as higher order terms in the expansion of the individual components (e.g. hexapole-hexapole term for electrostatics, or induced dipole-induced quadrupole term for dispersion).

7 The first implementation of EFP appeared in the GAMESS34 quantum chemistry package, which is developed by the original authors of the EFP method22. By now EFP has been successfully applied for treating the environmental effects in a number of molecular systems35–41. In addition, due to the accurate description of the inter- fragment interactions, EFP has been successfully used to study the relative stabilities of the molecular clusters and the dissociation energy profiles27, 28, 42–46. The primary goal of this work was to implement the EFP method in Q-Chem47. The benefits of such implementation are: (1) the extended availability of the EFP method, (2) the ability to generate EFP fragments parameters with a higher level methods available in Q-Chem (such as CCSD) and (3) creating the base for future development of EFP in which the active region can be modeled with a correlated ab initio theory a. The second goal was to apply the developed EFP implementation to the benzene clusters. These systems are interesting due to their π-π stacking interactions, which are similar in nature to those in DNA48. The individual components of EFP are described in details in the four following sections. Section 1.6 describes the developed implementation of EFP in Q-Chem. Its application to benzene trimers and higher oligomers is presented in section 1.7. The last section of the chapter presents summary of the results and conclusion.

aThe present GAMESS implementation only enables a non-correlated description of the active region (HF or DFT).

8 1.2 Electrostatics

Electrostatic component of the EFP energy accounts for Coulomb interactions. For molecular systems with hydrogen bonds or for strongly polar molecules, electrostatics has the leading contribution to total EFP energy49. EFP electrostatics was devised to provide an accurate, yet compact and computationally efﬁcient description of the molecular electrostatic properties22. Buckingham49 has shown that the Coulomb potential of a molecule can be expressed using multipole expansion at a single spacial point k:

x,y,z x,y,z x,y,z elec qk X k 1 X k 1 X k Vk (x) = + µaFa(rkx) + ΘabFab(rkx) − ΩabcFabc(rkx), (1.5) rkx 3 15 a a,b a,b,c where q, µ, Θ, and Ω are net charge, dipole, quadrupole, and octopole, respectively, and

Fa, Fab, and Fabc are the electric field, field gradient, and field second derivatives, due to the effective fragment. This expression converges to the exact electrostatic potential when an infinite number of terms are included. However, it is known to converge slowly50 and thus requires a large number of terms to achieve a reasonable accuracy, even if such expansions are assigned to the individual atoms in the molecule, as in Mul- liken population analysis51. That makes a single point or atomic based representations of the molecular electrostatic potential impractical. In his work on distributed multipoles analysis50, 52–54, Stone has shown that a better description of the Coulomb potential can be obtained by using many points as centers of the multipole expansions. In fact, for a system with a wavefunction expressed in the basis of N gaussians, the Coulomb potential can be described exactly with a finite number of multipoles located at N 2 points corresponding to the gaussian products centers55. However, this representation becomes inefficient for systems with a large basis.

9 Numerical tests conducted on a number of systems have demonstrated that an accurate representation of the electrostatic potential can be achieved by using multipoles expansion around atomic centers and bond midpoints (that is, the points known to have high electronic density) and truncating multipoles expansions after the octopoles50, 52. This representation has been adopted by EFP22, 25. Thus, electrostatic potential of an effective fragment is speciﬁed by charges, dipoles, quadrupoles and octopoles placed at atomic centers and bond midpoints of a molecule. An example of such expansion for a water molecule is shown in Fig. 1.3.

Figure 1.3: Distributed multipole expansion points for a water molecule.

The energy of electrostatic interaction for a system of effective fragments can be expressed as:

! 1 X X X X X X X X Eelec = Eelec + Eelec + Eelec , (1.6) 2 kl IJ Il A B6=A k∈A l∈B I∈A J∈B I∈A l∈B where A and B are the fragments, k and l are the multipole expansion points, I and J are the nuclei.

10 The electrostatic interaction between two expansion points is computed using the following equation22:

elec ch−ch ch−dip ch−quad Ekl = Ekl + Ekl + Ekl + (1.7) ch−oct dip−dip dip−quad quad−quad Ekl + Ekl + Ekl + Ekl

The speciﬁc equations for charge-charge (D.1), charge-dipole (D.2), charge-quadrupole (D.3), charge-octopole (D.4), dipole-dipole (D.5), dipole-quadrupole (D.6) and quadrupole-quadrupole (D.7) are provided in Appendix D. Nuclei-multipole and nuclei-nuclei terms are given by equations (1.8) and (1.9) respectively22.

elec nuc−ch nuc−dip nuc−quad nuc−oct EIl = EIl + EIl + EIl + EIl (1.8)

elec ZI ZJ EIJ = (1.9) RIJ

Equations for nuclei-charge (D.8), nuclei-dipole (D.9), nuclei-quadrupole (D.10) and nuclei-octopole (D.11) terms can be found in Appendix D. Electrostatic interaction between an effective fragment and the ab initio part is described by introducing perturbation to ab initio Hamiltonian:

H = H0 + V (1.10)

The perturbation is a one-electron term computed as a sum of contributions from all the expansion points and nuclei of all the effective fragments:

! X X X X elec X ZI V = dpq p V q + p q , (1.11) k R p q A k∈A I∈A

11 where A is an effective fragment, I is a nucleus, p and q are atomic orbitals of the ab

elec initio part, ZI is nucleus charge, dpq is an element of the atomic density matrix and Vk is electrostatic potential from an expansion point. A one-electron contribution to the Hamiltonian from an individual expansion point consists of 4 terms, each of which originates from the electrostatic potential of the corresponding multipole22:

k Px,y,z k elec q a µaa p|V |q = p q + p q k R R3 * Px,y,z k 2 + Θ (3ab − R δab) + p a,b ab q 5 (1.12) 3R * Px,y,z k 2 + Ω (5abc − R (aδbc + bδac + cδab)) + p a,b,c abc q , 7 5R where R, a, b and c are the distance and distance components between the electron and the expansion point k.

Electrostatics parameters

The parameters q, µ, Θ, Ω of the effective fragment electrostatics need to be computed only once for every fragment type. These multipoles are obtained from an ab initio electronic density of the individual molecule using DMA (distributed multipoles analysis) procedure described by Stone50, 52. First, a molecular wavefunction expressed in a basis of N gaussians is computed with an ab initio method. Then spherical multipoles expansions are calculated at the centers of the gaussian basis function products. These expansions are ﬁnite and include multipoles up to L1 + L2 order where L1 and L2 are angular moments of the two gaussians. At the next step, these multipoles are trans- lated to the EFP expansion points (atomic centers and bond midpoints) as described in Ref. 50 and truncated after the 4th term. Thus obtained set of spherical multipoles

12 is then converted to traceless Buckingham multipoles using the equations provided in Appendix E.1.

Charge penetration

Model of EFP electrostatics described above works well when electronic densities of the effective fragments do not overlap. However, for many molecular systems the overlap is substantial and is known as charge-penetration effect. A sketch demonstrating charge penetration for two water molecules is shown in Fig. 1.4

Figure 1.4: Sketch of the charge penetration phenomenon for two water molecules.

The magnitude of this effect is commonly around 15% of the total electrostatic energy and for some systems is as large as 200%27. It was demonstrated in Ref. 24 that charge penetration can be efﬁciently accounted for by introducing a screening function as a prefactor to the Coulomb potential:

elec −αµkrx−µk elec Vµk (x) → 1 − e Vµk (1.13)

13 This screening is applied only to the potential originating from charges representing electronic density, and not to the potential of higher multipoles, which represents electronic density variations27. Therefore only the charge-charge term is modiﬁed to account for the charge-penetration24:

 k l  αkRkl −αkRkl q q  1 − (1 + )e if αk = αl ch−ch  2 rkl Ekl = (1.14) α2 α2 k l  l −αkRkl k −αlRkl q q  1 − 2 2 e − 2 2 e if αk 6= αl  αl −αk αk−αl rkl

Electrostatics screening parameters

Parameters a and b of the screening function are computed using a ﬁtting procedure described in Ref. 24 once for every type of the fragment. This procedure optimizes the screening parameters so that the sum of differences between the ab initio electrostatic potential and the EFP electrostatic potential with screening is minimized:

X Elec Elec ∆ = Vab initio(p) − VEFP (p) , (1.15) p where the sum is taken over all the points p of the grid. The grid is constructed such that it includes the points with substantial electronic density and excludes the points close to nuclei and bond midpoints, which are inaccessible to multipoles of another fragments. A projection of such grid for a water molecule is shown in Fig. 1.5.

1.3 Polarization

Polarization accounts for the intramolecular charge redistribution under the inﬂuence of an external electric ﬁeld. It is the major component of many-body interactions, which

14 Figure 1.5: The shape of the grid (monotonous gray area) for ﬁtting the screening parameters of a water molecule. are important for proper description of cooperative molecular behavior. Since polarization cannot be parametrized by two-body potentials, it is often disregarded in molecular mechanics models. However, for hydrogen bonded systems, up to 20% of the total intermolecular interaction energy comes from polarization22. The exact expression for the polarization energy appears as the second-order term of a long-range perturbation theory53:

X h00|V |0ni h0n|V |00i Epol = − , (1.16) En − E0 n6=0 where n is the excited electronic state and V is the perturbing electrostatic potential.

1 n Expanding electrostatic potential V in R series in equation (1.16) and then combining terms with the same electric ﬁeld derivatives yields the following equation:

pol 1 1 1 E = − αabFaFb − βabcFaFbFc − Aa,bcFaFbc 2 6 3 (1.17) 1 1 − B F F F − C F F + ..., 6 ab,dc a b cd 6 ab,dc ab cd

15 where Fa and Fab are electric ﬁeld derivatives, α is the dipole polarizability tensor, β is the dipole hyperpolarizability, A, B and C are quadrupole polarizability tensors. In order to achieve a better convergence, EFP uses distributed polarizabilities placed at the centers of valence LMOs. This allows one to truncate the expansion (1.17) after the ﬁrst term and still retain a reasonable accuracy22. It should be noted that unlike the total molecular polarizability tensor, which is symmetric, the distributed polarizability tensors are asymmetric and thus all of 9 tensor components are required. Polarizability points for a water molecule are shown in Fig. 1.6.

Figure 1.6: Distributed polarizabilities for a water molecule.

Let us discuss how polarization energy is computed for an individual effective fragment in an external force ﬁeld. First, induced dipoles for every polarizability point are computed:

µk = αkFk, (1.18)

where µk, αk and Fk are the induced dipole, the polarizability tensor and the external force ﬁeld at point k.

16 Then induced dipoles are used to compute the polarization energy:

pol X E = µkFk (1.19) k

Now, let us consider a system with many effective fragments and an ab initio part. In this case, the electric ﬁeld acting on an effective fragment depends on: multipoles and nuclei of other effective fragments, induced dipoles of other effective fragments and electronic density and nuclei of the ab initio part:

! X X mult X nuc X ind Fk = Fi (k) + FI (k) + Fj (k) B6=A i∈B I∈B j∈B (1.20) +F ab initio elec(k) + F ab initio nuc(k),

where Fk is the total electric ﬁeld at polarizability point k of the effective fragment A, j

mult is a polarizability point, i is a multipole expansion point, I is a nucleus, Fi (k) is the nuc ind field due to the multipole i, FI (k) is the field from the nucleus I, Fj (k) is the field due to the induced dipole j, F ab initio elec(k) and F ab initio nuc(k) are the forces from the electron density and nuclei of the ab initio part. The dependence of the total electric field on other induced dipoles and ab initio density gives rise to two complications in computing the polarization energy discussed below. The first complication arises from the fact that an induced dipole depends on the values of all other induced dipoles in all other fragments in according with equations (1.18) and (1.20). Therefore, they must be computed self-consistently.

17 Another complication originates from the dependence of the induced dipoles on the ab initio electron density, which, in turn, is affected by the ﬁeld created by these induced dipoles through a one electron contribution to the Hamiltonian:

x,y,z X X P (µa +µ ¯a)a Hind = a k k , (1.21) R3 A k∈A where R, a, b and c are the distance and distance components between the electron and

a a the polarizability point k, µk and µ¯k are components of the induced dipole and conjugated induced dipole, respectively. Thus, when the ab initio part is present, conjugated induced dipoles should also be computed:

T µ¯k = αk Fk, (1.22)

T where αk and Fk are the transposed polarizability tensor and the external force ﬁeld at the polarization point k. In the following discussion it is implied that the conjugated induced dipoles are always computed together with the regular induced dipoles. As a result of these self-consistency requirements, the total polarization energy is computed with a two level iterative procedure illustrated in Fig. 1.7. The objective of the higher level is to converge the wavefunction. The lower level is tasked with converging the induced dipoles for a given ﬁxed wavefunction. In the beginning, the induced dipoles are initialized to 0 and the electronic density is obtained through a guess procedure (such as a generalized Wolfsberg-Helmholtz proce- dure56). At every iteration of the higher level, the wavefunction is updated based on the values of the induced dipoles on the previous step. Then the force from this new wavefunction acting on the polarizability points is recomputed. The convergence criteria is that the

18 μ = 0 Exit Ψ = guess()

No Yes Compute Ψ Is Ψ converged?

Compute FΨ

No Yes Compute Fμ Is μ converged?

Compute μ

Figure 1.7: Computing the induced dipoles through the two level self-consistent iterative procedure. Here Ψ is the ab initio wavefunction, µ is a set of induced dipoles, FΨ is the force due to the ab initio wavefunction and Fµ is the force due to the induced dipoles. wavefunction parameters (molecular orbital coefficients) are within a predefined range from their values on the previous iteration. At the lower level, the values of the induced dipoles are computed at every iteration based on the electric force from the ab initio part (which does not change during the lower-level iterations), as well as the forces from multipoles, nuclei and the force from the induced dipoles of the previous lower-level iteration. The convergence criteria is that the difference between the new induced dipoles and those from the previous iteration is within a predefined threshold. Thus, when the lower-level iterative procedure exits the induced dipoles are self- consistent and are consistent with a fixed ab initio wavefunction. Convergence of the

19 higher-level iterative procedure results in the induced dipoles and the ab initio wavefunction that are self-consistent and consistent with each other. If the molecular system does not contain the ab initio part, then only the lower-level procedure is required. Once the values of the induced dipoles are obtained, the total energy of the system can be computed as22:

1 X X 1 X X E = Eother − µ (F mult + F nuc) + (µ +µ ¯ )F ab initio elec, (1.23) 2 k k k 2 k k k A k∈A A k∈A where Eother is the sum of energies of the ab initio part and other components of EFP,

A is an effective fragment, k is a polarizability point, µk and µ¯k are induced dipoles and

mult conjugated induced dipoles, Fk is the force due to the multipole expansion points of nuc ab initio elec other fragments, Fk is the force due to the all other nuclei, Fk is the force due D E ab initio elec ˆelec to the ab initio electronic density (Fk = Ψ fk Ψ ). In equation (1.23), the second term is the total polarization energy of all the effective fragments. However, the second and the third term do not add up to the total polarization energy because a part of the later is implicitly included in the ab initio energy (due to the effect of the induced dipoles on the wavefunction). Computational cost of the polarization component consists of the cost of computing the induced dipoles and the energy. The latter is a trivial step with the cost of O(N 2) for N effective fragments. Thus, the former deﬁnes the overall computational cost. At the higher level, the number of iterations h is similar to that for the ab initio system without EFP, while at the lower level the convergence is achieved within bounded number of iterations. Every lower-level iteration scales as O(N 2), thus, the overall computational cost is O(hN 2). If system does not have the ab initio part, then only the lower-level iterative procedure is required and computational cost is O(N 2).

20 Polarization parameters

The polarization parameters in the EFP model are LMOs and polarizability tensors. Both can be obtained from an ab initio calculation once for every type of the fragment and then reused in all consequent calculations. LMOs are obtained with the Boys localization procedure based on the ab initio electronic density57, 58. Polarizabilities are obtained based on the definition (1.24) as a derivative of the dipole moment (µ) w.r.t. electric field (F ) using the finite differences method.

∂µa αab = , (1.24) ∂Fb

In this approach, a dipole moment of an LMO is first computed without an electric field. Then, by means of 3 additional ab initio computations, the LMO and its dipole moment are computed in the presence of a small electric field acting along one of the coordinate axis. Based on LMO dipole moments, the polarizability tensor can be computed as:

µa(Fb) − µa(0) αab = (1.25) Fb

1.4 Dispersion

For many molecular systems dispersion is the main attractive component of the intermolecular interaction energy. An accurate description of dispersion interaction is a difﬁ- cult task even for ab initio methods — inclusion of the electron correlation at high level is required. An additional challenge for EFP is that its dispersion description must be computationally efﬁcient.

21 The exact expression (1.26) for the dispersion energy appears in the second order term of a long-range perturbation theory.

X X h00|V |mni hmn|V |00i Edisp = − , (1.26) EA + EB − EA − EB n6=0 m6=0 m n 0 0 where A and B are two interaction molecules, m and n are their excited electronic states and V is the perturbing electrostatic potential.

1 n By expanding electrostatic potential V in R series dispersion energy can be expressed in the so called London series:

C C C Edisp = 6 + 8 + 10 + ..., (1.27) R6 R8 R10

where the Cn coefﬁcients correspond to induced dipole—induced dipole (C6), induced dipole—induced quadrupole (C8), induced quadrupole—induced quadrupole and induced dipole—induced octopole (C10) etc. interactions, Thus, dispersion energy can be alternatively expressed as a sum:

Edisp = Eind dip−ind dip + Eind dip−ind quad + Eind quad−ind quad + Eind dip−ind oct + ... (1.28) A better convergence can be obtained by using a distributed dispersion model with expansion points located at localized molecular orbitals centroids59. Expansion points for a water molecule are demonstrated in Fig. 1.8. When using distributed dispersion, a reasonable accuracy can be achieved by truncating the expansion after the ﬁrst term and approximating the rest as 1/3 of the this

22 Figure 1.8: Distributed dispersion points for a water molecule. term59. Induced dipole-induced dipole interaction energy between two induced dipoles k and l can be computed by using the following equation:

x,y,z Z ∞ ind dip−ind dip X kl kl k l Ekl = Tab Tcd αac(iν)αbd(iν)dν, (1.29) abcd 0

k l where αac and αbd are dynamic polarizability tensors, iν is the the imaginary frequency, 2 kl kl kl 3ab−Rklδab Tab and Tcd are the electrostatic tensors of the second rank, i.e., Tab = 5 . Rkl In general, the interaction energy described by (1.29) is anisotropic, distance dependent and computationally expensive26. The approximation used by EFP includes only the trace of polarizability tensor, which is anisotropic and distance independent. The total dispersion energy for a system of effective fragments is:

4 X X X X Ckl Edisp = 6 , (1.30) 3 R6 A B k∈A l∈B kl

23 kl where A and B are effective fragments, k and l are LMOs, C6 is intermolecular disper- 6 sion coefﬁcient, and Rkl is the distance between two LMO centroids. Intermolecular dispersion coefﬁcients are computed using the 12 point Gauss- Legendre integration formula:

12 X 2ν0 Ckl = w α¯k(iν )¯αl(iν ), (1.31) 6 i (1 − t )2 i i i=1 i

k l where α¯ (iνi) and α¯ (iνi) are 1/3 of the traces of the corresponding polarizability tensors.

Dispersion damping

For small distances between effective fragments dispersion coefﬁcients must be corrected to take into account the charge penetration effect. In the original EFP model, the Tang—Toennies formula60 with parameter b = 1.5 is used:

∞ ! X (bR)k Ckl → 1 − e−bR Ckl (1.32) 6 k! 6 k=0

However, recent studies revealed that this formula over-damps the dispersion and a more appropriate value for b lies between 2 and 3 and is system dependent29. As an alternative, Ref. 29 suggested to use a damping function dependent on the electronic density overlap Skl between LMOs k and l:

kl kl 2 kl 2 kl kl C6 → 1 − |S | (1 − 2ln|S | + 2ln |S |) C6 (1.33)

24 Dispersion parameters

As follows from equations (1.30) and (1.31), the dispersion parameters for an effective fragment are the coordinates of LMO centroids and sets of dynamic polarizability tensor traces for 12 frequencies for each LMO. These parameters can be computed once for every type of an effective fragment and then reused in EFP calculations. Dynamic polarizabilities can be computed using dynamic time-dependent Hartree-Fock (TDHF) or dynamic time-dependent density functional theory (TDDFT) as described in Ref. 26.

1.5 Exchange-repulsion

Exchange-repulsion originates from the Pauli exclusion principle, which states that for two identical fermions the wavefunction must be anti-symmetric. In ab initio methods such as Hartree-Fock and DFT, it is accounted for by the exchange operator. In classical

1 12 force ﬁelds, exchange-repulsion is introduced as a positive (repulsive) term, e.g., r in the Lennard-Jones potential. However, since the Pauli exclusion principle is intrin- sically quantum, its classical descriptions are not accurate. EFP uses a wavefunction- based formalism to account for the electron exchange. Thus exchange-repulsion is the only non-classical component of EFP and the only one that is repulsive. The EFP exchange-repulsion energy term has been derived from the exact equation for the exchange-repulsion energy23, 61:

25 where A and B are two molecules described by ΨA and ΨB wavefunctions, VAB is the intermolecular Coulomb interaction, and Aˆ is the antisymmetrization operator:

ˆ A = P0 − P1 + P2 − P3 + ..., (1.35)

where Pi is a permutation of i electron pairs consisting of one electron from each molecule. Truncating sequence (1.35) after the second term, applying an inﬁnite basis set approximation and a spherical gaussian overlap approximation as shown in Ref. 61 leads to the expression for the exchange-repulsion energy in terms of localized molecular orbitals:

1 X X X X Eexch = Eexch (1.36) 2 ij A B6=A i∈A j∈B

r 2 exch −2ln|Sij| Sij Eij = −4 π Rij ! X A X B −2Sij Fik Skj + Fjl Sil − 2Tij (1.37) k∈A l∈B ! 2 X −1 X −1 X −1 X −1 −1 +2Sij −ZJ RiJ + 2 Ril −ZI RIj + 2 Rkj − Rij , J∈B l∈B I∈A k∈A where A and B are the effective fragments, i, j, k and l are LMOs, I and J are nuclei, S and T are intermolecular overlap and kinetic energy integrals, F is the Fock matrix element. The positions of LMO centroids for a water molecule are shown in Fig. 1.9.

exch Formula for the Eij involves intermolecular overlap and kinetic energy integrals, which are expensive to compute. In addition, since equation (1.37) is derived within an

26 Figure 1.9: The exchange-repulsion LMO centroids for a water molecule. inﬁnite basis set approximation, it requires a reasonably large basis set to be accurate. These factors make exchange-repulsion the most computationally expensive part of the EFP energy calculations of moderately sized systems. Large systems require additional considerations. Equation (1.36) contains a sum over all the fragment pairs and its computational cost formally scales as O(N 2) with the number of effective fragments N. However, exchange-repulsion is a short range interaction —- the overlap and kinetic energy integrals of LMOs vanish very fast with the interfragment distance. Therefore, by employing a distance based screening, the number of overlap and kinetic energy integrals scales as O(N). As a result, for large systems exchange-repulsion becomes less computationally expensive than long range components of EFP (such as Coulomb interactions).

Exchange-repulsion parameters

The parameters required for equation (1.37) are the Fock matrix elements and the LMOs (orbital coefﬁcients, basis and centroids). They can be computed once for every type of

27 the fragment by performing a Hartree-Fock calculation on the molecule followed by an an orbital localization procedure (such as Boys LMO)23.

1.6 Implementation of EFP in Q-Chem

This section describes the new implementation of the EFP method in the Q-Chem electronic structure package. As detailed in section 1.1, within EFP formalism the molecular system is decomposed into an active region and a number of effective fragments. Since the active region is treated by an ab initio method, EFP requires an interface with a general-purpose quantum chemistry package. Our implementation was developed using C++62 as a module within the state of-the-art quantum chemistry package Q-Chem47. The standard template library (STL) was extensively used throughout the implementation. The developed code consists of the two independent parts: EFP energy and EFP parameters. The former is responsible for computing energy of the molecular system, whereas the latter allows one to obtain effective fragment parameters.

Computing EFP energy

The EFP energy part of the implementation allows one to compute total energy of a system comprised of an optional active region and a number of effective fragments. The total energy includes contributions from all the EFP components described in the four previous sections: electrostatics (with charge-penetration correction), polarization, dispersion, exchange-repulsion. A simpliﬁed high-level class diagram for the EFP energy code is shown in Fig. 1.10 (a dashed arrow represents a dependence). The main class EFP is responsible for man- aging the interactions between other classes and providing the interface to the rest of

28 the Q-Chem package. Classes Electrostatics, Polarization, Dispersion and Exchange- repulsion encapsulate the functionality required for the corresponding individual energy components. Class Polarization has the dependence on the class Electrostatics because induced dipoles depend on the field created by multipoles. The responsibility of the ConfigReader class is reading the effective fragment parameters and configuration of the molecular system, and storing them in the internal representation using classes Frag- mentParams and Fragment.

Electrostatics FragmentParams Fragment

Dispersion Polarization ExchangeRepulsion ConfigReader

EFP

Figure 1.10: Class diagram for the EFP energy part of the implementation.

std::vector< double > er_fock_matrix

std::vector< PolarizationPoint > polarizabilities

std::vector< XYZPoint > er_lmos

std::vector< std::vector< double > > er_wavefunction FragmentParams dispersions std::vector< DispersionPoint > er_basis

std::string atoms

std::vector< EFPAtom > multipoles

std::vector< MultipolePoint >

Figure 1.11: The internal structure of the FragmentParams class.

29 The internal structure for FragmentParams class is shown in Fig. 1.11. The labels above the arrows indicate the name of the data member. This class encapsulates all the parameters for a given EFP fragment type: atoms, multipoles, dispersion points, polarizabilities and parameters for exchange-repulsion (which start with er prefix). The class Fragment represents an individual effective fragment in the molecular system. It stores the fragment type, its position and orientation. The position is specified in terms of translation in Cartesian coordinates (x, y, z) relative to the origin of the ab initio system. The orientation is specified by the Euler angles (α, β, γ) in terms of the 3 consequent rotations, as explained in Fig. 1.12. The rotation matrix corresponding to these angles is given in Fig. 1.13.

Y Z

y X α γ

x N

Figure 1.12: Representing the orientation with the Euler angles. xyz is the ﬁxed system, XYZ is the rotated system, N is the intersection between the xy and XY planes called the line of nodes. Transformation between the xyz and XYZ systems is given by α rotation in the xy plane, followed by β rotation in the new y’z’ plane (around the N axis), followed by γ rotation in the new x”y” plane.

30 R =

 cosα cosγ − sinα cosβ sinγ −cosα sinγ − sinα cosβ cosγ sinβ sinα     sinα cosγ + cosα cosβ sinγ −sinα sinγ + cosα cosβ cosγ −sinβ cosα  sinβ sinγ sinβ cosγ cosβ

Figure 1.13: The rotation matrix corresponding to the Euler angles α, β, γ.

0 0 0 The position xi, yi, zi of an atom i of an effective fragment can be computed as:

      0 x xi x  i            0  = R   +   , (1.38) yi yi y             0 zi zi z

where xi, yi, zi are the coordinates of the atom in the fragment frame, R is the rotation matrix defined in Fig. 1.13, and x, y, z are the fragment coordinates. The flow of control between Q-Chem and the EFP module is presented in Fig. 1.14. The ReadParams and ReadFragments calls instruct the EFP module to read the definitions of the fragment types and the molecular system configuration from the Q-Chem standard input (discussed in the next subsection). The Init call checks the effective fragments parameters and the configuration for consistency, creates and initializes the instances of Electrostatics, Polarization, Dispersion and Exchange-Repulsion classes. The UpdateRepresentation method is called every time when the geometry of the molecular system changes. It performs rotation and translation of atoms, multipoles, polarization tensors, dispersion points and LMOs in accordance with the coordinates and orientation for each individual fragment. The NuclearEnergy function computes the EFP energy components, which are independent from the wavefunction of ab initio system (inter-fragment electrostatics, dispersion and exchange-repulsion).

31 If the active region is present, then for every iteration of the SCF procedure Q-Chem calls the Hamiltonian method and the WFDependentEnergy function. The Hamiltonian method updates the one-electron term of the active region Hamiltonian in accrodance with the equations (1.10) and (1.21). After updating the wavefunction accordingly, Q- Chem calls WFDependentEnergy to compute the wavefunction-dependent part of the EFP energy (polarization). WFDependentEnergy takes the electronic density of the active region as a argument. If the active region is absent, Q-Chem calls WFDepen- dentEnergy only once with a zero electronic density as an argument. Finally, the PrintSummary method is called to print the individual energy components for the inter-fragment EFP energy.

Q-Chem EFP

ReadParams()

ReadFragments()

Init()

for every new geometry UpdateRepresentation()

NuclearEnergy() energy

for every SCF cycle Hamiltonian() hamiltonian

WFDependentEnergy() energy

PrintSummary()

Figure 1.14: The ﬂow of control between Q-Chem and EFP energy module.

32 The correctness of the EFP energy implementation in Q-Chem was veriﬁed by performing the energy computation on a few dozens of examples and comparing the results with the GAMESS implementation developed by the original authors of the EFP method. The examples used for testing include systems composed of water, ammonia and benzene molecules in different conﬁgurations.

EFP input format

The Q-Chem standard input consists of a number of sections delimited by $keyword and $end. The mandatory sections are $molecule, which describes the geometry of ab initio part, and $rem section, which specifies parameters of the ab initio computation. Other sections are specific to different Q-Chem modules. The EFP code extends the Q-Chem input with 2 new sections: $efp params and $efp fragments, which describe the parameters of the EFP fragment types and define the molecular system by providing fragments’ positions and orientations, respectively. In the EFP input sections, all coordinates are in Angstroms, unless Q-Chem’s INPUT BOHR keyword is set to TRUE, in which case all the coordinates are in Bohr, all angles are in radians and other values are in atomic units. All keywords used in the EFP input section are case insensitive. Fig. 1.15 presentswhenhe format for the $efp params section. It consists of the parameters’ definitions for one or more fragment types. A definition starts by providing the name of the fragment type, listing all atoms and their coordinates. It is followed by the list of all the multipole expansion points (mult). Each multipole expansion point is defined by its coordinate, multipoles and a damping parameter (DAMP). Multipoles with different angular momentum are specified on the separate lines in an arbitrary order. In thier definition C is charge, D , Q and O are dipole, quadrupole and octupole components, respectively. All multipoles are optional, but when provided, then all their

33 components should be specified in the order demonstrated in Fig. 1.15. The damping parameter is optional and should be preceeded by the cdamp keyword. Next in the fragment definition are the polarization points (pol). They are specified by the coordinates and components of the polarization tensor P . The dispersion points (disp) are described similarly by the coordinates and the traces of the dynamic polarizability tensors C at 12 different frequencies. The last subsection describes the parameters required for the exchange-repulsion: er basis specifies the basis used, er fock provides the Fock matrix, er lmos specifies the LMO centroids and er wavefunction defines each LMO in terms of the atomic orbitals. The subsections for different EFP components are optional and can appear in an arbitrary order. Fig. 1.16 presents the format for the $efp fragments section. Every effective fragment in the molecular system is defined by the fragment type, followed by its Cartesian coordinates (X, Y, Z) specifying the translation relative to the ab initio system origin and Euler angles specifying the fragment orientation (α, β, γ). If the orientation is not provided, the default values of the angles are set to 0. The fragment type should match one of the fragments in the $efp params section. The active region is specified in the $molecule section the same way as an ab initio system is specified in Q-Chem47. In order to define a molecular system composed of the effective fragments only, the keyword EFP FRAGMENTS ONLY should be set to TRUE in the $rem section of the Q-Chem standard input b. A complete example of the Q-Chem standard input for the system composed of 3 water molecules (one as an active region and two others as effective fragments) is given in Appendix F.

bSince the $molecule section is mandatory in Q-Chem, even if the active region is disabled it should still contain at least one atom, for instance, He. In the nutshell, assigning EFP FRAGMENTS ONLY keyword to TRUE simply disables any interaction between the ab initio part and the effective fragments.

34 $efp params fragment ATOM X Y Z ... mult X Y Z [cdamp DAMP] [C] [DX DY DZ] [QXX QXY QXZ QYY QYZ QZZ] [OXXX OXXY OXXZ OXYY OXYZ OXZZ OYYY OYYZ OYZZ OZZZ] ... pol X Y Z PXX PXY PXZ PYX PYY PYZ PZX PZY PZZ ... disp X Y Z S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 ... er basis BASIS er fock F1 F2 F3 F4 ... er lmo XYZ ... er wavefunction 1 C1 C2 C3 C4 ... 2 C1 C2 C3 C4 ...... fragment ... $end

Figure 1.15: The format for the $efp params section. $efp fragments X Y Z [α β γ] ... $end

Figure 1.16: The format for the $efp fragments section.

Computing EFP parameters

The EFP parameters part of the code computes the electrostatics, charge-penetration and polarization parameters for an isolated molecule. The parameters need to be computed 35 only once for every fragment type and then can be reused in all subsequent EFP energy computations. Since the EFP model has been derived from the ab initio theory, the first step in computing the EFP parameters for a molecule is to obtain its electronic density. This can be achieved by means of any ab initio method available in Q-Chem that supports explicit electronic density (for instance: HF, DFT, CCSD). From the electronic density, the electrostatic parameters (multipoles) are computed using the distributed multipole analysis procedure as described in section 1.2. The multipole expansion is computed up to octupoles. The points for the expansion are the nuclei and the bond midpoints. The screening parameters accounting for the charge-penetration are computed by the fitting procedure, which minimizes the difference between the exact electrostatic potential and potential from the multipoles expansion points multiplied by the screening function, as detailed in the section 1.2. The optimization of screening parameters is based on the Newton-Raphson algorithm63. The polarization parameters are computed using the finite difference method based on 4 ab initio computations: one with no external field and 3 with a small (by default 10−5 V/m) electrostatic field along one of the coordinate axis, as described in section 1.3. In each computation, the localized molecular orbitals are obtained with the Boys localization procedure57, 58. The computation of the dispersion and exchange-repulsion parameters have not yet been implemented in Q-Chem. They can be obtained from the quantum chemistry package GAMESS34, using the code developed by the original authors of the EFP method. Using its makefp job type, all the EFP parameters can be produced in the GAMESS format. An example of the GAMESS input file generating the EFP parameters for benzene and the output produced are shown in Appendix A. To facilitate the conversion from

36 the GAMESS format to the Q-Chem format, a converter script (Appendix C) has been developed in Python64. It takes a ﬁle with the GAMESS EFP parameters as a command line argument and prints out the parameters in the Q-Chem format. The conversions performed internally by the script are the syntax conversion, unit conversion (from Bohr to A)˚ and redundancy elimination (replacing a dynamic polarizability tensor with its trace). The Q-Chem keywords affecting the EFP parameters computation are provided in Appendix G.

1.7 EFP application: Beyond benzene dimer

Aromatic π − π interactions have the major influence on the structure and properties of many biologically important molecular systems65. They govern nucleotide stacking patterns in DNA and RNA48, control the ternary structures of proteins66, play an important role in the intercalation of drugs with DNA67, affect the properties of the techno- logically important polymers aramids68, impact the crystal structure of many aromatic compounds69, and stabilize host—guest complexes70, 71. Studying the π−π interactions is a major challenge for both experiment and theory72. The experimental investigations are difficult due to the complex environments in which these interactions occur (proteins, DNA). The weakness of individual π − π interactions and the flatness of the PES along the coordinates defining the relative position of the aromatic rings make it difficult to compute even for small model systems in the gas phase73–75. The theoretical studies of π − π interactions, as of any non-covalent interactions, are challenging due to the weakness of these interactions relative to typical covalent interactions. Their proper treatment requires high accuracy ab initio methods [e.g. CCSD(T)]

37 and large basis sets with diffuse functions [e.g. aug-cc-pVTZ]76, which both contribute to the high computational complexity. At present, high accuracy ab initio computations are feasible only for systems with a few π − π interacting aromatic rings. However, in nature several interacting π − π moieties are usually present, e.g., many nucleotides are responsible for the secondary secondary structure of DNA48. Therefore, it is important to understand whether multiple π − π interactions can be treated as a sum of the pair-wise interactions between neighboring aromatic rings or there are significant non-additive many-body interaction components. Theoretical studies of multiple π − π interactions have been so far limited due to the computational complexity reasons discussed above77–80 . An interesting approach was taken in the work by Tauer and Sherrill78 who studied benzene trimers and tetramers using MP2 with a specially selected basis set. MP2 is known to overestimate the binding energies. For instance, MP2 with a large basis set (aug-cc-pVQZ) overestimates the dissociation energy of the benezene dimer by 70%76. In contrast, small basis sets are known to underestimate binding energies. Thus, MP2 with a small basis set may benifit from the cancelation of errors. The authors selected a basis (cc-pVDZ+78) such that the errors from MP2 and the basis cancel out each other. Binding energies of the different benzene dimer configurations at the MP2/cc-pVDZ+ level of theory are within 15% from the best known theoretical values [CCSD(T)/CBS]. This level of theory has been used to investigate the nature of binding in benzene trimers and tetramers. The goal was to understand the importance of the total nonadditive and the individual many- body interaction energies. The authors concluded that the binding energy of trimers and tetramers can be described as a simple nearest neighbor sum with 10% accuracy, and that non-additive (many-body and long range two-body) interactions will become increasingly important for larger clusters.

38 An important question for the π − π stacked systems is how the share of the non- additive energy component grows with the increase in the number of aromatic rings in the cluster: whether it becomes significant in large clusters or does it stay similar to that of benzene trimers. A related question is if the binding energy of multiple π − π interactions can be well approximated by the sum of the pair-wise nearest neighbor interaction energies. To answer these fundamental questions, one needs to study a series of the increasingly large aromatic clusters and observe the asymptotic behavior of the non-additive energy component. Unfortunately, aromatic clusters with more than a few benzene rings are beyond the reach of high level ab initio theories. Even using the proposed in Ref. 78 MP2/cc-pVDZ+ method, computing the energy of the cluster with a few tens of the benzene rings is presently unfeasible. The approach to the multiple π − π interactions problem taken in this work is to employ the effective fragment potential (EFP) — a semi-empirical method designed specifically to account for the inter-molecular interactions. In recent works27, 28, 81, EFP has been used in the extensive studies of benzene dimers. EFP was found to be in excellent agreement with high-level ab initio theories. For the three conformers of the benzene dimer shown in Fig. 1.17, the maximum binding energy was within 0.4 kcal/- mol and the intermolecular separation was within 0.2 A˚ from the values obtained with CCSD(T)/aug-cc-pVQZ. Additionally, the authors found that the EFP energy profiles (Fig 10.2 and 10.3 in Ref. 28) for the individual types of interactions (electrostatics, polarization, dispersion and exchange-repulsion) along the intermolecular separation coordinates were in excellent agreement with the symmetry-adapted perturbation theory (SAPT). Thus, not only the total binding energies of the benzene dimers are reproduced well by EFP, but also the individual types of the interactions are properly accounted for.

39 R2 ` R R R1

S T PD

Figure 1.17: Sandwitch (S), t-shaped (T) and parallel-displaced (PD) structures of the benzene dimer.

The discussion above indicates that EFP is an excellent candidate for studying the systems with π−π interactions. This work uses our new implementation of EFP to investigate multiple π − π interactions. Firstly, the EFP binding energies and geometries for the sandwich (S), t-shaped (T) and parallel-displaced (PD) structures are compared with those obtained at different ab initio levels of theory, including the best known to date theoretical estimates [CCST(T)/CBS]. Then, eight different benzene trimers are studied in order to understand how well EFP accounts for the non-additive energy contributions. Finally, the asymptotic behavior of multiple π − π interactions is investigated in 3 types of benzene oligomers.

Benzene dimers

The parameters for benzene used throughout this work are those recommended in Ref. 28. The geometry of the monomer [d(C-C)=1.3942, d(H-H)=1.0823] computed at the MP2/aug-cc-pVTZ level of theory is from Ref. 82. The effective fragment potential for the benzene monomer has been generated with GAMESS34 at the HF/6- 311G++(3df,2p) level of theory. As explained in Ref. 26, a basis set of such high quality is necessary in order to generate accurate dispersion parameters. The input ﬁle used

40 for computing EFP parameters is provided in Appendix A. The parameters have been converted to the Q-Chem format using the script provided in Appendix C. The benzene dimer structures shown in Fig. 1.17 where constructed analogously to Ref. 82. The geometries and energies (table 1.1) of 3 benzene dimers corresponding to the maximum binding at the EFP level of theory were found by performing scans along the degrees of the freedom shown in Fig. 1.17 with a resolution of 0.1 A.˚ At these geometries, we veriﬁed that our implementation computes the total binding energies and individual energy components correctly by comparing with an analogous GAMESS computation. The geometries and maximum binding energies we found are only slightly different (0.1 A˚ and 0.15 kcal/mol, respectively) from those obtained in Ref. 28. The discrepancy is due to the different techniques of the electrostatics screening. c

Table 1.1: The geometric parameters and energies of the benzene dimers

S T PD

Geometric parameters, A˚ R = 4.0 R = 5.2 R1 = 3.9 , R2 = 1.2 Energy, kcal/mol -1.96 -2.38 -2.18

Table 1.2 compares the energies of the 3 benzene dimer structures at different levels of theory. The best known theoretical estimates are the CCSD(T)/CBS values, which will serve as a benchmark for our comparison. We can see that CCSD(T)/aug-cc-pVQZ gives the results within 0.15 kcal/mol from the CBS result. Thus, aug-cc-pVQZ can be considered sufﬁcient. However, the MP2 results with this basis set are signiﬁcantly off (error up to 70%), which is attributed to the overestimation of electron correlation by MP2 in the benzene dimer83. The method (MP2/aug-cc-pVDZ+) proposed in Ref. 78

cIn Ref. 28 in addition to the screening of charges, which our implementation also performs, the higher order multipoles (dipoles, quadrupoles, octupoles) were also screened. However, as it was demonstrated in Ref. 27, the accuracy of charge-charge only and charge-charge plus high-order electrostatic screenings is similar. Therefore, we omitted screening of the higher order multipoles in our implementation.

41 is within 0.4 kcal/mol due to the cancellation of errors. Our EFP results are within 0.6 kcal/mol and thus are of accuracy comparable to MP2/aug-cc-pVDZ+.

Table 1.2: Dissociation energies of the benzene dimers at different levels of theory.

Methoda S T PD MP2/aug-cc-pVDZ* -2.83 -3.00 -4.12 MP2/aug-cc-pVTZ -3.25 -3.44 -4.65 MP2/aug-cc-pVQZ* -3.35 -3.48 -4.73 MP2/aug-cc-pVQZ -3.37 -3.54 -4.79 MP2/cc-pVDZ+ -1.87 -2.35 -2.84 CCSD(T)/cc-pVDZ* -1.33 -2.24 -2.22 Est. CCSD(T)/aug-cc-pVQZ* -1.70 -2.61 -2.63 Est. CCSD(T)/CBS -1.81 -2.74 -2.78 EFP -1.96 -2.38 -2.18 a The results for the ab initio methods are quoted from Ref. 76

Benzene trimers

Using the geometric parameters obtained for dimers (table 1.1), linear benzene trimers shown in Fig. 1.18 have been constructed. The cyclic structure was obtained by mini- mizing the binding energy along the intermolecular separation and rotation in the plane connecting the molecular centers. The minimum corresponds to the molecular sepa- ratioin of 5.3 A,˚ tilted 15 ◦ from the perpendicular. The geometries of the trimers are provided in Appendix B. These trimer structures are analogous to the trimers studied in Ref. 78 by MP2. Binding energies for the trimers along with their breakdown into the components are shown in tables 1.3 and 1.4. E2 is total two-body interaction energy:

E2 = E2(12) + E2(13) + E2(23), (1.39)

42 Table 1.3: Total and many-body interaction energies of the benzene trimers (ﬁrst group).

S PD T1 T2 EFP

E2(12) -1.96 -2.18 -2.38 -2.40

E2(13) 0.01 0.01 0.02 0.00

E2(23) -1.96 -2.18 -2.40 -2.38

E2 -3.91 -4.35 -4.77 -4.79

E3 0.02 -0.02 0.13 0.06

Etotal -3.89 -4.38 -4.64 -4.73

Etotal(dimer) -3.92 -4.36 -4.76 -4.76

Enon−add 0.03 -0.02 0.12 0.03 ∆ 0.77% -0.37% 2.66% 0.57% MP2/cc-pVDZ+

E3 0.03 0.00 0.08 0.06

Etotal -3.83 -5.88 -4.64 -4.73

Etotal(dimer) -3.74 -5.68 -4.70 -4.70

Enon−add -0.09 -0.20 0.06 -0.03 CP Enon−add 0.03 -0.06 0.1 0.03 ∆ 0.78% -1.02% 2.16% 0.63 % CCSD(T)/cc-pVDZ+

E3 0.04 0.01 0.07

Etotal -0.90 -1.84 -3.14

Etotal(dimer) -0.86 -1.72 -3.20

Enon−add -0.04 -0.12 0.06 CP Enon−add 0.06 0.00 0.10 ∆ 6.67% 0.00% 3.18%

E2(NM) - two body interaction between benzenes N and M, E2 - total two body interaction, E3 - three-body interaction, Etotal - total energy of trimer, Etotal(dimer) - total energy of trimer estimated form dimer, Enon−add - non-additive energy component, CP Enon−add - non-additive energy component with counterpoise correction, ∆ - share of the non-additive energy.

43 Table 1.4: Total and many-body interaction energies of the benzene trimers (second group).

C PD-S S-T PD-T EFP

E2(12) -2.19 -1.96 -1.96 -2.38

E2(13) -2.19 0.00 -0.12 -0.11

E2(23) -2.19 -2.18 -2.38 -2.18

E2 -6.57 -4.14 -4.46 -4.67

E3 -0.24 0.01 -0.07 -0.06

Etotal -6.81 -4.12 -4.52 -4.73

Etotal(dimer) -6.57 -4.14 -4.34 -4.56

Enon−add -0.25 0.02 -0.18 -0.17 ∆ -3.63% 0.43% -4.05% -3.66% MP2/cc-pVDZ+

E3 -0.33 0.02 -0.03 0.00

Etotal -7.88 -4.86 -4.49 -5.42

Etotal(dimer) -7.32 -4.71 -4.22 -5.19

Enon−add -0.56 -0.15 -0.27 -0.23 CP Enon−add -0.32 -0.01 -0.15 -0.13 ∆ -4.06% -0.21% -3.34% -2.40% CCSD(T)/cc-pVDZ+

E3 -0.25

Etotal -5.09

Etotal(dimer) -4.62

Enon−add -0.47 CP Enon−add -0.26 ∆ -5.11%

44 S T1 T2 PD

C PD-T PD-S S-T

Figure 1.18: The structures of eight benzene trimers.

where E2(NM) is binding energy for the dimers, created by removing one benzene from the trimer and computed in the full basis set of the trimer. E3 is three-body interaction energy computed as:

E3 = Etotal − E2 (1.40)

Etotal(dimer) is total binding energy estimated as the sum of nearest neighbor pair-wise

CP binding energies. Enon−add and Enon−add are the non-additive energy components without and with the counterpoise correction:

Enon−add = Etotal − Etotal(dimer) (1.41) CP CP Enon−add = Etotal − Etotal(dimer)

45 ∆ is the share of the non-additive interaction energy which EFP computes as:

E ∆ = non−add (1.42) Etotal and ab initio methods as: ECP ∆ = non−add (1.43) Etotal

To evaluate how well EFP describes multiple π − π interactions, it is necessary to conduct a comparison with ab initio methods. The quality of the description of the two-body interactions (between two benzene rings) has been investigated in the extensive studies of the benzene dimers energy proﬁles in Ref. 28. The conclusion was that EFP results are in good agreement with the high level ab initio theories [CCSD(T) and SAPT]. Three-body interactions for EFP, MP2/cc-pVDZ+ and CCSD(T)/cc-pVDZ+ are compared in tables 1.3 and 1.4. The ﬁrst important observation is that the ab initio

E3 values quoted from Ref. 78 were computed using the counterpoise correction to account for the basis set superposition error, which results from the basis set incom- pleteness. Three-body interaction energies were calculated by subtracting from total binding energy of the trimer energies of the 3 possible benzene dimers computed in the full set basis of the trimer.

All EFP many-body interactions are due to the polarization component. Thus, its E3 term is three-body polarization energy. From tables 1.3 and 1.4, we can see that three- body interaction energies computed with EFP have the same order of magnitude as for the ab initio methods. For most trimers studied here, the E3 correction is rather small and miscellaneous errors may have a signiﬁcant impact on its value. The exception is the cyclic benzene trimer (see Fig. 1.18), which is a true three-body system demonstrating substantial three-body interactions at all the considered levels of theory. For this trimer

E3 computed with EFP is within 0.01 kcal/mol from the CCSD(T) value. The fact that

46 these values almost coincide might be accidental, but their closeness implies a certain level of accuracy in the treatment of three-body π − π interactions by EFP. On the basis of the above comparison, we conclude that EFP provides a good description of three- body interactions in benzene trimers. Another interesting conclusion can be drawn from the comparison of total non additive interaction energies (Enon−add). Firstly, Enon−add for EFP are greater than for CCSD(T) and MP2 for all eight considered trimers. However, we concluded that this effect is mostly due to the basis set superposition error (BSSE). Enon−add is computed as a difference between trimer binding energy and energies of two dimers (which were derived from the trimer by removing one of the edge benzenes). Thus, the basis sets used for the trimer and the dimers energy computations are different. Since trimer has a larger basis set, its nearest neighbor π − π interactions are stronger than in the corresponding dimers. That is, the additional basis functions contribute to the wavefunction relaxation, leading to binding energy lowering. In fact, table 1 of Ref. 78 shows the energies of the dimers computed in the full basis set of the trimer. These energies are 0.05-0.10 kcal/mol lower than for the dimers in its own basis set. Thus, as much as 0.3 kcal/mol for the cyclic trimer and 0.2 kcal/mol for other trimers can be contributed to the lowering of Enon−add due to the basis set relaxation effect. Since EFP does not suffer from the basis set relaxation, its Enon−add for benzene trimers are systematically higher.

Enon−add computed by ab initio methods can be corrected by using the binding energies of the dimers computed in the full basis set of the trimer, similarly to the counterpoise correction often employed to rectify BSSE. Since these values are readily available in table 1 of Ref. 78, we were able to compute corrected non-additive interaction energy

CP Enon−add, which is also provided in tables 1.3 and 1.4. EFP non-additive interaction CP energies are within 0.03 kcal/mol from Enon−add for CCSD(T) and within 0.04 kcal/- mol from MP2 values (except for the cyclic trimer). The share of the total non-additive

47 energy correction ∆ does not exceed 4% for EFP and MP2. This share is larger for CCSD(T) because the method signiﬁcantly underestimates the total binding energy. It also may attributed to the inclusion of higher-order three-body interaction energy terms (e.g., three-body dispersion) in CCSD(T). Based on the above comparisons of three-body and total non-additive interaction energies, we conclude that EFP accurately describes the multiple π − π interactions in the considered benzene trimers.

Large benzene oligomers

The last part of our investigation is a study of the asymptotic behavior of the non-additive energy contribution. Using the intermolecular separation parameters for dimers (table 1.1), we constructed the linear oligomers with 10, 20, 30 and 40 benzene molecules. The oligomers S-N, T-N and PD-N, where N is the number of monomers, have the same structural motives as trimers S, T1 and PD shown in Fig. 1.18. The geometries of the oligomers are provided in Appendix B. The total binding energies (Etotal), the nearest neighbor estimates (Etotal(dimer)), the non-additive energy (Enon−add) and its share (∆) in the total energy are shown in table 1.5. The nearest neighbor estimate for the oligomer with N benzene rings has been computed as:

Etotal(dimer) = (N − 1)Edimer (1.44)

From table 1.5, we can see that for all the oligomers Enon−add is positive and thus destabilizing in nature, as it was correctly predicted in Ref. 78. It is also in good agreement with the observation that nonadditive energy is positive in the bulk benzene made in Ref. 84. Non-additive energy grows with the size of the system because of the

48 Table 1.5: Binding energies of benzene oligomers computed with EFP.

Etotal Etotal(dimer) Enon−add ∆ S-10 -17.14 -17.64 0.50 2.94% S-20 -36.01 -37.24 1.23 3.41% S-30 -54.89 -56.84 1.95 3.56% S-40 -73.76 -76.44 2.68 3.63% T-10 -20.89 -21.42 0.53 2.53% T-20 -44.05 -45.22 1.17 2.65% T-30 -67.21 -69.02 1.81 2.69% T-40 -90.37 -92.82 2.45 2.71% PD-10 -19.54 -19.62 0.08 0.41% PD-20 -41.16 -41.42 0.26 0.63% PD-30 -62.78 -63.22 0.44 0.70% PD-40 -84.40 -85.02 0.62 0.73% a S-N, T-N and PD-N are linear benzene oligomers with N benzene monomers, constructed using sandwich, t-shaped and parallel-displaced structural motives, respectively. large number of many-body and long-range two-body interactions possible in the larger oligomers. The dependence of the share of non-binding energy ∆ on the system size is shown in Fig. 1.19. For all 3 types of the oligomers, ∆ asymptotically approaches a constant value and does not exceed 4% of the total binding energy. Based on the investigation of benzene trimers and higher-order oligomers at the EFP level of theory, we conclude that the share of total non-additive energy is about 4% for the systems considered. This result implies that total interaction energy in clustered π−π systems can be estimated reasonably well as the sum of two-body nearest neighbor interactions. However, it is important to note that neither EFP applied in our work nor MP2 used in the studies of trimers in Ref. 78 account for many-body dispersion. As demonstrated in the recent study79, many-body dispersion terms contribute about

49 % 3 . 5

, y g r

e 3 . 0 n e

v 2 . 5 i t i d d a - 2 . 0 S n o

n P D

e 1 . 5 T h t

f o 1 . 0 e r a h

S 0 . 5

1 0 2 0 3 0 4 0 T h e n u m e r o f b e n z e n e m o n o m e r s

Figure 1.19: Dependence of the non-additive energy share on the length of benzene oligomers.

50% to total three-body interaction energy of the cyclic benzene trimer. Additionally, we expect that the share of the non-additive energy contributions will increase in the three-dimensional clusters and bulk benzene due to the large number of many-body interactions, which can occur between molecules located within a short distance from each other.

1.8 Conclusions

This chapter presented a review of the theoretical approaches to modeling the environmental effects and a detailed description of the EFP method. The details of our EFP implementation in the Q-Chem package were presented. The developed code was

50 applied to investigate multiple π − π interactions in benzene oligomers focusing on the role of many-body and total non-additive energies in these systems. We analyzed previously reported78 ab initio results for benzene trimers and concluded that they underestimate non-additive energy due to the basis set superposition error. Based on the data presented in Ref. 78, we were able to recompute ab initio non-additive energy with the counterpoise correction. Using our EFP code, we computed three-body and total non-additive energies for benzene trimers. These energies for all eight timers included in our study appeared to be in excellent agreement with the counterpoise-corrected ab initio values (within 0.04 kcal/mol). The magnitude of non- additive interaction energy was within 4% of total binding energy for all the considered benzene clusters. Based on the detailed comparison of the binding energy components (presented in tables 1.3 and 1.4), we concluded that EFP describes the intermolecular π − π interactions with the accuracy comparable to that of ab initio methods. In the ﬁnal part, our EFP code was applied to study linear benzene oligomers. Based on the results (presented in table 1.5), we concluded that the share of non-additive interaction energy in linear benzene clusters asymptotically approaches a constant value with the increase in cluster size, and does not exceed 4% of total binding energy. Thus, it is approximately of the same magnitude as for the benzene trimers. This result implies that there are no signiﬁcant non-local effects in the binding mechanism of the linear benzene clusters.

Reference list

[1] Raghavachari, K. ; Trucks, G.W. ; Pople, J.A. ; Head-Gordon, M. Chem. Phys. Lett. 1989, 157, 479.

51 [2] Bowler, D.R. ; Miyazaki, T. ; Gillan, M.J. J. Phys.: Condens. Matter 2002, 14, 2781.

[3] Schwegler, E. ; Challacombe, M. J. Chem. Phys. 1996, 105, 2726.

[4] White, C.A. ; Johnson, B.G. ; Gill, P.M.W. ; Head-Gordon, M. Chem. Phys. Lett. 1996, 253, 268.

[5] Schutz,¨ M. ; Hetzer, G. ; Werner, H. J. Chem. Phys. 1999, 111, 5691.

[6] Scuseria, G.E. ; Ayala, P.Y. J. Chem. Phys. 1999, 111, 8330.

[7] Gogonea, V. ; Westerhoff, L.M. ; K.M. Merz, Jr. J. Chem. Phys. 2000, 113, 5604.

[8] Kitaura, K. ; Ikeo, E. ; Asada, T. ; Nakano, T. ; Uebayasi, M. Chem. Phys. Lett. 1999, 313, 701 .

[9] Lippert, G. ; Hutter, J. ; Parrinello, M. Mol. Phys. 1997, 92, 477.

[10] Allunger, N.L. ; Yuh, Y.H. ; Lii, J.H. J. Am. Chem. Soc. 1989, 111, 8551.

[11] Casewit, C.J. ; Colwell, K.S. ; Rappe, A.K. J. Am. Chem. Soc. 1992, 114, 10035.

[12] Mayo, S.L. ; Olafson, B.D. ; Goddard, W.A. J. Phys. Chem. 1990, 94, 8897.

[13] Rizzo, R.C. ; Jorgensen, W.L. J. Am. Chem. Soc. 1999, 121, 4827.

[14] Firman, T.K. ; Landis, C.R. J. Am. Chem. Soc. 2001, 123, 11728.

[15] Jorgensen, W.L. ; McDonald, N.A. J. Molec. Struct. (Theochem) 1998, 424, 145 .

[16] MacKerell, A.D. ; Bashford, D. ; Bellott; Dunbrack, R.L. ; Evanseck, J.D. ; Field, M.J. ; Fischer, S. ; Gao, J. ; Guo, H. ; Ha, S. ; Joseph-McCarthy, D. ; Kuchnir, L. ; Kuczera, K. ; Lau, F.T.K. ; Mattos, C. ; Michnick, S. ; Ngo, T. ; Nguyen, D.T. ; Prodhom, B. ; Reiher, W.E. ; Roux, B. ; Schlenkrich, M. ; Smith, J.C. ; Stote, R. ; Straub, J. ; Watanabe, M. ; Wiorkiewicz-Kuczera, J. ; Yin, D. ; Karplus, M. J. Phys. Chem. B 1998, 102, 3586.

[17] Hill, J.R. ; Sauer, J. J. Phys. Chem. 1995, 99, 9536.

[18] Lin, H. ; Truhlar, D.G. Theor. Chem. Acc. 2007, 117, 185.

[19] Gao, J.L. ; Truhlar, D.G. Annu. Rev. Phys. Chem. 2002, 53, 467.

[20] Sierka, M. ; Sauer, J. J. Chem. Phys. 2000, 112, 6983.

[21] Monard, G. ; Merz, K.M. Acc. Chem. Res. 1999, 32, 904.

52 [22] Day, P.N. ; Jensen, J.H. ; Gordon, M.S. ; Webb, S.P. ; Stevens, W.J. ; M.Krauss; D.Garmer; H.Basch; D.Cohen J. Chem. Phys. 1996, 105, 1968.

[23] Jensen, J.H. ; Gordon, M.S. J. Chem. Phys. 1998, 108, 4772.

[24] Freitag, M.A. ; Gordon, M.S. ; Jensen, J.H. ; Stevens, W.J. J. Chem. Phys. 2000, 112, 7300.

[25] Gordon, M.S. ; Freitag, M.A. ; Bandyopadhyay, P. ; Jensen, J.H. ; Kairys, V. ; Stevens, W.J. J. Phys. Chem. A 2001, 105, 293.

[26] Adamovix, I. ; Gordon, M.S. Mol. Phys. 2005, 103, 379.

[27] Slipchenko, L.V. ; Gordon, M.S. J. Comput. Chem. 2007, 28, 276.

[28] Gordon, M.S. ; Slipchenko, L. ; Li, H. ; Jensen, J.H. In Annual Reports in Com- putational Chemistry; Spellmeyer, D.C. , Wheeler, R. , Eds., Vol. 3 of Annual Reports in Computational Chemistry; Elsevier, 2007; pages 177 – 193.

[29] Slipchenko, L.V. ; Gordon, M.S. Mol. Phys. 2009.

[30] Li, H. ; Netzloff, H.M. ; Gordon, M.S. J. Chem. Phys. 2006, 125.

[31] Li, H. ; Gordon, M.S. ; Jensen, J.H. J. Chem. Phys. 2006, 124.

[32] Li, H. ; Gordon, M.S. J. Chem. Phys. 2007, 126.

[33] Li, H. ; Gordon, M.S. Theor. Chem. Acc. 2006, 115, 385.

[34] Schmidt, M.W. ; Baldridge, K.K. ; J.A. Boatz, S.T. Elbert ; Gordon, M.S. ; J.H. Jensen, S. Koseki ; Mastunaga, N. ; Nguyen, K.A. ; S. Su, T.L. Windus ; Dupuis, M. ; Montgomery, J.A. J. Comput. Chem. 1993, 14, 1347.

[35] Adamovic, I. ; Gordon, M.S. J. Phys. Chem. A 2005, 109, 1629.

[36] Balawender, R. ; Saﬁ, B. ; Geerlings, P. J. Phys. Chem. A 2001, 105, 6703.

[37] Chandrakumar, K.R.S. ; Ghanty, T.K. ; Ghosh, S.K. ; Mukherjee, T. J. Molec. Struct. (Theochem) 2007, 807, 93.

[38] Day, P.N. ; Pachter, R. J. Chem. Phys. 1997, 107, 2990.

[39] Merrill, G.N. ; Webb, S.P. J. Phys. Chem. A 2004, 108, 833.

[40] Yoo, S. ; Zahariev, F. ; Sok, S. ; Gordon, M.S. J. Chem. Phys. 2008, 129, 8.

[41] Freitag, M.A. ; Gordon, M.S. Abstracts of Papers of the American Chemical Soci- ety 1999, 218, U511; 1.

53 [42] Merrill, G.N. ; Gordon, M.S. J. Phys. Chem. A 1998, 102, 2650.

[43] Merrill, G.N. ; Webb, S.P. J. Phys. Chem. A 2003, 107, 7852.

[44] Merrill, G.N. ; Webb, S.P. ; Bivin, D.B. J. Phys. Chem. A 2003, 107, 386.

[45] Netzloff, H.M. ; Gordon, M.S. J. Chem. Phys. 2004, 121, 2711.

[46] Day, P.N. ; Pachter, R.and Gordon, M.S. ; Merrill, G.N. J. Chem. Phys. 2000, 112, 2063.

[47] Shao, Y. ; Molnar, L.F. ; Jung, Y. ; Kussmann, J. ; Ochsenfeld, C. ; Brown, S. ; Gilbert, A.T.B. ; Slipchenko, L.V. ; Levchenko, S.V. ; O’Neil, D. P. ; Distasio Jr., R.A. ; Lochan, R.C. ; Wang, T. ; Beran, G.J.O. ; Besley, N.A. ; Herbert, J.M. ; Lin, C.Y. ; Van Voorhis, T. ; Chien, S.H. ; Sodt, A. ; Steele, R.P. ; Rassolov, V. A. ; Maslen, P. ; Korambath, P.P. ; Adamson, R.D. ; Austin, B. ; Baker, J. ; Bird, E.F.C. ; Daschel, H. ; Doerksen, R.J. ; Drew, A. ; Dunietz, B.D. ; Dutoi, A.D. ; Furlani, T.R. ; Gwaltney, S.R. ; Heyden, A. ; Hirata, S. ; Hsu, C.-P. ; Kedziora, G.S. ; Khalliulin, R.Z. ; Klunziger, P. ; Lee, A.M. ; Liang, W.Z. ; Lotan, I. ; Nair, N. ; Peters, B. ; Proynov, E.I. ; Pieniazek, P.A. ; Rhee, Y.M. ; Ritchie, J. ; Rosta, E. ; Sherrill, C.D. ; Simmonett, A.C. ; Subotnik, J.E. ; Woodcock III, H.L. ; Zhang, W. ; Bell, A.T. ; Chakraborty, A.K. ; Chipman, D.M. ; Keil, F.J. ; Warshel, A. ; Herhe, W.J. ; Schaefer III, H.F. ; Kong, J. ; Krylov, A.I. ; Gill, P.M.W. ; Head-Gordon, M. Phys. Chem. Chem. Phys. 2006, 8, 3172.

[48] Saenger, W. Principles of Nucleic Acid Structure; Springer-Verlag: New York, 1984.

[49] Buckingham, A.D. Q. Rev. Chem. Soc. 1959, 13, 183.

[50] Stone, A.J. Chem. Phys. Lett. 1981, 83, 233 .

[51] Mulliken, R.S. J. Chem. Phys. 1955, 23, 1833.

[52] Stone, A.J. ; Alderton, M. Mol. Phys. 1985, 56, 1047.

[53] Stone, A.J. In The Theory of Intermolecular Forces; Oxford University Press, 1997; pages 50 – 62.

[54] Stone, A.J. J. Chem. Theory Comput. 2005, 1, 1128.

[55] Rabinowitz, J.R. ; Namboodiri, K. ; Weinstein, H. Int. J. Quant. Chem. 1986, 29, 1697.

[56] Wolfsberg, M. ; Helmholz, L. J. Chem. Phys. 1952, 20, 837.

[57] Boys, S.F. Rev. Mod. Phys. 1960, 32, 296.

54 [58] Boys, S.F. In Quantum Science of Atoms, Molecules, and Solids; Academic Press, 1966; pages 253–262.

[59] Sanz-Garcia, A. ; Wheatley, R.J. ChemPhysChem 2003, 5, 801.

[60] Tang, K.T. ; Toennies, J.P. J. Chem. Phys. 1984, 80, 3726.

[61] Jensen, J.H. ; Gordon, M.S. Mol. Phys. 1996, 89, 1313.

[62] ISO/IEC ISO/IEC 14882:2003: Programming languages: C++ Technical report, International Organization for Standardization, Geneva, Switzerland., 2003.

[63] Press, W.H. ; Teukolsky, S.A. ; Vetterling, W.T. ; Flannery, B.P. Numerical Recipes 3rd Edition: The Art of Scientiﬁc Computing; Cambridge University Press: New York, NY, USA, 2007.

[64] Rossum, G.V. The Python Language Reference Manual; Network Theory Ltd., 2003.

[65] Hunter, C.A. ; Sanders, J.K.M. J. Am. Chem. Soc. 1990, 112, 5525.

[66] Burley, S.K. ; Petsko, G.A. Science 1985, 229, 23.

[67] Brana, M.F. ; Cacho, M. ; Gradillas, A. ; Pascual-Teresa, B. ; Ramos, A. Curr. Pharm. Des. 2001, 7, 1745.

[68] Gazit, E. Chem. Soc. Rev. 2007, 36, 1263.

[69] Desiraju, G.R. ; Gavezzotti, A. JCS Chem. Comm. 1989, page 621.

[70] Askew, B. ; Ballester, P. ; Buhr, C. ; Jeong, K.S. ; Jones, S. ; Parris, K. ; Williams, K. ; Rebek, J. J. Am. Chem. Soc. 1989, 111, 1082.

[71] Smithrud, D.B. ; Diederich, F. J. Am. Chem. Soc. 1990, 112, 339.

[72] Muller-Dethlefs, K. ; Hobza, P. Chem. Rev. 2000, 100, 143.

[73] Felker, P.M. ; Maxton, P.M. ; Schaeffer, M.W. Chem. Rev. 1994, 94, 1787.

[74] Leopold, K.R. ; Fraser, G.T. ; Novick, S.E. ; Klemperer, W. Chem. Rev. 1994, 94, 1807.

[75] Mueller-Dethlefs, K. ; Dopfer, O. ; Wright, T.G. Chem. Rev. 1994, 94, 1845.

[76] Sinnokrot, M.O. ; Sherrill, C.D. J. Phys. Chem. A 2006, 110, 10656.

[77] Ye, X. ; Li, Z. ; Wang, W. ; Fan, K. ; Xu, W. ; Hua, Z. Chem. Phys. Lett. 2004, 397, 56 .

55 [78] Tauer, T.P. ; Sherrill, C.D. J. Phys. Chem. A 2005, 109, 10475.

[79] Podeszwa, R. ; Szalewicz, K. J. Chem. Phys. 2007, 126, 194101.

[80] Podeszwa, R. J. Phys. Chem. A 2008, 112, 8884.

[81] Smith, T. ; Slipchenko, L.V. ; Gordon, M.S. J. Phys. Chem. A 2008, 112, 5286.

[82] Sinnokrot, M.O. ; Valeev, E.F. ; Sherrill, C.D. J. Am. Chem. Soc. 2002, 124, 10887.

[83] Tsuzuki, S. ; Uchimaru, T. ; Matsumura, K. ; Mikami, M. ; Tanabe, K. Chem. Phys. Lett. 2000, 319, 547 .

[84] Podeszwa, R. ; Bukowski, R. ; Szalewicz, K. J. Phys. Chem. A 2006, 110, 10345.

56 Chapter 2

Structure, vibrational frequencies, ionization energies, and photoelectron spectrum of the para-benzyne radical anion

2.1 Introduction

Para-benzyne1, an intermediate in Bergman cyclization2, is believed to be a warhead in antitumor enediyne antibiotics3–5 due to its ability to cause double-strand DNA cleavage and self-programmed cell death or apoptosis6, 7, and extensive studies have been carried out in the past 20 years to understand the factors that control the reaction8–10. Originally, it was interest in para-benzyne electronic structure that motivated the development of procedures to generate the corresponding anion for use in photoelectron spectroscopy experiments11, 12. Recent studies suggested that anionic cycloaromatization reactions are also feasible13, 14 including the anionic version of the Bergman cyclization15. Larger sensitivity to substituents15 suggests that the reactions of the anions might be designed to be more selective than the prototypic non-anionic Bergman cyclization, important for their applications in cancer-DNA damaging drugs. Previous theoretical studies of the para-benzyne anion16 predicted strong vibronic interactions (often referred to as pseudo or second-order Jahn-Teller effect) that produce

57 a lower symmetry C2v structure in addition to the D2h minimum. The energy difference between the two structures was found to be highly sensitive to the correlation method, with higher-level models favoring D2h. The analysis of the photoelectron spectrum of

12 the anion supported the assignment of the D2h symmetry, and it was concluded that the lower symmetry structures are artifacts of incomplete correlation treatment, a well known phenomenon in electronic structure17. Interestingly, density functional calculations produced only D2h structures, which was regarded as success of DFT methodology and the authors concluded that “this level of theory is particularly well-suited for computational studies of distonic radical anions derived from diradicals”16. The robust behavior of DFT in cases of symmetry breaking has been noted by other researches as well18. The goal of this work is to investigate the performance of coupled-cluster (CC) and equation-of-motion (EOM) methods in this challenging case of strong vibronic interactions and to assess the quality of the ab initio and DFT results by comparing the calculated photoelectron spectrum with the experimental one. We quantify the symmetry lowering by the selected geometrical parameters, charge distributions, and relative energies. In addition, the shape of the potential energy surface along the symmetry breaking coordinate is analyzed. We also present an efficient scheme for calculating ionization energies (IEs) of the anion in the spirit of isodesmic reactions. Several recent studies19–21 discussed different aspects of symmetry breaking in electronic structure calculations distinguishing between: (i) purely artefactual spatial and spin symmetry breaking of approximate wave functions, i.e., Lowdin¨ dilemma22; (ii) real interactions between closely-lying electronic states that result in lower-symmetry structures, significant changes in vibrational frequencies (see Fig. 1 from Ref. 21), and even singularities (first order poles) in force constants (e.g., Fig. 3 from Ref. 20). While the latter is a real physical phenomenon, it may or may not be accurately described by an

58 approximate method. If vibronic interactions are overestimated, calculations may yield incorrect lower-symmetry structure, and vice verse. Formally, the effect of the interactions between electronic states on the shape of potential energy surface can be described20 by the Herzberg-Teller expansion of the potential energy of electronic state i:

where Qα denotes normal vibrational modes, wave functions Ψk and energies Ek are adiabatic wave functions and electronic energies at Qα = 0. The last term, which is quadratic in nuclear displacement — hence second-order Jahn-Teller, describes vibronic effects. For the ground state, i.e., when Ej > Ei, it causes softening of the force constant along Qα, which may ultimately result in a lower symmetry structure, e.g., see Fig. 1 from Ref. 21. The above mentioned poles in force constants originate in the energy denominator. The formal analysis of the CC and EOM-CC second derivatives by Stanton20 demonstrated that the quadratic force constant of standard coupled-cluster methods, e.g., coupled-cluster with singles and doubles (CCSD), necessarily contains unphysi- cal terms, which may become signiﬁcant and spoil the CC force constant in the cases of strongly interacting states. Nevertheless, the standard CC methods are quite reliable for systems exhibiting pseudo Jahn-Teller interactions, as follows from Refs. 20, 21, as well as a host of numerical evidence. EOM-CC methods, on the other hand, provide most satisfactory description of the interacting states20, not at all surprising in view of a

59 multistage nature of EOM. It should be noted that both CC and EOM-CC may exhibit spurious frequencies if the orbital response poles, i.e., coupled-perturbed Hartree-Fock (CPHF) poles, are present due to near-instabilities in the reference, a condition closely related to (i). Perturbative corrections, e.g., as in CCSD(T), do not provide systematic improvement, and often results in wider instability volcanoes19. By analyzing the properties of linear response of multi-conﬁgurational SCF (MCSCF) and the relationship between response equations and poles’ structure, Stanton dispelled a common misconception of relying on multi-reference treatment of systems with strong vibronic interactions20. Using the same arguments, he also concluded that DFT describes pseudo Jahn-Teller effects reasonably well. While some numerical evi- dence18 seems to support this conclusion, other examples21 indicated that DFT (B3LYP) tend to underestimate the magnitude of vibronic interactions. Our results are in favor of the latter, and we attribute this behavior to self-interaction error. The structure of this chapter is as follows. The next section describes molecular orbitals (MOs) and relevant electronic states of the para-benzyne anion, as well as the nature of vibronic interactions in this system. Section 2.3 outlines computational details. Section 2.4 discusses the competition between lower and higher symmetry structures at different levels of theory. Photodetachment spectra are presented in section 2.5. Differ- ent computational strategies for calculating IEs are discussed in section 2.7. Our ﬁnal remarks are given in the last section.

2.2 Molecular orbital picture and the origin of vibronic

interactions in the para-benzyne anion

The frontier MOs of the para-benzyne anion and the diradical are shown in Fig. 2.1. At

D2h, the two orbitals belong to different irreps, b1u and ag, and their density is equally

60 distributed between the two radical centers (C1 and C4). At C2v, both orbitals become a1, and their densities are localized at different carbons. As in the benzyne diradical, the lowest MO, which hosts two electrons in the anion, is of anti-bonding character with respect to C1-C4, whereas the singly occupied MO is bonding. In the case of the diradical, three singlet states and one triplet state are derived from different distributions of

1 1 1 3 two electrons on these two orbitals: X Ag, 2 Ag, 1 B1u, 2 B1u. The ground singlet state belongs to the same irrep as the doubly excited singlet, and both have significant multi- configurational character. Likewise, distributing three electrons on these two orbitals in the anion gives rise to two electronic configurations shown in the lower panel of Fig. 2.1.

At D2h, these two determinants are of different symmetry and correspond to two distinct

2 2 electronic states, X Ag and B1u, separated by the energy gap of about 0.95 eV, as calculated by EOM-CC for electron-attached states (see below). At C2v, both determinants are of A1 symmetry and, therefore, can interact. This interaction increases the energy separation between the two electronic states that acquire multi-configurational character and, therefore, “softens” the corresponding normal mode in the lower state. If the interaction is sufficiently strong, that is, if the coupling matrix element is large relative to the energy gap between the interacting states, the lower potential energy surface (PES) may develop a lower symmetry minimum. In other words, lowering symmetry stabilizes the lowest electronic state through configurational interaction, and this is the driving force for symmetry breaking. This analysis of the wave functions of interacting states complements the lessons learned from Eq. (2.1) and poles’ structure of quartic force constants20, 21. For example, it shows that two factors are important for accurate description of systems with considerable vibronic interactions: (i) balanced description of possibly multi-configurational wave function (non-dynamical correlation); and (ii) quantitative accuracy in describing

61 D2h C2v

C1 C1

a g a 1

C4 C4

C1 C1

b1u a 1

C4 C4

6a g 11a1 2 2 A B1u A A1 5b1u 10a1

6ag 11a1 2 2 X Ag X A1 5b1u 10a1

Figure 2.1: Frontier MOs and leading electronic conﬁgurations of the two lowest electronic states at D2h and C2v structures.

62 excited state ordering (dynamical correlation). Unbalanced treatment may overempha- size the magnitude of vibronic interaction and, therefore, may lead to so-called artefactual symmetry breaking17. While (i) can be achieved by multi-configurational self- consistent field (MCSCF), (ii) is much more difficult to satisfy. For example, MCSCF exhibits artificial symmetry breaking in NO3, which is removed when dynamical correlation is included23. Ground-state coupled-cluster methods, e.g., CCSD or CCSD(T), may fail due to the insufficient description of non-dynamical correlation, as it happens

24, 25 26 in NO3 , although they often tackle such challenging systems successfully . The analysis of the MOs and the electronic states of the para-benzyne anion (Fig. 2.1) suggests a moderate multi-conﬁgurational character of the ground state wave function, and this is conﬁrmed by EOM-EA amplitudes (as calculated at EOM-EA-

CCSD/6-311+G**) For the D2h structure, the EOM coefficient of the leading configuration (the Ag deteminant from Fig. 2.1 and 2.2) is 0.86. The three next important configurations have weights of 0.28, 0.26 and 0.07. Likewise, for the C2v structure the weights of the four leading configurations are 0.87, 0.27, 0.16 and 0.15. The EOM- SF-CCSD/6-311+G** amplitudes are similar: the weights of the leading configurations are 0.81 and 0.80 for the D2h and C2v structures, respectevily. This confirms minor multi-configurational character of the para-benzyne anion. For comparison, the weights of the four leading configurations of the singlet para-benzyne diradical, which is a truly multiconfigurational system, are 0.57, 0.38, 0.32 and 0.31 and the two leaing configurations are double excitations with respect to each other. Thus, we expect ground-state

− CC methods to be capable of treating C6H4 reasonably well, unlike heavily multicon- ﬁgurational wavefunctions, as those of the singlet diradicals24. The EOM methods that simultaneously include both dynamical and non-dynamical correlation are particularly attractive for the systems with vibronic interactions. More- over, as formal analysis of Stanton shows, the pole structure of the exact force constant

63 Reference:

6a Ψ Singlet g t arg et 2- = I R P p-C6H 4 Ψ 5b1u ref

6ag EA 6a Triplet Ψtarget = R Ψ ref g

p-C6 H 4 + 5b1u 5b1u

f - re SF Ψ Doublet p-C6H4 7ag R t = ge Quartet Ψ tar - 6ag p-C6 H4

5b1u

− Figure 2.2: The two lowest electronic states of C6H4 can be described by: (i) EOM-IP with the dianion reference; (ii) EOM-EA with the triplet neutral p-benzyne reference; (iii) EOM-SF with the quartet p-benzyne anion reference. of Eq. (2.1) is correctly reproduced within EOM-CC formalism. From the conﬁgura- tion interaction point of view, the balanced description of two important determinants from Fig. 2.1 is easily provided by several EOM-CC methods, i.e., ionization potential, electron-attachment, and spin-ﬂip EOM-CC (EOM-EA, EOM-IP, and EOM-SF, respectively), as explained in Fig. 2.2. All three EOM models include dynamical correlation as well. Their reliability depends on how well the corresponding reference state is described by the single-reference CCSD method and possible (near)-instabilities of the Hartree-Fock references.

64 2.3 Computational details

Two basis sets were employed in this work. All the geometry optimizations, frequencies’ calculations and most single point calculations were performed with the 6- 311+G**27, 28 basis. Additional single point calculations of IEs used the aug-cc-pVTZ29 basis. The choice of reference in our calculations requires additional comments. The

2 ground-state methods [i.e., MP2, CCSD, CCSD(T)], which employ Ag reference, were conducted using pure-symmetry ROHF and UHF references. The < S2 > values for

UHF were 0.869 and 1.226 at D2h and C2v, respectively. The stability analysis revealed several symmetry-breaking triplet instabilities at D2h. The lowest-energy UHF solution is heavily spin-contaminated, < S2 >=1.248, and is also of broken symmetry. When employed in the CCSD calculations, the lower-energy broken-symmetry UHF reference yielded higher correlated energies, in agreement with our experience (see also footnote 17 in Ref. 20). Thus, we consider symmetry-pure UHF references to be more appropriate in CC calculations, even though this results in cusps in the PES scans along the D2h→ C2v distortion. Using symmetry-broken reference at D2h yields continuous curves. In the cases of moderate spin-contamination (as for pure-symmetry UHF references), CCSD is orbital insensitive, however, the (T) correction is often more reliable for ROHF (see, for example, results from Ref. 30). The optimized-orbital CC models avoid ambiguity of the reference choice for non-variational CCSD, and, in view of UHF instabilities, we consider optimized orbitals coupled-cluster with doubles (OO-CCD) reference to be most appropriate (among the ground-state CC methods) for this system. References for the EOM calculations (see Fig. 2.2) were not well behaved either.

1 2− The closed-shell A1 reference for EOM-IP corresponds to the C6H4 dianion, which is unstable. This caused severe convergence problems and we abandoned EOM-IP treat-

3 3 ment. The EOM-EA calculations employed the B1u state ( A1 at C2v) of the diradical.

65 Unexpectedly, the corresponding UHF reference develops strong spin contamination

2 at relatively small displacements from D2h: for example, < S >=2.026 and 2.468 and D2h and C2v, respectively. This compromises the reliability of EOM-EA-CCSD at

C2v. The problem can be resolved by performing the EOM-EA calculations using the

4 OO-CCD triplet reference (EOM-EA-OD). The EOM-SF calculations employ the B1u

4 3 reference ( A1 at C2v), which can be derived from the B1u conﬁguration by placing an additional electron on a σ-like ag orbital. Despite the similarity to the parent triplet, the quartet reference has been found to be well behaved throughout the whole range of the

2 D2h→C2v distortions, e.g., < S >=3.776 and 3.779 and D2h and C2v, respectively. Optimized geometries were obtained at the UHF, ROHF, UHF-CCSD31, UHF- CCSD(T)32, OO-CCD31, 33, EOM-EA-CCSD34, EOM-SF-CCSD35, 36, and B3LYP37 levels of theory. The latter two methods have produced only D2h structures. For all other methods, two minima were found, and the C2v-D2h energy differences ∆E’s were calculated from the corresponding total energies. In addition, ∆E’s were calculated by a variety of methods at the UHF-CCSD optimized geometries. The lowest electronically excited states for C2v and D2h structures where computed by EOM-EA-CCSD at the ground state geometries optimized at the same method. We also performed the UHF-CCSD, UHF-CCSD(T), OO-CCD, OO-CCD(T), EOM-EA-CCSD, EOM-SF-

CCSD and EOM-EA-OD PES scans along the D2h-C2v displacement coordinate R(x) deﬁned as:

R(x) = (1 − x) ∗ r(D2h) + x ∗ r(C2v), (2.2)

where x is a fraction of C2v structure and vector r denotes Cartesian coordinates of the

D2h and C2v structures. The photodetachment spectra were calculated within double harmonic and parallel normal modes approximations (using normal modes of anion) by program PES 38. In all spectra calculations, we employed the SF-DFT geometry and frequencies of the singlet

66 state of the diradical39, while the triplet state was described by CCSD. For the anion, we considered UHF-CCSD and B3LYP geometries and harmonic vibrational frequencies of the anion. Charge distributions were obtained from the natural bond orbital (NBO) analysis40. Dipole moments were computed relative to the molecular center of mass. The Mulliken

41 conventions for symmetry labels were used: for both C2v and D2h, the plane of the molecule is YZ and the Z-axis passes through the two carbons that host the unpaired electrons. Electronic structure calculations were performed using Q-Chem42 and ACES II 43.

2.4 Equilibrium structures, charge distributions, and

the D2h→ C2v potential energy proﬁles at different levels of theory

This section discusses the D2h and C2v optimized structures, respective energy differences, PES scans, as well as changes in charge distributions upon symmetry lowering. As pointed out by Nash and Squires16, most theoretical methods predict two minima on the PES, with D2h and C2v symmetries. As follows from the MOs (see Fig. 2.1), the latter structures are characterized by a localized charge, i.e., are of a distonic type.

The optimized D2h structures are presented in Fig. 2.3. Fig. 2.4 and Fig. 2.5 sum- marizes differences between the D2h and C2v structures calculated at different levels of theory. For coupled-cluster methods, the changes in geometries between different levels of theory (Fig. 2.4, 2.5) are not signiﬁcant and are smaller for the D2h structure than for C2v. For example, the variation in bond lengths and angles for D2h structures are

67 CCSD CCSD(T) 1.391 116.5 B3LYP 1.398 116.5 OO-CCD 1.384 117.2 EA-CCSD 1.390 116.8 SF-CCSD 1.393 116.6 1.393 116.6 H H 121.9 1.435 121.8 1.440 1.093 122.0 1.433 1.096 121.9 1.435 1.092 121.6 1.433 1.093 121.5 1.433 1.093 H H 1.092

Figure 2.3: D2h structures optimized at different levels of theory.

◦ below 0.02 A˚ and 2 , respectively. The differences are larger for the C2v structures, possibly because of larger spin-contamination of the doublet and triplet UHF references.

The D2h structures calculated by the CC methods with double substitutions, i.e., CCSD, OO-CCD, EOM-EA-CCSD, and EOM-SF-CCSD, are very close to each other, which is reassuring in view of the instability of the doublet UHF reference. The changes in the bond lengths and angles between CCSD(T) and CCSD are within 0.01 A˚ and 1◦, which is consistent with typical differences for well behaved molecules (Ref. 44). The largest change due to (T) is observed for the distance between the radical centers. The B3LYP structure is considerably different, e.g., the corresponding C1-C4 distance is 0.035 A˚ shorter than the CCSD(T) value. The above trends suggest that the accuracy of the CC structures is not affected by the HF instabilities and vibronic interactions, and question the reliability of B3LYP results.

68 C : C 4 2 v D : C 1 1 2 2 2 h C : C 1 2 v

1 2 0 g

e 1 1 8 d

, e l g n 1 1 6 A

1 1 4

1 1 2 B 3 L Y P C C S D O O - C C D C C S D ( T ) E A - C C S D S F - C C S D M e t h o d

1 . 4 5 D : C 2 - C 3 C : C 2 - C 3 C : C 1 - C 2 2 h 2 v 2 v D : C 1 - C 2 C : C 3 - C 4 2 h 2 v 1 . 4 4

1 . 4 3 A

, 1 . 4 2 h t g n e l 1 . 4 1 d n o B 1 . 4 0

1 . 3 9

1 . 3 8 B 3 L Y P C C S D O O - C C D C C S D ( T ) E A - C C S D S F - C C S D M e t h o d

Figure 2.4: CCC angles, bonds lengths calculated by different coupled-cluster methods and B3LYP.

The comparisons between the D2h and C2v structures allows one to quantify the degree of symmetry lowering at each level of theory. For example, the magnitude of

69 2 . 9 1 D : C - C 2 h 1 4 C : C - C 2 v 1 4

2 . 9 0 A

, e c

n 2 . 8 9 a t s i D

2 . 8 8

2 . 8 7 B 3 L Y P C C S D O O - C C D S F - C C S D E A - C C S D C C S D ( T ) M e t h o d

Figure 2.5: The distance between the radical-anion centers calculated by different coupled-cluster methods and B3LYP.

C2v distortions can be characterized by the difference between C1-C2 and C3-C4, which are identical in D2h. As shown in the lower panel of Fig. 2.4, the difference in C1-C2 between the D2h and C2v structures is largest for CCSD and OO-CCSD and is reduced by the inclusion of triples. At the EOM-EA-CCSD level, it shrinks even further. The C1-

C2 —C2-C3 difference, as well as difference in the C1 and C4 angles, follows the same trend. Thus, the magnitude of symmetry breaking is smaller for higher-level methods, or methods that use better references. This, along with the absence of a C2v minimum at the EOM-SF-CCSD level, suggests that its occurrence is artefactual. The degree of symmetry breaking can also be quantiﬁed by the charge localization, which gives rise to a non-zero dipole moment shown in Fig. 2.6. The corresponding atomic charges calculated using the NBO procedure are consistent with the dipole moment changes, except for ROHF, which may be an artifact of the NBO analysis. In agreement with the trends in structural changes, the charge localization is smaller for

70 - 0 . 3 2 R O H F - E A - C C S D

- 0 . 3 6 U H F - E A - C C S D R O H F - C C S D u . a

, ) 1 - 0 . 4 0 C ( q R O H F

- 0 . 4 4 U H F - C C S D

U H F - 0 . 4 8

5 . 5

R O H F 5 . 0 U H F U H F - C C S D 4 . 5 e y

b 4 . 0 e d

, e l 3 . 5 R O H F - C C S D o p i D 3 . 0 U H F - E A - C C S D

2 . 5

R O H F - E A - C C S D 2 . 0

Figure 2.6: The NBO charge at C1 (top panel) and dipole moments (bottom panel) at the optimized C2v geometries. The ROHF-CCSD and ROHF-EA-CCSD values are calculated at the geometries optimized by the corresponding UHF-based method.

71 more correlated wave functions. Interestingly, the overall dipole moment decreases in spite of increased C1-C4 distance. Thus, the charge localization patterns also suggest that the symmetry lowering is an artifact of an incomplete treatment of electron correlation. Finally, symmetry lowering can be characterized by energy differences (∆E’s) between the D2h and C2v structures summarized in Table 2.1 and Fig. 2.7. In addition, several PES scans are shown in Fig. 2.8. The ﬁrst part of the table and the upper panel of Fig. 2.7 summarize single point calculations at the CCSD optimized structures (MCSCF and CASPT2N employ the MCSCF geometries and are from Ref. 16), while the second part and the lower panel of Fig. 2.7 present energy differences calculated using the respective optimized structures. All ∆E’s from Table 2.1 and Fig. 2.7 are calculated using symmetry-pure UHF references at D2h (see section 2.3). Fig. 2.8 shows both the symmetry-pure and symmetry-broken CCSD/CCSD(T) values at D2h. For these methods, the correct-symmetry reference yields lower correlated energies, which results in cusps on the PESs (see section 2.3). OO-CCD has only one solution at D2h minimum, and the respective curves are smooth. Similarly to the trends in structures and charge localization, the energy difference between the D2h and C2v structures is also very sensitive to correlation treatment, in

16 agreement with earlier studies . UHF and ROHF place the C2v structure signiﬁcantly lower (by about 1 eV), than the D2h one, while MP2 gives the almost exactly opposite result. At higher level methods, the energy difference becomes much smaller. At the CCSD and OO-CCD levels, the C2v structure is just a little (<0.1 eV) more stable than the D2h one. Finally, (T) correlation at the CCSD or OO-CCD levels changes the balance and the D2h structure becomes the global minimum. Similarly, at the EOM-EA-

CCSD level, D2h is also the lowest minimum. The B3LYP and EOM-SF PES do not have a C2v minimum. Overall, the PES along this distortion is rather ﬂat (due to vibronic

72 1 . 2

0 . 8 V e

, )

h 0 . 4 2 D ( E

- 0 . 0

) v 2 C (

E - 0 . 4

- 0 . 8

- 1 . 2 ) ) ) 2 2 P F F N D D D D F D D T T T P ( 2 P ( ( Y S C S H S S H C O - T L D D D M S M C C C C U O C - - 3 A P - S S C C C C C C R F F B - - - - E S O C / C C - M H F H - F F A A T C O C F - S - O U E H H O C F - - H F F R U O D O F F U H H R H H U O U U R

1 . 2

0 . 8 V e

, )

h 0 . 4 2 D ( E

- 0 . 0

) v 2 C (

E - 0 . 4

- 0 . 8

- 1 . 2 ) F D D F D T ( S H S H C D C C U O C - S C C R - - O C F A O C - E H - F U F H H U U

Figure 2.7: Energy differences between the C2v and D2h structures (6-311+G** basis). Upper panel: calculated using the UHF-CCSD optimized geometries. MCSCF and CASPT2N employ the MCSCF geometries. Lower panel: calculated at the respective optimized geometries.

73 Table 2.1: Energy differences between the C2v and D2h minima (eV) in the 6-311+G** basis set. Negative values correspond to the C2v minimum being lower. ZPEs are not included.

Method UHF ROHF at UHF-CCSD optimized geometries HF -0.999 -1.049 MP2 1.191 0.887 CCSDa -0.046 (-0.116) -0.130 CCSD(T)a 0.140 (-0.052) 0.077 B3LYP 0.070 5050b -0.149 OO-CCD -0.096 OO-CCD(T) 0.081 EOM-EA-CCSD 0.086 EOM-EA-OD 0.050 EOM-SF-CCSD 0.052 at the geometries optimized by the corresponding method HF -0.971 -1.039 5050 -0.136 CCSD -0.046 CCSD(T) 0.140 OO-CCD -0.097 EOM-EA-CCSD 0.071 a Calculated using symmetry-pure and symmetry-broken (values in parenthesis) doublet UHF references at D2h. b DFT functional with 50% of the HF exchange. interactions): the most reliable estimates of ∆Es between the two CCSD optimized minima range between 0.05 eV (EOM-SF-CCSD) to 0.08 eV [CCSD(T) and OO-CCD(T)]. Optimized-orbital and regular CC methods give similar energy orderings, although OO-

CCD and OO-CCD(T) place the D2h structure a little higher than the corresponding CC methods based on the Hartree-Fock reference. EOM-SF behave similarly to EOM-EA.

74 C C S D E A - C C S D 0 . 1 2 C C S D ( T ) S F - C C S D O O - C C D E A - O D 0 . 0 8 O O - C C D ( T ) B 3 L Y P

0 . 0 4 V e

) 0 . 0 0 v 2

C (

E - 0 . 0 4 - E - 0 . 0 8

- 0 . 1 2

- 0 . 1 6 0 2 0 4 0 6 0 8 0 1 0 0 % o f C 2 v

Figure 2.8: The potential energy proﬁles along the D2h-C2v scans. At the D2h structure, CCSD and CCSD(T) energies are calculated using symmetry-broken and pure- symmetry UHF references (see section 2.3).

Similar trends, although with larger variations, were observed for multi-reference methods, e.g. MCSCF and CASPT2. The previous study16 reported the energy differences between the C2v and D2h structures calculated by multi-reference methods as -0.286 eV for MCSCF(9,8)/cc-pVDZ and 0.388 for CASPT2N/cc-pVDZ. Thus, only when dynamical correlation is included in a multi-reference method, the D2h structure becomes the lowest in energy. As follows from Table 2.1, using different optimized geometries has only a minor effect on ∆E’s. For example, the energy differences calculated at the CCSD optimized geometries are within 0.03 eV from those calculated using the structures optimized at the same level of theory, and the observed trends are the same.

75 Thus, the presented ∆E’s exhibit the same trend as the optimized geometries and charge localization patterns — the more correlation we add the more stable the D2h structure becomes. Energy profiles from Fig. 2.8 give more detailed information on the shape of the PES. Due to the HF instabilities, the CCSD and CCSD(T) curves are discontinuous, unless the symmetry-broken UHF reference is used at D2h. Apart from the cusp, the shapes of the OO-CCD and CCSD curves are surprisingly similar. The difference becomes more profound when triples are included: the OO-CCD(T) curve smoothly descends towards the D2h minimum, while the CCSD(T) curve is gradually climbing up all the way till the final drop at D2h. Even though the corresponding D2h→ C2v ∆E’s are close, the shape of the EOM-SF-CCSD and OO-CCD(T) curves are quite different: the former is much flatter at D2h than the latter. The EOM-EA-CCSD curve parallels

EOM-SF-CCSD around D2h, but soon becomes spoiled by the instability in the triplet reference. Using optimized orbitals triplet reference solves the problem of HF instability and the EOM-EA-OD curve is very similar to the EOM-SF-CCSD one. The B3LYP curve is steeper than the EOM-SF one, consistent with large value of the corresponding frequency (see next section). Overall, Fig. 2.8 demonstrates that describing the shape of PES is more challenging than calculating energy difference between two minima. One should expect considerable differences in the corresponding vibrational frequency. Note that the analytic CCSD or CCSD(T) frequencies would differ dramatically from those calculated by ﬁnite differences: the analytic frequencies will characterize the curvature of the PES corresponding to the symmetry-pure reference, whereas ﬁnite difference calculations will sample the cusp. To conclude, the analysis of the equilibrium structures, charge distributions, and energy differences calculated at different levels of theory suggests that the C2v minimum is an artifact of incomplete treatment of electron correlation and HF instabilities in the

76 doublet or triplet references. The strong vibronic interactions result in rather ﬂat PES along the D2h→ C2v distortion. In the next section, we present our calculations of the photoelectron spectrum, which allows us to assess the quality of calculated geometries and to unambiguously determine the symmetry of the anion.

2.5 Vibrational frequencies and the photoelectron spec-

− trum of para-C6H4

Photoelectron spectroscopy is a very sensitive structural tool. The spectra depend on equilibrium geometries, frequencies and normal modes of the initial (radical anion) and target (diradical) states. The equilibrium geometries were discussed in the previous section. The vibrational frequencies of the anion (at D2h) and the diradical are given in Table 2.2. The CCSD and B3LYP frequencies of the anion appear to be similar, except for the lowest b1u mode that describes D2h→C2v displacements and is most affected by vibronic interactions. This mode also exhibits the largest change relative to the neu-

− tral, i.e., it is considerably softer in C6H4 . The B3LYP frequency is 40% higher than the CCSD value, in favor of our assumption that B3LYP underestimates the magnitude of vibronic interaction. Unfortunately, the only signiﬁcant vibrational progressions present the Franck-Condon factors (see section 2.3) are those for fully-symmetric normal modes that exhibit displacements relative to the anion. Modes with signiﬁcant frequency change, such as the above b1u mode, yield much smaller features that are likely to be buried under major progressions, as it happens for the para-benzyne anion. The electron photodetachment spectrum of the para-benzyne radical anion reported by Wenthold and coworkers12 consists of the two overlapping bands corresponding to

1 3 the singlet (X Ag) and triplet ( Bu) states of the diradical with the origins, i.e., electron

77 Table 2.2: Vibrational frequencies of the para-benzyne radical anion and the diradical.

2 a 2 b 3 a 1 c Symmetry X A1 X A1 a B1u X A1

ag 3137 3085 3221 3317

ag 1471 1434 1565 1440

ag 1189 1175 1166 1202

ag 992 981 1028 1065

ag 624 631 615 660

au 879 917 920 994

au 380 398 368 450

b1g 691 721 810 722

b2g 1120 890 871 961

b2g 647 601 437 616

b3g 3114 3061 3207 3302

b3g 1579 1549 1652 1738

b3g 1297 1284 1300 1327

b3g 608 612 585 597

b1u 3111 3056 3206 3300

b1u 1445 1425 1471 1511

b1u 1071 1060 1046 1101

b1u 452 639 947 973

b2u 3136 3082 3220 3316

b2u 1331 1332 1375 1448

b2u 1189 1203 1267 1254

b2u 1027 1020 1081 1076

b3u 715 718 753 796

b3u 422 447 400 477 a Calculated at UHF-CCSD/6-311+G** b Calculated at B3LYP/6-311+G** c From Ref. 39, SF-5050/6-31G** afﬁnities (EAs), at 1.265 ± 0.008 eV and 1.430 ± 0.015 eV, respectively. Long vibrational progressions for frequencies at 635 ± 20 cm−1 and 990 ± 20 cm−1 (assigned to the singlet band), at 610 ± 15 cm−1 and 995 ± 20 cm−1 (assigned to the triplet band),

78 as well as a hotband at 615 ± 30 cm−1 were reported. For both the singlet and the triplet bands, the lower (about 600 cm−1) and the higher (about 1000 cm−1) energy progressions were assigned to symmetric ring deformation and ring breathing, respectively.

These bands, four lowest ag vibrations, are also present in all the computed spectra.

Including the lowest b1u mode resulted in very small features, completely obscured by the dense lines of major progressions. In the original paper12, the normal displacements were derived from the fitting procedure. Table 2.3 and Fig. 2.9 compare these values with the shifts computed in this work from the corresponding D2h optimized geometries and normal modes of the anion. The triplet and the singlet states of the diradical are described by the CCSD and spin-flip DFT methods, respectively (see section 2.3). The computed CCSD shifts are very close to those derived from the experiment. Values obtained with the CCSD normal modes and different equilibrium structures [e.g., CCSD(T), EOM-EA-CCSD, OO-CCD, and B3LYP] are quite similar to each other. However, the full B3LYP shifts (the last line in Table 2.3), i.e., those computed using the B3LYP normal modes and B3LYP equilibrium structures, are much larger for the ring breathing mode, which will result in a longer progression. Finally, Fig. 2.10 compares the calculated and experimental photodetachment spectra. The upper panel of Fig. 2.10 shows the singlet and triplet bands of the spectrum for the D2h structures calculated using the CCSD description of the anion. To compare with the experimental spectrum, the scaled intensities of the singlet and triplet bands were added together. Two scaling coefficients were determined from the following conditions: (i) the combined spectrum is normalized to 1; and (ii) the intensity of peak number 3 in the experimental and the calculated spectra coincide. The experimental spectrum was normalized to 1 as well. The bottom part of Fig. 2.10 compares the computed D2h and the experimental spectra. The agreement between the two spectra

79 √ Table 2.3: Displacements in normal modes upon ionization, A ∗ amu.

a b a b Method ν5 ν4 ν5 ν4 Singlet Triplet Exp. ﬁtc] -0.44 -0.14 0.76 -0.12 CCSDd -0.49 -0.21 -0.69 -0.09 CCSD(T)d -0.50 -0.27 -0.71 -0.14 EOM-EA-CCSDd -0.49 -0.22 -0.70 -0.10 OO-CCDd -0.49 -0.21 -0.69 -0.09 DFT/B3LYPd -0.47 -0.17 -0.67 -0.05 DFT/B3LYPe -0.40 -0.29 -0.60 -0.23 aRing deformation bRing breathing cRef. 12 dusing the CCSD normal modes eusing the DFT/B3LYP normal modes

0 . 0 e x p . f i t C C S D C C S D ( T ) E O M - E A - C C S D 2

/ - 0 . 2

1 O O - C C D

u D F T m

a D F T - f u l l ∗ A

s - 0 . 4 t n e m e c a l

p - 0 . 6 s i D

- 0 . 8 S - Q 6 3 5 S - Q 9 9 0 T - Q 6 1 0 T - Q 9 9 5 N o r m a l m o d e s

Figure 2.9: Displacements along the normal modes upon ionization (D2h structure).

80 y t i s n e t n I

1 . 2 1 . 4 1 . 6 1 . 8 2 . 0 2 . 2 2 . 4 E l e c t r o n b i n d i n g e n e r g y , e V y t i s n e t n I

1 . 2 1 . 4 1 . 6 1 . 8 2 . 0 2 . 2 2 . 4 E l e c t r o n b i n d i n g e n e r g y , e V

Figure 2.10: Top: Singlet (solid line) and triplet (dotted line) bands of the electron photodetachment spectrum for D2h using the CCSD equilibrium geometries and frequencies of the anion. Bottom: the experimental (solid line) and the calculated (dotted line) spectra for D2h using the CCSD structure and normal modes.

81 is excellent: the computed spectrum has the same features as the experimental one and differs only slightly in the peaks’ intensities. Fig. 2.11 shows the spectrum calculated for the C2v CCSD structure, which is markedly different — it features additional progressions that are not present in the experimental spectrum. They correspond to the ring deformation band and the combination bands involving this mode. To conclude, an excellent agreement between the spectrum calculated using the D2h structure and the experimental one, along with significant differences between D2h and C2v, unambiguously proves the D2h structure of the para-benzyne radical anion. Moreover, it also confirms high quality of the CCSD equilibrium structure and frequencies for the anion, as well as the accuracy of the spin-flip description of the diradical. We also calculated

− the spectrum using the B3LYP optimized geometries and frequencies for p-C6H4 (and the same structures and frequencies for the diradical, as above). The resulting spectrum (Fig. 2.12) has signiﬁcantly different ratio of intensities for the two main bands and much longer progressions. Thus, even though DFT correctly predicts symmetry of the para-benzyne radical anion, the overall structure is of poor quality. The reasons for this are discussed below.

2.6 DFT self-interaction error and vibronic interactions

The analysis of equilibrium structures, charge localization, and energy differences between D2h and C2v structures presented in section 2.4 demonstrated that the magnitude of symmetry breaking decreases when higher level methods are employed. More- over, the PES scans and the analysis of UHF references strongly suggest that the exis- tence of a C2v minimum is an artifact. Indeed, EOM-SF-CCSD, the only CC method

82 y t i s n e t n I

1 . 2 1 . 4 1 . 6 1 . 8 2 . 0 2 . 2 2 . 4 E l e c t r o n b i n d i n g e n e r g y , e V

Figure 2.11: The experimental (solid line) and the calculated (dotted line) spectra for C2v using the CCSD structure and normal modes. that employs a well-behaved UHF reference (that is, symmetry-pure, not strongly spin- contaminated and stable), yields only a D2h minimum and a smooth PES. The EOM- EA-OD potential energy proﬁle parallels the EOM-SF one. Finally, modeling of the photoelectron spectrum ruled out the C2v structure. Thus, we conclude that symmetry

− breaking in p-C6H4 is purely artefactual. DFT/B3LYP appears to be more robust with respect to this artefactual symmetry breaking as it yields only a D2h structure. The important question is why, and whether or not one should consider DFT a reliable tool for modeling systems with strong vibronic interactions. Detailed benchmark studies (e.g., Ref. 18) provide many examples of DFT giving “the right answer” for difﬁcult symmetry breaking systems. Stanton explained these observations by making connections between DFT response theory, TD-DFT excitation energies and their Tamm-Dancoff approximation (footnote 23 in Ref. 20), and

83 y t i s n e t n I

1 . 2 1 . 4 1 . 6 1 . 8 2 . 0 2 . 2 2 . 4 E l e c t r o n b i n d i n g e n e r g y , e V

Figure 2.12: The experimental (solid line) and the calculated (dotted line) spectra for D2h using the B3LYP structure and normal modes. concluded that DFT describes pseudo Jahn-Teller effects reasonably well “for the right reason”, that is, because the poles appear in approximately the right places. Other examples, however, suggested that DFT might actually underestimate the magnitude of vibronic interactions21. To complicate matters even further, we would like to mention

− the dehydro-meta-xylylene anion (DMX ) for which DFT yields low symmetry (Cs or

26 C2) equilibrium geometries , while wave function based methods predict planar C2v structures45. To gain insight, we begin by analyzing the character of interacting electronic states

− − in p-C6H4 and DMX at symmetric and symmetry-broken geometries. As described in − sections 2.2 and 2.4, the two interacting states of p-C6H4 have the excess charge equally delocalized between two radical centers (C1 and C4) at D2h, whereas C2v distortions

84 result in the charge localization at one of the carbons. DMX− shows exactly the oppo-

26, 45 site behavior : at the planar C2v geometries, most of the excess charge resides on the

σ-radical center, whereas C2 or Cs displacements facilitate the ﬂow of the charge into the π-system. Thus, in both cases B3LYP favors the structures with more delocalized

+ 46 excess charge, which immediately brings about the infamous H2 example . While HF + dissociation curve for a one-electron H2 is exact, many DFT methods result in quasi- bound potential energy curves and with errors as large as 50 kcal/mol! This happens because DFT over-stabilizes the solutions with an electron being equally split between the two hydrogens, which is an artifact of self-interaction error (SIE) present in many functionals47–49. SIE is considerably larger in systems with odd number of electrons. Self-interaction energy, Coulomb interaction of an electron with itself, is always positive, and electron delocalization lowers its magnitude thus causing artificial stabilization of delocalized configurations. Other well understood examples of artificial stabilization of delocalized states due to SIE include underestimated barriers for radical reactions, dissociation of cationic radicals, and mixed-valence transition metal dimers49.

+ As shown in Ref. 48, the magnitude of SIE in H2 strongly depends on the distance between the atoms, varying from a negligibly small value around the equilibrium to 55 kcal/mol (B3LYP) at the dissociation limit. The charge-bearing radical centers in the

+ para-benzyne radical anion are 2.9 A˚ apart. At this distance, the H2 curve exhibits as much as 60% of its maximum SIE. This suggests substantial SIE in the para-benzyne radical anion.

To test this assumption, we computed energy differences between the D2h and C2v minima (using the CCSD optimized geometries) employing functionals with a varying fraction of HF exchange. The results are shown in Fig. 2.13. The trend is clear: the relative stabilization of the D2h structure is inversely proportional to the fraction of the HF exchange in the functional, and, consequently, is proportional to SIE. Combining

85 different correlation functionals with the ﬁxed fraction of the HF exchange has little effect on the ∆E’s, for example, all functionals with 0% HF exchange yield approxi- matly the same D2h-C2v enegry separition. Therefore, we conclude that for DFT, the strengths of vibronic interactions and, consequently, a relative ordering of the D2h and

C2v structures, are governed by SIE. Thus, B3LYP yields the right structure for the “wrong reason”. Therefore, it is not at all surprising that the quality of the B3LYP structure and the shape of PES is relatively poor, as follows from the comparison of the computed photo-electron spectrum with the experimental one. In DMX−, B3LYP yields the wrong structure, but for the same “right reason” (SIE). Therefore, B3LYP (or any functional with SIE) is not a reliable method for the systems with strong vibronic interactions that affect charge localization patters, or when the interacting states are characterized by considerably different charge distributions.

0 . 2 E D F 2

0 . 0 E D F 1 B 3 L Y P 5 0 - 5 0 G i l l B 3 L Y P 5 V - 0 . 2 e B L Y P

B 3 P W 9 1 , )

h S V W N 2

D B V W N ( - 0 . 4 7 5 - 2 5

E G G 9 9

) v 2 C

( - 0 . 6 E L Y P - 0 . 8 V W N H F

- 1 . 0 0 2 0 4 0 6 0 8 0 1 0 0 H F e x c h a n g e , %

Figure 2.13: Energy differences between the D2h and C2v minima (using the CCSD structures) calculated using different density functionals.

86 2.7 Calculation of electron afﬁnities of p-benzyne

Different strategies for calculating electron affinities in problematic open-shell systems were discussed in Ref. 26. The simplest ”brute-force” approach, i.e., to take the difference between the total energies of the anion and the neutral calculated at the same level of theory, can give accurate values only if the errors in the corresponding total energies are very small (which can be achieved only at very high levels of theory), or if both species are described on the equal footing by the method and errors in the total energies cancel out. The latter is difficult to achieve in the case when both the neutral and anion wave functions have a complicated open-shell character. For example, the singlet state of the diradical is significantly multi-configurational and it would be described less accurately than the anion by single reference methods. The triplet state of the diradical, however, is single-configurational and can be described by single-reference methods with approximately the same accuracy as the doublet. Thus, triplet EAs calculated by energy differences employing correlated single-reference methods are more reliable than singlet EAs. The latter can be accurately computed by subtracting the experimental (or a calculated by an appropriate method) singlet—triplet gap from the calculated triplet electron affinity. This approach was used for calculating EAs of triradicals26, 45 and diradicals50. Below we demonstrate that this scheme in the spirit of isodesmic reactions indeed yields accurate singlet EAs. EAs calculated at different levels of theory for the singlet and the triplet states of the diradical are presented in Table 2.4 and Fig. 2.14. The first two columns present triplet and singlet EAs computed by energy differences. The last column presents singlet EA calculated from the corresponding triplet EA and the best theoretical estimate of the singlet-triplet gap. For the triplet state, which is predominantly single-configurational, even relatively low level methods using a moderate basis, e.g. B3LYP and MP2 in 6- 311+G**, yield reasonable EAs, e.g., within 0.11 eV and 0.07 eV from the experiment,

87 respectively. For the singlet state, however, EAs calculated by DFT and MP2 are 1.03 eV and 0.67 eV off, because these methods are not appropriate for the multi-conﬁgurational wave function of the singlet. Note that such brute-force approach reverses the states ordering in the diradical even at the CCSD level. For both states coupled-cluster methods, especially with triples’ corrections, are in a reasonable agreement with the experimental values. For example, ROHF-CCSD(T) is within 0.16 eV for triplet and within 0.05 eV for singlet.

3 1 Table 2.4: Electron afﬁnities (eV) of the a B1u and X Ag states of the para-benzyne diradical.

Method Triplet EA Singlet EA Singlet EAa Experiment 1.430 1.265 Experiment-ZPE 1.375 1.140 6-311+G** basis B3LYP 1.481 2.172 1.310 UHF -0.994 2.679 -1.165 UHF-MP2 1.547 0.467 1.376 UHF-CCSD 0.893 1.659 0.722 UHF-CCSD(T) 1.207 1.085 1.036 ROHF -1.148 2.356 -1.319 ROHF-MP2 2.053 0.914 1.882 ROHF-CCSD 0.872 1.629 0.701 ROHF-CCSD(T) 1.216 1.094 1.045 OO-CCD 0.875 1.644 0.704 OO-CCD(T) 1.210 1.080 1.039 aug-cc-pVTZ basis ROHF -1.199 2.297 -1.370 ROHF-MP2 2.266 1.010 2.095 ROHF-CCSD 0.998 1.766 0.827 ROHF-CCSD(T) 1.395 1.258 1.224 aCalculated by using triplet EA and the singlet-triplet gap from Ref. 51.

88 6 - 3 1 1 + G * * 2 . 5 a u g - c c - p V T Z

2 . 0 V e

A 1 . 5 E

E x p . 1 . 0

0 . 5 ) ) ) F 2 D D P D T T T ( ( ( P S S H Y C L D D D C C U M C - 3 - S S C C C F - - B O C C / C F F - H T O C C H H - - U O F F F U O O D H H R U O R

2 . 0 6 - 3 1 1 + G * * a u g - c c - p V T Z 1 . 5 E x p . 1 . 0 V e

A 0 . 5 E

0 . 0

- 0 . 5

- 1 . 0 ) ) ) D P D F 2 D T T T ( ( ( S Y P S C H D L D D C C C U M 3 - - S S C C C - B F - O C C / C F - F H T C O C - - H H O U F F F O U O D H H R O U R

Figure 2.14: Adiabatic electron afﬁnities calculated by energy differences for the singlet (top panel) and triplet (bottom panel) state of the para-benzyne diradical.

89 For quantitative thermochemical results, larger basis sets are required. We performed additional coupled-cluster calculations with the aug-cc-pVTZ basis (see Table 2.4). EA for the triplet calculated by CCSD(T) is within 0.02 eV from the experimental one, as expected for this method. However, the singlet CCSD(T) value is 0.12 eV off, because of the poor description of the multi-conﬁgurational singlet state by ground-state single- reference methods. By employing the best theoretical estimate for the singlet-triplet gap in para- benzyne, i.e., 0.171 eV from Ref. 51, singlet EA calculated from the CCSD(T) triplet value is within 0.08 eV from the experiment, as compared to 0.12 eV for the brute-force approach. As follows from the excellent agreement of triplet EA, the accuracy of singlet EA can be improved by reﬁning the value of the singlet-triplet gap.

2.8 Conclusions

− The shape of ground-state PES of C6H4 is strongly affected by the vibronic interactions with a low-lying excited state. The magnitude of vibronic interactions depends cru- cially on the the energy gap and couplings between the interacting states. Complicated open-shell character of the wave functions of the interacting states and HF instabilities challenge ab initio methodology: the shape of PES along the vibronic coupling coordinate differs dramatically for different methods. For example, many wave function based methods predict vibronic interactions to be strong enough to result in a lower-symmetry

C2v minimum, in addition to the D2h structure. EOM-SF-CCSD and B3LYP predict only a single minimum, however, the curvature of the surface is quite different. The degree of this symmetry lowering can be quantiﬁed by structural parameters (e.g., the magnitude of nuclear displacements), charge localization and the resulting dipole moment, as well as energy difference between the two minima. We found that

90 the magnitude of symmetry breaking (as characterized by the above metrics) decreases when higher-level methods are employed. For example, the dipole moment of the C2v structure decreases in the HF→CCSD→EOM-EA-CCSD series. Likewise, coupled- cluster methods with triples corrections as well as EOM-EA/SF-CCSD predict that D2h is a lower energy structure. Moreover, EOM-SF predicted only D2h minima. Calculation of the potential energy proﬁles along the D2h→ C2v distortion along with the stability analysis of UHF references allowed us to attribute the lower-symmetry minimum to the incomplete treatment of electron correlation and UHF instabilities.

The quality of D2h structure was assessed by comparing the computed photoelectron spectrum against the experiment. The spectrum calculated using the CCSD description of the anion and the SF-DFT results for the diradical is in an excellent agreement with the experimental one, which is particularly encouraging in view of its dense and complicated nature. The spectrum calculated using the C2v geometry and frequencies is markedly different, which allows us to rule out the C2v structure. As pointed out in earlier studies16, B3LYP appears to be robust w.r.t. this artefactual symmetry breaking, and yields only D2h structure. The analysis of the structure and the potential energy proﬁle along the D2h→ C2v distortion reveals that the structure and the curvature are considerably different from those calculated by reliable wave functions methods. Moreover, the computed spectrum differs signiﬁcantly from the experimental one. The shape of the surface, as well as the corresponding harmonic frequency reveals that B3LYP underestimates the magnitude of vibronic interactions, as pointed out earlier by Crawford and coworkers21. Further analysis allowed us to attribute the failure of B3LYP to the self-interaction error. A similar behavior (that yielded seemingly different result) was observed in DMX−, where SIE resulted in B3LYP overestimation of vibronic interactions and, consequently, a lower symmetry structure. In both cases,

91 large SIE originates in signiﬁcant changes in charge localization patterns along symmetry breaking coordinate. Thus, B3LYP is not a reliable method for systems where vibronic interactions are coupled to different charge localization patterns, as happens in distonic radical and diradical anions. The work presented in this chapter was published in Ref. 52.

Reference list

[1] Wenk, H.H. ; Winkler, M. ; Sander, W. Angew. Chem. Int. Ed. Engl. 2003, 42, 502.

[2] Jones, R.R. ; Bergman, R.G. J. Am. Chem. Soc. 1972, 94, 660.

[3] Lee, M.D. ; Dunne, T.S. ; Siegel, M.M. ; Chang, C.C. ; Morton, G.0. ; Borders, D.B. J. Am. Chem. Soc. 1987, 109, 3464.

[4] Borders, D.B. , Doyle, T.W. , Eds. Enediyne antibiotics as antitumor agents; Mar- cel Deekker, New York, 1995.

[5] Maeda, H. , Edo, K. , Ishida, N. , Eds. Neocarzinostatin: the past, present, and future of an anticancer drug; Springer, New York, 1997.

[6] Schottelius, M.J. ; Chen, P. Angew. Chem. Int. Ed. Engl. 1996, 3, 1478.

[7] Hoffner, J. ; Schottelius, M.J. ; Feichtinger, D. ; Chen, P. J. Am. Chem. Soc. 1998, 120, 376.

[8] Nicolaou, K.C. ; Smith, A.L. Acc. Chem. Res. 1992, 25, 497.

[9] Bowles, D.M. ; Palmer, G.J. ; Landis, C.A. ; Scott, J.L. ; Anthony, J.E. Tetrahedron 2001, 57, 3753.

[10] Zeidan, T. ; Manoharan, M. ; Alabugin, I.V. J. Org. Chem. 2006, 71, 954.

[11] Wenthold, P.G. ; Hu, J. ; Squires, R.R. J. Am. Chem. Soc. 1996, 118, 11865.

[12] Wenthold, P.G. ; Squires, R.R. ; Lineberger, W.C. J. Am. Chem. Soc. 1998, 120, 5279.

[13] Eshdat, L. ; Berger, H. ; Hopf, H. ; Rabinovitz, M. J. Am. Chem. Soc. 2002, 124, 3822.

92 [14] Alabugin, I.V. ; Kovalenko, S.V. J. Am. Chem. Soc. 2002, 124, 9052.

[15] Alabugin, I.V. ; Manoharan, M. J. Am. Chem. Soc. 2003, 125, 4495.

[16] Nash, J.J. ; Squires, R.R. J. Am. Chem. Soc. 1996, 118, 11872.

[17] Davidson, E.R. ; Borden, W.T. J. Phys. Chem. 1983, 87, 4783.

[18] Cohen, R.D. ; Sherrill, C.D. J. Chem. Phys. 2001, 114, 8257.

[19] Crawford, T.D. ; Stanton, J.F. ; Allen, W.D. ; Schaefer III, H.F. J. Chem. Phys. 1997, 107, 10626.

[20] Stanton, J.F. J. Chem. Phys. 2001, 115, 10382.

[21] Russ, N.J. ; Crawford, T.D. ; Tschumper, G.S. J. Chem. Phys. 2004, 120, 7298.

[22] Lowdin,¨ P.-O. Rev. Mod. Phys. 1963, 35, 496.

[23] Eisfeld, W. ; Morokuma, K. J. Chem. Phys. 2000, 113, 5587.

[24] Crawford, T.D. ; Kraka, E. ; Stanton, J.F. ; Cremer, D. J. Chem. Phys. 2001, 114, 10638.

[25] Wladyslawski, M. ; Nooijen, M. In ACS Symposium Series, Vol. 828, pages 65–92, 2002.

[26] Slipchenko, L.V. ; Krylov, A.I. J. Phys. Chem. A 2006, 110, 291.

[27] Krishnan, R. ; Binkley, J.S. ; Seeger, R. ; Pople, J.A. J. Chem. Phys. 1980, 72, 650.

[28] Frisch, M.J. ; Pople, J.A. ; Binkley, J.S. J. Chem. Phys. 1984, 80, 3265.

[29] Woon, D. E. ; Jr., T. H. Dunning J. Chem. Phys. 1994, 100.

[30] Cristian, A.M.C. ; Shao, Y. ; Krylov, A.I. J. Phys. Chem. A 2004, 108, 6581.

[31] Purvis, G.D. ; Bartlett, R.J. J. Chem. Phys 1982, 76, 1910.

[32] Raghavachari, K. ; Trucks, G.W. ; Pople, J.A. ; Head-Gordon, M. Chem. Phys. Lett. 1989, 157, 479.

[33] Sherrill, C.D. ; Krylov, A.I. ; Byrd, E.F.C. ; Head-Gordon, M. J. Chem. Phys. 1998, 109, 4171.

[34] Nooijen, M. ; Bartlett, R.J. J. Chem. Phys. 1995, 102, 3629.

[35] Krylov, A.I. Chem. Phys. Lett. 2001, 338, 375.

93 [36] Levchenko, S.V. ; Krylov, A.I. J. Chem. Phys. 2004, 120, 175.

[37] Becke, A.D. J. Chem. Phys. 1993, 98, 5648.

[38] Arnold, D.W. Ph.D. Thesis, UC Berkeley, 1994.

[39] Shao, Y. ; Head-Gordon, M. ; Krylov, A.I. J. Chem. Phys. 2003, 118, 4807.

[40] NBO 5.0. Glendening, E.D. ; Badenhoop, J.K. ; Reed, A.E. ; Carpenter, J.E. ; Bohmann, J.A. ; Morales, C.M. ; Weinhold, F. Theoretical Chemistry Institute, University of Wisconsin, Madison, WI, 2001.

[41] Mulliken, R.S. J. Chem. Phys. 1955, 23, 1997.

[42] Shao, Y. ; Molnar, L.F. ; Jung, Y. ; Kussmann, J. ; Ochsenfeld, C. ; Brown, S. ; Gilbert, A.T.B. ; Slipchenko, L.V. ; Levchenko, S.V. ; O’Neil, D. P. ; Distasio Jr., R.A. ; Lochan, R.C. ; Wang, T. ; Beran, G.J.O. ; Besley, N.A. ; Herbert, J.M. ; Lin, C.Y. ; Van Voorhis, T. ; Chien, S.H. ; Sodt, A. ; Steele, R.P. ; Rassolov, V. A. ; Maslen, P. ; Korambath, P.P. ; Adamson, R.D. ; Austin, B. ; Baker, J. ; Bird, E.F.C. ; Daschel, H. ; Doerksen, R.J. ; Drew, A. ; Dunietz, B.D. ; Dutoi, A.D. ; Furlani, T.R. ; Gwaltney, S.R. ; Heyden, A. ; Hirata, S. ; Hsu, C.-P. ; Kedziora, G.S. ; Khalliulin, R.Z. ; Klunziger, P. ; Lee, A.M. ; Liang, W.Z. ; Lotan, I. ; Nair, N. ; Peters, B. ; Proynov, E.I. ; Pieniazek, P.A. ; Rhee, Y.M. ; Ritchie, J. ; Rosta, E. ; Sherrill, C.D. ; Simmonett, A.C. ; Subotnik, J.E. ; Woodcock III, H.L. ; Zhang, W. ; Bell, A.T. ; Chakraborty, A.K. ; Chipman, D.M. ; Keil, F.J. ; Warshel, A. ; Herhe, W.J. ; Schaefer III, H.F. ; Kong, J. ; Krylov, A.I. ; Gill, P.M.W. ; Head-Gordon, M. Phys. Chem. Chem. Phys. 2006, 8, 3172.

[43] ACES II. Stanton, J.F. ; Gauss, J. ; Watts, J.D. ; Lauderdale, W.J. ; Bartlett, R.J. 1993. The package also contains modiﬁed versions of the MOLECULE Gaussian integral program of J. Almlof¨ and P.R. Taylor, the ABACUS integral derivative program written by T.U. Helgaker, H.J.Aa. Jensen, P. Jørgensen and P.R. Taylor, and the PROPS property evaluation integral code of P.R. Taylor.

[44] Helgaker, T. ; Jørgensen, P. ; Olsen, J. Molecular electronic structure theory; Wiley & Sons, 2000.

[45] Munsch, T.E. ; Slipchenko, L.V. ; Krylov, A.I. ; Wenthold, P.G. J. Org. Chem. 2004, 69, 5735.

[46] Bally, T. ; Sastry, G.N. J. Phys. Chem. A 1997, 101, 7923.

[47] Polo, V. ; Kraka, E. ; Cremer, D. Molecular Physics 2002, 100, 1771 .

[48] Zhang, Y. ; Yang, W. J. Chem. Phys. 1998, 109, 2604.

94 [49] Lundber, M. ; Siegbahn, P.E.M. J. Chem. Phys. 2005, 122, 1.

[50] Reed, D.R. ; Hare, M. ; Kass, S.R. J. Am. Chem. Soc. 2000, 122, 10689.

[51] Slipchenko, L.V. ; Krylov, A.I. J. Chem. Phys. 2002, 117, 4694.

[52] Vanovschi, V. ; Krylov, A.I. ; Wenthold, P.G. Theor. Chim. Acta 2008, 120, 45.

95 Chapter 3

Future work

The effective fragment potential method, which emerged in the last decade, is currently an active area of research in both the methodology and applications. The implementation developed in this work increases the availability of the method and opens the possibility to combine it with various ab initio theories available in Q-Chem1 (e.g., RI-CCSD, EOM-SF-CCSD). Thus, it is likely to increase the interest to this method and its potential even further. Since EFP is a relatively new method, the opportunities for the future developments are abound. Below the most interesting directions of the future work are presented in the separate sections for the method, implementation and applications.

3.1 Methodology

Three-body dispersion

A recent work by Podeszwa ar al.2 demonstrated that three-body dispersion can contribute up to 50% to total three-body interaction energy for non-polar molecular clusters, such as cyclic benzene timer. Therefore, we expect that three body dispersion will become increasingly important in the three-dimensional aromatic clusters and in the bulk. To model intermolecular interactions in such systems accurately, the three-body dispersion term needs to be included in the EFP model. The approach, which must not compromise the computational efﬁciency of EFP, is yet to be developed. One of the

96 possible directions is to apply the approximations made in the derivation of the two- body EFP dispersion3 to the three-body dispersion expression in terms of frequency- dependent electric polarizabilities derived in Ref. 4.

Impoved electrostatics expansion

In the present EFP formulation5, the exchange-repulsion, polarization and dispersion points are the LMO centroids, whereas the electrostatics expansion points are the nuclei and bond midpoints. The LMOs centroids account for both bond midpoints and lone pairs. Thus, the lone pairs, which also represent concentration of the electronic density, are not included in the electrostatic expansion. An interesting question is if using the LMO centroids and the nuclei (instead of the bond midpoints and the nuclei) as the electrostatics expansion points will make the EFP description more balanced and thus more accurate. Additionally, the multipoles at the LMO centroids might better ﬁt in the uniﬁed EFP damping scheme proposed in Ref. 29.

Linear scaling EFP

The computational complexity of all EFP components in their straightforward formulation is O(N 2), where N is the number of fragments. Application of the fast multipole method6 to the electrostatics and polarization components along with a cutoff procedure to fast decaying exchange-repulsion and dispersion, will bring the computational complexity down to O(Nlog(N)). This will open the possibility for modeling much larger molecular systems and the properties of the bulk condensed phase at the EFP level.

Correlated treatment of the active region

In the present EFP formulation5 the active region is treated at the HF level of theory, which does not account for electron correlation. However, it is well established7–18 that

97 accurate description of many molecular properties requires correlated methods. Elec- tron correlation is especially important for the open-shell species, such as multiconfigurational excited states, which require inclusion of at least static correlation. The implementation of EFP developed in this work is a part of Q-Chem. This state of the art ab initio electronic structure package implements a wide variety of correlated methods1. Therefore, a promising direction of the future work is to develop a correlated EFP theory and to extend our implementation so that the active region can be treated with any of correlation methods available in Q-Chem. This theory will allow one to account for the environmental effects in the electronic structure studies of radicals and electronically excited species, as well as to model chemical reactions in the condensed phase. Relevant to the research conducted in this work (chapter 2), correlated EFP will allow to model the reaction of Bergman cyclization in solutions. Below a possible framework for such correlated EFP theory is outlined. In the present EFP formulation, the effective fragments affect the active region via one-electron terms in the Hamiltonian. These terms originate from the electrostatics and polarization components. Correlated methods can similarly use this modified Hamiltonian to account for the field of the effective fragments. However, a complication arises from the polarization component. Its contribution to the Hamiltonian depends on the values of the induced dipoles. The induced dipoles depend on the wavefunction of the active region, which in turn depends on the Hamiltonian. Thus the correlated wavefunction and the induced dipoles should be consistent with each other. To satisfy the consistency requirement, the induced dipoles and the correlated wavefunctions should be computed self-consistently with the following iterative procedure:

1. Compute the correlated wavefunction for a given values of the induced dipoles.

2. Update the induced dipoles to account for the new wavefunction.

98 3. If the induced dipoles and the wavefunction are converged exit, otherwise return to step 1.

However, such strict self-consistency requires to recompute the correlated wavefunction multiple times and, therefore, is computationally expensive. Alternatively, one may use the induced dipoles consistent with the HF wavefunction, which is computed as a prerequisite for all single reference correlated methods. This way the consistency is compromised to reduce the computational cost. Yet another approach is to perform just one step toward the consistency, i.e., to compute the correlated wavefunction based on the HF induced dipoles, update the induced dipoles and recompute the correlated wavefunction again. The accuracy of the last two approximations is yet to be investigated. For the methods computing the correlated wavefunction iteratively (such as CCSD), it is possible to design a more efﬁcient integrated approach, similar to that for the Hartree- Fock SCF procedure (section 1.3). In this approach, at every iteration both the correlated wavefunction and the induced dipoles are updated and thus consistence of both will be achieved once they converge. The drawback is the increased complexity in the implementation.

3.2 Implementation

Gradients

As discussed in section 1.7, the energy minima for the benzene dimer structures were obtained using scans of the potential energy surface. For large systems, a better approach would be to use a gradient-based optimization procedure. The gradient of the electrostatic energy was formulated in the original EFP paper19. Gradient for the dispersion

1 energy is simply a derivative of R6 . More intricate and difﬁcult to implement are recently

99 published gradients for polarization20, 21 and exchange-repulsion energies22. Implement- ing the gradient of EFP energy will allow one to perform geometry optimizations and molecular dynamics simulations for large molecular systems.

Fragment paramters

In the developed implementation, the EFP fragment parameters are deﬁned in the Q- Chem standard input, as described in section 1.6. The parameters of an effective fragment are computed from an individual isolated molecules, as discussed in sections 1.2, 1.3 1.4 and 1.5. Thus, for a given molecule the same set of EFP parameters can be used in all computations. This immediately prompts creating a library of the pre-computed EFP parameters for molecules that frequently occur in the systems studied with EFP. Such library can, for instance, include the parameters for common organic solvents (e.g., water, alcohols, acetone, DMSO, acetonitrile, formic acid, dichloromethane, carbon tetrachloride, chloroform, tetrahydrofuran, benzene, phenol, toluene, etc.). The EFP parameters included in the library can be carefully selected and benchmarked against high level ab initio theories. Therefore, the beneﬁts of such a library are not just the convenience it provides, but also a high-quality assurance for the EFP parameters.

QM-EF dispersion and exchange-repulsion interactions

In the current state of EFP theory, the interaction between the effective fragments and the ab initio region is due to the electrostatics and polarization components only. Thus, there is no account for dispersion and exchange-repulsion interactions between them. The proper theories for such interactions are presently developed in M.S. Gordon’s group. Meanwhile, it is possible to account for such interactions by treating the ab initio system as an effective fragment for these two types of interactions. Since the wavefunction of the ab initio part is readily available, the dispersion and exchange-repulsion parameters

100 can be computed as discussed in sections 1.4 and 1.5, respectively. From these parameters, dispersion and exchange-repulsion interaction energies between the active region and the effective fragments can be evaluated. Implementing such scheme will allow one to properly compute the binding energy in the systems with an ab initio region. In the present EFP theory, binding energy is properly computed for systems composed of the effective fragments only, whereas molecular systems with an ab initio region are used to model the solvation effects. Implementing the suggested technique will allow one to combine accurate description of the solvation effects and binding energies.

3.3 Applications

Numerous applications of EFP for modeling the solvation effects and the binding energies in the clusters and condensed phase can be envisioned. Here, the future direction of the research of multiple π − π interactions conducted in this work (section 1.7) is presented. As demonstrated in Ref. 2 using the cyclic benzene trimer example, three-body dispersion has a substantial contribution to many-body interaction energy. Introduction of the three-body dispersion term in EFP is proposed in section 3.1. Once this term is available, the extended EFP method can be applied to:

• study the role of many-body π − π interactions in the multi dimensional aromatic clusters and the bulk;

• model the mechanical properties of carbon-based materials, such as graphite and fullerene powders;

• investigate the role of π − π stacking in the rigidity of DNA.

101 Reference list

[1] Shao, Y. ; Molnar, L.F. ; Jung, Y. ; Kussmann, J. ; Ochsenfeld, C. ; Brown, S. ; Gilbert, A.T.B. ; Slipchenko, L.V. ; Levchenko, S.V. ; O’Neil, D. P. ; Distasio Jr., R.A. ; Lochan, R.C. ; Wang, T. ; Beran, G.J.O. ; Besley, N.A. ; Herbert, J.M. ; Lin, C.Y. ; Van Voorhis, T. ; Chien, S.H. ; Sodt, A. ; Steele, R.P. ; Rassolov, V. A. ; Maslen, P. ; Korambath, P.P. ; Adamson, R.D. ; Austin, B. ; Baker, J. ; Bird, E.F.C. ; Daschel, H. ; Doerksen, R.J. ; Drew, A. ; Dunietz, B.D. ; Dutoi, A.D. ; Furlani, T.R. ; Gwaltney, S.R. ; Heyden, A. ; Hirata, S. ; Hsu, C.-P. ; Kedziora, G.S. ; Khalliulin, R.Z. ; Klunziger, P. ; Lee, A.M. ; Liang, W.Z. ; Lotan, I. ; Nair, N. ; Peters, B. ; Proynov, E.I. ; Pieniazek, P.A. ; Rhee, Y.M. ; Ritchie, J. ; Rosta, E. ; Sherrill, C.D. ; Simmonett, A.C. ; Subotnik, J.E. ; Woodcock III, H.L. ; Zhang, W. ; Bell, A.T. ; Chakraborty, A.K. ; Chipman, D.M. ; Keil, F.J. ; Warshel, A. ; Herhe, W.J. ; Schaefer III, H.F. ; Kong, J. ; Krylov, A.I. ; Gill, P.M.W. ; Head-Gordon, M. Phys. Chem. Chem. Phys. 2006, 8, 3172.

[2] Podeszwa, R. ; Szalewicz, K. J. Chem. Phys. 2007, 126, 194101.

[3] Adamovix, I. ; Gordon, M.S. Mol. Phys. 2005, 103, 379.

[4] McLachlan, A.D. Mol. Phys. 1963, 6, 423.

[5] Gordon, M.S. ; Slipchenko, L. ; Li, H. ; Jensen, J.H. In Annual Reports in Com- putational Chemistry; Spellmeyer, D.C. , Wheeler, R. , Eds., Vol. 3 of Annual Reports in Computational Chemistry; Elsevier, 2007; pages 177 – 193.

[6] Schmidt, K.E. ; Lee, M.A. J. Stat. Phys. 1991, 63, 1223.

[7] Woon, D.E. ; Dunning, T.H. J. Chem. Phys. 1993, 99, 1914.

[8] Peterson, K.A. ; Kendall, R.A. ; Dunning, T.H. J. Chem. Phys. 1993, 99, 1930.

[9] Peterson, K.A. ; Kendall, R.A. ; Dunning, T.H. J. Chem. Phys. 1993, 99, 9790.

[10] Peterson, K.A. ; Woon, D.E. ; Dunning, T.H. J. Chem. Phys. 1994, 100, 7410.

[11] Woon, D.E. J. Chem. Phys. 1994, 100, 2838.

[12] Woon, D.E. ; Dunning, T.H. J. Chem. Phys. 1994, 101, 8877.

[13] Peterson, K.A. ; Dunning, T.H. J. Chem. Phys. 1995, 102, 2032.

[14] Peterson, K.A. ; Dunning, T.H. J. Chem. Phys. 1997, 106, 4119.

[15] Woon, D.E. ; Peterson, K.A. ; Dunning, T.H. J. Chem. Phys. 1998, 109, 2233.

102 [16] Wilson, A.K. ; Dunning, T.H. J. Chem. Phys. 1997, 106, 8718.

[17] Peterson, K.A. ; Dunning, T.H. J. Phys. Chem. A 1997, 101, 6280.

[18] Peterson, K.A. ; Wilson, A.K. ; Woon, D.E. ; Dunning, T.H. Theor. Chem. Acc. 1997, 97, 251.

[19] Day, P.N. ; Jensen, J.H. ; Gordon, M.S. ; Webb, S.P. ; Stevens, W.J. ; M.Krauss; D.Garmer; H.Basch; D.Cohen J. Chem. Phys. 1996, 105, 1968.

[20] Li, H. ; Gordon, M.S. J. Chem. Phys. 2007, 126.

[21] Li, H. ; Netzloff, H.M. ; Gordon, M.S. J. Chem. Phys. 2006, 125.

[22] Li, H. ; Gordon, M.S. Theor. Chem. Acc. 2006, 115, 385.

103 Bibliography

[1] I. Adamovic and M.S. Gordon. Solvent effects on the S(N)2 reaction: Application of the density functional theory-based effective fragment potential method. J. Phys. Chem. A, 109(8):1629–1636, 2005. [2] I. Adamovix and M.S. Gordon. Dynamic polarizability, dispersion coefficient C6 and dispersion energy in the effective fragment potential method. Mol. Phys., 103(2-3):379–387, 2005. [3] I.V. Alabugin and S.V. Kovalenko. C1-C5 photochemical cyclization of enediynes. J. Am. Chem. Soc., 124:9052–9053, 2002. [4] I.V. Alabugin and M. Manoharan. Radical-anionic cyclizations of enediynes: Remarkable effects of benzannelation and remote substituents on cycliorearoma- tization reactions. J. Am. Chem. Soc., 125:4495–4509, 2003. [5] N.L. Allunger, Y.H. Yuh, and J.H. Lii. Molecular mechanics - the MM3 force- field for hydrocarbons .1. J. Am. Chem. Soc., 111(23):8551–8566, 1989. [6] D.W. Arnold. Study of Radicals, Clusters, and Transition State Species by Anion Photoelectron Spectroscopy. PhD thesis, UC Berkeley, 1994. [7] B. Askew, P. Ballester, C. Buhr, K.S. Jeong, S. Jones, K. Parris, K. Williams, and J. Rebek. Molecular recognition with convergent functional groups. VI. Synthetic and structural studies with a model receptor for nucleic acid components. J. Am. Chem. Soc., 111(3):1082–1090, 1989. [8] R. Balawender, B. Safi, and P. Geerlings. Solvent effect on the global and atomic DFT-based reactivity descriptors using the effective fragment potential model. Solvation of ammonia. J. Phys. Chem. A, 105(27):6703–6710, 2001. [9] T. Bally and G.N. Sastry. Incorrect dissociation behavior of radical ions in density functional calculations. J. Phys. Chem. A, 101:7923–7925, 1997. [10] A.D. Becke. Density-functional thermochemistry. iii. The role of exact exchange. J. Chem. Phys., 98:5648, 1993.

104 [11] D.B. Borders and T.W. Doyle, editors. Enediyne antibiotics as antitumor agents. Marcel Deekker, New York, 1995.

[12] D.R. Bowler, T. Miyazaki, and M.J. Gillan. Recent progress in linear scaling ab initio electronic structure techniques. J. Phys.: Condens. Matter, 14(11):2781– 2798, 2002.

[13] D.M. Bowles, G.J. Palmer, C.A. Landis, J.L. Scott, and J.E. Anthony. The Bergman reaction as a synthetic tool: Advantages and restrictions. Tetrahedron, 57:3753–3760, 2001.

[14] S.F. Boys. Construction of some molecular orbitals to be approximately invariant for changes from one molecule to another. Rev. Mod. Phys., 32(2):296–299, 1960.

[15] S.F. Boys. In Quantum Science of Atoms, Molecules, and Solids, pages 253–262. Academic Press, 1966.

[16] M.F. Brana, M. Cacho, A. Gradillas, B. Pascual-Teresa, and A. Ramos. Interca- lators as anticancer drugs. Curr. Pharm. Des., 7(17):1745–1780, 2001.

[17] A.D. Buckingham. Molecular quadrupole moments. Q. Rev. Chem. Soc., 13(3):183–214, 1959.

[18] S.K. Burley and G.A. Petsko. Aromatic-aromatic interaction: A mechanism of protein structure stabilization. Science, 229(4708):23–28, 1985.

[19] C.J. Casewit, K.S. Colwell, and A.K. Rappe. Application of a universal force- ﬁeld to organic-molecules. J. Am. Chem. Soc., 114(25):10035–10046, 1992.

[20] K.R.S. Chandrakumar, T.K. Ghanty, S.K. Ghosh, and T. Mukherjee. Hydra- tion of uranyl cations: Effective fragment potential approach. J. Molec. Struct. (Theochem), 807(1-3):93–99, 2007.

[21] R.D. Cohen and C.D. Sherrill. The performance of density functional theory for equilibrium molecular properties of symmetry breaking molecules. J. Chem. Phys., 114:8257–8269, 2001.

[22] T.D. Crawford, E. Kraka, J.F. Stanton, and D. Cremer. Problematic p-benzyne: Orbital instabilities, biradical character, and broken symmetry. J. Chem. Phys., 114:10638–10650, 2001.

[23] T.D. Crawford, J.F. Stanton, W.D. Allen, and H.F. Schaefer III. Hartree-Fock orbital instability envelopes in highly correlated single-reference wave functions. J. Chem. Phys., 107:10626–10632, 1997.

105 [24] A.M.C. Cristian, Y. Shao, and A.I. Krylov. Bonding patterns in benzene triradicals from structural, spectroscopic, and thermochemical perspectives. J. Phys. Chem. A, 108:6581–6588, 2004.

[25] E.R. Davidson and W.T. Borden. Symmetry breaking in polyatomic molecules: Real and artifactual. J. Phys. Chem., 87:4783–4790, 1983.

[26] P.N. Day, J.H. Jensen, M.S. Gordon, S.P. Webb, W.J. Stevens, M.Krauss, D.Garmer, H.Basch, and D.Cohen. An effective fragment method for modeling solvent effects in quantum mechanical calculations. J. Chem. Phys., 105(5):1968– 1986, 1996.

[27] P.N. Day, M.S. Pachter, R.and Gordon, and G.N. Merrill. A study of water clusters using the effective fragment potential and Monte Carlo simulated annealing. J. Chem. Phys., 112(5):2063–2073, 2000.

[28] P.N. Day and R. Pachter. A study of aqueous glutamic acid using the effective fragment potential method. J. Chem. Phys., 107(8):2990–2999, 1997.

[29] G.R. Desiraju and A. Gavezzotti. From molecular to crystal structure; polynu- clear aromatic hydrocarbons. JCS Chem. Comm., pages 621–623, 1989.

[30] W. Eisfeld and K. Morokuma. A detailed study on the symmetry breaking an its effect on the potential surface of NO3. J. Chem. Phys., 113:5587–5597, 2000. [31] L. Eshdat, H. Berger, H. Hopf, and M. Rabinovitz. Anionic cyclization of a cross-conjugated enediyne. J. Am. Chem. Soc., 124:3822–3823, 2002.

[32] P.M. Felker, P.M. Maxton, and M.W. Schaeffer. Nonlinear raman studies of weakly bound complexes and clusters in molecular beams. Chem. Rev., 94(7):1787–1805, 1994.

[33] T.K. Firman and C.R. Landis. Valence bond concepts applied to the molecular mechanics description of molecular shapes. 4. Transition metals with π-bonds. J. Am. Chem. Soc., 123(47):11728–11742, 2001.

[34] M.A. Freitag and M.S. Gordon. Using the effective fragment-potential model to study the solvation of acetic and formic acids. Abstracts of Papers of the American Chemical Society, 218:U511–U511, 1999. 1.

[35] M.A. Freitag, M.S. Gordon, J.H. Jensen, and W.J. Stevens. Evaluation of charge penetration between distributed multipolar expansions. J. Chem. Phys., 112(17):7300–7306, 2000.

106 [36] M.J. Frisch, J.A. Pople, and J.S. Binkley. Self-consistent molecular orbital methods 25. Supplementary functions for Gaussian basis sets. J. Chem. Phys., 80:3265–3269, 1984. [37] J.L. Gao and D.G. Truhlar. Quantum mechanical methods for enzyme kinetics. Annu. Rev. Phys. Chem., 53:467–505, 2002. [38] E. Gazit. Self-assembled peptide nanostructures: the design of molecular building blocks and their technological utilization. Chem. Soc. Rev., 36(8):1263–1269, 2007. [39] E.D. Glendening, J.K. Badenhoop, A.E. Reed, J.E. Carpenter, J.A. Bohmann, C.M. Morales, and F. Weinhold. NBO 5.0. Theoretical Chemistry Institute, Uni- versity of Wisconsin, Madison, WI, 2001. [40] V. Gogonea, L.M. Westerhoff, and Jr. K.M. Merz. Quantum mechanical/quantum mechanical methods. I. A divide and conquer strategy for solving the schrodinger equation for large molecular systems using a composite density functional– semiempirical Hamiltonian. J. Chem. Phys., 113(14):5604–5613, 2000. [41] M.S. Gordon, M.A. Freitag, P. Bandyopadhyay, J.H. Jensen, V. Kairys, and W.J. Stevens. The effective fragment potential method: A QM-based MM approach to modeling environmental effects in chemistry. J. Phys. Chem. A, 105(2):293–307, 2001. [42] M.S. Gordon, L. Slipchenko, H. Li, and J.H. Jensen. Chapter 10 the effective fragment potential: A general method for predicting intermolecular interactions. In D.C. Spellmeyer and R. Wheeler, editors, Annual Reports in Computational Chemistry, volume 3 of Annual Reports in Computational Chemistry, pages 177 – 193. Elsevier, 2007. [43] T. Helgaker, P. Jørgensen, and J. Olsen. Molecular electronic structure theory. Wiley & Sons, 2000. [44] J.R. Hill and J. Sauer. Molecular mechanics potential for silica and zeolite catalysts based on ab initio calculations. 2. Aluminosilicates. J. Phys. Chem., 99(23):9536–9550, 1995. [45] J. Hoffner, M.J. Schottelius, D. Feichtinger, and P. Chen. J. Am. Chem. Soc., 120:376, 1998. [46] C.A. Hunter and J.K.M. Sanders. The nature of π − π interactions. J. Am. Chem. Soc., 112(14):5525–5534, 1990. [47] ISO/IEC. ISO/IEC 14882:2003: Programming languages: C++. Technical report, International Organization for Standardization, Geneva, Switzerland., 2003.

107 [48] J.H. Jensen and M.S. Gordon. An approximate formula for the intermolecular Pauli repulsion between closed shell molecules. Mol. Phys., 89(5):1313–1325, 1996. [49] J.H. Jensen and M.S. Gordon. An approximate formula for the intermolecular Pauli repulsion between closed shell molecules. II. Application to the effective fragment potential method. J. Chem. Phys., 108(12):4772–4782, 1998. [50] R.R. Jones and R.G. Bergman. p-Benzyne. Generation as an intermediate in a thermal isomerization reaction and trapping evidence for the 1,4-benzenediyl structure. J. Am. Chem. Soc., 94:660, 1972. [51] W.L. Jorgensen and N.A. McDonald. Development of an all-atom force field for heterocycles. Properties of liquid pyridine and diazenes. J. Molec. Struct. (Theochem), 424(1-2):145 – 155, 1998. [52] K. Kitaura, E. Ikeo, T. Asada, T. Nakano, and M. Uebayasi. Fragment molecular orbital method: An approximate computational method for large molecules. Chem. Phys. Lett., 313(3-4):701 – 706, 1999. [53] R. Krishnan, J.S. Binkley, R. Seeger, and J.A. Pople. Self-consistent molecular orbital methods. XX. A basis set for correlated wave functions. J. Chem. Phys., 72:650, 1980. [54] A.I. Krylov. Size-consistent wave functions for bond-breaking: The equation-of- motion spin-flip model. Chem. Phys. Lett., 338:375–384, 2001. [55] M.D. Lee, T.S. Dunne, M.M. Siegel, C.C. Chang, G.0. Morton, and D.B. Borders. Calichemicins, a novel family of antitumor antibiotics. 1. Chemistry and partial structure of calichemicin. J. Am. Chem. Soc., 109:3464, 1987. [56] K.R. Leopold, G.T. Fraser, S.E. Novick, and W. Klemperer. Current themes in microwave and infrared spectroscopy of weakly bound complexes. Chem. Rev., 94(7):1807–1827, 1994. [57] S.V. Levchenko and A.I. Krylov. Equation-of-motion spin-flip coupled-cluster model with single and double substitutions: Theory and application to cyclobu- tadiene. J. Chem. Phys., 120(1):175–185, 2004. [58] H. Li and M.S. Gordon. Gradients of the exchange-repulsion energy in the general effective fragment potential method. Theor. Chem. Acc., 115(5):385–390, 2006. [59] H. Li and M.S. Gordon. Polarization energy gradients in combined quantum mechanics, effective fragment potential, and polarizable continuum model calculations. J. Chem. Phys., 126(12), 2007.

108 [60] H. Li, M.S. Gordon, and J.H. Jensen. Charge transfer interaction in the effective fragment potential method. J. Chem. Phys., 124(21), 2006.

[61] H. Li, H.M. Netzloff, and M.S. Gordon. Gradients of the polarization energy in the effective fragment potential method. J. Chem. Phys., 125(19), 2006.

[62] H. Lin and D.G. Truhlar. QM/MM: what have we learned, where are we, and where do we go from here? Theor. Chem. Acc., 117(2):185–199, 2007.

[63] G. Lippert, J. Hutter, and M. Parrinello. A hybrid gaussian and plane wave density functional scheme. Mol. Phys., 92(3):477–487, 1997.

[64] P.-O. Lowdin.¨ Rev. Mod. Phys., 35:496, 1963.

[65] M. Lundber and P.E.M. Siegbahn. Quantifying the effects of the self-interaction error in DFT: When do the delocalized states appear? J. Chem. Phys., 122(224103):1–9, 2005.

[66] A.D. MacKerell, D. Bashford, Bellott, R.L. Dunbrack, J.D. Evanseck, M.J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F.T.K. Lau, C. Mattos, S. Michnick, T. Ngo, D.T. Nguyen, B. Prodhom, W.E. Reiher, B. Roux, M. Schlenkrich, J.C. Smith, R. Stote, J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera, D. Yin, and M. Karplus. All-atom empirical potential for molecular modeling and dynamics studies of proteins. J. Phys. Chem. B, 102(18):3586–3616, 1998.

[67] H. Maeda, K. Edo, and N. Ishida, editors. Neocarzinostatin: the past, present, and future of an anticancer drug. Springer, New York, 1997.

[68] S.L. Mayo, B.D. Olafson, and W.A. Goddard. DREIDING - a generic force-ﬁeld for molecular simulations. J. Phys. Chem., 94(26):8897–8909, 1990.

[69] A.D. McLachlan. Three-body dispersion forces. Mol. Phys., 6(4):423–427, 1963.

[70] G.N. Merrill and M.S. Gordon. Study of small water clusters using the effective fragment potential model. J. Phys. Chem. A, 102(16):2650–2657, 1998.

[71] G.N. Merrill and S.P. Webb. Anion-water clusters A(-)(H2O)(1-6), A = OH, F, SH, Cl, and Br. an effective fragment potential test case. J. Phys. Chem. A, 107(39):7852–7860, 2003.

[72] G.N. Merrill and S.P. Webb. The application of the effective fragment potential method to molecular anion solvation: A study of ten oxyanion-water clusters, A(-)(H2O)(1-4). J. Phys. Chem. A, 108(5):833–839, 2004.

109 [73] G.N. Merrill, S.P. Webb, and D.B. Bivin. Formation of alkali metal/alkaline earth cation water clusters, M(H2O)(1-6), M = Li+, Na+, K+, Mg2+, and Ca2+: An effective fragment potential (EFP) case study. J. Phys. Chem. A, 107(3):386–396, 2003.

[74] G. Monard and K.M. Merz. Combined quantum mechanical/molecular mechanical methodologies applied to biomolecular systems. Acc. Chem. Res., 32(10):904–911, 1999.

[75] K. Mueller-Dethlefs, O. Dopfer, and T.G. Wright. ZEKE spectroscopy of complexes and clusters. Chem. Rev., 94(7):1845–1871, 1994.

[76] K. Muller-Dethlefs and P. Hobza. Noncovalent interactions: A challenge for experiment and theory. Chem. Rev., 100(1):143–168, 2000.

[77] R.S. Mulliken. Electronic population analysis on LCAO-MO molecular wave functions. I. J. Chem. Phys., 23(10):1833–1840, 1955.

[78] R.S. Mulliken. Report on notation for the spectra of polyatomic molecules. J. Chem. Phys., 23:1997, 1955.

[79] T.E. Munsch, L.V. Slipchenko, A.I. Krylov, and P.G. Wenthold. Reactivity and structure of the 5-dehydro-m-xylylene anion. J. Org. Chem., 69:5735–5741, 2004.

[80] J.J. Nash and R.R. Squires. Theoretical studies of o-,m-, and p-benzyne negative ions. J. Am. Chem. Soc., 118:11872–11883, 1996.

[81] H.M. Netzloff and M.S. Gordon. The effective fragment potential: Small clusters and radial distribution functions. J. Chem. Phys., 121(6):2711–2714, 2004.

[82] K.C. Nicolaou and A.L. Smith. Molecular design, chemical synthesis, and biological action of enediynes. Acc. Chem. Res., 25:497–503, 1992.

[83] M. Nooijen and R.J. Bartlett. Equation of motion coupled cluster method for electron attachment. J. Chem. Phys., 102:3629–3647, 1995.

[84] K.A. Peterson and T.H. Dunning. Benchmark calculations with correlated molecular wave-functions. 7. Binding-energy and structure of the hf dimer. J. Chem. Phys., 102(5):2032–2041, 1995.

[85] K.A. Peterson and T.H. Dunning. Benchmark calculations with correlated molecular wave functions. 11. Energetics of the elementary reactions F+H-2, O+H-2, and H’+HCl. J. Phys. Chem. A, 101(35):6280–6292, 1997.

110 [86] K.A. Peterson and T.H. Dunning. Benchmark calculations with correlated molecular wave functions. 8. Bond energies and equilibrium geometries of the chn and c2hn (n=1-4) series. J. Chem. Phys., 106(10):4119–4140, 1997.

[87] K.A. Peterson, R.A. Kendall, and T.H. Dunning. Benchmark calculations with correlated molecular wave-functions. 2. Conﬁguration-interaction calculations on 1st-row diatomic hydrides. J. Chem. Phys., 99(3):1930–1944, 1993.

[88] K.A. Peterson, R.A. Kendall, and T.H. Dunning. Benchmark calculations with correlated molecular wave-functions. 3. Conﬁguration-interaction calculations on 1st row homonuclear diatomics. J. Chem. Phys., 99(12):9790–9805, 1993.

[89] K.A. Peterson, A.K. Wilson, D.E. Woon, and T.H. Dunning. Benchmark calculations with correlated molecular wave functions. 12. Core correlation effects on the homonuclear diatomic molecules b-2-f-2. Theor. Chem. Acc., 97(1-4):251– 259, 1997.

[90] K.A. Peterson, D.E. Woon, and T.H. Dunning. Benchmark calculations with correlated molecular wave-functions. 4. The classical barrier height of the h+h-2- ]h-2+h reaction. J. Chem. Phys., 100(10):7410–7415, 1994.

[91] R. Podeszwa. Comment on ”Beyond the benzene dimer: An investigation of the additivity of π − π interactions”. J. Phys. Chem. A, 112(37):8884–8885, 2008.

[92] R. Podeszwa, R. Bukowski, and K. Szalewicz. Potential energy surface for the benzene dimer and perturbational analysis of π − π interactions. J. Phys. Chem. A, 110(34):10345–10354, 2006.

[93] R. Podeszwa and K. Szalewicz. Three-body symmetry-adapted perturbation theory based on kohn-sham description of the monomers. J. Chem. Phys., 126(19):194101, 2007.

[94] V. Polo, E. Kraka, and D. Cremer. Electron correlation and the self-interaction error of density functional theory. Molecular Physics, 100:1771 – 1790, 2002.

[95] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery. Numerical Recipes 3rd Edition: The Art of Scientiﬁc Computing. Cambridge University Press, New York, NY, USA, 2007.

[96] G.D. Purvis and R.J. Bartlett. A full coupled-cluster singles and doubles model: The inclusion of disconnected triples. J. Chem. Phys, 76:1910, 1982.

[97] J.R. Rabinowitz, K. Namboodiri, and H. Weinstein. A ﬁnite expansion method for the calculation and interpretation of molecular electrostatic potentials. Int. J. Quant. Chem., 29(6):1697–1704, 1986.

111 [98] K. Raghavachari, G.W. Trucks, J.A. Pople, and M. Head-Gordon. A ﬁfth-order perturbation comparison of electron correlation theories. Chem. Phys. Lett., 157:479–483, 1989. [99] D.R. Reed, M. Hare, and S.R. Kass. Formation of gas-phase dianions and distonic ions as a general method for the synthesis of protected reactive intermediates. Energetics of 2,3- and 2,6-dehydronaphthalene. J. Am. Chem. Soc., 122:10689– 10696, 2000. [100] R.C. Rizzo and W.L. Jorgensen. OPLS all-atom model for amines: Resolution of the amine hydration problem. J. Am. Chem. Soc., 121(20):4827–4836, 1999. [101] G.V. Rossum. The Python Language Reference Manual. Network Theory Ltd., 2003. [102] N.J. Russ, T.D. Crawford, and G.S. Tschumper. Real versus artifactual symmetry breaking effects in Hartree-Fock, density-functional, and coupled-cluster methods. J. Chem. Phys., 120:7298–7306, 2004. [103] W. Saenger. Principles of Nucleic Acid Structure. Springer-Verlag: New York, 1984. [104] A. Sanz-Garcia and R.J. Wheatley. Atomic representation of the dispersion interaction energy. ChemPhysChem, 5(5):801–807, 2003. [105] K.E. Schmidt and M.A. Lee. Implementing the fast multipole method in three dimensions. J. Stat. Phys., 63(5):1223–1235, 1991. [106] M.W. Schmidt, K.K. Baldridge, S.T. Elbert J.A. Boatz, M.S. Gordon, S. Koseki J.H. Jensen, N. Mastunaga, K.A. Nguyen, T.L. Windus S. Su, M. Dupuis, and J.A. Montgomery. General atomic and molecular electronic structure system. J. Comput. Chem., 14(11):1347–1363, 1993. [107] M.J. Schottelius and P. Chen. Angew. Chem. Int. Ed. Engl., 3:1478, 1996. [108] M. Schutz,¨ G. Hetzer, and H. Werner. Low-order scaling local electron correlation methods. I. Linear scaling local MP2. J. Chem. Phys., 111(13):5691–5705, 1999. [109] E. Schwegler and M. Challacombe. Linear scaling computation of the Hartree- Fock exchange matrix. J. Chem. Phys., 105(7):2726–2734, 1996. [110] G.E. Scuseria and P.Y. Ayala. Linear scaling coupled cluster and perturbation theories in the atomic orbital basis. J. Chem. Phys., 111(18):8330–8343, 1999. [111] Y. Shao, M. Head-Gordon, and A.I. Krylov. The spin-ﬂip approach within time- dependent density functional theory: Theory and applications to diradicals. J. Chem. Phys., 118:4807–4818, 2003.

112 [112] Y. Shao, L.F. Molnar, Y. Jung, J. Kussmann, C. Ochsenfeld, S. Brown, A.T.B. Gilbert, L.V. Slipchenko, S.V. Levchenko, D. P. O’Neil, R.A. Distasio Jr., R.C. Lochan, T. Wang, G.J.O. Beran, N.A. Besley, J.M. Herbert, C.Y. Lin, T. Van Voorhis, S.H. Chien, A. Sodt, R.P. Steele, V. A. Rassolov, P. Maslen, P.P. Koram- bath, R.D. Adamson, B. Austin, J. Baker, E.F.C. Bird, H. Daschel, R.J. Doerk- sen, A. Drew, B.D. Dunietz, A.D. Dutoi, T.R. Furlani, S.R. Gwaltney, A. Hey- den, S. Hirata, C.-P. Hsu, G.S. Kedziora, R.Z. Khalliulin, P. Klunziger, A.M. Lee, W.Z. Liang, I. Lotan, N. Nair, B. Peters, E.I. Proynov, P.A. Pieniazek, Y.M. Rhee, J. Ritchie, E. Rosta, C.D. Sherrill, A.C. Simmonett, J.E. Subotnik, H.L. Woodcock III, W. Zhang, A.T. Bell, A.K. Chakraborty, D.M. Chipman, F.J. Keil, A. Warshel, W.J. Herhe, H.F. Schaefer III, J. Kong, A.I. Krylov, P.M.W. Gill, and M. Head-Gordon. Advances in methods and algorithms in a modern quantum chemistry program package. Phys. Chem. Chem. Phys., 8:3172–3191, 2006.

[113] C.D. Sherrill, A.I. Krylov, E.F.C. Byrd, and M. Head-Gordon. Energies and analytic gradients for a coupled-cluster doubles model using variational Brueckner + orbitals: Application to symmetry breaking in O4 . J. Chem. Phys., 109:4171, 1998.

[114] M. Sierka and J. Sauer. Finding transition structures in extended systems: A strategy based on a combined quantum mechanics-empirical valence bond approach. J. Chem. Phys., 112(16):6983–6996, 2000.

[115] M.O. Sinnokrot and C.D. Sherrill. High-accuracy quantum mechanical studies of π − π interactions in benzene dimers. J. Phys. Chem. A, 110(37):10656–10668, 2006.

[116] M.O. Sinnokrot, E.F. Valeev, and C.D. Sherrill. Estimates of the ab initio limit for π−π interactions: The benzene dimer. J. Am. Chem. Soc., 124(36):10887–10893, 2002.

[117] L.V. Slipchenko and M.S. Gordon. Electrostatic energy in the effective fragment potential method: Theory and application to benzene dimer. J. Comput. Chem., 28(1):276–291, 2007.

[118] L.V. Slipchenko and M.S. Gordon. Damping functions in the effective fragment potential method. Mol. Phys., 2009.

[119] L.V. Slipchenko and A.I. Krylov. Singlet-triplet gaps in diradicals by the spin-ﬂip approach: A benchmark study. J. Chem. Phys., 117:4694–4708, 2002.

[120] L.V. Slipchenko and A.I. Krylov. Efﬁcient strategies for accurate calculations of electronic excitation and ionization energies: Theory and application to the dehydro-meta-xylylene anion. J. Phys. Chem. A, 110:291–298, 2006.

113 [121] T. Smith, L.V. Slipchenko, and M.S. Gordon. Modeling π − π interactions with the effective fragment potential method: The benzene dimer and substituents. J. Phys. Chem. A, 112(23):5286–5294, 2008.

[122] D.B. Smithrud and F. Diederich. Strength of molecular complexation of apolar solutes in water and in organic solvents is predictable by linear free energy rela- tionships: A general model for solvation effects on apolar binding. J. Am. Chem. Soc., 112(1):339–343, 1990.

[123] J.F. Stanton. Coupled-cluster theory, pseudo-Jahn-Teller effects and conical intersections. J. Chem. Phys., 115:10382–10393, 2001.

[124] J.F. Stanton, J. Gauss, J.D. Watts, W.J. Lauderdale, and R.J. Bartlett. ACES II, 1993. The package also contains modiﬁed versions of the MOLECULE Gaussian integral program of J. Almlof¨ and P.R. Taylor, the ABACUS integral derivative program written by T.U. Helgaker, H.J.Aa. Jensen, P. Jørgensen and P.R. Taylor, and the PROPS property evaluation integral code of P.R. Taylor.

[125] A.J. Stone. Distributed multipole analysis, or how to describe a molecular charge distribution. Chem. Phys. Lett., 83(2):233 – 239, 1981.

[126] A.J. Stone. In The Theory of Intermolecular Forces, pages 50 – 62. Oxford Uni- versity Press, 1997.

[127] A.J. Stone. Distributed multipole analysis: Stability for large basis sets. J. Chem. Theory Comput., 1(6):1128–1132, 2005.

[128] A.J. Stone and M. Alderton. Distributed multipole analysis - methods and applications. Mol. Phys., 56(5):1047–1064, 1985.

[129] K.T. Tang and J.P. Toennies. An improved simple model for the van der Waals potential based on universal damping functions for the dispersion coefﬁcients. J. Chem. Phys., 80(8):3726–3741, 1984.

[130] T.P. Tauer and C.D. Sherrill. Beyond the benzene dimer: An investigation of the additivity of π − π interactions. J. Phys. Chem. A, 109(46):10475–10478, 2005.

[131] S. Tsuzuki, T. Uchimaru, K. Matsumura, M. Mikami, and K. Tanabe. Effects of the higher electron correlation correction on the calculated intermolecular interaction energies of benzene and naphthalene dimers: comparison between MP2 and CCSD(T) calculations. Chem. Phys. Lett., 319(5-6):547 – 554, 2000.

[132] V. Vanovschi, A.I. Krylov, and P.G. Wenthold. Structure, vibrational frequencies, ionization energies, and photoelectron spectrum of the para-benzyne radical anion. Theor. Chim. Acta, 120:45–58, 2008.

114 [133] H.H. Wenk, M. Winkler, and W. Sander. One century of aryne chemistry. Angew. Chem. Int. Ed. Engl., 42:502–528, 2003.

[134] P.G. Wenthold, J. Hu, and R.R. Squires. J. Am. Chem. Soc., 118:11865–11871, 1996.

[135] P.G. Wenthold, R.R. Squires, and W.C. Lineberger. Ultraviolet photoelectron spectroscopy of the o-, m-, and p-benzyne negative ions. Electron afﬁnities and singlet-triplet splittings for o-, m-, and p-benzyne. J. Am. Chem. Soc., 120:5279– 5290, 1998.

[136] C.A. White, B.G. Johnson, P.M.W. Gill, and M. Head-Gordon. Linear scaling density functional calculations via the continuous fast multipole method. Chem. Phys. Lett., 253(3-4):268–278, 1996.

[137] A.K. Wilson and T.H. Dunning. Benchmark calculations with correlated molecular wave functions. 10. Comparison with ”exact” mp2 calculations on ne, hf, h2o, and n-2. J. Chem. Phys., 106(21):8718–8726, 1997.

[138] M. Wladyslawski and M. Nooijen. The photoelectron spectrum of the NO3 radical revisited: A theoretical investigation of potential energy surfaces and conical intersections. In ACS Symposium Series, volume 828, pages 65–92, 2002.

[139] M. Wolfsberg and L. Helmholz. The spectra and electronic structure of the tetra- hedral ions MnO4-, CrO4–, and ClO4-. J. Chem. Phys., 20(5):837–843, 1952.

[140] D. E. Woon and T. H. Dunning Jr. Gaussian basis sets for use in correlated molecular calculations. iv. calculation of static electrical response properties. J. Chem. Phys., 100, 1994.

[141] D.E. Woon. Benchmark calculations with correlated molecular wave-functions. 5. The determination of accurate ab-initio intermolecular potentials for he-2, ne- 2, and ar-2. J. Chem. Phys., 100(4):2838–2850, 1994.

[142] D.E. Woon and T.H. Dunning. Benchmark calculations with correlated molecular wave-functions. 1. Multireference conﬁguration-interaction calculations for the 2nd-row diatomic hydrides. J. Chem. Phys., 99(3):1914–1929, 1993.

[143] D.E. Woon and T.H. Dunning. Benchmark calculations with correlated molecular wave-functions. 6. 2nd-row-A(2) and ﬁrst-row 2nd-row-AB diatomic-molecules. J. Chem. Phys., 101(10):8877–8893, 1994.

[144] D.E. Woon, K.A. Peterson, and T.H. Dunning. Benchmark calculations with correlated molecular wave functions. IX. The weakly bound complexes Ar-H-2 and Ar-HCl. J. Chem. Phys., 109(6):2233–2241, 1998.

115 [145] X. Ye, Z. Li, W. Wang, K. Fan, W. Xu, and Z. Hua. The parallel π − π stacking: a model study with MP2 and DFT methods. Chem. Phys. Lett., 397(1-3):56 – 61, 2004.

[146] S. Yoo, F. Zahariev, S. Sok, and M.S. Gordon. Solvent effects on optical properties of molecules: A combined time-dependent density functional theory/effective fragment potential approach. J. Chem. Phys., 129(14):8, 2008.

[147] T. Zeidan, M. Manoharan, and I.V. Alabugin. Ortho effect in the Bergman cyclization: Interception of p-benzyne intermediate by intramolecular hydrogen abstraction. J. Org. Chem., 71:954–961, 2006.

[148] Y. Zhang and W. Yang. A challenge for density functionals: Self-interaction error increases for systems with a noninteger number of electrons. J. Chem. Phys., 109(7):2604–2608, 1998.

116 Appendix A

GAMESS input for generating EFP parameters of benzene

The GAMESS input ﬁle for generation the EFP parameters at the HF/6-311G++(3df,2p) level of theory for the benzene monomer used in this work is given below.

$CONTRL RUNTYP=MAKEFP ISPHER=1 $END $SYSTEM TIMLIM=99999 MWORDS=150 $END $BASIS GBASIS=N311 NGAUSS=6 NFFUNC=1 NDFUNC=3 NPFUNC=2 DIFFSP=.T. DIFFS=.T. $END $SCF SOSCF=.F. DIIS=.T. $END $STONE BIGEXP=4.0 $END $DAMP IFTTYP(1)=3,2 IFTFIX(1)=1,1 THRSH=500 $END $DAMPGS C3=C1 H4=H2 C5=C1 H6=H2 C7=C1 H8=H2 C9=C1 H10=H2 C11=C1 H12=H2 BO43=BO21 BO53=BO31 BO65=BO21 BO75=BO31 BO87=BO21 BO97=BO31 BO109=BO21 BO111=BO31 BO119=BO31 BO1211=BO21 $END $DATA benzene

117 C1 C1 6.0 -1.3942300 0.00000000 0.0000000 H2 1.0 -2.4764870 0.00000000 0.0000000 C3 6.0 -0.6971150 0.00000000 1.2074388 H4 1.0 -1.2382435 0.00000000 2.1447002 C5 6.0 0.6971150 0.00000000 1.2074388 H6 1.0 1.2382435 0.00000000 2.1447002 C7 6.0 1.3942300 0.00000000 -0.0000000 H8 1.0 2.4764870 0.00000000 0.0000000 C9 6.0 0.6971150 0.00000000 -1.2074388 H10 1.0 1.2382435 0.00000000 -2.1447002 C11 6.0 -0.6971150 0.00000000 -1.2074388 H12 1.0 -1.2382435 0.00000000 -2.1447002 $END

118 Appendix B

Geometries of benezene oligomers

Geometries of all the benzene oligomers studied in this work are provided below in the format of the Q-Chem $efp fragments section. Fig. 1.16 provides the deﬁnition of the format.

Sandwich (S) structure of benzene dimer: benzene 0.0 0.0 0.0 benzene 0.0 4.0 0.0

T-shaped (T) structure of benzene dimer: benzene 0.0 0.0 0.0 1.57079633 1.57079633 1.57079633 benzene 0.0 5.2 0.0 1.57079633 1.57079633 0.00000000

Parallel-displaced (PD) structure of benzene dimer: benzene 0.0 0.0 0.0 benzene 1.2 3.9 0.0

S benzene trimer: benzene 0.0 0.0 0.0 benzene 0.0 4.0 0.0 benzene 0.0 8.0 0.0

T1 benzene trimer: benzene 0.0 0.0 0.0 1.57079633 1.57079633 1.57079633 benzene 0.0 5.2 0.0 1.57079633 1.57079633 0.00000000 benzene 0.0 10.4 0.0 1.57079633 1.57079633 1.57079633

T2 benzene trimer: benzene 0.0 0.0 0.0 1.57079633 1.57079633 1.57079633 benzene 0.0 5.2 0.0 1.57079633 1.57079633 0.00000000 benzene 0.0 -5.2 0.0 1.57079633 1.57079633 0.00000000

PD benzene trimer:

119 benzene 0.0 0.0 0.0 benzene 1.2 3.9 0.0 benzene 0.0 7.8 0.0 C benzene trimer: benzene 2.6 0.0 0.0 -0.25943951 0.00000000 0.00000000 benzene -2.6 0.0 0.0 1.83495559 0.00000000 0.00000000 benzene 0.0 -4.503332 0.0 3.92935069 0.00000000 0.00000000 PD-T benzene trimer: benzene 0.0 0.0 0.0 1.57079633 1.57079633 1.57079633 benzene 0.0 5.2 0.0 1.57079633 1.57079633 0.00000000 benzene 0.0 -3.9 1.2 1.57079633 1.57079633 1.57079633 PD-S benzene trimer: benzene 0.0 -4.0 0.0 benzene 0.0 0.0 0.0 benzene 1.2 3.9 0.0 S-T benzene trimer: benzene 0.0 0.0 0.0 1.57079633 1.57079633 1.57079633 benzene 0.0 5.2 0.0 1.57079633 1.57079633 0.00000000 benzene 0.0 -4.0 0.0 1.57079633 1.57079633 1.57079633 S-10 benzene oligomer (larger S-oligomers are constructed similarly): benzene 0.0 0.0 0.0 benzene 0.0 4.0 0.0 benzene 0.0 8.0 0.0 benzene 0.0 12.0 0.0 benzene 0.0 16.0 0.0 benzene 0.0 20.0 0.0 benzene 0.0 24.0 0.0 benzene 0.0 28.0 0.0 benzene 0.0 32.0 0.0 benzene 0.0 36.0 0.0 T-10 benzene oligomer (larger T-oligomers are constructed similarly): benzene 0.0 0.0 0.0 1.57079633 1.57079633 1.57079633 benzene 0.0 5.2 0.0 1.57079633 1.57079633 0.00000000 benzene 0.0 10.4 0.0 1.57079633 1.57079633 1.57079633 benzene 0.0 15.6 0.0 1.57079633 1.57079633 0.00000000 benzene 0.0 20.8 0.0 1.57079633 1.57079633 1.57079633 benzene 0.0 26.0 0.0 1.57079633 1.57079633 0.00000000 benzene 0.0 31.2 0.0 1.57079633 1.57079633 1.57079633 benzene 0.0 36.4 0.0 1.57079633 1.57079633 0.00000000 benzene 0.0 41.6 0.0 1.57079633 1.57079633 1.57079633 benzene 0.0 46.8 0.0 1.57079633 1.57079633 0.00000000

120 PD-10 benzene oligomer (larger PD-oligomers are constructed similarly): benzene 0.0 0.0 0.0 benzene 1.2 3.9 0.0 benzene 0.0 7.8 0.0 benzene 1.2 11.7 0.0 benzene 0.0 15.6 0.0 benzene 1.2 19.5 0.0 benzene 0.0 23.4 0.0 benzene 1.2 27.3 0.0 benzene 0.0 31.2 0.0 benzene 1.2 35.1 0.0

121 Appendix C

Converter of the EFP parameters

A script, which converts the GAMESS input parameters to the Q-Chem format is given below:

#!/usr/bin/env python """EFP parameters converter. Converts from GAMESS to Q-Chem input format.

Usage: python converter.py """ import sys import re

# Bohrs to Angstroms convertion factor B2A = 0.529177249

__version__ = "1.15" __author__ = "Vitalii Vanovschi" if "set" not in globals(): from sets import Set as set

(FRAG, COORDS, MONOPOLES, DIPOLES, QUADRUPOLES, OCTUPOLES, \ BASIS, WAVEFUNCTION, FOCK, LMO, MULTIPLICITY, SCREEN2,\ POLARIZABLE, DYNAMIC, CANONVEC, CANONFOK, SCREEN3) = xrange(0, 17) def dump_qchem_frag(frag_name, labels, charges, dipoles, quadrupoles, \ octupoles, wavefunction, fock, centroids, screen2, \ ordered_labels, nuc_charges, pols, disps, basis, coordinates): """Outputs EFP fragment parameters in the Q-Chem format"""

print "fragment", frag_name

# output atoms for coordinate in coordinates: if coordinate[0][0:2] != "bo": print coordinate[0].rstrip("0123456789") + " " \ + " ".join(coordinate[1:])

# output atomic charges (as multipoles) for label in ordered_labels: coord = labels[label] if label in nuc_charges and float(nuc_charges[label]) > 0.1: print "mult ", " ".join(coord) print nuc_charges[label]

# output multipoles for label in ordered_labels:

122 coord = labels[label] print "mult ", " ".join(coord) if label in screen2: print "cdamp " + screen2[label][1] if label in charges: print charges[label] if label in dipoles: print " ".join(dipoles[label]) if label in quadrupoles: print " ".join(quadrupoles[label]) if label in octupoles: print " ".join(octupoles[label])

# output polarization points for pol in pols: print "pol ", pol[1], " ", pol[2], " ", pol[3] print " ".join(pol[4:])

# output dispersion points dispindexes = list(set([x[0] for x in disps])) dispindexes.sort() for i in dispindexes: isfirst = True for disp in disps: if disp[0] == i: if isfirst: print "disp ", disp[1], " ", disp[2], " ", disp[3] isfirst = False print disp[4], print

# output exchange-repulsion print "er_basis" print "er_wavefunction" wfindexes = wavefunction.keys() wfindexes.sort()

if len(wfindexes) * (len(wfindexes) + 1) != 2 * len(fock): raise AssertionError("Inconsistent input.\n" "The following assertion has failed:\n" "len(wfindexes) * (len(wfindexes) + 1) == 2 * len(fock)\n" "Values: len(wfindexes) = " + str(len(wfindexes)) + ", " "len(fock) = " + str(len(fock)) + ".\n" "Did you remove ’>’ from the last line of the FOCK MATRIX \n" "section in GAMESS input? (an old GAMESS bug)")

for ind in wfindexes: orb = wavefunction[ind] i = 0 for shell in basis: if shell == ’s’: i += 1 if shell == ’p’: i += 3 if shell == ’sp’ or shell == ’l’: i += 4 if shell == ’d’: tmp = orb[i:i+6] #10 -> 10 #11 -> 12 #12 -> 15 #13 -> 11 #14 -> 13

123 #15 -> 14 orb[i] = tmp[0] orb[i+1] = tmp[3] orb[i+2] = tmp[1] orb[i+3] = tmp[4] orb[i+4] = tmp[5] orb[i+5] = tmp[2] i += 6 if shell == ’f’: tmp = orb[i:i+10] orb[i] = tmp[0] #fxxx orb[i+1] = tmp[3] #fxxy orb[i+2] = tmp[5] #fxyy orb[i+3] = tmp[1] #fyyy orb[i+4] = tmp[4] #fxxz orb[i+5] = tmp[9] #fxyz orb[i+6] = tmp[6] #fyyz orb[i+7] = tmp[7] #fxzz orb[i+8] = tmp[8] #fyzz orb[i+9] = tmp[2] #fzzz i += 10 for i in range(len(orb)): if i % 5 == 0: print ind, if orb[i][0] == "-": print orb[i], else: print " " + orb[i], if i % 5 == 4: print print

print "er_fock_matrix" for i in range(len(fock)): print fock[i], if i % 5 == 4: print print print "er_lmos" for lmo in centroids: print " ".join(lmo) def parse_gamess(gamess_file): """Parses GAMESS input file"""

str2int = {"frag" : FRAG, "coordinates" : COORDS, "monopoles" : MONOPOLES, "dipoles" : DIPOLES, "quadrupoles" : QUADRUPOLES, "octupoles" : OCTUPOLES, "basis" : BASIS, "wavefunction" : WAVEFUNCTION, "fock" : FOCK, "lmo" : LMO, "multiplicity" : MULTIPLICITY, "screen2": SCREEN2, "polarizable": POLARIZABLE, "dynamic": DYNAMIC, "canonvec": CANONVEC, "canonfok": CANONFOK,

124 "screen3": SCREEN3 } frag_name = None ordered_labels = [] labels = {} coordinates = [] charges = {} nuc_charges = {} dipoles = {} quadrupoles = {} octupoles = {} screen2 = {} centroids = [] wavefunction = {} fock = [] expect = FRAG prefix = "" pols = [] disps = [] basis = [] print "$efp_params" for line1 in gamess_file: line_orig = line1 line1 = line1.strip()

if not line1 or line1.lower() == ’STOP’: continue

if line1[-1] == ">": prefix += " " + line1 continue else: line = prefix + " " + line1 prefix = ""

fragment_start = re.search("\$([A-Za-z0-9]+)", line) if fragment_start: text = fragment_start.groups()[0].lower() if text in ("efrag", "contrl"): continue

if text == "end" and frag_name: # print "frag_name:", frag_name # print "labels:", labels # print "dipoles:", dipoles # print "quadrupoles:", quadrupoles # print "octupoles:", octupoles # print "wavefunction:", wavefunction # print "fock:", fock # print "centroids:", centroids

dump_qchem_frag(frag_name, labels, charges, dipoles, quadrupoles, octupoles, wavefunction, fock, centroids, screen2, ordered_labels, nuc_charges, pols, disps, basis, coordinates)

frag_name = None labels = {} charges = {} nuc_charges = {} dipoles = {}

125 quadrupoles = {} octupoles = {} screen2 = {} centroids = [] coordinates = [] ordered_labels = [] wavefunction = {} fock = [] expect = FRAG prefix = "" pols = [] disps = [] basis = []

if text != "end": frag_name = text while 1: line = gamess_file.next() words = re.findall("([A-Za-z0-9\.+-]+)", line) if words[0].lower() == "coordinates": break expect = COORDS continue

words = re.findall("([A-Za-z0-9\.+-]+)", line) if words: firstword = words[0].lower() else: firstword = None

if firstword in str2int: expect = str2int[firstword] continue

if firstword == ’projection’ and len(words) > 1: expect = str2int[words[1].lower()] continue

if expect == COORDS and len(words) > 3: atom_x = str(float(words[1])*B2A) atom_y = str(float(words[2])*B2A) atom_z = str(float(words[3])*B2A) labels[firstword] = (atom_x, atom_y, atom_z) coordinates.append((firstword, atom_x, atom_y, atom_z)) ordered_labels.append(firstword)

if expect == MONOPOLES and len(words) > 1: charges[firstword] = words[1] nuc_charges[firstword] = words[2]

if expect == DIPOLES and len(words) == 4: dipoles[firstword] = words[1:]

if expect == QUADRUPOLES and len(words) == 7: quadrupoles[firstword] = words[1:]

if expect == OCTUPOLES and len(words) == 11: octupoles[firstword] = words[1:]

if expect == WAVEFUNCTION: # 3 1-1.40747045E-01 2.10323654E-01 2.42989533E-01-4.68907985E-01-1.03447318E-01 key = int(line_orig[:3].strip())

126 if key not in wavefunction: wavefunction[key] = [] values = line_orig[5:].rstrip() while values: wavefunction[key].append(values[:15].strip()) values = values[15:]

if expect == FOCK: fock += words

if expect == LMO and len(words) == 4: centroids.append([str(float(x)*B2A) for x in words[1:]])

if expect == SCREEN2: screen2[firstword] = words[1:]

if expect == POLARIZABLE: if len(words) == 4: pol_x = str(float(words[1])*B2A) pol_y = str(float(words[2])*B2A) pol_z = str(float(words[3])*B2A) pols.append([words[0], pol_x, pol_y, pol_z]) elif len(words) == 9: pols[-1].extend(words)

if expect == DYNAMIC: if len(words) == 5 or len(words) > 5 and words[5] == "--": index = int(words[1]) disp_x = str(float(words[2])*B2A) disp_y = str(float(words[3])*B2A) disp_z = str(float(words[4])*B2A) disps.append([index, disp_x, disp_y, disp_z]) elif len(words) == 9: disps[-1].append(sum([float(x) for x in words[:3]]) / 3)

if expect == BASIS: if firstword in (’s’, ’sp’, ’p’, ’d’, ’l’, ’f’): basis.append(firstword)

print "$end" if __name__ == "__main__": if len(sys.argv) != 2: print "Usage: python converter.py " sys.exit() INPUT = open(sys.argv[1]) parse_gamess(INPUT) INPUT.close()

127 Appendix D

Fragment-fragment electrostatic interaction

The equations for electrostatics inter-fragment interaction energy are presented below. k and l are the multipole expansion points. q is the charge, µ is the dipole, Θ is the quadrupole, Ω is the octupole, R is the distance between the expansion points k and l, a, b and c are the components of the distance. Charge-charge interaction energy:

k l ch−ch q q Ekl = (D.1) Rkl

Charge-dipole interaction energy:

k Px,y,z l ch−dip q a µaa Ekl = 3 (D.2) Rkl

Charge-quadrupole interaction energy:

k Px,y,z Px,y,z l ch−quad q a b Θabab Ekl = 5 (D.3) Rkl

Charge-octupole interaction energy:

k Px,y,z Px,y,z Px,y,z l ch−oct q a b c Ωabcabc Ekl = 7 (D.4) Rkl

128 Dipole-dipole interaction energy:

Px,y,z k l Px,y,z Px,y,z k l dip−dip a µaµb a b µaµbab Ekl = 3 − 3 5 (D.5) Rkl Rkl

Dipole-quadrupole interaction energy:

Px,y,z Px,y,z k l Px,y,z Px,y,z k Px,y,z l dip−quad a b Θabµab a b Θabab c µcc Ekl = −2 5 + 5 7 (D.6) Rkl Rkl

Quadrupole-quadrupole interaction energy:

Px,y,z Px,y,z Θk Θl Equad−quad = 2 a b ab ab kl 3R5 kl (D.7) Px,y,z Px,y,z k Px,y,z l Px,y,z Px,y,z k a b Θabb c Θacc a b Θabab −20 7 + 35 9 3Rkl 3Rkl

Nuclei-charge interaction energy:

I l nuc−ch Z q EIl = (D.8) rkl

Nuclei-dipole interaction energy:

I Px,y,z l nuc−dip Z a µaa EIl = 3 (D.9) Rkl

Nuclei-quadrupole interaction energy:

I Px,y,z Px,y,z l nuc−quad Z a b Θabab EIl = 5 (D.10) Rkl

Nuclei-octupole interaction energy:

I Px,y,z Px,y,z Px,y,z l nuc−oct Z a b c Ωabcabc EIl = 7 (D.11) Rkl

129 Appendix E

Buckingham multipoles

E.1 The conversion from the sperical to the Bucking-

ham multipoles

The Buckingham charge q in terms of the spherical multipole Q with l = 0 angular momentum:

q = Q0,0 (E.1)

The Buckingham dipole µ in terms of the spherical multipole Q with l = 1 angular momentum:

µx = −2Q11

µy = 2Q11¯ (E.2)

µz = Q10

130 The Buckingham quadrupole Θ in terms of the spherical multipole Q with l = 2 angular momentum:

Θxx = 6Q22 − Q20

Θxy = −6Q22¯

Θyy = −6Q22 − Q20 (E.3) Θxz = −3Q21

Θyz = 3Q21¯

Θzz = 2Q20

The Buckingham octupole Ω in terms of the spherical multipole Q with l = 3 angular momentum:

Ωxxx = −30Q33 + 6Q31

Ωxxy = 30Q33¯ − 2Q31¯

Ωxxz = −3Q30 + 10Q32

Ωxyy = 30Q33 + 6Q31

Ωxyz = −10Q32¯ (E.4) Ωxzz = −8Q31

Ωyyy = −30Q33¯ − 6Q31¯

Ωyyz = −3Q30 − 10Q32

Ωyzz = 8Q31¯

Ωzzz = 6Q30

131 E.2 The conversion from the Cartesian to the Bucking-

ham multipoles

. The Buckingham charge q in terms of the Cartesian charge q0:

q = q0 (E.5)

The Buckingham dipole µ in terms of the Cartesian dipole µ0:

0 µx = µx

0 µy = µy (E.6)

0 µz = µz

The Buckingham quadrupole Θ in terms of the Cartesian quadrupole Θ0:

Θ0 + Θ0 + Θ0 halftrace = xx yy zz 2 3 Θ = Θ0 − halftrace xx 2 xx 3 Θ = Θ0 xy 2 xy 3 Θ = Θ0 − halftrace (E.7) yy 2 yy 3 Θ = Θ0 xz 2 xz 3 Θ = Θ0 yz 2 yz 3 Θ = Θ0 − halftrace zz 2 zz

132 The Buckingham octupole Ω in terms of the Cartesian octupole Ω0:

0 0 0 traceX = Ωxxx + Ωxyy + Ωxzz

0 0 0 traceY = Ωxxy + Ωyyy + Ωyzz

0 0 0 traceZ = Ωxxz + Ωyyz + Ωzzz 5 3 Ω = Ω0 − traceX xxx 2 xxx 2 5 3 Ω = Ω0 − traceY xxy 2 xxy 2 5 3 Ω = Ω0 − traceZ xxz 2 xxz 2 5 1 Ω = Ω0 − traceY xyy 2 xyy 2 (E.8) 5 1 Ω = Ω0 − traceZ xyz 2 xyz 2 5 1 Ω = Ω0 − traceX xzz 2 xzz 2 5 1 Ω = Ω0 − traceX yyy 2 yyy 2 5 1 Ω = Ω0 − traceZ yyz 2 yyz 2 5 1 Ω = Ω0 − traceY yzz 2 yzz 2 5 Ω = Ω0 zzz 2 zzz

133 Appendix F

The Q-Chem input for EFP water cluster

The Q-Chem input ﬁle a for a molecular system with two effective fragment water molecules and another water moleucle in the the active region is presented below:

$molecule 0 1 O 2.98679899864 0.000000000000 0.000000000000 H 3.37280799847 0.408736299796 0.759199899653 H 3.37280799847 0.408736299796 -0.759199899653 $end

$rem basis 6-31G* exchange hf $end

$efp_fragments water 2.5 0.0 0.0 water 0.0 2.5 0.0 $end

$efp_params fragment water o 2.98679899864 0.000000000000 0.000000000000 h 3.37280799847 0.408736299796 0.759199899653 h 3.37280799847 0.408736299796 -0.759199899653 mult 2.98679899864 0.000000000000 0.000000000000 8.00000 mult 3.37280799847 0.408736299796 0.759199899653 1.00000 mult 3.37280799847 0.408736299796 -0.759199899653

aLines ending at a backslash symbol should be joined with the next line in the actual Q-Chem input.

134 1.00000 mult 2.98679899864 0.000000000000 0.000000000000 cdamp 2.432093890 -5.9880774156 0.9069997308 0.960401737 0.000000000 -2.0776300674 -2.055053110 -1.094067772 0.1972105926\ 0.0000000000 0.000000000 1.2431711201 1.307173512 0.0000000000 0.3882335020\ 0.0000000000 0.357964785 0.0000000000 0.3997408222\ 0.4232766191 0.000000000 mult 3.37280799847 0.408736299796 0.759199899653 cdamp 1.856169014 -0.6161272254 -0.0607832521 -0.0643620268 -0.1385305992 -0.2984353445 -0.2970178207 -0.2629326306 0.0123821246\ 0.0186582649 0.0197568195 -0.1419121249 -0.1496397602 -0.2838114377 -0.0466364802\ -0.1010324765 -0.0434504063 -0.0996406903 -0.0238456713\ -0.0252496482 0.0121573058 mult 3.37280799847 0.408736299796 -0.759199899653 cdamp 1.856168614 -0.6161272254 -0.0607832521 -0.0643620268 0.1385305992 -0.2984353445 -0.2970178207 -0.2629326306 0.0123821246\ -0.0186582649 -0.0197568195 -0.1419121249 -0.1496397602 0.2838114377 -0.0466364802\ 0.1010324765 -0.0434504063 0.0996406903 -0.0238456713\ -0.0252496482 -0.0121573058 mult 3.17980349856 0.204368149898 0.379599949853 cdamp 10.000000000 -1.3898340671 0.0671975530 0.0711539868 -0.0800483392 -0.7200126389 -0.7080787920 -0.3459299814 0.1042426037\ 0.1265126185 0.1339613836 0.0823824078 0.0860368352 -0.1633636547 0.0224998491\ -0.0684206741 0.0201192189 -0.0686914603 0.0517621518\ 0.0548097854 -0.0023653270 mult 3.17980349856 0.204368149898 -0.379599949853 cdamp 10.000000000 -1.3898340671 0.0671975530 0.0711539868 0.0800483392 -0.7200126389 -0.7080787926 -0.3459299814 0.1042426037\ -0.1265126185 -0.1339613836 0.0823824078 0.0860368352 0.1633636547 0.0224998491\ 0.0684206741 0.0201192189 0.0686914603 0.0517621518\

135 0.0548097854 0.0023653270 pol 3.225043346880 0.252271132154 -0.448809812124 0.3364082841 0.3771885556 2.0857377185 0.3562152772\ -0.6146554330 -0.6508454378 0.3562153645 -0.9006532748\ -0.9536819287 pol 3.22504358327 0.252271266354 0.448809784236 0.3364078945 0.3771881750 2.0857352855 0.3562148807\ 0.6146555083 0.6508456481 0.3562149890 0.9006533941\ 0.9536821303 pol 2.68786044569 0.0680155240902 0.000000000000 0.0546318425 0.2156829505 0.4678560525 0.0987167685\ -0.0000001586 -0.0000002666 0.1193632771 -0.0000002037\ -0.0000002725 pol 3.07178559579 -0.294563387213 0.000000000000 0.2027164107 0.0676015168 0.4678627585 0.1278414717\ 0.0000000833 0.0000000562 0.1071947676 0.0000000844\ 0.0000000709 disp 3.22504334688 0.252271132154 -0.448809812124 0.9330994836 0.9328329368 0.9312726852 0.9258111592\ 0.9104538267 0.8718871966 0.7834519894 0.6086537072\ 0.3529710151 0.1263476070 0.0217816395 0.0007641188 disp 3.22504358327 0.252271266354 0.448809784236 0.9330984158 0.9328318692 0.931271618733 0.92581009\ 0.9104527744 0.8718861726 0.7834510398 0.6086529349\ 0.3529705524 0.1263474433 0.0217816120 0.0007641179 disp 2.68786044569 0.068015524090 0.000000000000 0.2460531659 0.2459405339 0.2452820477 0.2429879390\ 0.2366256365 0.2211913563 0.1883852733 0.1320296167\ 0.0651263370 0.0192960190 0.0029443470 0.0001007762 disp 3.07178559579 -0.294563387213 0.000000000000 0.2460564460 0.2459438126 0.2452853183 0.2429911814\ 0.2366288005 0.2211943284 0.1883878314 0.1320314467\ 0.0651272751 0.0192963144 0.0029443942 0.0001007779 er_basis sto-3g er_wavefunction 1 9.12366435E-02 -2.55851988E-01 -2.74993695E-01 -2.91183082E-01 1 4.26891343E-01 1.09661524E-01 -5.16313349E-01 2 9.12367358E-02 -2.55852205E-01 -2.74994306E-01 -2.91183344E-01 2 -4.26890773E-01 -5.16313220E-01 1.09660818E-01

136 3 1.54680314E-01 -6.48520365E-01 7.87066922E-01 -1.96453849E-01 3 -1.20289533E-07 8.88260577E-02 8.88262341E-02 4 1.54680523E-01 -6.48521682E-01 -2.41107828E-01 7.74554226E-01 4 3.89668406E-08 8.88271814E-02 8.88271243E-02 er_fock_matrix -0.8097112963 -0.1811928224 -0.8097117054 -0.1951745608 -0.1951748560 -0.5598738582 -0.1951742273 -0.1951744336 -0.1696642207 -0.5598736581 er_lmos 3.22504334688 0.252271132154 -0.448809812124 3.22504358327 0.252271266354 0.448809784236 2.68786044569 0.068015524090 0.000000000000 3.07178559579 -0.294563387213 0.000000000000 $end

137 Appendix G

Q-Chem’s keywords affecting the EFP parameters computation

Q-Chem’s keywords affecting the computation of the electrostatics and polarization parameters are presented below in the format compartiblie with the Q-Chem manual: POL GEN Whether distributed polarizabilities should be computed for makeefp jobtype. TYPE: LOGICAL DEFAULT: TRUE OPTIONS: TRUE/FALSE RECOMMENDATION: none

POL PRINT BOHR Print polarizabilities in Bohr instead of Angstroms. TYPE: LOGICAL DEFAULT: FALSE OPTIONS: TRUE/FALSE RECOMMENDATION: none

138 POL DIFF TYPE Finite differentitation type TYPE: INTEGER DEFAULT: 2 OPTIONS: 2 two steps ﬁnite difference (for higher accuracy) 1 one step ﬁnite difference (faster) RECOMMENDATION: none

POL FIELD DELTA Electric ﬁled (step size for numerical differentiation of polarizabilities) TYPE: INTEGER DEFAULT: 100 OPTIONS: n n ∗ 10−7 step size. RECOMMENDATION: none

POL LMO FREEZE How many core orbitals to freeze for LMO computation TYPE: INTEGER DEFAULT: -1 OPTIONS: -1 freeze all core orbitals n freeze n core orbitals RECOMMENDATION: none

DMA GEN Whether to perform distributed multipole analysis (DMA) for jobtype makeefp. TYPE: LOGICAL DEFAULT: TRUE OPTIONS: TRUE/FALSE RECOMMENDATION: none

139 DMA MIDPOINTS Add bond midpoint to DMA expansion points on bond centers (in addtion to expansion points on nucleai) TYPE: LOGICAL DEFAULT: TRUE OPTIONS: TRUE/FALSE RECOMMENDATION: none

DAMP STO1G GUESS Whether to use STO1G guess procedure TYPE: LOGICAL DEFAULT: TRUE OPTIONS: TRUE/FALSE RECOMMENDATION: none

DAMP GEN EXP FIT Whether to compute exponential damping TYPE: LOGICAL DEFAULT: FALSE OPTIONS: TRUE/FALSE RECOMMENDATION: none

DAMP GEN GAUSS FIT Whether to compute gaussian damping TYPE: LOGICAL DEFAULT: FALSE OPTIONS: TRUE/FALSE RECOMMENDATION: none

140 DAMP EXP TOL Tolerance for exponential ﬁtting TYPE: INTEGER DEFAULT: 10 OPTIONS: n n ∗ 10−7 tolerance RECOMMENDATION: none

DAMP GAUSS TOL Tolerance for gaussian ﬁtting TYPE: INTEGER DEFAULT: 1000 OPTIONS: n n ∗ 10−7 tolerance RECOMMENDATION: none

DAMP GRID STEP Grid step for damping of electrostatic potential TYPE: INTEGER DEFAULT: 50 OPTIONS: n n/100 Angstroms RECOMMENDATION: none

DAMP RMIN Minimum distance from a nucleai to a grid point TYPE: INTEGER DEFAULT: 67 OPTIONS: n n/100 Angstroms RECOMMENDATION: none

141 DAMP RMAX Maximum distance from a nucleai to a grid point TYPE: INTEGER DEFAULT: 300 OPTIONS: n n/100 Angstroms RECOMMENDATION: none

142