<<

Automated structure generation for first-principles transition-metal catalysis

by Efthymios Ioannis Ioannidis

Diploma Chemical Engineering National Technical University of Athens (2013) M.S. Chemical Engineering Practice Massachusetts Institute of Technology (2014)

Submitted to the Department of Chemical Engineering in partial fulfillment of the requirements for the degree of

Doctor of Philosophy in Chemical Engineering Practice

at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2018 ⃝ Massachusetts Institute of Technology 2018. All rights reserved.

Author...... Department of Chemical Engineering June 2018 Certified by...... Heather J. Kulik Joseph R. Mares ’24 Career Development Professor in Chemical Engineering Thesis Supervisor Accepted by ...... Patrick S. Doyle Robert T. Haslam (1911) Professor of Chemical Engineering Chairman, Committee for Graduate Students 2 Automated structure generation for first-principles transition-metal catalysis by Efthymios Ioannis Ioannidis

Submitted to the Department of Chemical Engineering in June 2018, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Chemical Engineering Practice

Abstract Efficient discovery of new catalytic materials necessitates the rapid but selective gen- eration of candidate structures from a very wide chemical space and the efficient estimation of their properties. We developed an efficient and reliable utility for high-throughput screening of inorganic complexes that enables chemical discov- ery by automating molecular and intermolecular complex structure generation, job preparation as well as post-processing analysis to elucidate correlations of electronic or geometric descriptors with energetics. The developed software was then used to unveil different binding modes of small anions on organometallic complexes as well as functionalizations that allow for selective binding. We additionally employed our ma- terials design framework to study the binding of on functionalized metalloporphyrins providing tuning strategies and uncertainty estimation. Compu- tational approaches such as density functional theory (DFT) that directly simulate the electronic properties have been increasingly used as tools for materials design mainly due to recent developments in computational speed and accuracy. DFT re- casts the many-body problem of interacting electrons into an equivalent problem of non-interacting electrons, greatly simplifying the solution procedure. This approach introduces certain approximations that are effectively modeled with an exchange and correlation functional that accounts for the many-body effects that are not included in the simplified problem. The functional choice is an important modeling decision and therefore computational predictions can be sensitive to user selection. This sensitivity is maximized for systems with highly localized electrons such as transition metals due to self-interaction error, where one electron interacts with its own mean field resulting in an unphysical delocalization of the electron density. We studied extensively how the incorporation of the widely employed Hartree-Fock and meta-GGA-type exchange functionals affects DFT predictions on complexes.

Thesis Supervisor: Heather J. Kulik Title: Joseph R. Mares ’24 Career Development Professor in Chemical Engineering

3 4 Acknowledgments

At the verge of completing my PhDCEP thesis I would like to express my gratitude to all the people that have helped me shape this amazing journey. First and foremost, I would like to thank my academic advisor, Prof. Heather Kulik who despite the short 3-year research period of the PhDCEP program provided me with all I needed to jump start the thesis and then let me construct my own path with proper and invaluable feedback and guidance. Being one of the first students in the Kulik lab and helping start up the group was a great experience and watching it gradually grow and take its own shape and character has been very fulfilling for me. I would like to additionally thank my thesis committee members Prof. Bill Green and Prof. Yuriy Roman for their continuous support in our thesis committee meetings and the advice they offered me during the term of this research project. I have been extremely fortunate to be part of a very special research group for the past 3 years. I would like to thank everybody who is or has been a member of the Kulik lab including Natasha, Lisi, Niladri, Qing, Helena, Yusu, Jeong Yun, JP and especially Terry for their valuable feedback and fruitful discussions on matters concerning various aspects of my thesis. I would also like to thank all my dear friends back home that have been supporting me these years I have been away. Despite the distance I am grateful that we are still close and I do consider you my life-long friends. Also, I am grateful for the new friends I made here in Boston, the Greek community and my friends from the chemical engineering department at MIT, all of them being special people that have helped me through the transition of living and studying in the US. I cherish every moment that I have spent with you and I am looking forward to even more exciting ones. I would not have been here if it weren’t for my parents, Liana and Ippokratis and my sisters Dioni and Eleonora. They have always been my driving force, my source of energy and motivation and have supported every decision that I have made in my life. This thesis is dedicated to you.

5 6 Contents

1 Introduction 19 1.1 Density functional theory (DFT) ...... 19 1.1.1 Introduction to density functional theory ...... 19 1.1.2 Spin density functional theory ...... 23 1.2 DFT for transition metal catalysis ...... 24 1.3 field theories ...... 26 1.3.1 Crystal field theory ...... 26 1.3.2 Ligand field theory ...... 28 1.4 High-throughput screening ...... 30 1.5 Thesis outline ...... 33

2 Effect of Hartree-Fock exchange 35 2.1 Computational details ...... 38 2.2 Dependence of spin-state ordering on functional choice ...... 40 2.3 Dependence of spin-state ordering on HF exchange ...... 43 2.3.1 Spin-state ordering dependence with Fe(III) complex test cases 44 2.3.2 Spin-state ordering: comparison with Fe(II) complexes .... 47 2.4 Trends in charge localization measures ...... 50 2.5 Corroborating geometric and energetic relationships ...... 56 2.6 Quantitative vs. qualitative spin-state ordering ...... 57 2.7 Conclusions ...... 58

7 3 Effect of meta-GGA exchange 61 3.1 Theory ...... 62 3.2 Computational details ...... 65 3.3 Effect of meta-GGA exchange on single ions spin-state splittings ... 68 3.4 Dependence of spin-state ordering on meta-GGA exchange ...... 69 3.5 Trends in charge localization measures ...... 74 3.6 Combined effect of HF exchange and meta-GGA exchange ...... 79 3.7 Conclusions ...... 83

4 Automatic structure generation 85 4.1 Code overview ...... 86 4.2 Code architecture ...... 87 4.3 Structure generation ...... 89 4.3.1 General approach ...... 89 4.3.2 Customized cores ...... 95 4.3.3 Modify function ...... 96 4.4 Additional features ...... 97 4.4.1 Random generation ...... 97 4.4.2 Database search ...... 98 4.4.3 Supramolecular complex building ...... 99 4.4.4 Simulation automation ...... 101 4.4.5 Structure-property correlation and analysis ...... 102 4.5 Benchmarking molSimplify ...... 105 4.6 Conclusions ...... 111

5 Selective anion binding by functionalized organometallics 113 5.1 Computational details ...... 114 5.2 Binding modes ...... 115 5.3 Selective binding ...... 118 5.3.1 Hydrogen bonding ...... 120 5.3.2 Additional correlations ...... 123

8 5.4 Conclusions ...... 126

6 CO binding on metalloporphyrins 129 6.1 Computational details ...... 131 6.2 Structures ...... 133 6.3 Binding energies ...... 135 6.4 Charge measures ...... 141 6.5 Sensitivity analysis ...... 143 6.6 Conclusions ...... 146

7 Concluding remarks 149

8 Market analysis of the Catalysis industry: Capstone paper 153 8.1 Introduction ...... 153 8.2 Types of Catalysts ...... 154 8.3 Catalyst market segments ...... 156 8.4 Environmental catalysts ...... 157 8.5 Refining catalysts ...... 160 8.6 Polymer catalysts ...... 161 8.7 Chemical catalysts ...... 163 8.8 Global Catalysis Market Trends ...... 163 8.8.1 Toward higher activity and selectivity of catalysts ...... 164 8.8.2 Changes in feedstock and more effective use of feedstock ... 165 8.8.3 Lower Operating Temperatures ...... 166 8.8.4 Energy efficiency ...... 166 8.8.5 Creation of processes around catalyst technologies ...... 167 8.8.6 Gas-to-liquids (GTL) technologies ...... 168 8.9 Conclusions ...... 169

9 10 List of Figures

1-1 Crystal field theory picture of orbital interaction ...... 27 1-2 Orbital splitting in an octahedral field ...... 27 1-3 High- and low-spin configurations for mid-row transition metals ... 28 1-4 Ligand field theory molecular orbital diagram ...... 29 1-5 Examples of coordination complexes ...... 31 1-6 Example of SMILES string and corresponding structure ...... 32

2-1 Octahedral iron complex structures ...... 41 2-2 Relative spin state ordering for Fe(III) octahedral complexes ..... 45 2-3 Relative spin state ordering for Fe(II) octahedral complexes ...... 48 2-4 Relative spin state ordering for Fe(II/III)-N complexes ...... 49 2-5 Augmented data set with 5 additional octahedral structures ..... 50 2-6 Spin-state sensitivity against NBO charges for Fe(III) complexes ... 52 2-7 Spin-state sensitivity against NBO charges for Fe(II) complexes ... 54 2-8 Spin-state dependence on HFX against NBO charge dependence ... 55 2-9 Derivative of spin-state splitting with HFX vs spin-state splitting .. 59

3-1 TPSS and mTPSS enhancement factors ...... 65 3-2 Error of spin-state energies at 0% and 100% meta-GGAX for atoms . 70 3-3 Spin sensitivity to meta-GGA exchange ...... 72 3-4 Spin sensitivity versus splitting for Cr(II)-Co(III) ...... 73 3-5 Spin sensitivity versus splitting for early- late-row TM ...... 74 3-6 Spin-state sensitivity versus charge measures ...... 76

11 3-7 Splitting versus charge sensitivity ...... 78 3-8 HFX and meta-GGAX composite spin-state splitting ...... 80 3-9 Complexes of the extended data set ...... 82

4-1 Flowchart for molSimplify process ...... 88 4-2 Example octahedral complex generation ...... 91 4-3 Alignment procedure for bidentate and monodentate ..... 93 4-4 Example of custom core functionalization ...... 95 4-5 Chemical discovery workflow ...... 97 4-6 Supramolecular complex generation ...... 100 4-7 RMS gradients for benchmark data set ...... 106 4-8 Energy differences for benchmark data set with and without FF ... 108 4-9 Energy difference between molSimplify- and UFF-generated structures 109 4-10 Representative complexes with FF optimized lowest energy structures 110 4-11 Comparison of UFF and molSimplify structures for distorted complex 110

5-1 Formate initial guesses and binding modes ...... 116 5-2 Formate binding modes ...... 117 5-3 Functional groups for ferrocenium functionalization ...... 119 5-4 Formate and perchlorate adsorption energies histogram ...... 120 5-5 Hydrogen bonds for strongly and weakly binding Fc-anion complexes 122 5-6 Relative adsorption energies versus formate adsorption energy .... 124 5-7 Relative adsorption energies versus charge difference measures .... 126

6-1 Structure of metal tetraphenyl (MTPP)...... 131 6-2 MTPP and CO molecular orbitals ...... 135 6-3 Bond length versus MTPP-L/CO dissociation energy ...... 139 6-4 Orbital polarization by axial ligands ...... 140 6-5 NAO descriptor for binding strength ...... 141 6-6 Metal charge transfer versus binding strength ...... 142 6-7 Bond dissociation energy sensitivity plot ...... 144

12 6-8 Binding strength sensitivity versus spin-state sensitivity ...... 145

8-1 Catalyst market shares by technology ...... 155 8-2 Catalyst market shares by application ...... 157 8-3 Catalyst market shares by application ...... 158 8-4 Polymer catalyst market growth projection by subsegment ...... 162 8-5 Trends in catalyst development in Hydrocracking ...... 165

13 14 List of Tables

2.1 Ground states of octahedral coordination complexes ...... 42 2.2 Difference in bond lengths at 20% HFX between HS and LS complexes 57 2.3 Dependence of HS-LS bond length differences on HFX ...... 57

3.1 Experimental spin-state splittings for 8 first-row transition metals .. 69

3.2 SCAN vs TPSS results for representative Fe(II)L6 complexes ..... 71 3.3 Charge vs energy sensitivity on meta-GGA exchange ...... 79 3.4 Extended data set for meta-GGA calculations ...... 82

4.1 Reported properties by the molSimplify post-processing module ... 103 4.2 Comparison of properties for UFF and molSimplify structures .... 108 4.3 Comparison of UFF and molSimplify Ni(II) structures ...... 111

6.1 Five-coordinate metal distal ligand bond lengths ...... 134 6.2 Six-coordinate metal distal ligand bond lengths ...... 136 6.3 Six-coordinate metal-proximal CO ligand bond lengths ...... 136 6.4 Relaxation energies for 6-coordinate MTPP-L/CO...... 138 6.5 Porphyrin-CO bond dissociation energies ...... 140

8.1 Global chemical catalyst market (2016) ...... 164

15 16 Abbreviations

BCP Bond critical point B3LYP Becke, 3-parameter Lee-Yang-Parr CFT Crystal field theory CASPT2 Complete active space perturbation theory CLI Command line interface CN Coordination number COM Center of mass CT Coordination template CSD Cambridge structural database DFT Density functional theory ELF Electron localization function FG Functional group GGA Generalized gradient approximation GUI Graphical user interface HF Hartree-Fock HOMO Highest occupied molecular orbital HS High-spin IS Intermediate-spin KS Kohn-Sham LDA Local density approximation LFT Ligand field theory

17 LS Low-spin LSDA Local spin density approximation LUMO Lowest unoccupied molecular orbital meta-GGA meta generalized gradient approximation ML Metal-ligand modB3LYP modified B3LYP NAO Natural atomic orbital NBO Natural bonding orbital NIST National institute of standards and technology NPA Natural population analysis QTAIM Quantum theory of atoms in molecules RMSD Root-mean square deviation SCAN Strongly constrained and appropriately normed SCO Spin crossover SDF Structure data file SIE Self interaction error SMARTS Smiles arbitrary target specification SMILES Simplified molecular input line entry string TM Transition metal TPP TPSS Tao, Perdew, Scuseria, Staroverov UEG Uniform electron gas UFF Universal force field UKS Unrestricted Kohn-Sham

18 Chapter 1

Introduction

Efficient discovery of new materials necessitates the rapid but selective generation of candidate structures from a very wide chemical space and the efficient estimation of their properties. Experimental discovery of new materials is limited by the cost and the time required to perform the experiments. Computational approaches [1] such as density functional theory (DFT) that directly simulate the electronic properties have been increasingly used as tools for materials design mainly due to recent develop- ments in computational speed and accuracy. DFT methods are now able to rapidly characterize candidate materials before they are synthesized reducing both the cost and the time required for their development.

1.1 Density functional theory (DFT)

1.1.1 Introduction to density functional theory

The first principles method we will primarily use in this work is density functional theory. DFT is among the most popular and versatile methods available in com- putational , since it combines accuracy with computational efficiency. The theoretical foundations for its development were set by the Hohenberg-Kohn theo- rems [2] which argued that the ground state properties of a system can be described in terms of its ground state electronic density instead of the far more complicated

19 wavefunction. Starting from the many-electron time-independent Schrödinger equa- tion and employing the Born-Oppenheimer approximation the equation becomes (in atomic units):   N N M N 1 ∑ ∑ ∑ Z ∑ 1 – ∇2 – A +  Ψ = EΨ, (1.1) 2 i |r – R | |r – r | i=1 i A i A i

where the Hamiltonian operator includes contributions from the kinetic energy of the electrons, the electrostatic attraction between electrons and nuclei as well as the repulsion between electrons. M is the number of nuclei and N is the number of electrons in the system. However, the fact that the wavefunction depends on the positions of all the electrons for a given configuration of nuclei, thus on 3N variables, makes the direct solution of the equation intractable in practice. Reformulating the ∑ ρ ∥ψ ∥2 problem using the three dimensional electron density, (r) = i i(r) , significantly simplifies the solution procedure.

Furthermore, as demonstrated by Hohenberg and Kohn [2], all the physical quan- tities of interest are a function of an external potential, v(r), and therefore the energy of the system can be defined as a functional of the electron density: ∫ E[ρ(r)] = F[ρ(r)] + v(r)ρ(r) dr, (1.2)

where F[ρ(r)] is a universal functional, independent of the system in question, that contains the contributions from the kinetic energy and the Coulomb interactions between the electrons, whereas v(r) represents the external potential (in our case the Coulomb interactions of electrons with the nuclei). This reformulation of the problem allows us to obtain the ground state energy by variationally minimizing the energy under the constraint that the total number of particles is preserved.

In practice, however, the universal functional F[ρ(r)] is not known. Kohn and Sham [3] bypassed this problem by mapping the original system onto a system of non- interacting electrons that has the same electron density and thus the same energy. This system can be described by a Slater determinant of single particle orbitals. The

20 universal functional F[ρ(r)] of the real system can now be expressed as:

∑ ∫ ′ 1 ∗ 1 ρ(r)ρ(r ) ′ F[ρ(r)] = – ψ (r)∇2ψ (r) dr + dr dr +E [ρ(r)], (1.3) 2 i i 2 |r – r′| xc | i {z } | {z } EH T0

where the first term is the kinetic energy of the fictitious system of non-interacting electrons, the second term is the classical Coulomb interaction described through their density (also called Hartree term, EH), whereas Exc[ρ(r)] or exchange-correlation energy accounts for the many-body effects that are not included in the rest of the functional. This last contribution to the energy sums up everything that we don’t know about the universal functional in one term and in practical DFT we try to account for it using several approximations.

The simplest of these approximations is called local density approximation (LDA) [4] in which we assume that the electrons behave locally as a uniform electron gas (UEG) with constant density. Therefore, the exchange-correlation term can be ex-

pressed as: ∫ LDA UEG Exc [ρ(r)] = εxc (r)ρ(r) dr, (1.4)

UEG where εxc (r) is the exchange-correlation energy of the homogeneous electron gas and is calculated using high accuracy quantum Monte Carlo simulations [5]. The exchange part of the functional can be calculated exactly as:

( ) 3 3 1/3 ELDA[ρ] = – ρ(r)4/3dr. (1.5) x 2 4π

For quickly varying densities, as in molecules with localized subshells, LDA ex- change provides a particularly poor estimate of the exchange energy.

Beyond the LDA, gradients of the density may be directly incorporated into semi- local descriptions of exchange [6], typically rescaled by the absolute value of the density as in the so-called generalized gradient approximation (GGA). The B88 [7]

21 GGA exchange energy is given by: ∫ χ2 GGA LDA 4/3 ( σ ) Ex = Ex – β ρσ dr, (1.6) 1 + 6βχσ sinh–1 χσ

where this exchange energy is referenced with respect to the LDA energy and is a rescaled integral of the spin density (spin index σ, see Section 1.1.2) with a semi-

empirical parameter β=0.0042a.u. Here, the variable χσ is the rescaled gradient of the density: |∇ρσ| χσ = . (1.7) 4/3 ρσ

Higher level approximations introduce the usage of the Laplacian of the density or equivalently the kinetic energy of the electrons (meta-GGA) [8], all of them trying to provide more accurate results by incorporating more complexity in the functional ex- pression. However it is not yet clear that these functionals provide improved accuracy in all cases, since all of them are based on a mean-field formalism that is expected to work well only for systems with delocalized electrons.

On the other hand, the Hartree-Fock (HF) method accounts for the exchange in- teraction exactly while neglecting fully the electron correlation. The exchange energy in this method is explicitly given by:

∑occ ∫ ψ∗ ψ∗ ′ ψ∗ ψ∗ ′ 1 ′ i (r) j (r ) j (r) i (r ) EHF[ρ(r)] = – dr dr , (1.8) x 2 |r – r′| i,j

where ψi(r) is the single particle orbital i and the sum is over all occupied orbitals. Since HF includes the exact exchange energy, hybrid functionals [9], another class of exchange-correlation functionals, incorporate a portion of exact HF in attempt to better account for this type of interaction. Hybrid functionals are very popular due to their good performance for a range of different systems, however it should be pointed out that the results obtained with these functionals depend strongly on the amount of exact exchange included and therefore they should be used with caution.

22 The mathematical form of hybrid functionals is given by:

( ) ( ) ( ) LDA HF LDA GGA LDA LDA GGA LDA Exc = Ex +a0 Ex –Ex +ax Ex –Ex +Ec +ac Ec –Ec . (1.9) By varying the 3 parameters, it is possible to tune the hybrid functional and adjust the HF, LDA and GGA parts. The most popular hybrid functionals in chemistry

are B3LYP [9–11] with a0 = 0.2, ax = 0.72 and ac = 0.81 and PBE0 [12, 13] with

a0 = 0.25, ax = 0.75 and ac = 1.00.

1.1.2 Spin density functional theory

A useful extension of the KS approach treats separately the densities ρα(r) and ρβ(r) of electrons with spin projection up and down. Equivalently one can deal with

ρ(r) = ρα(r) + ρβ(r), (1.10)

together with the polarization function:

ρα(r)– ρβ(r) ζ(r) = , (1.11) ρα(r) + ρβ(r)

that takes values between -1 (fully polarized downwards) and +1 (fully polarized upwards). The spin-up and -down densities are generated from spin-up and spin- down KS wavefunctions,

∑occ ∑occ 2 2 ρα(r) = |ψi,α| , ρβ(r) = |ψi,β| . (1.12) i i The local density approximation can be extended to the local spin density approxi- mation (LSDA) based on the spin-polarized uniform gas in a manner analogous to the LDA approach. LSDA represents a considerable improvement over LDA for atomic and molecular systems with unpaired spins or open-shell systems, for which the un- polarized electron gas is clearly not a very good model. This approach is commonly

23 referred to as unrestricted Kohn-Sham (UKS) [14] and it can be employed in systems where a standard KS calculation would unphysically restrict the true symmetry (e.g., dissociation of the H2 molecule).

1.2 DFT for transition metal catalysis

Efficient design and discovery of catalysts is central to solving modern challenges in energy and resource utilization [15]. In the field of heterogeneous catalysis, it has been shown that a wide range of catalysts can be screened [16–18] using suitable chemical descriptors such as binding energies calculated with ab initio methods. A similar approach has been proposed for molecular catalysts [19] where binding ener- gies of specific molecules can be used as indicators for catalytic activity. Molecular catalysts enhance reaction rates and selectivities at metal-centers coordinated with specific ligands. These coordination complexes have well-defined geometric and elec- tronic structures that foster selective and targeted interactions with various classes of molecules. The complexity of these interactions makes a molecular-level understanding of the underlying interplay between the atoms essential for the design of effective molecular catalysts. By understanding the properties that determine their catalytic activity, it is possible to tune the structure of the catalysts in order to achieve the highest possible turnover frequency. Transition metals are present as catalytic reactive centers in a wide range of bi- ological [20] and inorganic systems [21–23]. In addition, the majority of molecular catalysts that have been studied contain transition metals. Common approaches in- cluding density functional theory techniques utilize mean field approximations to de- scribe them, with their accuracy strongly depending on the underlying assumptions of each approximation. It becomes necessary therefore to develop a better understand- ing of how the choice of different first-principles approaches can affect the calculated properties of transition metal complexes and provide estimates about the uncertainty in our predictions.

24 Transition metals are unique due to their open-shell character. They include partially occupied valence d or f orbitals that impart special properties such as para- magnetism [24], vivid color of their compounds [25] or electrical conductivity [26]. Transition metals exhibit a wide range of oxidation states that allow them to form many different compounds.

Transition metal compounds are currently one of the biggest challenges for theo- retical chemistry [21,27]. The high localization of the d electrons can not be described adequately by most exchange-correlation functionals that tend to delocalize the elec- tron density [28]. The main source of this problem is self-interaction error (SIE). As

we can see from Eq. 1.3, the Hartree term, EH, includes the repulsion of each refer- ence electron in the mean field of the rest. However, the calculated mean field density includes the charge of the reference electron as well, thus causing the electron to in- teract with its own mean field. For a simple, one-electron system with wavefunction ϕ(r), the Hartree term from eq. 1.3 becomes: ∫ 2 ′ 2 1 |ϕ(r)| |ϕ(r )| ′ E = dr dr ≠ 0, (1.13) H 2 |r – r′|

which indicates that the Hartree term even in an one-electron system predicts the unphysical repulsion of an electron by its own mean field. Including the exact exchange energy of Eq. 1.8 The sum becomes: ∫ ∫ 2 ′ 2 2 ′ 2 1 |ϕ(r)| |ϕ(r )| ′ 1 |ϕ(r)| |ϕ(r )| ′ E + E = dr dr – dr dr = 0. (1.14) H x 2 |r – r′| 2 |r – r′|

Thus, in Hartree-Fock theory self-interaction cancels exactly, however the lack of dynamic correlation makes the method unable to accurately describe transition-metal containing systems. Most exchange and correlation functionals, including hybrids, still fail to systematically localize d electrons [29–31] correcting this way for self- interaction and thus alternative approaches are still the topic of ongoing interest [32].

25 1.3 Ligand field theories

Fundamental for the description of open-shell transition metal coordination complexes is an understanding of the interactions between the central metal atom and the co- ordinating ligands. Various theories have been proposed to describe the bonding in coordination complexes with the most popular amongst them being crystal and ligand field theory.

1.3.1 Crystal field theory

Within crystal field theory (CFT) we assume that the 6 ligands in an octahedral complex behave as negative point charges that are brought near the metal in an oc- tahedral array along the three principle axes of the d orbitals. These point charges represent lone pairs on the ligands that are considered to be fully localized and there- fore are not involved in any type of covalent bonding with the metal. As a result, CFT assumes a purely ionic type of interaction between the metal and the ligands.

The dx2–y2 and dz2 orbitals of the metal are affected the most by the negative charges that point directly at their corresponding electron clouds (Figure 1-1). Any electrons in these orbitals will be strongly repelled by the corresponding point charges,

thus raising the energy levels of the two orbitals. In contrast, the dxy, dxz and dyz orbitals have their lobes directed between the ligands (Figure 1-1) increasing the stability of these orbitals and thus lowering their energy. The net result of this purely electrostatic interaction is that the 5 degenerate metal

d orbitals are split into two groups, a set of 2 orbitals, eg, with high energy and a set

of 3 orbitals, t2g, with low energy, separated by the so-called crystal-field splitting

energy, Δo (Figure 1-2). The splitting energy is crucial in accounting for magnetic properties in coordina- tion complexes. Pairing electrons on the same orbital requires energy input and if

Δo is smaller than this pairing energy then a configuration where unpaired electrons occupy eg orbitals will be preferred in a so-called high-spin configuration. If Δo is larger than the pairing energy, then pairs of electrons will occupy the lower-energy

26 Figure 1-1: Crystal field theory representation of the repulsion between the metal d orbitals and the 6 point charges that represent the ligands.

Figure 1-2: Splitting of the degenerate metal d orbitals in the presence of an octahe- dral crystal field.

orbitals t2g (Figure 1-3) obtaining a low-spin configuration.

The magnitude of the splitting energy will depend on the electrostatic or crystal field created by the ligands. A strong crystal field will result in stronger repulsion and

a larger Δo that favors low-spin configurations. Weaker crystal fields will result in weaker repulsion, lowering the Δo and thus favoring high-spin electron configurations. Some of the most common ligands can be listed by their corresponding crystal field

27 strength as indicated by the spectrochemical series:

– – – – – – – |CO, CN{z > en} > NH| 3 > NCS{z > H2O} > OH| ,F > Cl{z > Br > I} . strong field intermediate field weak field

Figure 1-3: High- (HS) and low-spin (LS) configurations for common mid-row tran- sition metals.

1.3.2 Ligand field theory

Extending on the concepts that were developed in crystal field theory, ligand field theory (LFT) takes explicitly into account the orbitals of the ligands. Within LFT, six ligand orbitals are initially assumed to have σ symmetry around the metal-ligand bond lines and are allowed to interact with six of the nine valence metal orbitals,

namely s, px, py, pz, dx2–y2 and dz2. The dxy, dyz, dxz orbitals have the wrong symmetry for combining with σ-type ligand orbitals and are therefore nonbonding in character. The resulting energy-level diagram includes 6 bonding, 6 antibonding and 3 nonbonding orbitals (Figure 1-4). The nonbonding level and the lowest antibonding

level correspond to the two levels, t2g and eg predicted by CFT. Including the effect of ligand π-type orbitals, the molecular orbital diagram is not changing significantly since they mainly interact with the nonbonding t2g molecular orbitals. If the π-type ligand orbitals are populated, they will repel the t2g orbitals

28 t1u *

a1g *

px py pz

eg*

σ Δo

2 2 2 dx -y dz dxy dyz dxz t2g Energy

Ligand σ eg

t1u

a1g

Ligands (L) Metal (M) ML 6

Figure 1-4: Simplified molecular orbital diagram for an octahedral complex as pre- dicted by ligand field theory.

that have strong metal dxy, dxz and dyz character, raising their energy level and

thus decreasing the Δo. Any ligand with filled orbitals having such π symmetry around the ligand-metal axis (e.g., Cl–, OH–) is classified as weak-field under LFT. On the other hand, ligands that have an unfilled antibonding orbital with π symmetry behave differently. The antibonding π∗ orbitals in this molecular group accept electron density from the nonbonding t2g orbitals that become partially delocalized. This delocalization stabilizes them and lowers their energy resulting in increased Δo and favored low-spin configurations. The interaction between the metal t2g orbitals and empty ligand π∗ orbitals is called π back-bonding and strengthens the metal-ligand – bond. Ligands that increase the splitting of the levels in this way (e.g., CO, NO2) are classified as strong-field under LFT. Both theories can be extended to other geometries using the same methodology that was initially developed to explain properties of octahedral coordination complexes.

29 Transition metals are present as catalytic reactive centers in a wide range of bi- ological [20] and inorganic systems [22, 23]. In addition, the majority of molecular catalysts that have been stu

1.4 High-throughput screening

Catalysts, be they molecular [33], heterogeneous [34], or biological [20], selectively and efficiently convert abundant feedstocks to more useful, functionalized products. Therefore, efficient design and discovery of catalysts is central to solving modern challenges in energy and resource utilization. Nevertheless, experimental discovery of new catalysts is slow, and computational study is often limited to rationalizing experimental observations. While experimental catalyst design efforts have focused on developing new synthesis techniques [35], characterizing alloys [36], using chemical intuition [37], or taking inspiration from biology [38], there are essentially infinite candidate catalysts that have not yet been synthesized or characterized. Catalyst discovery often relies on a make-first, explain-later approach, where reactivity is ra- tionalized after compounds are synthesized. Exponential growth in computational power has enabled major first-principles screening efforts in heterogeneous catalysis [16,17,39,40] and materials [41–43]. Part of the success of these solid-state screening efforts lies in the fact that for a given composition, a relatively limited number of possible crystal structures exists that may be efficiently enumerated, for instance by evolutionary algorithms [42,43]. Molecular catalysts are another attractive target for computational screening efforts due to their high selectivity, activity, and wide range of tuning made possible through variation of metal and ligand identities. While some experimental [44, 45] or computational [33, 46–51] screens have been carried out for the discovery of molecular catalysts, robust and broadly applicable tools for the rapid generation and assessment of inorganic complexes (Figure 1-5) are not yet available. Recent work [19] has shown that the binding energy relations used in the solid state can be generalized into an energetic span model that relates turnover frequencies to binding energies across all catalytic

30 systems, thus enabling the fast assessment of the catalytic properties for molecular systems with an approach similar to the volcano plots used in heterogeneous catalysis.

Figure 1-5: Three coordination complexes: Zn(NH3)4,Co(CO)5,Fe(phen)2(NCS)2, with Zn, Co and Fe as metal centers and tetrahedral, trigonal-bipyramidal and octa- hedral coordinations respectively.

Unlike transition-metal complexes, organic molecules are much more straightfor- ward targets for computational screening. Often, first-principles simulation is not even needed to evaluate chemical properties of simple organic molecules. Instead, tools have been developed to store [52–55] and analyze [56–60] large quantities of chemical data for organic molecules including chemical formulas, struc- tures, and connectivity information. The development of these methodologies has also enabled encoding of all structural information in a molecule into a two-dimensional (2D) representation of a molecule such as a connection table, a simplified molecular input line entry string (SMILES) [61] string (Fig. 1-6) or a Smiles ARbitrary Target Specification (SMARTS) string [62]. Most tools evaluate chemical properties solely on the basis of connectivity definitions [63–65], while some also work with three- dimensional (3D) structures [66–68]. The interface between cheminformatics and simulation is made possible by generators that can turn these 2D strings into three- dimensional (3D) coordinates in a process called structure diagram generation [69]. These coordinates can then be used as input for the calculation of various properties of interest with first-principles simulation methods such as DFT. However, for these structure generation tools to succeed, atom valency and connectivity should be well defined, which is seldom the case for transition metal complexes.

31 O = Cc1|{z} ccc (O)c(OC)c1 –→ c:c

Figure 1-6: Example of SMILES string and the SMARTS pattern c:c that indicates aromatic carbons joined by an aromatic bond. The resulting 3D structure of vanillin is also illustrated.

While the generation of 3D coordinates from a 2D representation is straight- forward for small organic molecules, this process becomes challenging for inorganic complexes. Here, structure prediction is complicated by the fact that the number of bonds and valency are variable for inorganic complexes and not well defined as in organic molecules. However, the open-shell character, large number of electrons, and diversity of chemical bonds observed in transition metal complexes [70–73] that make it difficult to predict 3D structures are also the properties that impart interesting cat- alytic activity and chemical properties to these systems. Therefore, there is a clear need to develop automatic structure generation tools that can robustly obtain 3D properties of inorganic complexes. Additionally, strategies [74] for generating multi- ple conformers of a given 2D formula are implemented in several codes [67,68,75–77] and work reasonably well for organic molecules. However, generating conformers of inorganic complexes becomes more challenging, and few tools exist for weakly bound complexes. Molecular catalysis screening efforts necessitate evaluation of binding en- ergetics with adsorbates and non-covalent interactions. Initial guesses for such studies typically require painstaking hands-on generation or customized, user-built scripts. Crucially, first-principles geometry optimizations will only find the closest local mini- mum to a user-guessed geometry, and there may be many such local minima for most chemical systems of interest. In contrast to organic molecules, most inorganic complex screening typically re-

32 lies on databases of experimentally characterized chemical compounds [78–85]. While mining such databases for existing complexes that can be repurposed for other objec- tives in catalysis [86] often proves fruitful, discovery of new materials also necessitates generating molecules that are not already experimentally known. Automatic struc- ture generation tools that combine both new structures as well as fragments from experimental structures will instead allow the extension of the surveyed chemical space in a combinatorial search to previously uncharacterized molecules. Some more flexible 3D structure generation approaches [87–89] for inorganic complexes have in- stead decomposed the molecule into elementary fragments that are matched against 3D structures in experimental libraries. However, the reliance of most of these tools on commercial databases [78] has not been without controversy [90, 91], leading to their limited distribution. For a screening effort to succeed, both the calculations and structure generation should be automated. However, there are only a few examples (e.g., the Atomistic Simulation Environment python toolkit in the solid state [92] or the code [93] for ) are available that both generate structures and aid the preparation of first-principles calculations through the generation of input files and analysis of results. In order to accelerate discovery in transition metal chemistry, there is a clear need to both enable the flexible generation of 3D structures and the evaluation of their properties with first-principles simulations in a reliable and automatic framework.

1.5 Thesis outline

The remainder of this thesis is outlined as follows:

• Chapter 2 discusses the effect and role of Hartree-Fock exchange in describing the electronic structure of transition-metal complexes. Detailed results on the spin-state energetics of iron octahedral complexes are reported including charge analysis and structural properties. The relative performance of hybrid density functionals is also compared against alternative exchange-correlation functionals

33 and literature results.

• Chapter 3 considers the effect of meta-GGA exchange on the spin-state energet- ics of a wide range of single metal ions and octahedral coordination complexes. Correlations between electronic structure and functional performance are drawn with the results compared against reference values. Furthermore, the combined effect of Hartree-Fock exchange and meta-GGA exchange is discussed for a set of iron coordination complexes, suggesting differing effects that depend on the ligand-field strength.

• Chapter 4 discusses the development of molSimplify, a structure generation soft- ware that utilizes freely available modules and custom routines for the accurate generation of 3D coordinates in a wide range of transition metal complexes. The program employs an array of geometric manipulation routines that offer significant advantages against alternative methods for generating coordination complexes. Furthermore, additional tools such as post-processing modules or database searching routines that are embedded in the program are presented and their functionality discussed.

• Chapters 5 and 6 present two applications of this framework. Initially, the se- lective binding of anions on functionalized ferrocenium complexes is discussed, with the results suggesting that hydrogen-bonding plays an important role in the interactions between organometallics and anions. Finally, results regarding the binding strength of carbon monoxide on model catalytic metalloporphyrins functionalized with an additional axial ligand are presented. Correlations be- tween electronic and structural properties are obtained and a binding strength descriptor is suggested. Additionally, uncertainty estimates on the results are introduced based on sensitivity analysis.

34 Chapter 2

Effect of Hartree-Fock exchange

Reprinted (adapted) with permission from [94]. Copyright 2015 American .

Density functional theory has seen widespread use and exponential growth [95] ow- ing to its relatively computationally efficient description of short-range, dynamic cor- relation. Ease of entry for new users has made practical DFT one of the most popular "black box" techniques, despite well-known shortcomings. Namely, predictions are sensitive to user selection of the exchange-correlation func- tional amongst a "zoo" of choices. Decisions about functional choice are in turn often influenced by word of mouth or popular opinion [96] and availability in a localized basis or plane wave electronic structure code. Extensive optimization of exchange- correlation functional parameters in DFT against test sets with a large number of parameters [97] has improved accuracy, though reduction in parameters [98] or lim- ited use of parameters in some functionals [12,13] can provide improved transparency. Nevertheless, mathematical expressions for exchange and correlation still prevent a clear understanding of how accuracy may be systematically and globally improved. Established test and training sets for functional development primarily focus on ther- mochemistry of main group molecules [99], and accuracy is not necessarily transferable to other properties or elements.

35 Importantly, exchange-correlation functionals that work well for main group ele- ments may not work as well for transition metals [100], which are central to homo- geneous [21–23], heterogeneous [101], or enzymatic [20] catalysis. Transition metals are increasingly prominent in computational design screens [101,102] for which high- accuracy and high-efficiency black box DFT predictions are needed. Nevertheless, transition metals remain a challenge due to the close spacing of electron configura- tions (e.g. 3d74s1 vs. 3d64s2 in neutral Fe) that leads to several accessible spin states and oxidation states [103]. Spin-crossover (SCO) complexes [104, 105], which typically contain Fe(II) or Fe(III) centers [106], represent a particularly challenging class of molecules because the spin state can change with small changes in tempera- ture. SCO molecules have shown promise in storage devices [107], spintron- ics [108, 109], and catalysis [110–112]. Nevertheless, common exchange-correlation functionals struggle to reproduce critical features for the spin-dependent potential energy surfaces [113–116].

In transition metal complexes, low-spin states are known to be favored by gen- eralized gradient approximation exchange-correlation functionals, while hybrid func- tionals that include a fraction of Hartree-Fock exchange often prefer high-spin states [116–121], and different energy gaps are obtained with different exchange-correlation functionals [122,123]. One suggested reason for the failure of practical DFT in describ- ing transition metal complexes is that relatively localized 3d valence electrons suffer strongly from self-interaction error (SIE) present in pure DFT functionals and only approximately corrected in hybrid functionals. Strides have been made in systematic removal of SIE [124,125] and identification of paths to improve balance in spin-state ordering [116,117,126–130] but hybrid functionals remain a popular and straightfor- ward approach to approximately correct for energetic errors driven by imbalances in SIE between spin states.

It is worthwhile to note that mixing in of Hartree-Fock exchange may poten- tially trade reduction in self-interaction errors for an increase in static correlation errors, which tend to plague Hartree-Fock more than density functional theory ap- proaches. For a balanced treatment of both static correlation and in the absence of

36 self-interaction error, multireference wavefunction techniques have been used [131– 138] to study spin crossover complexes up to around 45 atoms in size [136]. The predominant method employed in the study of spin-crossover complexes is CASPT2, and, while it scales more expensively than density functional theory approaches, re- cent improvements in scaling [132] have made larger systems tractable. Other studies have applied the even more expensively-scaling CCSD(T) to smaller spin crossover complexes [137]. Wavefunction approaches are not without challenges and in some cases still produce sizeable, 5 kcal/mol energetic errors in spin-state ordering [136]. However, they are typically in very good agreement with experimental spin crossover properties and are a suitable reference for benchmarking of exchange-correlation func- tionals for higher throughput studies. Despite advances in wavefunction theory, ap- proximate density functionals are still preferred by most computational researchers due to ease of use and lower scaling that makes geometry optimization and high- throughput calculations feasible.

Extending study [119–121] of how inorganic complex spin states vary with func- tional choice can broaden an understanding of the ways in which hybrid functionals improve predictions of spin-state orderings, especially since both low [120, 139, 140] and high [119,141,142] percentages of Hartree-Fock exchange have been proposed for the description of transition metal complexes. Rather than focusing on finding one prescription for exchange, we aim to understand the way in which relative energetic, electronic, and structural properties of spin states are sensitive to these descriptions. Understanding this variability unifies many functionals and can provide a useful guide for interpreting the prediction bias introduced through functional choice in DFT liter- ature. Finally, we aim to enlarge a quantitative understanding of how self-interaction error manifests and is balanced through the use of Hartree-Fock exchange.

While B3LYP is commonly employed to successfully describe organic systems, its direct application to organometallics leads to mixed results. One approach is to ad- just the extent of Hartree-Fock exchange in a functional in order to reproduce key energetics and spin-state orderings in organometallic systems where multiple spin mul- tiplicities lie close in energy [115, 119, 139–142]. However, such exchange-correlation

37 functional tuning is then constrained by the availability of experimental data or well- converged correlated quantum chemistry results. Importantly, the outcome from these fitting studies are often contradictory: alternative mixings of 0% [120], 15% [139,140], 25% [141, 142], and 30-50% [119] HF exchange have all been proposed for Fe(II) oc- tahedral complexes alone. Such broad outcomes suggest that the mixing of exact exchange in a functional is highly dependent on the underlying chemistry of the system, and a one-size-fits-all approach is not likely to be successful. Preliminary success has been made in identifying chemically-motivated ways to tune functional parameters [143–145] outside of . The appropriate tuning for organometallic complexes or correlated materials is largely still approximated on local measures of the chemical potential of the subshell [130] of interest, and efforts to improve functionals on transition metal test sets have indicated no clear path to optimization [146].

2.1 Computational details

Calculations were carried out using the TeraChem [147] package for all LDA, GGA, and GGA hybrid calculations. The default B3LYP definition in TeraChem uses the VWN1-RPA form for the LDA VWN [4] component of LYP correlation [10]. Initial calculations on GGA hybrids also considered the effect of using instead the 3-parameter or 5-parameter forms of the VWN correlation (B3LYP3, B3LYP5 key- words) as well as using other forms of the correlation with the B3P86 [9,148], B3PW91 [6,9], PBE0 [12,13] (25% HF exchange vs. 20% in B3LYP), or B97 [149] (19% HF ex- change) GGA hybrids. Overall qualitative GGA hybrid predictions were unchanged and therefore B3LYP1 is chosen as the representative functional. Altered Hartree-Fock exchange percentages in a modified form of B3LYP were implemented in TeraChem for this work. Meta-GGA calculations were carried out with Q-Chem 4.2. All calculations were performed using the LANL2DZ effective core potential basis for the iron atom and the 6-31G* basis for the other atoms. Geometry optimizations were carried out using the L-BFGS algorithm in Cartesian coordinates,

38 as implemented in DL-FIND [150], to default thresholds of 4.5x10–4 hartree/bohr for the maximum gradient and 1x10–6 hartree for the change in SCF energy between steps.

High-spin states (quintet multiplicity for Fe(II) and sextet for Fe(III)) are com- pared against low-spin states (singlet for Fe(II) and doublet for Fe(III)). Intermediate spin states were not considered. Oxidation states are qualitative and obtained by constraining total charge to correspond to the net charge on the respective ligands along with a positive (+2 or +3) charge for the iron center. Quantitative determina- tion of the charges and occupation of subshells (i.e. 3d and 4s) was obtained from the TeraChem interface with the Natural Bond Orbital (NBO) v6.0 package [151]. NBO calculates the natural atomic orbitals (NAOs) for each atom by computing the orthogonal eigenorbitals of the atomic blocks in the density matrix. After the set of NAOs is defined, NAO occupancy is obtained using natural population analysis (NPA) [152], which permits estimation of 3d and 4s subshell occupation. The NBO partial charge (q) on an atom is calculated by taking the difference between the atomic number (Z) and the total population (N) for the NAOs for each atom (i):

qi = Zi –Ni. (2.1)

– Several octahedral complex structures (ligands: CO, CN , CNH, NCH, NH3,

H2O) were generated from simplified molecular input line entry system (SMILES) [61] strings. Using OpenBabel [68], the SMILES strings were converted to structures that were starting points for TeraChem geometry optimizations. The larger octahedral

complex structures (ligands: (phen)2(NCS)2, PEPXEP, HICPEQ, bpy, terpy), were obtained from the Cambridge Structural Database (CSD) [78]. PEPXEP denotes the

CSD accession code for a compound with N6C26H38 stoichiometry, while HICPEQ corresponds to a N8C18H26 compound. The (phen)2(NCS)2 structure was previously identified as a good test case [139]. The other ligands were selected by using the CCDC ConQuest web-screening tool with a query limiting elements to Fe, C, N, H in an octahedral complex with symmetric Fe-N bonds, as was previously used for

39 catalyst screening [153].

2.2 Dependence of spin-state ordering on functional choice

We have considered a test set of representative Fe(II) and Fe(III) octahedral com- plexes (Fig. 2-1) for various exchange-correlation functionals. In all cases, the ground state spin is known experimentally or may be suggested from ligand field theory. Fe(II) and Fe(III) have nominally 3d6 and 3d5 electron configurations, giving rise to low-spin (LS) singlet or doublet spin multiplicity or high-spin (HS) quintet or sextet electronic states. The adiabatic electronic energy gap between HS and LS states is:

HS–LS ΔE = EHS(RHS)–ELS(RLS), (2.2) where EHS(RHS) is the electronic energy of the HS state at its geometry optimized coordinates and ELS(RLS) is the equivalent for the LS state. The initial set of struc- tures includes two carbon ligand sets (CO and CNH), three nitrogen ligand sets

(NH3, NCH, and (phen)2(SCN)2), and one oxygen ligand set (H2O) (see structures in Fig. 2-1). One representative functional is chosen for each class: LDA (PZ81 [125]), GGA (PBE [153]), GGA hybrid (B3LYP) and meta-GGA (M06-L [154]) to compare qualitative relative high-spin/low-spin energetics. Reliance on a single representative functional for each class is motivated by preliminary findings in comparing a wider array of functionals. Pure density functionals (LDA or GGA) consistently predict low-spin ground states in nine of the ten cases (six are Fe(II) and four are Fe(III)) considered (Table 2.1), although only half of the ten cases are expected to be low spin. Pure GGA preference for low-spin Fe(II)/Fe(III) complexes is consistent with earlier observations [29, 119, 157]. Including higher order dependence on the density

as in a meta-GGA improves identification of some high-spin states: Fe(II)(NH3)6

and Fe(III)(NCH)6 are predicted to be high spin with a meta-GGA, while they were

40 Figure 2-1: Structures of octahedral iron complexes classified by direct ligand identity: carbon (top), nitrogen (middle), or oxygen (bottom).

predicted to be low-spin with a GGA. However, meta-GGA results are inconsistent:

Fe(III)(NH3)6 and Fe(II)(NCH)6 have high-spin ground states [119,156] but the meta- GGA predicts both to be low-spin. Identification of how the higher-order terms of the density may be systematically incorporated to improve predictions of magnetic ordering or spin states is of ongoing interest since meta-GGAs have the potential to improve predictions in extended systems where explicit incorporation of Hartree-

41 Structure Reference LDA GGA hybrid meta-GGA II Fe (CO)6 LS LS LS LS LS II Fe (H2O)6 HS HS HS HS HS II Fe (CNH)6 LS LS LS LS LS II Fe (NCH)6 HS LS LS HS LS II Fe (NH3)6 HS LS LS HS HS II Fe (phen)2(NCS)2 LS LS LS HS LS III Fe (NH3)6 HS LS LS LS LS III Fe (NCH)6 HS LS LS HS HS III Fe (CNH)6 LS LS LS LS LS III Fe (CO)6 LS LS LS LS LS Table 2.1: Ground states of octahedral Fe(II) and Fe(III) complexes with speci- fied ligand sets for LDA, GGA, GGA hybrid, and meta-GGA classes of exchange- correlation functionals. Reference data are from experiment (indicated in bold), oth- erwise approximations from ligand-field theory are provided. Incorrect predictions of the ground state spin for a functional are indicated by red color. The experimen- tal data are from those collected in Ref. [119], except for Fe(II)(H2O)6 (Ref. [155]), Fe(II/III)(NH3)6 (Ref. [156]), and Fe(II)(phen)2(NCS)2 (Ref. [139]).

Fock exchange may be prohibitive (see Chapter 3). For the GGA hybrid class of functionals, correct qualitative identification of spin states is achieved in eight out of ten cases. However, in the case of (phen)2(NCS)2, a high-spin ground state is predicted despite experimental observation [139] of a low-spin ground state. While this test set is relatively small, it reinforces general observations that GGA hybrids tend to over-predict high-spin ground states, while pure density functionals predict low-spin ground states. This trend will be investigated on an expanded molecule test set in Sec. 2.3. Qualitative spin-state assignment is difficult in weak ligand cases where the quan- titative gap falls below 5 kcal/mol due to basis set dependence or zero-point energy and vibrational entropy effects [119,121] not considered here but covered in detail in the recent work by Mortensen and Kepp [121]. For the GGA, Fe(II)(NH3)6 is close to crossover to high-spin, which would improve agreement with experiment. Three of the meta-GGA predictions: Fe(II)(NCH)6, Fe(III)(NH3)6, and Fe(II)(phen)2(NCS)2, are close to the spin crossover point to HS states, which would improve agreement with experiment in the first two cases but worsen agreement for the last case. We also note

42 that these meta-GGA results may be more substantially sensitive to the functional form since M06-L, for instance, is highly parameterized. We thus compare against TPSS [8], a meta-GGA with fewer adjustable parameters. Comparing the TPSS and M06-L meta-GGAs, we find the two are qualitatively consistent, but TPSS has a stronger bias for high spin systems. This bias leads to improved qualitative agree- ment for two compounds (Fe(II)(NCH)6 and Fe(III)(NH3)6) as high-spin but also reduced qualitative agreement for two low-spin compounds that TPSS predicts to be

high-spin (Fe(II)(phen)2(NCS)2 and Fe(III)(CO)6).

2.3 Dependence of spin-state ordering on HF ex- change

In order to broadly investigate the effect of HF exchange on spin-state ordering, we vary the amount of HF exchange included in a modified B3LYP (modB3LYP) functional. The DFT exchange for the modB3LYP functional is calculated using the following expression:

modB3LYP HF LDA GGA LDA Ex = αHFEx + (1 – αHF)Ex + 0.9(1 – αHF)(Ex –Ex ), (2.3)

where αHF is the amount of HF exchange. For αHF → 0, the exchange is pure DFT-

GGA (as in BLYP), while for αHF → 1, the exchange becomes pure HF. The factor 0.9 was introduced so that the ratio

EGGA x = 9 (2.4) LDA Ex

GGA LDA is equal to that of the original B3LYP functional (0.72 for Ex and 0.08 for Ex ) and constant for all αHF. We apply the modB3LYP functional (with HF exchange = 0-50%) to select octahedral complexes from the initial test set (Sec. 2.2) as well as an expanded test set. Throughout, we also compare to a narrow range of 12.7- 28.3%, which corresponds to 3σ confidence interval on the normal distribution fit to

43 the votes for standard hybrid exchange-correlation functionals in a popular density functional theory poll [96]. While the narrower range indicates the most common hybrid exchange ratios, the wider range permits connection to both pure GGA and high HF exchange functionals.

2.3.1 Spin-state ordering dependence with Fe(III) complex test cases

We first focus on the relative electronic energy between high-spin (HS) and low-spin (LS) electronic states (ΔEHS–LS) for four Fe(III) octahedral complexes (N ligands:

NCH, NH3, C ligands: CNH, CO) across the 0-50% HF exchange range (Fig. 2-2). Fe(III) complexes have a d5 configuration that will lead to complete filling of all d levels in the high-spin case or a paired, closed-shell doublet in the low-spin case. Linear behavior in spin-state energetics is observed over the complete range of HF exchange covered with modB3LYP exchange for Fe(III) complexes with both carbon and nitrogen ligands, extending and confirming earlier observations by Droghetti on octahedral Fe(II) complexes [119] over a range of about 15-40% HF exchange. This linear energetic dependence is surprising because it suggests that any re- sponse that the density has to the modified HF potential is of the same magnitude in both the high-spin and low-spin state. That is, it is evident that simply mixing increasing fractions of HF exchange energy on a high-spin state will lower its energy linearly with respect to a low-spin state. However, if the density responds differently in the case of the low-spin state, e.g. through increased localization with respect to the high-spin state due to imbalances in self-interaction error, then the energetic dependence should contain higher order terms than a simple linear averaging. This linear result suggests that HF-derived localization of covalent, delocalized orbitals may not be suitable for understanding the effect of higher fractions of HF exchange. A comparison to self-interaction correction schemes [124,125] and the delocalization- penalty +U approach [130] will likely be instructive in the future. Since ΔEHS–LS varies linearly with HF exchange, linear-regression fits are very

44 80

Fe(III)(CNH)6 60 Fe(III)(CO)6 Fe(III)(NH3)6 40 Fe(III)(NCH)6

(kcal/mol) 20 HS-LS

ΔE 0

-20

01020304050 % HF exchange

Figure 2-2: Relative high spin (HS)-low spin (LS) energy (ΔEHS–LS) in kcal/mol of four Fe(III) octahedral complexes (two nitrogen ligands: NCH and NH3 and two carbon ligands: CNH and CO) with 3σ confidence interval from normal distribution poll data on hybrid exchange functionals, as indicated with black dashed lines and black arrow.

good approximations to the partial derivative of the energy with respect to HF ex-

change (αHF), ∆∆EHS–LS ∂∆EHS–LS slope = ≈ , (2.5) ∆αHF ∂αHF where the correlation coefficients (i.e., R2 values) for these fits are all 0.999. We introduce here the unit notation "HFX", where on unit of HFX corresponds to the range from 0% to 100% HF exchange. The identify of the directly bonded element HS–LS dominates the value of ∂∆E , and nitrogen-containing ligands have nearly iden- ∂αHF kcal kcal tical values: -75 mol·HFX for Fe(III)(NH3)6 and -77 mol·HFX for Fe(III)(NCH)6. For kcal carbon-containing ligands, the correspondence is also quite close: -110 mol·HFX for kcal σ Fe(III)(CNH)6 and -106 mol·HFX for Fe(III)(CO)6. Over the 3 confidence interval, the carbon ligand sets always prefer a low-spin ground state, but ΔEHS–LS is re- duced by 17 kcal/mol over this range, which is a significant change in predictions of quantitative energetics. For comparison, the shift in ΔEHS–LS from BLYP (0%) to B3LYP (20%) is larger at around -21 kcal/mol, and the difference between 20% and 25% HF exchange shifts the prediction by -5 kcal/mol. The ratio of Hartree-Fock

45 exchange and the direct ligand dominates these trends, rather than the form of the DFT exchange or the associated correlation functional. When calculations are carried out with a modified PBE0 functional instead of modB3LYP, slopes are qualitatively unchanged, with an average difference of 6% in computed slopes and a maximum kcal deviation of -9 mol·HFX for Fe(III)(CO)6. While the spin-state splitting derivatives are smaller for the nitrogen-containing ligands, the proximity of the curves to the HS-LS crossover makes the qualitative spin-

state assignment more challenging. Furthermore, both Fe(III)(NH3)6 and Fe(III)(NCH)6

are low-spin at the lower bound of the 3σ confidence interval (αHF=0.127), while they

are both high-spin at the upper bound of that same interval (αHF=0.283). If the objec- tive of a computational study is qualitative spin-state assignment, such an assignment

would be highly sensitive to functional choice. Experimentally [8], Fe(III)(NH3)6 is known to be high-spin, but HS-LS spin crossover occurs at 27.2% HF exchange, which is a higher fraction than is incorporated in B3LYP or PBE0. Despite challenges in qualitative assignment, quantitative changes in spin-state orderings are slightly lower: the shift in ΔEHS–LS in the confidence interval is -12 kcal/mol, from BLYP to B3LYP it is -15 kcal/mol, and the difference between a 20% and 25% HF exchange is -4 kcal/mol.

Previous work in this area [121,158–160] suggests that a ligand field theory picture that focuses on ligand strength following the spectrochemical series [161] may provide some, albeit tenuous, guidance regarding observations in HS-LS splitting. The CO ligand is the strongest in the spectrochemical series and should maximize octahedral

field splitting (Δo) between the three low-energy t2g and two high-energy eg states, while the NH3 ligand is considerably weaker and should have a smaller Δo value. Across the range of all HF exchange percentages, the LS state is relatively preferred for the strong CO with respect to the NH3 ligand. However, for high HF exchange

(40-50%), the HS state is the ground state for Fe(III)(CO)6 and the relative penalty of HS–LS HS–LS ΔE (CO) vs. ΔE (NH3) narrows. In a simplified LFT picture, increasing HF exchange is modulating the octahedral field splitting more dramatically for strong ligands (e.g. CO) than for weak ligands (e.g. NH3). Thus, these trends suggest that

46 too-high ratios of exact exchange in functionals will override established ligand field concepts.

2.3.2 Spin-state ordering: comparison with Fe(II) complexes

Qualitative trends in spin-state ordering with αHF(Fig. 2-3) previously observed for Fe(III) are preserved in Fe(II), but with slightly lower correlation coefficients (R2 = HS–LS 0.995-0.997). For Fe(II), carbon ligand systems have higher ∂∆E values: -151 ∂αHF kcal kcal mol·HFX for Fe(II)(CNH)6 and -155 mol·HFX for Fe(II)(CO)6, and this higher slope appears to correlate with higher ΔEHS–LS versus Fe(III) obtained at a GGA reference by 20 kcal/mol. Such a difference between Fe(II) and Fe(III) ΔEHS–LS diverges from ligand field theory, since in LFT, Fe(II) does not populate any additional high-energy HS–LS levels in the HS or LS state. The Fe(II)(NH ) and Fe(II)(NCH) ∂∆E values 3 6 6 ∂αHF kcal diverge slightly from their Fe(III) counterparts, reducing the slope to -63 mol·HFX in kcal the former case and increasing to -86 mol·HFX in the latter. Nevertheless, trends are preserved: here, spin-crossover from LS to HS occurs near the lower bound of the 3σ confidence interval, while both ligands prefer high spin at the upper bound.

Despite high correlation coefficients, the fit to linear trend lines appears poorer in the case of Fe(II) compared to Fe(III). It is likely that extra degrees of freedom associ- ated with the unpaired minority-spin 3d electron in Fe(II) make energetic predictions more sensitive to HF exchange ratios. Quadratic fits of the data were thus also ob- tained, leading naturally to an improved fit. First derivatives of the second order HS–LS polynomials give access to a range of instantaneous ∂∆E values. By definition, ∂αHF the derivatives obtained at 25% HF exchange from either approach match exactly,

but this single value is an underestimate for low and an overestimate for high αHF.

The carbon ligand slope ranges are from -106 (αHF=0.5) to -206 (αHF=0.0) for CO

and -114 (αHF=0.5) to -183 (αHF=0.0) for CNH. Nitrogen ligand ranges are slightly

smaller: -48 to -81 for NH3 and -50 to -125 for NCH. Such ranges are subject to the number of data points and nature of the higher order fit, but they provide a reference frame for evaluating the magnitude of variation of derivatives. Thus, although there

47 80

Fe(II)(CNH)6 60 Fe(II)(CO)6 Fe(II)(NH3)6 40 Fe(II)(NCH)6

(kcal/mol) 20 HS-LS

ΔE 0

-20

01020304050 % HF exchange

Figure 2-3: Relative high spin (HS)-low spin (LS) energy (ΔEHS–LS) in kcal/mol of four Fe(II) octahedral complexes (two nitrogen ligands: NCH and NH3 and two carbon ligands: CNH and CO) with 3σ confidence interval from normal distribution poll data on hybrid exchange functionals, as indicated with black dashed lines and black arrow.

HS–LS is a 23 kcal difference in ∂∆E from linear regression for Fe(II)/N complexes, mol·HFX ∂αHF this difference is relatively small. As in the case of Fe(III), use of a modified PBE0 functional produces comparable slopes to those from modB3LYP. The average devia- tion in slopes between the two approaches is 5% with the largest deviation being -10 kcal mol·HFX for Fe(II)(CO)6. However, in light of the non-linearity analysis for Fe(II), it becomes clear that discrepancies between the two classes of tuned functionals are within the uncertainty of the slope assignment.

Direct comparison of Fe(II) and Fe(III) nitrogen ligand trends permits identi- fication of the magnitude of differences between the oxidation states (Fig. 2-4). HS–LS The ΔE splitting of Fe(II)(NCH)6 and Fe(III)(NCH)6 is nearly identical for data points in the 10-40% HF exchange range. Small differences in ΔEHS–LS of 1-2 kcal kcal/mol shift the prediction of the numerical slope -8 mol·HFX from Fe(III) to Fe(II). HS–LS In contrast, ΔE shifts downward by nearly 20 kcal/mol from Fe(III)(NH3)6 to

Fe(II)(NH3)6. The two NH3 curves appear parallel for HF exchange below 20%, but HS stabilization rate declines at higher %HF exchange for Fe(II). Such observa-

48 HS–LS tions suggest ∂∆E that depends more strongly on direct ligand identity than on ∂αHF oxidation state. Therefore, prediction variability with exchange-correlation param- eter changes may be determined on a small set and broadly applied to an array of compounds.

30

Fe(II)(NH3)6 20 Fe(II)(NCH)6 Fe(III)(NH3)6 10 Fe(III)(NCH)6

(kcal/mol) 0

HS-LS -10 ΔE

-20

-30 01020304050 % HF exchange

Figure 2-4: Relative high spin (HS)-low spin (LS) energy (ΔEHS–LS) in kcal/mol of octahedral complexes with nitrogen ligands (NCH, indicated in red and NH3 , indicated in blue) for both Fe(II) (diamond symbols) and Fe(III) (square symbols) with 3σ confidence interval from normal distribution poll data on hybrid exchange functionals, as indicated with black dashed lines and black arrow. The trend lines for Fe(II) are dashed lines, while the trend lines for Fe(III) are solid lines.

In order to investigate whether the correlations observed thus far hold for larger HS–LS transition-metal complexes, we enlarge the data set for ∂∆E evaluation (struc- ∂αHF tures in Fig. 2-5). For the Fe(III) compounds, there is a narrow range of deriva- kcal tives from about -70 to -80 mol·HFX for the six nitrogen ligand sets and -105 to -110 kcal mol·HFX for the three carbon ligand sets. The agreement for direct-ligand-based spin- state splitting dependence on exact exchange is less strong for Fe(II) compounds: four

nitrogen complexes (NH3, bpy, terpy, and PEPXEP) have values around -110 to -120 kcal kcal mol·HFX while NH3 and NCH are -63 and -86 mol·HFX, respectively. For the Fe(II) kcal carbon ligand sets, CO and CNH slopes are around -155 mol·HFX while CN is slightly kcal α shallower at -130 mol·HFX. For almost all data points, the Fe(II) HF gradients are larger in magnitude than the Fe(III) αHF gradients.

49 Figure 2-5: Ball-and-stick models of nitrogen- and carbon-ligand structures for octa- hedral iron complexes with compound labels used throughout text. The atoms are color-coded with nitrogen in blue, carbon in gray, hydrogen in white, and iron in maroon.

2.4 Trends in charge localization measures

Self-interaction-error induced delocalization is one rationale for why pure density functionals fail to correctly predict relative spin-state energetics in transition metal complexes. Localized 3d electrons are expected to be particularly sensitive to SIE, and low-spin states permit greater delocalization through higher occupancy of bonding or- bitals than high-spin states do. In order to quantify the extent of charge localization

50 on the transition metal center, we compute formal charges with NBO natural popu- lation analysis. The difference in charge obtained between the HS and LS states is then a predictor of the relative charge-localization between the states:

Δ HS–LS HS LS q = qFe – qFe , (2.6) where the charge on Fe for all cases considered is positive (i.e. less than the atomic number of iron). A positive ΔqHS–LS corresponds to a net loss of electrons on the iron center from LS to HS states. For all 18 cases considered (both Fe(II) and Fe(III)), increasing the % of HF exchange increases the formal positive charge on both HS and LS states, which indicates an absence of charge localization on the Fe center through the inclusion of HF exchange. Instead, this effective delocalization of charge from the metal to neighboring ligands opposes the view on how HF exchange corrects SIE on transition metal valence states. A second metric is the dependence of the charge on percentage of HF exchange,

∆∆qHS–LS ∂∆qHS–LS slope = ≈ , (2.7) ∆αHF ∂αHF where this derivative is obtained from linear regression of charges with αHF. Posi- HS–LS tive values for the charge difference derivative, ∂∆q , indicate that the HS state ∂αHF loses electrons to surrounding ligands faster than the LS state, and a negative value corresponds to the reverse.

First, correlations between spin-state HF exchange dependence and the shift in charge from the LS to HS state (ΔqHS–LS) are compared for Fe(III) compounds (Fig. 2-6). Charge reduction from LS to HS is greater in the case of carbon (∼1.2 e– loss) than nitrogen (∼0.6-0.8 e– loss) ligands. Derivatives of ΔEHS–LS with respect to HF exchange demonstrate a strong correlation (R2=0.94) to the charge shift (ΔqHS–LS) evaluated at 20% HF exchange. That is, the more charge lost from the LS to the HS state on the iron center, the greater the stabilization of the HS state as the fraction of exact exchange is increased. The resulting linear-scaling relation from the least

51 squares fit is HS–LS ∂∆E ≈ HS–LS –73.5∆q@20% – 19.6. (2.8) ∂αHF

-70 CNH CO -80 CN

2 -90 terpy R = 0.94 bpy -100 HICPEQ PEPXEP -110 NCH -120 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3

Figure 2-6: Plot of the derivative of spin-state splitting with Hartree-Fock exchange (αHF) against the difference in Fe(III) NBO charges between high-spin (HS) and low- spin (LS) states. Results are shown for six nitrogen containing ligands (blue symbols) and three carbon containing ligands (red symbols), as indicated with legend. A linear regression fit and associated R2 value is also shown on the plot.

Extrapolation beyond the surveyed compounds may not be on firm footing, and collection of results on additional ligand sets could alter this linear-scaling relation. Nevertheless, this relationship suggests that if the relative charge increases on the HS state, then the dependence of the splitting will be reduced. In particular, a stationary HS–LS point (i.e. ∂∆E =0) may be extrapolated to occur at ΔqHS–LS of -0.27. That is, ∂αHF if the high-spin Fe(III) state accumulates charge with respect to the low-spin state, then increasing HF exchange from a pure GGA should not affect the HS-LS splitting. It is likely difficult to isolate such systems because covalent interactions are stronger in LS iron states than HS states, leading to a higher net charge. Nevertheless, this observation provides a search direction for identification of Fe(III) complexes that have relatively inert qualitative and quantitative spin-state orderings with respect to

52 HF exchange. While it may be preferable to continue to develop electronic structure methods that are increasingly accurate, narrowing the focus of materials screening and discovery efforts to exchange-correlation-inert compounds is a potential direction to circumvent many of the uncertainties faced in applications of practical DFT. This narrowing still leaves a wide chemical space open that is inclusive of many potentially catalytically active iron complexes.

It is worthwhile to see whether these correlations hold for Fe(II) compounds (Fig. 2-7), recalling that for Fe(II) the dependence of spin-state splitting on HF exchange exhibited more variability and non-linear behavior. A comparable correlation is found for the Fe(II) compounds, although with a reduced correlation coefficient (R2 = 0.67).

Three Fe(II) complexes (NH3, NCH, and CN) lie above the best-fit line, and a shal- lower dependence on charge would be obtained if those three points were excluded. Using all nine data points, a linear-scaling relationship is obtained,

HS–LS ∂∆E ≈ HS–LS –67.7∆q@20% – 47.0. (2.9) ∂αHF

Despite the reduced quality of the fit for Fe(II) complexes with respect to Fe(III), the linear-scaling relationship is consistent. The second derivative of spin-state splittings kcal/mol with respect to both HF exchange and charge (-73.5 vs. -67.7 HFX·–e– ) appears invariant to oxidation state of the transition metal and direct ligand identity. One difference for Fe(II) is the prediction of the stationary point for spin-state splitting with respect to HF exchange at a higher charge increase on the HS state of ΔqHF–LS=- 0.7 e–.

HS–LS Now considering all 18 complexes, ∂∆E is plotted against the partial deriva- ∂αHF HS–LS tive of the charge with respect to HF exchange, ∂∆q , in Fig. 2-8. Both positive ∂αHF HS–LS – and negative values of ∂∆q (from -1.2 to 0.2 –e ) are observed. Positive charge ∂αHF HFX shift derivatives indicate that HS states lose charge faster than LS states with increas-

ing αHF and are associated with Fe(III)-nitrogen complexes. The Fe(III)/N complexes also have the flattest dependence of spin-state splittings on HF exchange, suggesting that if increasing HF exchange causes depletion of charge in the HS state with respect

53 -60 terpy -80 bpy HICPEQ -100 PEPXEP 2 R = 0.67 NCH -120 CNH -140 CO CN

-160 0.6 0.8 1.0 1.2 1.4 1.6

Figure 2-7: Plot of the derivative of spin-state splitting with Hartree-Fock exchange (αHF) against the difference in Fe(II) NBO charges between high-spin (HS) and low- spin (LS) states. Results are shown for six nitrogen containing ligands (blue symbols) and three carbon containing ligands (red symbols), as indicated with legend. A linear regression fit and associated R2 value is also shown on the plot.

to the LS state, the HF-exchange derived stabilization of the HS state is reduced.

Fe(II)-carbon complexes exhibit opposite behavior, with charge depleting more HS–LS slowly from the HS state with increasing αHF and larger dependence of ΔE on HF exchange. The Fe(III)/C and Fe(II)/N complexes are intermediate in their values of charge dependence on HF exchange. For the best fit line, data from all oxidation

state/ligand combinations is included, excluding four outlying points: Fe(II)(NH3)6,

Fe(II)(NCH)6, Fe(II)(terpy)2, and Fe(II)(CN)6. Exclusion of these four compounds leads to a strong correlation (R2=0.93) for the remaining 14 data points with a linear- scaling relationship, ∂∆EHS–LS ∂∆qHS–LS ≈ 106 – 84. (2.10) ∂αHF ∂αHF

This relationship provides quantitative support for the observation that αHF-inert complexes are most probable for cases where charge accumulates on the HS state with increasing HF exchange. A stationary point for ΔEHS–LS may be extrapolated HS–LS – to ∂∆q = 0.8 –e . This quantitative result suggests that a GGA will give ∂αHF HFX

54 -60

FeII - C -80 FeII - N

FeIII - C -100 FeIII - N

2 -120 R = 0.93

-140

-160 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 ∂ΔqHS-LS

∂aHF

Figure 2-8: Plot of the derivative of spin-state splitting with Hartree-Fock exchange (αHF) against the high-spin-low-spin NBO charge difference derivative with αHF. Re- sults are shown for Fe(II) and Fe(III) with six nitrogen containing ligands (green symbols for Fe(II), blue for Fe(III)) and three carbon containing ligands (orange symbols for Fe(II), red for Fe(III)). Symbol shapes follow the legend in Figs. 2-6 and 2-7. A linear least-squares regression fit is indicated with a dashed line, with four outlying data points excluded from the fit as indicated by dashed outer circles.

the same spin-state ordering as a hybrid if increasing HF exchange causes increased localization of charge on the HS state with respect to the LS state. This result is somewhat surprising because when SIE is invoked for suggesting a hybrid functional over a GGA for 3d electrons, it is assumed that HF exchange should always enhance localization in a high-spin state. However, few compounds in the data set have positive derivatives of charge shift with respect to HF exchange. Identification of putative compounds that extend the set surveyed thus far would validate this prediction.

We now provide justification for exclusion of the four outliers from the linear regression in Fig. 2-8. Fe(II)(NH3)6, Fe(II)(NCH)6, and Fe(II)(CN)6 were previously observed to have lower than expected spin-state splitting dependence on HF exchange HS–LS (see Fig. 2-7). The linear fit to obtain ∂∆q for Fe(II)(NH ) and Fe(II)(CN) ∂αHF 3 6 6 is poorer with respect to other data points because first derivatives increase abruptly around 20-30% HF exchange. Earlier analysis also identified that depending on the

55 evaluation point (e.g. at lower HF exchange), values of for NCH and NH3 ligands could be evaluated as significantly higher than values obtained at 25% HF exchange.

For Fe(II)(CN)6, a net relative increase in 3d occupations in the HS state over the HS–LS LS state for increasing HF exchange (∂∆3d ) is double that observed for other ∂αHF compounds (1.2 e–/HFX vs. 0.6 e–/HFX for other compounds). Generally, both HS and LS states lose 3d occupancy with increasing and the LS state has higher overall

3d occupancy than the HS state. For Fe(II)(CN)6, the LS state loses electrons much more rapidly than the HS state, suggesting that HF exchange has a much stronger

effect on the LS state than the HS state. For Fe(II)(terpy)2, this data point is omitted to avoid asymmetric exclusion of outlying data points, which would skew the trend

line. One further justification is that the Fe(II)(terpy)2 complex was challenging to HS–LS converge and there is some increased variability in the fit for ∂∆q as a result. ∂αHF

2.5 Corroborating geometric and energetic relation- ships

In all studies, we observe consistent increased variability in both energetic and charge descriptions of Fe(II) octahedral complexes over Fe(III) complexes. There also ap- pears to be a stronger dependence of energetic properties on HF exchange for carbon ligands with respect to nitrogen ligands. Structural correlations (ΔRHS–LS in Table 2.2 and the partial derivative in Table 2.3) support the charge and energetic trends. As expected, bond elongation occurs for octahedral complexes in the HS state, but the range of ΔRHS–LS at 20% HF exchange over the Fe(II)/carbon complexes is largest both in magnitude (avg=0.38 Å) and range (max-min=0.08 Å). This ΔRHS–LS shift is smallest for Fe(III)/nitrogen complexes (avg=0.16 Å), and the range is also re- duced. The partial derivative of bond length shifts with respect to HF exchange is negative in all cases with comparable averages across all compounds of about -0.15 to Å -0.09 HFX. The Fe(II)/carbon octahedral complexes again have the highest degree of variability. Negative derivatives in the case of all complexes indicate that the inclu-

56 sion of increasing percentages of HF exchange makes the formal bond order between HS and LS states more equivalent. In Fe(II) complexes, this behavior corresponds to bond elongation in the LS state and unchanging or slight decrease for HS state bond lengths, suggesting that HF exchange is reducing covalent hybridization in LS state bonding orbitals. The majority of Fe(III) complexes instead exhibit negative deriva- tives due to decreases in bond length for the high-spin complex, indicating stronger bonding, and unchanging bond lengths in the low-spin complex.

ΔRHS–LS(Å) M-L class avg min max Fe(II)-C 0.38 0.34 0.42 Fe(II)-N 0.21 0.2 0.23 Fe(III)-C 0.26 0.25 0.28 Fe(III)-N 0.16 0.15 0.17

Table 2.2: Difference in bond length at 20% Hartree-Fock exchange between high- spin (HS) and low-spin (LS) complexes. The average (avg), minimum (min), and maximum (max) differences are reported for each class of metal-ligand complexes.

HS–LS ∂∆R (Å/HFX) ∂αHF M-L class avg min max Fe(II)-C -0.15 -0.28 -0.05 Fe(II)-N -0.12 -0.17 -0.10 Fe(III)-C -0.14 -0.18 -0.11 Fe(III)-N -0.09 -0.11 -0.08

Table 2.3: Derivative of bond length differences between high-spin (HS) and low-spin (LS) complexes with respect to Hartree-Fock exchange. The average (avg), minimum (min), and maximum (max) differences are reported for each class of metal-ligand complexes.

2.6 Quantitative vs. qualitative spin-state ordering

In earlier spin-state dependence observations (see Figs. 2-5 and 2-6), qualitative spin- state assignment and quantitative spin-state dependence on HF exchange appeared decoupled. That is, if HS and LS states were nearly degenerate at 20% HF exchange,

57 the variation of those spin-state splittings with respect to HF exchange was reduced compared to cases where the HS-LS splitting was high. We now consider the strength of this correlation on all 18 octahedral complexes in this study (Fig. 2-9). Over the data set, HS complexes at 20% HF exchange have the smallest dependence of splittings on HF exchange. Such a result suggests that stable HS complexes should be more exact-exchange inert. When the 18 octahedral complexes are grouped by oxidation state, relatively good correlations are observed in the fit for all nine Fe(III) complexes (R2=0.80) and nine Fe(II) complexes (R2=0.72). In both cases, a stationary point for HF exchange dependence of spin-state splitting is predicted at a ΔEHS–LS around -65 to -73 kcal/mol. Such compounds would be both quantitatively and qualitatively identified as high-spin and would likely correspond to weak field ligand interactions. A recent study [121] of eight density functionals provides further validation that stronger ligands are more sensitive to differences in functional parameters including, but not limited to, HF exchange mixing [121, 158–160]. Weak ligands that can test the exchange inert ranges should be discoverable, as ΔEHS–LS in isolated Fe2+ and Fe3+ gas phase ions is -86 and -134 kcal/mol, respectively [162]. Another target for extrapolation from linear scaling relations is that the energy should change no more than 2 kcal/mol (i.e.,  1 kcal/mol chemical accuracy with respect to the midpoint of the normal distribution) over the 3σ confidence interval (12.7-28.3% HF exchange) in the functional. Using this measure, such narrow uncertainties would occur for octahedral complexes with relative HS spin state energetics at around -53 to -64 kcal/mol.

2.7 Conclusions

In this study, we have quantified ranges of property prediction for transition metal complexes based on exchange-correlation functional choice over a range of the most commonly used functional properties (i.e. GGA hybrids with a typical HF exchange range (3σ) of 12.7-28.3%). Despite the proliferation of a "zoo" of functionals, we observed qualitative agreement for various functionals within a functional class when

58 -60 FeII - C FeII - N -80 FeIII - C 2 III R = 0.80 Fe - N -100

-120

2 R = 0.72 -140

-160 -10 010203040

Figure 2-9: Plot of the derivative of spin-state splitting with Hartree-Fock exchange (αHF) against spin-state splitting at 20% HF exchange. Results are shown for Fe(II) and Fe(III) with six nitrogen containing ligands (green symbols for Fe(II), blue for Fe(III)) and three carbon containing ligands (orange symbols for Fe(II), red for Fe(III)). Symbol shapes follow the legend in Figs. 2-6 and 2-7. Two best-fit lines are obtained for Fe(II) and Fe(III) compounds separately, as indicated by dashed lines. comparing spin-state ordering across ten representative Fe(II) and Fe(III) octahedral complexes with various ligands. We used a fixed GGA/LDA ratio modified B3LYP functional to study the dependence of properties across the 3σ interval and beyond (0- 50%). With increasing HF exchange, we observed strong high-spin stabilization over low-spin complexes, as much as 1-2 kcal/mol per 1% HF exchange. High HF exchange (> 30%) even overrode qualitative ligand field theory arguments by stabilizing high- spin, strong-ligand Fe(CO)6 complexes. While HS-LS energetics depend strongly on HF exchange, the strength of variation was linear in nature and broadly applicable across oxidation state, and varied most with respect to the element of the direct ligand. These observations suggest that HF-exchange-dependence may be straightforwardly determined across broad classes of many materials by a handful of calculations on representative molecules.

59 We then identified the extent to which tuning HF exchange changes the underly- ing charge density. We made the surprising observation that partial charge decreases on iron as HF exchange is increased, corresponding to 3d electron delocalization to ligands. This runs counter to typical explanations of how HF exchange can approx- imately correct self-interaction errors. Further, we identified a general correlation between HS-LS splitting with HF exchange and HS-LS charge differences and first derivatives, suggesting that when the charges are more balanced between HS and LS states, the energy dependence on HF exchange should be reduced. This work echoes prior observations of the lack of a one-size-fits-all percentage of HF exchange for transition-metal complexes. Nevertheless, we have identified relative uncertainties in spin-state ordering that correlate well with broad metal and ligand identities, and we have identified weak field ligands and balanced HS-LS partial charges to be good objectives for materials to consider in catalyst and materials design because molecules with such properties will have reduced sensitivity to exchange-correlation functional choice.

60 Chapter 3

Effect of meta-GGA exchange

Density functional theory exchange and correlations approximations are often ex- pressed in terms of a "Jacobs ladder" [163], where the first rung is the local density approximation (LDA) that depends only on the density at a given point, and in- creasing information about the environment through first and second derivatives of the density are incorporated into the second rung generalized gradient approxima- tion (GGA) and third-rung meta-GGAs, respectively. The popular non-local, hybrid functionals that incorporate a fraction of exact exchange from Hartree-Fock (HF) the- ory and therefore belong to a generalized Kohn-Sham formalism [164] have typically demonstrated [12, 165, 166] improved performance over pure xc approximations in a wide variety of molecules [94,120,167]. Where the first three rungs of pure xc approx- imations are generally expected to improve descriptions of electron correlation, the incorporation of HF exchange serves to more efficiently correct self-interaction errors (SIE) [168] introduced in the Kohn-Sham approach to modeling Coulomb repulsion of each electron with the total density of the system [3].

Nevertheless, the 10-100x additional cost of evaluating non-local exchange within periodic boundary conditions [169, 170] as well as the high-sensitivity and therefore accuracy of transition metal complex properties with HF exchange [94, 119, 120] has motivated the ongoing development [8,154,171–176] of increasingly-accurate, pure xc approximations. In the solid state, the alternative DFT+U approach [130], wherein

61 a Hubbard model correction [177, 178] is added to local or semi-local functionals to approximately correct SIE with little or no additional computational cost. DFT+U [179,180] necessitates determination of the atoms and subshells to which a U correc- tion should be applied as well as the magnitude of U, either through fitting or calcu- lation. Therefore, there has been ongoing interest in developing higher-rung, pure xc approximations (i.e., meta-GGAs) as an efficient black box approach for transition- metal-containing molecules and solids. The improved performance of meta-GGAs has been demonstrated with the TPSS [8] functional for transition metal diatomics [29], the more heavily-parameterized M06-L [181] for transition metal complexes heats of formation, and the recently-introduced SCAN [171] for transition-metal oxide poly- morphs [182] and generally for solids [183]. Previous work [94,121,126] has addition- ally shown evidence that meta-GGAs improves over SIE-prone GGA predictions of spin-state ordering for octahedral transition-metal complexes in comparison to GGAs. In an effort to improve predictions of hybrid functionals, most meta-GGAs have been developed in conjunction with HF exchange and although there have been many sys- tematic studies of the role of HF exchange on the spin-state ordering of transition metal complexes, no such systematic study of the role of meta-GGA exchange tuning has been carried out.

In this study, we introduce and apply a strategy to continuously tune from GGA to meta-GGA exchange and identify the effect of meta-GGA exchange on energetic and electronic properties of transition-metal complexes.

3.1 Theory

The most commonly employed exchange-correlation approximations within practical DFT are those that depend only locally on the density (i.e., the local density approxi- mation or LDA) or the re-scaled gradient of the density (i.e., the generalized-gradient approximation or GGA). The widely-employed Perdew-Burke-Ernzerhof (PBE) [153]

62 GGA exchange energy is:

∫ PBE ρ ρ ρεUEG ρ ρ PBE EX ( α, β) = x ( α, β)Fx (s) dr, (3.1)

UEG where εx is the exchange energy potential of the uniform electron gas. The ad- ditional contribution of the electron density gradient to standard LDA exchange is PBE incorporated through the exchange enhancement factor, Fx :

λ FPBE = 1 + λ – , (3.2) x μs2 1 + λ where λ and μ are non-empirical constants selected to recover the local-spin density linear response [184] and to satisfy the Lieb-Oxford bound [185], and s = s(ρ, ∇ρ) is a functional of the reduced density gradient. Despite the formally parameter free nature of the PBE functional, it and other GGAs suffer from self-interaction error (SIE) due to non-cancellation by the xc functional of the overestimated Coulomb repulsion. Because low- and high-spin states of open-shell transition metal complexes have differing fractions of delocalized (bonding) versus localized (non- or antibonding) states, LDA or GGA descriptions of exchange within DFT are known to systematically destabilize high-spin states [94, 121]. Ganzenmüller et al. [120] quantified this effect

on FeH6, showing that increasing GGA exchange favors low-spin states, whereas increasing HF exchange favors high-spin states. Beyond the LDA and GGA, meta-GGA functionals represent a higher rung on the "Jacobs ladder" of DFT by incorporating the laplacian of the density. Among pure meta-GGAs, the Tao, Perdew, Scuseria, and Staroverov (TPSS) [8] meta-GGA satisfies all of the exact or nearly exact constraints satisfied by PBE including 3 additional constraints [186] without relying on any empirical parameters. The TPSS functional has produced promising results on both molecules [187] and solids [188]. Furche and Perdew showed that the TPPS meta-GGA has improved accuracy in predicting energetic and structural properties for transition metal dimers and oxides compared to GGAs without significant computational overhead [29]. The TPSS exchange energy is expressed in a manner similar to the PBE GGA

63 exchange energy:

∫ TPSS UEG TPSS Ex (ρα, ρβ) = ρεx (ρα, ρβ)Fx (s) dr, (3.3)

with a modified enhancement factor:

λ TPSS λ Fx = 1 + – x . (3.4) 1 + λ

Here λ is the same constant as in 3.2 and x = x(ρ, ∇ρ, ∇2ρ) is a functional of the electron density, the electron density gradient, and the electron kinetic energy through the reduced laplacian of the electron density, α, as defined by Tao et al. [8]. The similarities in exchange energy expressions for PBE and TPSS enable construction of a combined form of exchange where the pure gradient contribution can be separated from the gradient-Laplacian contribution. Including an additional weighting factor

ax between the two contributions to exchange, the meta-GGA enhancement factor becomes:

λ TPSS/PBE λ Fx = 1 + – 2 . (3.5) axx+(1–ax)μs 1 + λ

By varying ax, we can smoothly vary exchange from the PBE GGA limit (ax → 0) to the TPSS meta-GGA limit (ax → 1) in a similar fashion as exact exchange can be tuned in hybrid functionals [94]. The exchange enhancement factor for intermediate values of is a non-linear admixture of the PBE and the TPSS values (Figure 3-1). This tuning strategy serves to provide insight on the effect of meta-GGA exchange on spin-state energetics and electronic structure properties for the transition metal complexes studied here. Recently, Sun, Ruzsinszky and Perdew developed the strongly constrained and appropriately normed (SCAN) [171] meta-GGA that satisfies all 17 exact constraints appropriate to a semi-local functional, intended for broad applicability in molecular [189] and solid state chemistry [190]. Unlike TPSS, the highly non-linear functional form for the enhancement factor, Fx, in SCAN does not allow the continuous shift

64 s 1.6 1.6 a) b) 1.5 1.5

1.4 1.4 x x

F 1.3 F 1.3

1.2 1.2 PBE TPSS PBE TPSS/PBE α=0.0 α=1.0 1.1 1.1 α=0.0 α=1.0 α=0.1 α= ∞ α=0.1 α=∞ 1 1 0 0.5 1.01.0 1.5 2.0 2.5 3.0 0 0.5 1.0 1.5 2.0 2.5 3.0 s

Figure 3-1: (a) TPSS (solid lines) and PBE (dashed line) enhancement factors,Fx , as functions of the reduced gradient s for four different values of the reduced kinetic energy, α. (b) TPSS/PBE (solid lines) and PBE (dashed line) enhancement factors, Fx , as functions of the reduced gradient s for four different values of the reduced kinetic energy, α, at ax = 0.5.

from GGA to meta-GGA exchange and we therefore focus on TPSS tuning in this work. Nevertheless, due to the promise of SCAN that has recently been demonstrated for transition metal complexes [182], we also provide comparison between absolute predictions of TPSS and SCAN (see Sec. 3.4).

3.2 Computational details

Atoms. For the +2- and +3-charged transition metal atoms (Sc-Ni), we performed single point energy calculations in the CPU-based open source software GAMESS- US [191] using Dunnings correlation consistent triple-ζ basis set [192] (CCPVTZ) and the following exchange and correlation functionals: pure Hartree-Fock, PBE [153], revTPSS [176] and the modified TPSS functional in 10-25% increments of meta-GGA exchange, starting at 0% up to 100% meta-GGA. For all unrestricted calculations, the default values for level shifting and convergence tolerance (1x10–5 a.u.) were used. The calculated energies for the single atoms were compared with reference data from the National Institute of Standards and Technology [162] (NIST) database. The NIST energies, electron configurations, and spin multiplicities studied in this work are summarized in Table 3.1.

65 Inorganic complexes. For each of the 8 transition metals studied here (Sc-Ni), we generated a set of 3 transition metal complexes, one coordinated with a strong field carbonyl ligand, one with the intermediate field ligand ammonia, and one with the weak field water ligand for a total of 24 structures. The same oxidation and spin states were studied as in the single atom calculations by assigning charge and spin state for the total complex. For each of the octahedral structures generated, we performed initial geometry optimizations in GAMESS-US with the 6-31G* basis set [193] for all atoms. We employed the modified mTPSS functional in increments of 10% meta-GGA exchange starting at 0% (pure PBE exchange) up to 100% meta- GGA (pure TPSS exchange) for a total of 11 calculations. Default values were used for level shifting and the convergence criteria were set to 1x10–4 a.u. for both the geometry optimization and the self-consistent field calculation. For each optimized structure, we then performed a single point energy calculation with the larger triple-ζ def2TZVP basis set [194] from Ahlrichs and co-workers using the same values for level shifting and the SCF convergence criteria as in the smaller basis set.

Preparation and Post-processing. All structures, input files, and job scripts were generated using molSimplify [195] (see Chapter 4). For the single atoms, no ligands were specified, while for the octahedral complexes, we specified one ligand at a time and generated symmetric octahedral six-coordinate metal complexes. Quantitative determination of the charges and occupation of subshells (i.e. 3d and 4s) was obtained from the GAMESS-US interface with the Natural Bond Orbital (NBO) v6.0 package [151].

Topological analysis was performed within the context of quantum theory of atoms in molecules (QTAIM) [196] using the Multiwfn post-processing package [197]. QTAIM relies on quantum observables such as the electron density, ρ(r), and energy densities to define atoms and bonds within molecules. The topology of the electron density is dominated by the attractive forces of the nuclei that correspond to a sharp maximum of the electron density at each nucleus. A consequence of nuclear maxima in the electron density distribution is the association of an atom with a region of space the boundaries of which are determined by the balance in the forces the neighboring

66 nuclei exert on the electrons. A critical point (CP) in the electron density is a point in space where the first derivatives of the density become zero:

CPs = {P ∈ R3 : ∇ρ(P) = 0}. (3.6)

These critical points can be classified according to their rank (ω) and signature (σ) and are symbolized by (ω,σ). These two measures correspond to properties of the Hessian matrix defined as:   ∂2ρ ∂2ρ ∂2ρ    ∂x2 ∂x∂y ∂x∂z   ∂2ρ ∂2ρ ∂2ρ  H(rCP) =   . (3.7)  ∂yx ∂y2 ∂y∂z  ∂2ρ ∂2ρ ∂2ρ ∂z∂x ∂z∂y ∂z2 r=rBCP

The rank is the number of non-zero curvatures of ρ at the critical point whereas the signature is the algebraic sum of the signs of the curvatures. There are only four types of stable critical points that are classified according to their rank and signature as:

• (3,-3) corresponds to a local maximum of ρ

• (3,-1) corresponds to a local maximum of ρ in the plane defined by the corre- sponding eigenvectors but is a minimum along the third axis perpendicular to this plane

• (3,+1) corresponds to a local minimum of ρ in the in the plane defined by the corresponding eigenvectors but is a maximum along the third axis perpendicular to this plane

• (3,+3) corresponds to a local minimum of ρ.

Each type of critical point can be matched with an element of the chemical struc- ture as:

• (3,-3) Nuclear critical point (NCP)

67 • (3,-1) Bond critical point (BCP)

• (3,+1) Ring critical point (RCP)

• (3,+3) Cage critical point (CCP).

By joining adjacent identified BCPs we can partition space into basins that can then be assigned to atoms with NCPs that lie within these basins. Properties such as charge or electron delocalization can then be calculated by integrating functions of the electron density within these basins. In this application, we will identify the BCPs along a metal-ligand bond and then calculate properties of interest at these points that can be useful descriptors of the bond strength and the electron delocalization.

3.3 Effect of meta-GGA exchange on single ions spin- state splittings

In the first set of calculations, we evaluated the energies of different spin-states on the set of 8 first-row transition metal ions from Ti to Cu in the +2 and +3 oxidation states using 0% and 100% meta-GGA exchange that correspond to PBE and TPSS exchange respectively The results for the various spin states were then compared with available experimental data (Table 3.1) from the NIST database and the corresponding error in the relative energetics of high- and low-spin states computed as:

Δ HS–LS Δ HS–LS Error = EDFT – ENIST . (3.8)

The results indicate that meta-GGA exchange is able to improve predictions for spin-state splittings in most of the cases studied (Figure 3-2) compared to GGA val- ues. Pure GGA exchange overestimates the splittings in the case of early (Ti2+, V2+/3+, Cr3+) and late (Co2+, Ni2+/3+, Cu3+) transition metal ions, whereas un- derestimates the splitting for mid-row transition metals (Cr2+, Fe2+/3+, Co3+) with the exception of Mn2+ where GGA exchange predicts a larger energy splitting. Intro- ducing meta-GGA exchange reduces the error in 10 out of 13 cases with the strongest

68 M(II) M(III) Element HS LS IS HS LS IS

triplet singlet – – – – Ti(3d2/–) 0.00 1.05 – – – – quartet doublet – triplet singlet – V(3d3/3d2) 0.00 1.48 – 0.00 1.36 – quintet singlet triplet quartet doublet – Cr(3d4/3d3) 0.00 3.12 2.08 0.00 1.87 – sextet doublet quartet quintet singlet triplet Mn(3d5/3d4) 0.00 4.85 3.32 0.00 – 2.56 quintet singlet triplet sextet doublet quartet Fe(3d6/3d5) 0.00 3.76 2.40 0.00 5.83 3.99 quartet doublet – quintet singlet triplet Co(3d7/3d6) 0.00 2.10 – 0.00 4.45 2.83 triplet singlet – quartet doublet – Ni(3d8/3d7) 0.00 1.73 – 0.00 2.46 – – – – triplet singlet – Cu(–/3d8) – – – 0.00 2.01 –

Table 3.1: High- (HS), low- (LS) and intermediate-spin (IS) states with corresponding electron configurations and experimental relative energies (in eV) from the NIST database for the 8 first-row transition metals considered. effect observed in mid-row transition metal complexes where the error drops below 5 kcal/mol in all cases. For the early transition metals Ti2+ and V3+ the error is slightly reduced by 1 kcal/mol whereas in the case of V2+ and Cr3+ the already overestimated spin-state splitting is further increased by approximately 5 kcal/mol. Similarly, meta-GGA exchange slightly reduces the error of Ni2+/3+ and Cu2+ by 2,1 and 2 kcal/mol respectively whereas the spin-state splitting error for Co2+ is increased by 2 kcal/mol.

3.4 Dependence of spin-state ordering on meta-GGA exchange

The next set of calculations included evaluation of the spin-state splitting of coordina- tion complexes with the modified TPSS functional. In order to assess the sensitivity

69 30

20

10

0

-10Error(kcal/mol)

-20 2+ 3+ 2+ 3+ 2+ 2+ 3+ 2+ 3+ 2+ 3+ 2+ 3+ Ti V V Cr Cr Mn Fe Fe Co Co Ni Ni Cu

Figure 3-2: Error (in kcal/mol) for the relative energetics of the high- and low- spin states (ΔEHS–LS) for 8 first-row transition metal ions in the +2 and +3 oxidation state. Gray circles indicate results at 0% and triangles results at 100% meta-GGA exchange. Green and red correspond to improved and worsened predictions respec- tively.

of the results to the choice of meta-GGA functional used, we initially performed cal- culations using both the original TPSS and the SCAN meta-GGA functionals. The TPSS and SCAN meta-GGAs predict spin-state splittings of comparable accuracy with respect to a CASPT2 reference for a set of four representative iron octahedral coordination complexes (Table 3.2). The unsigned average disagreement between TPSS and SCAN is 4 kcal/mol, with smaller discrepancies for carbonyl (3.2 kcal/mol) and phosphine ligands (2.9 kcal/mol), whereas the largest disagreeement is observed for the weak-field, hexa-aqua complex (5.6 kcal/mol). These meta-GGA differences do not correspond to a systematic preference in spin state, nor one that is strongly ligand-dependent, as SCAN slightly favors the HS state in two cases (strong-field car- bonyl and weak-field water ligands) and LS state in the other two cases (strong-field phosphine and weak-field ammonia). In terms of qualitative spin-state assignment, the two meta-GGAs differ only in the predicted ground state for Fe(II)NH3, which is an LS ground state with SCAN and HS with TPSS, although this 4 kcal/mol

70 prediction difference would be sensitive to inclusion of zero-point vibrational energy, entropic, and solvation effects. Furthermore, these small differences between the two functionals do not lead to systematic improvement with respect to a CASPT2 refer- ence for one over the other. The mean unsigned error is comparable for TPSS (21.9 kcal/mol) and SCAN (20.0 kcal/mol). The maximum error for TPSS (25.9 kcal/mol) is also only slightly larger than the SCAN maximum error (22.9 kcal/mol). Although a systematic tuning of the contribution of meta-GGA exchange is only straightfor- ward with the TPSS functional, we anticipate the similar performance of the two functionals should make conclusions drawn from analysis of TPSS calculations valid for SCAN as well.

Complex TPSS SCAN Error TPSS Error SCAN Reference Fe(II)(PH3)6) 36.4 39.3 -19.1 -16.2 55.5 [198] Fe(II)(CO)6 71.5 68.3 23.8 20.6 47.7 [198] Fe(II)(NH3)6) -1.7 2.6 18.6 22.9 -20.3 [199] Fe(II)(H2O)6 -20.8 -26.4 25.9 20.3 -46.7 [199] Mean absolute – – 21.9 20.0 – error

Table 3.2: Energy difference (ΔHS-LS) in kcal/mol for representative iron octahedral coordination complexes calculated with TPSS and SCAN and the def2TZVP basis set. The corresponding error and mean absolute error (in kcal/mol) with respect to CASPT2 (complexes) reference values are also reported.

We approximate the partial derivative of the relative electronic energy between HS–LS HS and LS states (ΔE ) with respect to meta-GGA exchange (ax) with a linear regression fit: ∆∆EHS–LS ∂∆EHS–LS ≈ . (3.9) ∆ax ∂ax ΔΔ HS–LS The E provides a measure of sensitivity of spin-state energetics to meta- Δax GGA exchange. We also introduce the notation "MX" where one unit of MX corre-

sponds to the range from ax = 0 (0%) to ax = 1 (100%) meta-GGA exchange. The ΔΔ HS–LS calculated E for the octahedral complexes and the single metal atoms indicate Δax a strong metal and ligand dependence of the ΔEHS–LS meta-GGA sensitivity (Figure kcal kcal 3-3) with values ranging from -7.7 mol·MX for Fe(II)(CO)6 to 12.9 mol·MX for the

71 isolated Mn2+ atom. Note, the partially filled point in Figure 3-3 that corresponds to Mn3+ was excluded from the atom analysis due to significant spin contamination in the singlet spin-state with a reported = 1.95 versus the expected value of 0.0. Results on spin-contamination-free atoms instead indicated a negative slope of kcal -26.2 mol·MX.

15 atom CO 10 NH 3 OH 5 2

0 (kcal/mol / MX) / (kcal/mol x E Δα ΔΔ -5

-10

-15 2+ 3+ 2+ 3+ 2+ 3+ 2+ 3+ 2+ 3+ 2+ 3+ 2+ 3+ Ti V V Cr Cr Mn Mn Fe Fe Co Co Ni Ni Cu

Figure 3-3: Plot of the linear regression approximation for the partial derivative of spin-state splitting with meta-GGA exchange (αx) across the periodic table for single atom calculations (green diamonds, Mn3+ case is partially filled for reasons indicated in the main text) and the 3 corresponding octahedral complexes M(CO)6 (gray squares), M(NH3)6 (blue triangles) and M(OH2)6 (red circles).

ΔΔ HS–LS As with atoms, E values for early (Ti2+,V3+) and late transition metal Δax complexes (Ni3+, Ni2+, Cu3+) are small, owing to the smaller difference in the num- ber of unpaired electrons on the metal center between high- and low-spin states in d2-d3 and d7-d8 electron configurations. Across the periodic table, complexes with

weak-field ligands (e.g., OH2) demonstrate comparable, if slightly reduced, spin-state sensitivity to the isolated atoms. Although NH3 is nominally a stronger field ligand,

M(NH3)6 complexes display comparable behavior to the atoms with again slightly reduced sensitivities compared to some of the atoms and hexa-aqua complexes (e.g., Fe(III)). Thus for isolated metal atoms, ammonia, and water complexes, HS states are

72 generally penalized in favor of LS states with added meta-GGA exchange. Conversely, complexes with the strong-field CO ligand exhibit the opposite trend with the most ΔΔ HS–LS negative E of -7.7 kcal observed for Fe(II). As we have shown previously Δax mol·MX for Hartree-Fock exchange [94], we may use the 0% meta-GGA exchange HS-LS spin state splitting as a measure of ligand field and correlate it with the spin state sensitiv- ΔΔ HS–LS ity, E (Figure 3-4). This good correlation for mid-row complexes (R2=0.88) Δax indicates that 1 kcal/mol increase in the HS-LS splitting shifts the meta-GGA ex- change sensitivity 0.15 units more negative. Similar correlations were obtained for early- and late-row transition metal complexes with good accuracy (Figure 3-5).

10 R 2 = 0.88

5

0 (kcal/mol / MX) / (kcal/mol

HS-LS -5 x E Cr(II) Mn(II) Fe(II) ∆α ∆∆ Mn(III) Fe(III) Co(III) -10 CO NH 3 OH 2 -40 -20 0 20 40 60 80 ∆ H-L EGGAx (kcal/mol)

Figure 3-4: Plot of the linear regression approximation for the partial derivative of spin-state splitting with meta-GGA exchange (αx) against the spin-state splitting at 0% meta-GGA exchange. Results are shown for the octahedral complexes of the mid- row transition metals Cr(II)-Co(III) with symbols representing the different metals colored corresponding to the ligands: CO (gray), NH3 (blue) and OH2 (red). A linear regression fit and associated R2 value is also shown on the plot.

Among the mid-row transition metal complexes considered, the Mn(III)(CO)6 complex with near degenerate LS and HS states is the most meta-GGA exchange ΔΔ HS–LS inert with a E of 0.7 kcal . Molecules that prefer the LS at 0% meta- Δax mol·MX ΔΔ HS–LS GGA exchange ( ΔEHS–LS > 0) have the most strongly negative E whereas Δax HS complexes (ΔEHS–LS < 0) have the most strongly positive spin-state-energetics

73 Figure 3-5: Plot of the linear regression approximation for the partial derivative of spin-state splitting with meta-GGA exchange (αx) against the spin-state splitting at 0% meta-GGA exchange. Results are shown for the octahedral complexes of the early- (Ti(II)-Cr(III)) and late-row (Co(II)-Cu(III)) transition metals with symbols representing the different metals colored corresponding to the ligands: CO (gray), 2 NH3 (blue) and OH2 (red). A linear regression fit and associated R value is also shown on the plot.

dependence with meta-GGA exchange. These results thus suggest that meta-GGA exchange in all transition metal complex cases considered reduces the absolute value of the HS-LS energetic splitting, regardless of the ground state identified with the GGA or meta-GGA functional. This behavior is in contrast to HF exchange tuning [94, 119, 120] where increasing HF exchange always favors the HS state over the LS state. In that case, HS-LS splittings are increased for weak field ligands and decreased for strong field ligands. The reversal of sign in meta-GGA exchange sensitivity with ligand field thus merits further evaluation to identify the chemical origins.

3.5 Trends in charge localization measures

In order to quantify the extent of charge localization on the transition metal center, we compute metal formal charges with NBO natural population analysis. The difference in the calculated charge between the HS and the LS state represents the relative

74 charge-localization between the states,

Δ HS–LS HS LS q = qM – qM (3.10)

A positive ΔqHS–LS, which we will refer to as the charge shift, corresponds to a net loss of electrons (charge increase) on the metal center from the LS to the HS state.

For all complexes considered, increasing ax increases the partial charge of the metal center for both HS and LS states with an effective delocalization of electron density from the metal to the neighboring ligands. Alternatively, this effect may be viewed as electron localization but onto the ligand, not onto the metal. Despite differences between HF exchange and meta-GGA exchange tuning behavior on energetics (see Ch. 2), the observations of electron localization onto the ligand here is consistent with our previous observations for HF exchange tuning [94]. Comparison between spin-state energetics dependence on meta-GGA exchange Δ HS–LS and charge shift from LS to HS state at 0% meta-GGA exchange ( q0% ) reveals a good correlation (R2=0.86) for the mid-row (Cr(II)-Co(III)) octahedral complexes (Figure 3-6 left). Charge increase from LS to HS is greater for the carbonyl complexes (ca. 0.8-1.6 e– loss), followed by the ammonia (ca. 0.35-0.75 e– loss) and water com- ΔΔ HS–LS plexes (ca. 0.25-0.5 e– loss). The spin-state splitting sensitivity, E becomes Δax Δ HS–LS more negative as the charge shift, q0% , increases The resulting linear-scaling relation from the least squares fit is

ΔΔ HS–LS E Δ HS–LS = –14.1 q0% + 14.2. (3.11) Δax ΔΔ HS–LS This relationship suggests that a stationary point (i.e. E = 0) may be Δax Δ HS–LS ∼ – interpolated to occur at a q0% of 1.01 e . That is, if the HS state has a metal partial charge reduced by one electron with respect to the LS state, then the HS-LS energy splitting is invariant to meta-GGA exchange. Complexes of Cr(II)-Mn(III) Δ HS–LS – with CO lie the closest to that stationary point with a q0% of 0.85-0.9 e and negligible spin-state splitting sensitivity to ax. In the weak-field ligand complexes Δ HS–LS – studied here (e.g. NH3, OH2), q0% is less than 0.7 e in all cases with smaller

75 15 15 a) Cr(II) Mn(II) Fe(II) b) Mn(III) Fe(III) Co(III) 10 CO 10 NH 3 H2O

5 5 R2=0.82 2

R =0.86 MX) / (kcal/mol (kcal/mol / MX) / (kcal/mol 0 0 HS-LS HS-LS E

E -5 -5 ∆α ∆∆ ∆α ∆∆ -10 -10 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 -4 -3 -2 -1 0 1 HS-LS ∆ρHS-LS (a.u.) x10-2 ∆q (-e-) BCP GGAx

Figure 3-6: Plot of the linear regression approximation for the partial derivative of spin-state splitting with meta-GGA exchange (αx) against the difference in the NBO charges between high-spin and low-spin states at 0% meta-GGA exchange (left) and the average difference in the electron density between high-spin and low-spin states at the metal-ligand bond critical point for 0% meta-GGA exchange (right). Results are shown for the octahedral complexes of the mid-row transition metals Cr(II)-Co(III) with symbols representing the different metals colored corresponding to the ligands: CO (gray), NH3 (blue) and OH2 (red). A linear regression fit for each plot and associated R2 value is also shown.

ΔΔ HS–LS values corresponding to increasing E and stabilization of LS states. Δax In addition to a metal-centered view, we may interpret differences in meta-GGA sensitivity through bond-centered metrics. We compared the average electron density difference between HS and LS state at the metal-ligand BCPs for 0% meta-GGA exchange:

ΔρHS–LS ρHS ρLS BCP = BCP – BCP, (3.12)

ρHS where BCP is the average electron density at the metal-ligand BCPs for the HS ρLS state calculated at 0% meta-GGA exchange and BCP the corresponding for the LS state. Electron density decrease from LS to HS is greatest for strong-field ligand CO cases (ca. 0.01-0.04 a.u. loss) corresponding to the largest decrease in metal- ligand bonding from LS to HS (Figure 3-6 right). In contrast to CO, complexes

with weak-field ligands (NH3, OH2) show smaller and more variable density differ- ΔρHS–LS ence between HS and LS states with BCP in the range of -0.01 to 0.01 a.u.. The

76 only complexes that have an increase in bonding density in the HS state over the

LS state are Cr(II)(OH2)6, Cr(II)(NH3)6 and Mn(III)(OH2)6 where the metalligand bond length increases from LS to HS complexes only by 0.08 Å, 0.14 Å and 0.07

Å respectively compared to a larger increase (> 0.10 Å for OH2 and > 0.20 Å for

NH3) for the corresponding complexes of the other metals. The strong correlation ΔΔ HS–LS (R2=0.82) of ΔρHS–LS with E suggests that a large relative density accumu- BCP Δax lation at the BCPs for LS states results in stabilization of HS states with increasing ΔρHS–LS ax, whereas smaller BCP values correspond to stabilization of LS states as meta- ΔρHS–LS GGA exchange is increased. The BCP predicted to produce meta-GGA exchange invariant structures is -0.022 a.u.. Increased electron density at the BCP indicates ΔρHS–LS stronger bonding and thus the negative BCP values observed for most complexes indicate that bonding is consistently weaker in HS complexes over LS complexes.

In order to determine whether inclusion of meta-GGA exchange also alters relative properties of the electron density, we also examine the sensitivity of the charge shift, HS–LS Δq , to meta-GGA exchange(ax):

∆qHS–LS ∂qHS–LS ≈ , (3.13) ∆ax ∂ax

where this approximation is again obtained from the slope of a linear regression

fit of the charge shift with ax. Negative values for the charge difference derivative Δ HS–LS approximation, q , indicate that the charge difference between HS and LS states Δax decreases with increasing ax resulting in closer metal charges, and a positive value Δ HS–LS corresponds to the opposite. Negative q values are observed in all cases (Figure Δax e– 3-7) ranging from -0.08 to -0.01 MX indicating LS states have stronger charge shift sensitivity to meta-GGA exchange. Among complexes with the same ligand, the charge shift for metals with large charge difference between HS and LS states such as Cr(II) and Mn(III) (Figure 3.5 left) is most sensitive to meta-GGA exchange whereas complexes of Fe(II) and Co(III) that have similar charges in the HS and LS states show decreased sensitivity of the charge shift to meta-GGA exchange. Δ HS–LS Although the sensitivity of q to the charge shift shows similar trends for Δax

77 15

10 H O 2 NH 3 2 R = 0.91 5 2 (kcal/mol / MX) / (kcal/mol R = 0.99 0 HS-LS x

E CO Δα ΔΔ -5 R 2 = 0.79 -10 -0.08 -0.06 -0.04 -0.02 0.00 ΔΔ q (e-/ MX) Δα x

Figure 3-7: Plot of the linear regression approximation for the partial derivative of spin-state splitting with meta-GGA exchange (αx) against the linear regression approximation for the derivative of the difference in the NBO charges between high- spin and low-spin states with . Results are shown for the octahedral complexes of the mid-row transition metals Cr(II)-Co(III) with symbols representing the different metals colored corresponding to the ligands: CO (gray), NH3 (blue) and OH2 (red). Three linear regression fits and associated R2 values corresponding to the three ligands are also shown on the plot.

the 3 ligands, the different intercepts in the linear regression lines suggest opposite ways of interference between the electron density and the spin-state energetics (Table Δ HS–LS Δ HS–LS 3.3). Starting at q ≈ 0 and with decreasing q , the density difference Δax Δax Δ HS–LS interacts constructively with the energetics for CO complexes with q < 0 in Δax all cases whereas destructive interaction is observed for the weak-field ligands (NH3, Δ HS–LS Δ HS–LS OH ) with q > 0 for all q < 0. 2 Δax Δax

78 Ligand Slope (kcal/mol/MX) Intercept (e–/MX) R2 CO -109.9 -9.0 0.79 NH3 -114.1 0.9 0.99 OH2 -141.3 4.6 0.91

Table 3.3: Slopes, intercepts and R2 values of the linear regression approximation for the partial derivative of spin-state splitting with meta-GGA exchange (ax) against the linear regression approximation for the derivative of the difference in the NBO charges between high-spin and low-spin states with ax. Results are grouped by ligand and correspond to mid-row transition metal complexes.

3.6 Combined effect of HF exchange and meta-GGA exchange

In order to assess the total influence of both HF and meta-GGA exchange on represen-

tative strong-field, Fe(II)(CO)6 and weak-field, Fe(II)(NH3)6 cases, we approximate the effect of both as a composite spin-state splitting:

ΔΔ HS-LS ΔΔ HS-LS Δ HS-LS Δ HS-LS E E E = EGGA + ax + aHF, (3.14) Δax ΔaHF Δ HS-LS where EGGA is the spin-state splitting calculated at 0% HF exchange with ΔΔ HS-LS modB3LYP [94], E is the approximate partial derivative of the splitting with ΔaHF respect to HF exchange and aHF the fraction of HF exchange. A composite spin- state splitting approximated at any fraction of HF or meta-GGA exchange may be compared to literature benchmark values calculated with complete active space per- turbation theory (CASPT2) [198, 199] references (Figure 3-8). We observe stronger dependence of spin-state splitting on HF exchange than meta-GGA exchange, e.g. for HS-LS Fe(II)(CO)6, ΔE decreases by around 60 kcal/mol over the range of 0-40% HF exchange that is typically used in hybrid functionals, but only by around 8 kcal/mol over the range of 0-100% meta-GGA exchange. For Fe(II)(NH3)6, HF and meta- GGA exchange have opposite effects on ΔEHS-LS, with a HS-LS splitting decrease of 25 kcal/mol observed over the range 0-40% HF exchange, whereas including full meta-GGA exchange increases this spin state splitting by 5 kcal/mol. The synergistic effect of meta-GGA and HF exchange on the spin-state splitting of

79 100 a) Fe(II)(CO)6 80

60

40 %mGGA 80 20 60 40 0 0 5 10 15 20 25 30 35 40 20 100 0 b) Fe(II)(NH )63 −20 80 −40 60 −60 −80 40 %mGGA

20

0 0 5 10 15 20 25 30 35 40 %HF

Figure 3-8: Spin-state splitting, ΔEHS-LS, in kcal/mol based on the % of HF exchange (x-axis) and the % of meta-GGA exchange (y-axis) included for Fe(II)(CO)6 (top) and Fe(II)(NH3)6 (bottom). Benchmark CASPT2 values [198,199] are indicated with a green solid line in both cases.

Fe(II)(CO)6 indicates that in this case incorporation of meta-GGA exchange decreases the amount of HF exchange required from 12% to 7% to reach quantitative agreement with CASPT2 (Figure 3-8 top). This reduction suggests that meta-GGA exchange may be incorporated to improve predictions of spin-state ordering for complexes coor- dinated with strong-field ligands, which may be useful for analogous periodic systems where explicit incorporation of HF exchange in plane-wave basis sets is computation-

ally prohibitive. If a pure meta-GGA calculation were carried out for Fe(II)(CO)6,

80 the HS-LS splitting would be overestimated only by 11 kcal/mol whereas a GGA (e.g., PBE) would produce a 20 kcal/mol error. The opposite behavior is observed

for the weak-field Fe(II)(NH3)6 complex where an already high HF exchange ratio of 24% needed for a GGA hybrid to reach quantitative agreement with CASPT2 is increased to 32% by inclusion of meta-GGA exchange (Figure 3-8 bottom). The high HF exchange ratio needed for weak field ligands is likely due to our earlier ob- servations of decreased HF exchange-sensitivity of HS-LS splittings for weaker field ligands, i.e., HF exchange is less efficient in those cases for correcting GGA-derived self-interaction errors. Here, the opposing effect of meta-GGA exchange effectively reduces the efficiency of HF exchange tuning even further. The different amounts of required HF exchange observed in our data are in agreement with the literature where arguments for both high and low percentages of included HF exchange have been pro- posed in conjunction with either GGA or meta-GGA functionals for the description of transition metal complexes. For complexes beyond those studied here (Figure 3-9), the difference in electron density along the bond in the LS state with respect to the HS state can provide a guide for whether the complex is in the strong- or weak-field limit. Over our data set containing the mid-row transition metal octahedral complexes and the additional Fe(II) coordination complexes (Table 3.4), values of the electron localization function (ELF) at the BCP In LS are 0.10-0.17 for strong-field complexes (e.g., CO), whereas

this number is reduced to 0.04-0.07 in weaker-field ligands (e.g., NH3 and OH2).

81 Figure 3-9: Ball-and-stick models of nitrogen-, carbon- and oxygen-ligand structures for octahedral iron (II) complexes with compound labels. The atoms are color-coded with nitrogen in blue, carbon in gray, oxygen in red, hydrogen in white, sulfur in yellow, and iron in brown.

ΔΔ HS-LS Complex E ELF@BCP Reference Δax

Fe(II)(PH3)6 -4.7 0.17 LS [198] Fe(II)(CO)6 -7.7 0.11 LS [198] Fe(II)(NCH)6 1.9 0.06 HS [198] Fe(II)(NH3)6 4.9 0.05 HS [199] Fe(II)(H2O)6 6.2 0.06 HS [199] – Fe(II)(NCS )6 3.4 0.06 HS – Fe(II)(NCO )6 4.5 0.07 HS HICPEQ -1.9 0.13 LS DEVDUD -2.0 0.14 LS Fe(II)(NO)7 -3.6 0.15 LS [200] Fe(II)(terpy)2 -3.6 0.15 LS [198] – Fe(II)(phen)2(NCS )2 -1.3 0.11 LS [198] BENVIZ 4.8 0.07 HS Table 3.4: Linear regression approximation for the partial derivative of spin-state ΔΔ HS-LS splitting with meta-GGA exchange ( E ) in kcal/mol/MX, average electron lo- Δax calization function (ELF) at the bond critical points (BCPs) and reference ground states for an extended set of 8 Fe(II) octahedral coordination complexes. Reference spin-states are from experiment, CASPT2 calculations, or ligand-field theory approx- imations (indicated by italics).

82 3.7 Conclusions

In this chapter, we used a modified TPSS functional to study the dependence of properties across the 0-100% meta-GGA exchange range for first-row transition metal ions and coordination complexes. We observed very low sensitivity of early- and late-row transition metal ions and octahedral complexes whereas mid-row transition metal ions and complexes showed higher sensitivity to meta-GGA exchange. High- spin stabilization over low-spin complexes was predicted for ions and complexes with weak-field coordinated ligands with the opposite trend observed for complexes with strong-field coordinated ligands. Further analysis of the mid-row transition metal complexes showed that meta-GGA exchange reduces the absolute high- vs low-spin energy splitting in all cases favoring degeneracy among spin-states. We then identified the extent to which underlying charge density properties corre- late with the observed energetic trends. We observed that the metal partial charge is always higher in the HS states and increases with meta-GGA exchange, correspond- ing to 3d electron delocalization to ligands. The magnitude of the charge difference depends on the ligand-field character with strong-field ligands showing higher rela- tive delocalization between HS and LS states whereas this difference is smaller in the case of weak-field ligands. Similar conclusions are drawn by calculating the electron density at the bond critical points indicating that in HS complexes electron density is further delocalized to the ligands from both the metal center and the bond when compared to LS states. Finally, we investigated the composite effect of HF exchange and meta-GGA ex- change in calculating the spin-state splitting of octahedral coordination complexes. Opposite behavior is observed for strong- and weak-field cases with meta-GGA ex-

change acting in synergy with HF exchange for the Fe(II)CO6 complex indicating that meta-GGA exchange can be incorporated to improve predictions of spin-state ordering for complexes with strong-field ligands. The opposite behavior is observed for weak-field ligands such as NH3 where inclusion of meta-GGA exchange effectively reduces the efficiency of HF exchange tuning even further.

83 84 Chapter 4

Automatic structure generation

Reprinted in part with permission from [195]. Copyright 2016 John Wiley and Sons Inc.

High-throughput screening has taken center stage in guiding heterogeneous catal- ysis [40] and materials discovery, [41] where the order of crystal structure lattices re- stricts the possible number of structures that must be computationally enumerated. Molecular catalyst motifs remain an attractive target for computational screening. Although some experimental [44, 45] and computational [33, 46–48] screening efforts have been carried out for inorganic complex discovery, robust, user-friendly tools for the rapid generation and assessment of inorganic complexes are not widely available.

To accelerate discovery in , both flexible generation of 3D structures and efficient first-principles calculations to evaluate properties are neces- sary. In organic chemistry, molecular screening benefits from tools that have been developed to store [54, 55] and analyze [57, 58, 64] chemical structures and connec- tivity, facilitating rapid property evaluation based on connectivity [65] or generation of 3D structures from these descriptions [66, 67]. Within the first-principles commu- nity, OpenBabel [201] and Avogadro [93] provide users access to these rapid structure generation tools in a user-friendly graphical user interface (GUI). In the solid state, the Atomic Simulation Environment [92] provides a useful Python interface to peri- odic electronic structure codes. However, such cheminformatics and heterogeneous

85 catalysis tools are not straightforwardly extended to molecular inorganic chemistry. Instead, users have relied on searching commercial databases of experimentally char- acterized compounds. More recently, [88,89] fragment-based and rules-based structure generation tools have been developed for 3D coordinate generation, as long as free- to-use fragments are placed into a predefined database, but the seamless connection to first-principles modeling in an open source framework is not yet available. We introduce an efficient, reliable approach to the high-throughput screening of inorganic complexes in our open source molSimplify toolkit. This streamlined proce- dure enables chemical discovery by automating molecular and intermolecular complex structure generation, job preparation and execution, and post-processing analysis to elucidate correlations of electronic or geometric descriptors with energetics. In the Code Overview section, we provide a description of the code layout and the detailed routines involved in (i) structure generation, (ii) simulation automation, and (iii) post-processing analysis. We then present Benchmarking results of our approach over a 190 molecule test set. Finally, we provide the Conclusions and outlook for our software toolkit.

4.1 Code overview

The developed software utility (molSimplify) is an open source workflow that incor- porates geometric manipulation routines necessary for the generation of transition metal complexes, automated setup and execution of electronic structure calculations, and post-processing data analysis. The software is designed for maximum flexibility, supporting a range of coordination numbers, metals, ligand identities, and ligating geometries. The code both builds coordination complexes starting from a single metal atom and functionalizes complex structures or generates supramolecular complexes. Although intended for inorganic chemistry, these tools are straightforwardly appli- cable to a number of challenges ranging from studying non-covalent interactions in organic molecules [202] to building complex, selectively functionalized molecules such as dendritic polymers [203]. Our structure generation tools are designed to enable

86 fast reaction mechanism screening by evaluating binding interactions and generating candidate reactive intermediates in catalysis. The code accelerates high-throughput chemical discovery through interaction with internal and external chemical databases to generate new candidates and with both internal and external postprocessing rou- tines to aid data interpretation and trend elucidation. In addition to a commandline version, molSimplify is available through a GUI to make the code accessible to the broader scientific community.

4.2 Code architecture

The molSimplify code is written in Python 2.7 [204], a modern high-level programming language that is widely used in the computational chemistry community. Python’s object-oriented approach enables natural representation of components we manip- ulate in our framework including atoms and molecules, molecular fragments, and supramolecular complexes. The molSimplify program makes use of the OpenBa- bel [68] toolbox via the pybel Python bindings [201] in order to convert between chemical formats and perform force field optimizations of generated structures. Porta- bility of the code is maintained by including other commonly used Python modules such as NumPy in the installation. Users may run and install the molSimplify on any or linux-based (i.e., Mac OSX) platform where Python and OpenBabel are available. The user interacts with molSimplify in one of three ways: at a commandline interface (CLI), with an input file, or through a GUI. The GUI has been developed using the Qt5 library with its corresponding Python bindings (PyQt5). The GUI enables at-a-glance molecule building in a manner intended to be intuitive for the broader scientific community. Thus, the code consists of two distinct parts: i) the modules that build the GUI and interact with the user and ii) the processing core that executes structure generation and processing features. The framework is comprised of over 13,000 lines of code and includes three main modules (Figure 4.1) described in greater detail in the rest of this chapter:

87 1. The main structure generation module builds and modifies inorganic com- plexes and supramolecular complexes. It also generates the appropriate input files and job scripts to run first-principles quantum chemistry calculations on the generated structures.

2. The database module interacts with external chemical databases for ligand search and retrieval to be fed back into the structure generation module.

3. The post-processing module contains routines that monitor progress of quan- tum chemistry calculations, analyze their outcomes, and post-process wavefunc- tion and charge density data to elucidate structure-property correlations. This module also interacts with external codes for some electronic structure analysis.

input CLI file GUI

exe post- structure database process generation search DB

analysis multi- .xyz input molecule summary scripts file

.wfn CPU .out

Figure 4-1: Flowchart for molSimplify process.

We have represented all objects in the molSimplify code by classes ranging from the GUI class with subclasses for its various widgets to the atom3D and the mol3D class that represent atoms and molecules, respectively, in the structure generation module. These structure-oriented classes contain routines that perform geometric operations, manipulate the molecules by adding, deleting or combining with other molecules, and calculate molecular or fragment properties. Example properties include the center

88 of mass of a molecule, root-mean square deviation (RMSD) distance of one molecule with other molecules, and the number of hydrogen atoms connected to specific atoms. The code stores both core (e.g., metal) and ligand structures in internal libraries that are prepopulated with some common metals and ligands. The GUI enables the user to add to these internal libraries and supports user inputs in the form of 2D representation Simplified Molecular Input Line Entry System (SMILES) strings [61] or 3D coordinates from .mol [205] or .xyz files. The code supports user-specified ligand taxonomy in the form of ligand groups and designations for whether a ligand only adds to organic cores or forms metal-ligand bonds, aiding subsequent database retrieval. In order to convert 2D SMILES representations to 3D coordinates, we combine OpenBabel [68, 201] modules with our custom molecule class mol3D that carries out geometric manipulations relevant for inorganic complexes.

4.3 Structure generation

The structure generation module is the central core of molSimplify and introduces tools with broad application in computational chemistry. This module synthesizes wide-ranging features including: simple transition-metal complex generation, cus- tom molecule functionalization, functional group replacement, database search and retrieval for ligands in new molecule discovery, and supramolecular complex genera- tion.

4.3.1 General approach

In the standard inorganic complex structure generation approach, the user specifes a i) central metal, ii) coordination number and geometry, iii) a list of ligands to fill the coordination sphere, and iv) the numbers and indices of ligand atoms that coordinate to the metal. Using this user input (example of input file and GUI input shown in Figure 4-2), the code then generates and writes 3D coordinates of the inorganic complex.

89 Coordination templates

The code supports metal coordination numbers ranging from 1 to 8 with native sup- port for a total of the 13 most common coordination geometries (e.g., CN=6 trigonal prism or octahedral). These geometries are encoded by a coordination template (CT) that is a set of points in space with the appropriate positioning and angles between a metal center and direct ligand atoms (e.g., 7 points in space for an octahedral complex CT). The code also supports custom molecular cores (see Custom Cores) by defining a backbone that includes all porphyrin atoms and the points in space for connecting atoms (e.g. of proximal and distal ligands). The user may define custom CTs (e.g. CN > 8) by adding them to the coordination dictionary in the Data folder and providing a geometry of the CT. Once added, this new CT will be automatically incorporated into the GUI and CLI. molSimplify flexibly modifies user-input structures, as outlined in the Custom Cores and Modify Function sections, and we recommend employing these features rather than adding CTs unless the user has frequent need for an alternate CT. The default CTs correspond to the most symmetric forms for each CN and geometry, but the user may specify a distortion angle that measures the degree of displacement with respect to the original metal-ligand bond vector in spherical coordinates. For high-throughput screening, a randomized distortion feature may be selected to add increasing amounts of noise to the final geometry.

Denticity-dependent ligand alignment

Strategies for ligand alignment to a CT differ based on the ligand’s denticity. The code identifies ligand denticity by the number of connecting atoms in the user-provided or internal library definition of the ligand. For instance, a carboxylate may be specified as either a monodentate or bidentate ligand, depending on the defined connecting atoms (i.e, 1 vs. 2). The code sorts all ligands and prioritizes the ones with highest denticity. The larger number of constraints in aligning multidentate ligands as well as their typically bulky nature typically necessitates preferential alignment in order

90 to minimize overall steric repulsion in the final complex. However, the user may choose separately both where to place unique by choosing to force molSimplify to place ligands in the coordination site specified or to additionally override the order in which ligands are aligned (Fig. 4-2).

Figure 4-2: Example octahedral complex generation in GUI (left) or with input file (right).

The multiple connection atoms in multidentate ligands simplify their alignment with respect to non-chelating ligands (Figure 4-3). The code places the loaded ligand randomly with respect to the CT and translates the ligand to align the first connect- ing atom to the target connection point on the CT. The code minimizes the distance of the second connecting atom to the additional CT connection point through rota- tions (Figure 4-3). The code determines the second connection point from a look-up dictionary of denticity-dependent connection point combinations on a CT. If align- ment is poor between a bidentate ligand and the CT, the ligand angles are stretched by the code until the MMFF94 force-field evaluated energy is raised by no more than 5 kcal/mol per ligand. The final ligand structure is then aligned symmetrically. For higher-denticity ligands, remaining atoms are aligned through rotation and retrieval from the lookup table as well. Select tetradentate ligands (e.g. EDTA) are supported,

91 but very high-denticity ligands may be better described as a custom core with func- tionalization points (see Section 4.3.2). As an example, (see Fig. 4-6) may be generated by loading a custom core from the pre-populated internal database or by specifying two cyclopentadienyl ligands with a custom ’CM’ connection atom representing the center of the ring to the metal.

Monodentate alignment is slightly more challenging due to fewer constraints on ligand placement. The code identifies a free connection point on the CT to align the connecting atom. If not defined by the user or internal library, the first atom in the monodentate ligand is the default connecting atom. As in the multidentate case, the code places the monodentate ligand randomly with respect to the CT and then translates the ligand to align the connecting atom to an available connection point (Figure 4-3).

The code then defines a fragment of the ligand consisting only of the connecting atom and ligand atoms directly bonded to the connecting atom as a ’sub-molecule’. The alignment routine rotates the ligand until the sub-molecule’s center of mass (COM) is aligned along the axis defined by the metal center and connection point (Figure 4-3). The sub-molecule COM alignment is necessary for suitable alignment of bulky and asymmetric ligands. When the COM coincides with the connecting atom, as in highly symmetric ligands, the code performs additional rotations to minimize any overlap between any atoms in the ligand to the metal core.

Next, the code rotates the ligand around the metal-ligand axis to minimize steric repulsion with adjacent ligands (Figure 4-3) by simultaneously maximizing average distance between the ligands’ centers of mass (dcom1–com2) and the minimum distance between any two atoms (d12min). We employ an optimization scheme that varies rotations to maximize the objective function:

α Fopt = dcom1–com2 + log d12min, (4.1)

where α is a constant calculated by trial and error.

92 Figure 4-3: Alignment procedure for bidentate (left) and monodentate (right) ligands including alignment of connecting atoms (CAs) to connection points (CPs), rotating submolecules (SMs) around bond vectors (BVs), and extra rotations to minimize steric repulsion for monodentate ligands.

Metal-ligand distances

The final step in alignment of ligands of any denticity is ligand translation to set the metal-ligand (ML) bond length. The ML distances are set based on an internal library of geometry-optimized bond lengths obtained with density functional theory (DFT). The chosen ML value is specific to the user-specified metal-ligand combination, charge

93 state and spin state. If the code cannot find an exact match for the ML distance in the library for the user-specified combination or a match to metal and direct ligand elements only, the code instead uses the sums of covalent radii of the two atoms. The user may optionally specify custom values for the ML bond lengths or force use of covalent radii sums through the GUI or CLI. The user may use custom ML values for each ligand, for instance, to generate Jahn-Teller distorted octahedral complexes with longer axial bonds than equatorial bonds (M-L bond entry is shown in GUI in Figure 4-2). Once all operations on the ligand are completed, the code updates the core structure by combining the previous core with the aligned ligand into a single mol3D molecule object.

Force field optimization

In order to produce an optimal starting structure for first-principles simulation, the molSimplify code carries out multiple partial force field optimizations during structure generation. As a ligand is generated from SMILES or loaded from 3D coordinates, the user may choose to pre-optimize the individual ligand structure. We have found that this procedure is particularly useful for building very bulky ligands that are challenging for the OpenBabel 2D SMILES to 3D coordinate transformation. The code performs this and other optimizations by default with the MMFF94 [206] force field, but we also support other force fields (e.g. the universal force field, UFF [207]) available via the OpenBabel module. Once the entire structure is built, the code will optionally optimize the entire structure. In this case, the core and all connecting atoms are held fixed, and the force field optimization serves to further minimize steric repulsion between ligands. This divide-and-conquer approach for partial treatment with chemistry-specific DFT-trained values and selective force field optimization step is valuable for producing near-optimal starting geometries (see Section 4.5).

94 4.3.2 Customized cores

Within molSimplify, the user may choose more complex initial structures around which new ligands or functional groups are attached, e.g. in studying the effect of functionalizing a porphyrin. We refer to this feature as the ’custom core’ option in which the user may specify a multi-atom structure obtained from 3D coordinates (.xyz or .mol) or a SMILES string. Such an approach is also useful in studying complexes with multiple metal centers. In the current version of molSimplify, the main distinction between functionalizing a standard or custom core is that in the latter case the structure may only be functionalized with singly-coordinating ligands. In this approach, the user specifies the custom core coordinates and the connection atoms in the structure that will be functionalized. If multiple functionalizations are desired at a single point in the structure, the connection atom should be listed each additional time. These connection atoms are the non-terminal atoms to which new functional groups should be added, and any existing excess hydrogen atoms on that connection atom in the custom core will be removed (Figure 4-4).

Figure 4-4: Phenyl structure generation from custom core functionalization (left) of porhpine by or modification (right) of a functionalized porphine with benzene. Connecting atoms (red), modification point and downstream atoms (green), and connection point (blue) are highlighted.

95 Ligand alignment is carried out in a similar manner to the standard procedure except for an alternative definition of the target alignment connecting point for each ligand. We determine the optimal position of the connecting atom by randomly placing a dummy carbon atom near the specified custom core connection atom and minimizing steric repulsion from adjacent atoms through dummy atom rotation in spherical coordinates and force field optimization. This approach is customized for the case where the connection atom is a metal atom itself including both minimizing sterics and setting the M-L bond from the database. In both cases, the goal is to minimize overlap between the newly placed ligand and the original structure. Once this optimum position for the connection atom is identified, the structure building proceeds as in the simple core case, including individual ligand optimization. We repeat this procedure for all ligands that must be added to the custom core and optionally carry out a final force field optimization of the completed structure to further minimize any ligand-ligand steric repulsion.

4.3.3 Modify function

The molSimplify code will also manipulate loaded custom core structures by replacing existing singly-bonded functional groups in a feature we refer to as the ’modify func- tion’. As with a custom core, the user loads the structure and then selects atoms on the molecule as modification points with the ’replace’ checkbox selected. This atom will be removed along with any downstream connected to the modification point. In this process, the molecule is broken into sub-molecules as determined by connectiv- ity of the structure obtained from identifying bond distances smaller than 90% of the sums of covalent radii. The code identifies the sub-molecule to replace by determining the one that has a smaller number of atoms (Fig. 4-4). The new user-specified ligand is then aligned along the bond vector of the previously removed sub-molecule, and the new functional group to core bond distance is assigned first by sums of covalent radii, followed by standard rotations and optional force field optimization to reduce steric repulsion. The user may screen initial guesses for transition states in catalysis by providing a custom core for functionalization or modification that is a good initial

96 guess to the transition state.

4.4 Additional features

4.4.1 Random generation

The constrained random generation module for building inorganic complexes is a unique feature of our framework. The user specifies at minimum the core of the complex and the coordination number but has the flexibility to further constrain the search. If the user specifies no ligand identities, ligands are extracted from the internal database of candidate ligands. One may specify a list of potential ligands, the ligand frequency, and to randomize only over a subset of the total ligands (Figure 4-5). The user then specifies how many molecules are desired, and the code creates a randomized sample that fall inside the constraint space.

database search random generation

external internal DB (SDF) DB or list

Property metal n , MW, elem at n ligands align ligands m CN Pattern T F eg C(=O)[O-] random ! ! " ? or ligand set

# results ! +=1 align ligand multi- molecule .xyz.xyz # results file .xyz

Figure 4-5: Chemical discovery workflow with optional constrained external database search (left) and constrained random structure generation (right).

97 In addition to generating coordinates, the code creates an input file so that each randomly generated structure may be reproduced at any time. The flexibility of this approach enables both i) randomized screening of ligands at a single coordination site (e.g., axial ligands in a porphyrin) and ii) high-throughput generation and discovery of transition-metal complexes beyond the limits of human intuition.

4.4.2 Database search

The molSimplify framework optionally interacts with freely available multi-million molecule chemical databases [80–85, 208]. In order to take advantage of this feature, the user must download the database of interest (e.g. ChemBL [84] or eMolecules [208]) in a structure-data file [205] (SDF) format. The code guides the user through the setup of these databases during installation, as outlined in the user guide. If the user downloads additional databases and places them in the database folder, the code automatically includes them in the database menu. This feature is in addition to the built-in internal database to which users may add their own custom molecules in SMILES, .xyz, or .mol format with the help of the code interface. We designed the external database feature to enable users to screen a large ligand set with a given metal-ligand connectivity in common and survey the effect of indirect ligand effects, a frequent goal in catalyst and materials design. Here, we harness tools developed by the informatics community to search for and filter molecules from large databases. The module utilizes OpenBabel to support matching patterns in molecular structure using a SMILES string, the more flexible SMILES extension known as the SMILES arbitrary target specification (SMARTS) [65], or to a 3D structure. In order to aid structure generation, the user should specify a SMILES or SMARTS pattern (Figure 4-5) and indicate the atom numbers in that pattern that are target connecting atoms. New databases can be downloaded and if placed in the appropriate folder and they will be automatically included in the database menu. In order to accelerate the search, we suggest generating a fast search file using OpenBabel before using a new database in molSimplify. The structures matching the search are retrieved and stored in a multimolecule file

98 with SMILES strings in the first column and the appropriate connection atoms that match the original pattern in the second column. For instance, if the user wishes to search for carboxylate (e.g., C(=O)[O–]) ligands to be aligned in a bidentate fashion to a metal core, the user would specify connecting atoms as 2,3. If the search retrieves acetate (i.e., CC(=O)[O–]), then the code will report the connecting atoms for that result as 3,4 in the multi-molecule file. If the user does not specify connecting atoms at runtime, the code defaults to reporting the connecting atom corresponding to the first atom in the pattern. The connecting atoms may be manually revised at any time. The database feature supports constraints such as total number of desired results, range of numbers of atoms or molecular weight, number of bonds, or chemical proper- ties encoded by flexible SMARTS strings. In addition to OpenBabel-driven filtering, our built-in routines filter out multi-fragment structures, salts, and duplicate struc- tures. Once the multimolecule file is generated, the user may evaluate the search results and build molecules using the structure generation tool. The "draw ligands" feature generates a vector graphic (SVG) of all molecules in the test set with labeled atom indices to aid inspection of the data set and for any revising of connection or connecting atom selections prior to high-throughput screening. During structure gen- eration, the user specifies the multimolecule file as a ligand, and the connecting atoms are read in from the file. The code loops through and generates an inorganic complex for each specified structure in the file. This approach is also compatible with the constrained random generation and user-defined or internal-database structures for remaining coordinating ligands. Although intended for high-throughput screening in inorganic chemistry, the database search feature has broad applicability for sampling chemical space.

4.4.3 Supramolecular complex building

Non-covalent intermolecular interactions are of central interest in many areas of chem- istry and catalysis. In order to aid screening of such interactions, the molSimplify code also generates supramolecular complexes. For this special case, the user specifies

99 a base molecule, which may be simultaneously optionally functionalized. Then the user selects the "extra molecule" check box and defines an additional molecule to be placed around this base structure. This additional molecule may be any structure that can be represented by a SMILES string or 3D coordinates, including another inorganic complex originally built by molSimplify. If the user does not specify any input, random values are selected in the code’s spherical coordinate system centered at the center of mass of the base molecule for the radial distance (r), azimuthal angle (ϕ) and polar angle (θ) (Figure 4-6). As with constrained random generation, input files are generated alongside each structure so they may be regenerated at any time. z

θ r

y φ x

Figure 4-6: Supramolecular complex generation coordinates: distance (r) and angles (ϕ, θ) of the additional molecule COM with respect to the base molecule COM.

Alternatively, the code will generate a user-specified number of structures under constraint of specified lower and upper distance and angle limits. This supramolecu- lar generation consists of up to 200 randomized attempts to satisfy user constraints while simultaneously avoiding overlap between the two molecules. The distance spec- ified here is the additional distance between the sums of covalent radii between base molecule atoms and additional ligand atoms. Covalent ligands should be generated

100 using the standard approach. If the user wishes to obtain more directional initial con- figurations of the supramolecular complex, one may specify a subset of ligand atoms to be aligned to the core. Once the final positions are determined, the two molecules are combined into a single coordinate file. The user may optionally request that a line dividing the two molecules is inserted into the coordinates, as is customary in symmetry adapted perturbation theory or energy decomposition analysis calculations.

4.4.4 Simulation automation

Once the coordinates of the structures are generated, the program places the .xyz files into separate folders that are named with a unique identifier corresponding to the metal, ligands, and coordination geometry of the complex. A common next step in a catalyst or materials discovery workflow is to calculate properties of generated structures with first-principles methods. Our code incorporates a file generator that assists in the construction of input files and scripts for common quantum chemistry codes and high-performance computing queuing systems, respectively. Currently, molSimplify prepares input files for the quantum chemistry packages TeraChem [147, 209], GAMESS [191], and Q-Chem [210]. The user specifies the charge, spin state, type of calculation, method, basis set and any additional input that would be required for the calculation. Importantly, the code "fact checks" some input choices in order to minimize user error. If the user-selected charge and spin state are incommensurate with the number of electrons in the associated structure, the code will present a warning and suggest an alternate charge assignment. Additionally, the code will suggest a total charge based on the charge state of selected ligands and a user-selected oxidation state that aids in input generation. Another example of this fact-checking is that when a user wishes to carry out a non-closed shell calculation in TeraChem, the user should employ an unrestricted method with optional level- shifting [211]. Therefore, when a user chooses a non-singlet spin multiplicity, the input generation automatically defaults to enforcing these other options. The user may also enter a series of common input commands that is stored for future use by selecting the "make default" option.

101 Furthermore, in order to facilitate method benchmarking and convergence studies, the user may specify multiple parameter values (e.g., exchange-correlation functionals or spin states) and the program will generate multiple input files each corresponding to one set of input parameters. The molSimplify code generates job scripts for submission to the SGE [212] or SLURM [213] queuing systems, which are used in a majority of supercomputers and local computer clusters. The user can specify job identifiers, designate a queue, run wall time, required memory, modules that are loaded by the environment modules utility, and a text editor field. If the user specifies nothing, the code generates a basic queue script with job run command. Since run parameters vary by use case and cluster, the user is encouraged to employ the "make default" feature to store common settings. The integrated creation of structures, input files and job scripts promotes full automation of thus streamlining a procedure that would normally require significant time investment and human interaction. Furthermore, in order to facilitate sensitivity analysis of the results, the user can specify multiple values of parameters such as exchange-correlation functionals or spin states and the program will generate multiple input files each corresponding to one set of input parameters.

4.4.5 Structure-property correlation and analysis

The primary objective of any high-throughput catalyst and materials design screen is to unearth correlations between geometric or electronic structure and energetics. In order to realize this goal, molSimplify gathers output files, parses results, and carries out post-processing analyses (Table 4.1) intended to be general for a broad set of design challenges in inorganic chemistry. The code aids determination of structure- activity correlations by using our developed built-in routines and leveraging external programs with established flexibility for wavefunction and electron density analysis [151,197]. The user is guided through the configuration of interfaces to external codes during molSimplify installation. The program streamlines large data set collection and analysis by providing an on-

102 Property Software Equation Energy molSimplify - General job information molSimplify - ∫ ρ M(r) ρ Hirshfeld metal charge Multiwfn QM = ZM – ρ mol(r)dr ∫ pro(r) ] VDD metal charge Multiwfn Q = Z – [ρ r)– ρ (r) dr M M Voronoi cell∑ ( pro Mulliken metal charge Multiwfn Q = Z – Q ∑ ∑ M ∑M ∑ μ∈M√ μ δ η η Delocalization index of metal Multiwfn M = B k=α,β 2 i∈k j∈k i jSij(M)Sij(B) Localization index of metal Multiwfn λM = ZM – δM Lowest eigenvalue molSimplify e0 = min ei, , i =∑ 1, 2, .., MOs ∑ η ε d–AOs(Ci)2 MOs i ∑i j j d-band center molSimplify ed = i d–AOs i 2 j (Cj) HOMO enegy molSimplify eHOMO = maxoccupied MOs ei LUMO enegy molSimplify eLUMO = minunoccupied MOs Energy gap molSimplify egap = eHOMO – eLUMO · Fermi energy molSimplify efermi = 0.5 ∑(eHOMO + eLUMO) occupied MOS n ∑i i Average orbital occupation molSimplify nav = occupied MOs i Cubefiles Multiwfn ∫ - ρ ≥ HELP molSImplify HELP = V ∫(r)dr , if ELF(r) 0.5 Average electron distance molSimplify R = ρ(r)re–Md3r/n √av∫ V e σ ρ e–M 2 3 Standard deviation of electron molSimplify e = V( (r)r –Rav) d r/ne distance ∫ Average ELF molSimplify ELF = ρ(r)ELF(r)d3r/n √∫ av V e σ ρ 2 3 Standard deviation of ELF molSimplify e = V( (r)ELF(r) – ELFav) d r/ne NBO metal charge NBO - NBO metal orbitals NBO - contributions NBO average d orbital NBO - occupations NBO d-band center NBO - Table 4.1: Major properties reported or calculated with the molSimplify post- processing module. the-fly summary that includes calculation descriptors including the compound name, method employed, spin, and charge alongside progress and results indicators including steps in a geometry optimization, convergence status, absolute energies, computing time, and < S2 > values. These parsing routines are stored in the ’postparse.py’ script, and the user may modify this script to track other quantities for their specific application. All post-processing codes search recursively for .out and . files and therefore are independent of the user’s directory structure.

External code post-processing analysis includes an interface to the natural bond- ing orbital (NBO) [151] v6.0 code for charge analysis and natural population anal-

103 ysis (NPA) [152]. From the output of NBO calculations, molSimplify will summa- rize metal partial charge, average contribution of the metal to shared NBOs with other elements, and d-orbital occupation. We also interface with the Multiwfn [197] code to apply volume-oriented partitioning of the charge density, as exemplified by Bader’s quantum theory of atoms in molecules (QTAIM) [196] basin and population analysis. This added feature supports the use of Molden [214] format wavefunction files typically generated by any quantum chemistry software. In addition to Bader’s QTAIM charges, this interface extracts partial charges on the metal and direct ligand atoms with commonly employed charge analysis schemes including Voronoi defor- mation density (VDD) [215], Hirshfeld [216], and Mulliken charges (see Table 4.1). When employing QTAIM theory as implemented in Multiwfn, molSimplify identifies attractors and reports the delocalization index between two atom-centered attractors as a quantitative measure of the number of electron pairs delocalized (or shared) be- tween two atomic basins for all cases where one atom is a metal atom. Importantly, for many inorganic chemistry applications, multiple attractors may be identified for a single atom. The molSimplify code uses a geometric clustering approach to group attractors and provide summarized LI and DI results on a per-atom basis.

Our built-in codes carry out density topology analysis on format cube files (i.e. a 3D grid), which may be generated by Multiwfn from Molden format wavefunction files. By default, molSimplify generates the 3D grid for the total electron density, the spin density, the spin up and down electron densities, and the electron localization function (ELF) [217]. Our built-in routines also calculate 3D charge density properties including the average distance and variance (i.e., spread) of the density from metal nuclei and the average value of the ELF. Following this analysis, the code retains the cube files for the user to further process or visualize.

The built-in routines parse Molden wavefunction files to analyze and report in- formation about molecular orbital character and occupancy, including the energy of the highest occupied molecular orbital (HOMO), the energy of the lowest unoccupied molecular orbital (LUMO), the Fermi energy, and the d-band center [218]. Addition- ally, the code summarizes occupancies of individual atomic orbital and total subshell

104 contributions from s, p, and d atomic orbitals of the metal center. Extensions to char- acterize d-band width and shape as well as orbitals centered on other atoms or groups of atoms are straightforward to implement. Some users may prefer to employ other post-processing tools, such as cclib [219] or the recently introduced ORBKIT [220], which is a python toolkit for postprocessing electronic structure calculations.

4.5 Benchmarking molSimplify

In order to evaluate the overall robustness of molSimplify as a tool for automating the generation of inorganic complexes, we generated nearly two hundred transition metal complexes spanning a range of ligands and metal centers. Overall, 150 molecules are octahedral Fe(II) complexes randomly generated from a wide pool of ligands of varying coordinating element (C, N, O, P, Cl) and bulk or denticity, whereas the remaining 40 are Cr, Mn, Fe, Co, and Ni complexes [25] that have a similarly diverse set of ligands with both organic and halide coordinating atoms in coordination environments ranging from tetrahedral to octahedral. For each structure generated, we performed a single-point gradient calculation and computed the maximum and root mean squared (RMS) energy gradient, which serves as a metric of the quality of the generated structure by indicating the proximity of the structure to the closest stationary point in a DFT geometry optimization. All calculations presented here were carried out with the TeraChem [209] quantum chemistry package. The hybrid-exchange correlation functional B3LYP [9–11] was used, and in TeraChem the default definition employs the VWN1-RPA form for the LDA VWN [4] component of LYP [10] correlation. Heavy atoms (Fe) were treated with the LANL2DZ effective core potential basis while the 6-31G* basis was used for the remaining atoms. In the first 150-molecule randomized test set, we used molSimplify to generate each structure in four ways: i) no force field optimization and the sum of covalent radii as metal-ligand distances, ii) no force field optimization with trained metal- ligand distances from our database, iii) MMFF94 [206] force field optimization initially

105 of both the individual ligands as well as the final structure with sum of covalent radii as metal-ligand distances, and iv) MMFF94 force field optimization of both the individual ligands and the final structure with trained metal-ligand bond lengths from our database. In order to evaluate the accuracy of each method, we compare total energies and maximum and root mean squared (RMS) gradients of each molecule. We broadly group the 150 molecules in two ways: a) by denticity as monodentate (mono) and multidentate (multi) and b) by size as small or bulky ligands. Comparison of structures obtained with Case i and iii or Case ii compared against Case iv reveals that force field optimization improves multidentate and bulky ligand placement but has no effect on smaller ligands, some of which are excluded from force field optimization due to poor bond distance estimation by MMFF94. We thus limit our comparisons to the most divergent Cases, i and iv (Fig. 4-7).

0.12 no FF + CR 0.10 MMFF94 + ML

0.08

0.06

0.04

RMS gradient (Ha/a.u.) RMSgradient 0.02

0.00 mono multi small bulky

Figure 4-7: Comparison of RMS gradients with MMFF94 optimization and trained ML bond distances to no FF optimization and covalent radius (CR) bond distances for 150 molecule semi-random test set grouped either by denticity (monodentate or multidentate) or by size (small or bulky). Whiskers represent maximum and minimum of the data set.

Comparison of RMS gradients for Case i (no FF + CR in Fig. 4-7) and Case iv (MMFF94 + ML in Fig. 4-7) reveals that the median RMS gradient is system- atically lowered for the 150-molecule set by 11-22%. The magnitude of reduction

106 is largest for bulky or multidentate ligands, and comparison of maximum gradients reveals that the bulky grouping benefits from an 18% reduction in the median maxi- mum gradient when force field optimization and trained ML distances are employed. As an example, the largest RMS gradient bulky structure without force field opti- mization is a tetrapyridyl complex that exhibits a much lower gradient when trained ML distances are employed. The MMFF94 equilibrium bond lengths of the smaller methylisocyanate ligand, on the other hand, are poor, leading to an increase in RMS gradient when the MMFF94 strategy is employed. Overall, RMS and maximum gra- dients are modest for all molSimplify-generated structures, and the user will likely see the largest benefit from selective-force field optimization for bulky ligand alignment.

In addition to gradients, we compared the total energy of the structures gener- ated in Case i and iv using the same denticity- and size-based groupings (Fig. 4-8). Both the median (∼ -20 kcal/mol) and first through third quartile energy ranges (the box shapes in Fig. 4-8) systematically favor the MMFF94 + ML strategy, es- pecially for monodentate and bulky ligands. In addition to the previously identified outliers, trained ML bond lengths are overestimates, e.g. for an iron-quinoline com- plex (ΔE=59 kcal/mol), or force field optimization may generate sub-optimal ligand conformations, e.g. for an ethylamine and Schiff base ligand (ΔE=85 kcal/mol). Con- versely, many outliers have strongly improved energies when employing the Case iv strategy, including the commonly employed bipyridinyl ligand.

We now focus on comparison of the most robust molSimplify strategy (MMFF94 partial optimization with trained ML bond lengths) to the universal force field (UFF) [207], which has been developed for characterization of inorganic complexes. We note that molSimplify also supports UFF force field optimization, but we focus on comparison of our divide-and-conquer ML+MMFF94 strategy against an all UFF strategy. Other structure generation tools, such as OpenBabel that supports UFF, cannot enable an automated workflow of SMILES to UFF-optimized 3D structure because the user must still manually specify the existence and order of metal-ligand bonds. Here, we have manually generated the correct bond order prior to UFF optimization in OpenBabel as our control comparison.

107 150

100

50

0

-50

-100

-150

-200 E(MMFF,ML-no FF,CR) (kcal/mol) FF,CR) E(MMFF,ML-no Δ -250 mono multi small bulky

Figure 4-8: Energy differences with and without FF optimization and trained ML bond distances for 150 molecule semi-random test set. Whiskers represent +/- 1.5 interquartile region, and outliers are shown as circles.

For the 40 molecule test set, single points could not be converged for 3 molecules optimized with UFF, and we assume that they were higher in energy than the mol- Simplify counterparts that did converge. In total, molSimplify generates a lower energy structure for 75% of the molecules (Table 4.2). The median RMS gradient and maximum component of the gradient are both lowered by around 40%.

molSimplify Metric UFF MMFF94+ML Lowest energy (#) 10/40 30/40 Median RMS grad (Ha/a.u.) 0.032 0.019 Median max. grad. (Ha/a.u.) 0.054 0.033

Table 4.2: Comparison of UFF and molSimplify structure properties for 40 molecule test set.

Separating the 40-molecule test set by metal center and maximum denticity of the substituent ligands reveals (Fig. 4-9) that UFF generally outperforms molSimplify for select multidentate ligands. The molSimplify code aligns multidentate ligands by weakly stretching the ligand following the MMFF94 force field in order to increase agreement with the desired alignment points for the ligand. The UFF strategy, on the

108 other hand, is more flexible for select multidentate alignments. Notably, however, the MMFF94+ML molSimplify approach outperforms UFF by as much as 100 kcal/mol for relatively simple structures including a hexa-aqua Ni(II) complex, tetrabromo Co(II) and a tetraphenylporphyrin with axial pyridinyl ligands (Fig. 4-10).

100 1 2 4 50

0

-50 E(UFF-MMFF94,ML) (kcal/mol) E(UFF-MMFF94,ML) Δ -100 Cr Mn Fe Co Ni

Figure 4-9: Energy difference between molSimplify- and UFF-generated structures in kcal/mol sorted by metal and maximum ligand denticity: monodentate (gray circles), bidentate (red squares), or multidentate (green triangles).

The primary instances where UFF outperforms molSimplify by 50-100 kcal/mol margins are Co(III) complexes with bidentate carbonate or ethylenediamine ligands (Fig. 4-10). Here, we also note that the current database of ML trained bond lengths is limited to neutral ligands and additional data for charged ligands, such as carbon- ate, will be added in the near future to improve molSimplify results further. Lastly, we demonstrate the flexibility of molSimplify to generate complexes that have distorted coordination geometries. As an illustration, we consider a four-coordinate Ni(II) complex with two dipyrrole ligands that adopts a distorted tetrahedral geome- try because the square planar form is sterically hindered. Standard implementations of UFF will only produce a see-saw geometry because the default Ni(II) atom type in UFF is octahedral. The molSimplify procedure, on the other hand, produces a more realistic distorted tetrahedral structure (Figure 4-11).

109 Figure 4-10: Representative complexes for which UFF (left) or molSimplify (MMFF94 and trained ML distances, right) generates a lower energy structure.

Figure 4-11: Comparison of UFF and molSimplify generated structures for distorted four-coordinate complex.

Here, we specified tetrahedral coordination for the two ligands, and molSimplify places ligands by maintaining the user-specified coordination environment as well as any trained metal-ligand bond lengths and minimizing steric repulsion between ligands through partial MMFF94 force field optimization. Alternatively, the user may generate the same geometry starting from a square planar configuration and then specifying custom angles to distort the geometry. Single-point gradient calculations (Table 4.3) confirm that the molSimplify-generated structure is lower in energy and has smaller maximum and RMS gradients.

110 Metric UFF molSimplify Relative energy (kcal/mol) +28.5 0.0 RMS grad (Ha/a.u.) 0.031 0.029 max. grad. (Ha/a.u.) 0.094 0.078

Table 4.3: Comparison of UFF and molSimplify structural properties for Ni(II) com- plexes.

4.6 Conclusions

In this chapter, we presented the molSimplify toolkit, which i) automates the structure generation of inorganic complexes, ii) prepares input files and job scripts for electronic structure property determination, and iii) analyzes structure-property trends through post-processing analysis. Additional features in the structure generation module in- clude database searching and randomized generation for chemical discovery, guided or random supramolecular complex building, and a feature to selectively alter molecular fragments. We have benchmarked the program over 190 molecules that were both randomly selected and representative of common inorganic complexes. Our trained metal-ligand bond distances and selective force field optimization reduced gradients by around 20% compared to unoptimized placement. We confirmed this best use strategy by comparison to UFF optimization results where median molSimplify gradients were 40% lower as well. Overall, molSimplify provides a flexible but robust strategy to generate good starting structures for electronic structure characterization in high- throughput screening efforts. We expect molSimplify to have wide applicability in fields ranging from bioinor- ganic chemistry to materials science and catalysis. The current version of the open source code is documented and published online under the GPL license (available at http://molsimplify.mit.edu).

111 112 Chapter 5

Selective anion binding by functionalized organometallics

Reprinted (adapted) with permission from [221]. Copyright 2016 American Chemical Society.

Selective binding and recognition of one species over another is a fundamental component of catalyst design. is a redox active inorganic complex that has promising applications in catalysis and organic synthesis [222, 223]. Ferrocene complexes have also been recently employed as inexpensive, mild catalysts for the etherification of alcohols and the photodecomposition of [224–226]. Fer- rocenium, the oxidized state of ferrocene, can catalyze Michael addition reactions, while the reduced ferrocene state does not [227]. Ferrocenium is able to selectively bind more asymmetric anions such as carboxylates over more spherical anions such as perchlorate, despite the comparable size of the two species. In such cases where two oppositely charged species form an intermolecular complex, strong electrostatic at- traction is expected to dominate binding interactions [228–231], but selective binding of one ion over another also suggests that more complex interactions may be involved.

113 5.1 Computational details

Density functional theory (DFT) calculations were carried out with the TeraChem [147, 209] quantum chemistry package. The hybrid-exchange correlation functional B3LYP [9–11] was used, and in TeraChem the default definition employs the VWN1- RPA form for the LDA VWN [4] component of LYP [10] correlation. Heavy atoms (Fe) were treated with the LANL2DZ effective core potential basis is employed for heavy atoms while the 6-31G* basis was used for the remaining atoms. Ferrocenium studies were carried out using unrestricted DFT with level shifting (spin up 1.6, spin down 0.1). The isolated ferrocenium core was assigned a +1 charge, while ferrocenium-anion complexes had a net neutral charge. Dispersion corrections (DFT-D3 [232,233]) were employed to account for the long range intermolecular interactions of the ferrocenium- anion complexes, while solvation effects were incorporated with the implicit solvation model COSMO [234–236] using a dielectric constant of ε = 78.39 that corresponds to water. Basis set superposition errors (BSSE) were calculated using the counterpoise method [237] but accounted for less than 0.5 kcal/mol. Geometry optimizations were carried out using the L-BFGS algorithm in Cartesian coordinates, as implemented in DL-FIND [150], to default thresholds of 4.5 · 10–4 hartree/bohr for the maximum gradient and 10–6 hartree for the change in self-consistent energy between steps.

The natural bonding orbital (NBO) [151] code is employed for charge analysis, and the Multiwfn [197] code is used for quantum theory of atoms in molecules (QTAIM)

[196] basin and population analysis. Within QTAIM, the delocalization index, δA,B, between two atoms A and B is a quantitative measure of the number of electron pairs delocalized (or shared) between two atomic basins. It is calculated as:

∑ ∑ ∑ √ δA,B = 2 ηiηjSij(A)Sij(B), (5.1) k=α,β i∈k j∈k

where the first sum is over spins, ηi is the occupation of molecular orbital i. Sij(A) is

114 the atomic overlap matrix (AOM) element defined as: ∫

Sij(A) = ϕi(r)ϕj(r)dr, (5.2) A where the integration is over the atomic space of A and ϕi is the molecular orbital i. The total delocalization index for atom A can be calculated by summing the contributions from the different atoms as:

∑ δA = δA,B. (5.3) B

These metrics are used to relate properties of the complexes to relative affinities for anion binding.

5.2 Binding modes

Here, we first employ the automatic random generation of initial positions of the anion with respect to the ferrocenium species by varying both distance and angles, enabling identification of all possible binding modes between the two species. In total, we generated 100 formate-ferrocenium initial configurations and carried out geometry optimizations to obtain the closest local minima (Figure 5-1). The strength of electrostatic attraction is so high that a local minimum corresponding to a binding interaction is found as long as the initial distance between the center of mass of the two species is no more than 9 Å. These calculations are carried out in implicit solvent with a dielectric constant corresponding to solvent. Anion adsorption energies were calculated by subtracting the total electronic en- ergies of the optimized isolated ferrocenium, E(Fc), and anion, E(anion) from the energy of the optimized anion/ferrocenium complex, E(anion/Fc):

ΔEads = E(anion/Fc) – E(Fc) – E(anion). (5.4)

A wide range of adsorption energies is observed, with the weakest binding con-

115 Figure 5-1: Initial guesses for formate positioning (lines representation) and final optimized positions ( & stick representation) for the binding modes observed in the calculations.

figuration corresponding to an adsorption energy (ΔEads) of -4 kcal/mol and the strongest corresponding to -14 kcal/mol (Figure 5-2). While the range is large, the

majority of data points fall within ΔEads = -10 to -14 kcal/mol. The strength of bind- ing correlates, albeit weakly, to the distance between the iron center of ferrocenium to formate. If the binding were purely electrostatic, we would expect the binding energies to strictly follow a 1/r dependence on the iron-formate distance. Instead, analysis of the final optimized structures revealed that the binding energies clustered into six distinct binding modes. Four lateral binding modes are found corresponding to an open ferrocenium (centered at 2.5 Å, green circles in Fig. 5-2, and labeled as

lat-open), closed ferrocenium with both oxygen atoms (3.5 Å, red circles, lat-O2), one oxygen (3.8 Å, gray circles, lat-angle), or the hydrogen (3.7 Å, yellow circles, lat-H)

116 oriented towards ferrocenium. Two vertical binding modes are also observed in which a formate oxygen binds to one of the cyclopentadienyl rings of the formate either with both oxygen atoms oriented down towards the iron center (4.5 Å, blue circles, vertical1) or with one oxygen oriented up away from the iron center (4.8 Å, magenta circles, vertical2).

Figure 5-2: Adsorption energies in kcal/mol versus iron-anion distance in Å for 100 different configurations. The data revealed 6 different groupings of binding modes of formate on ferrocenium. Three of the binding modes correspond to lateral binding with the oxygen atoms facing the metal center with ferrocenium having either closed or open cyclopentadienyl rings and the anion approaching either perpendicularly or at an angle. In another mode of lateral binding the anion binds with the hydrogen facing the metal center. The last 2 binding modes correspond to vertical binding where the formate ion approaches one of the cyclopentadienyl rings from the top and oxygen binds to one carbon atom with the rest of the anion oriented either horizontally or vertically.

In the shortest distance binding mode, the small formate ion causes the cyclopen- tadienyl rings to open, allowing formate to approach the metal core in what we denote as indirect binding. Relatively strong binding is observed in this mode with adsorption

117 energies close to -12 kcal/mol. The largest ΔEads = -11 to -14 kcal/mol is observed instead for the lateral binding without deformation of the ferrocenium structure cen- tered around 3.5 Å. This lateral mode was the main binding mode observed with most of the calculations converging to this relaxed structure. The widest distribution

of ΔEads (-6 to -11 kcal/mol) and distances (3.6 to 4.0 Å) is observed for the lateral mode with a single oxygen oriented towards the metal. The weakest lateral binding mode with hydrogen oriented towards the center is occasionally observed with ΔEads

= -4 kcal/mol. The difference in ΔEads observed for the two vertical binding modes

(-9 kcal/mol for vertical1 and -6 kcal/mol for vertical2) appears to be largely due to the presence of a hydrogen bonding interaction for the downward facing oxygen in

vertical1, which is absent in vertical2. Overall, the most competitive binding mode is observed to be the one in which both oxygen atoms of formate orient toward the ferrocenium core, followed by the open configuration. As a point of comparison, we also investigated the binding energy of perchlorate with ferrocenium using the same approach. In this case, the same number of initial guesses was generated, but only a single lateral binding mode was observed. The range of adsorption energies was also very narrow with all results falling within a 0.5 kcal/mol range centered at approximately -6 kcal/mol.

5.3 Selective binding

In order to identify whether functionalization can tune the relative binding energies of perchlorate and formate, we used molSimplify to generate a set of 44 functionalized ferrocenium structures where one hydrogen atom on the cyclopentadienyl ring was substituted by different functional groups (FGs) (Figure 5-3). We used the "custom core" feature in molSimplify which aligns the new FG along the bond vector of the previously removed FG (in this case a hydrogen atom), assign the new FG-core bond distance according to the sum of covalent radii, and performs pre-defined rotation routines to reduce steric repulsion. This approach ensures maximum coincidence for the initial position of the ion across FGs and ensures evaluation of direct anion-FG

118 interactions.

1 2 3 4 5 6 7 8 9 NH 2 PH 2PPh2 N Cl Cl N P N N

10 11 12 13 14 15 Cl N O NH H 16 17 18 19 20 21 22 O O O O N N HO N NH 2 N

23 24 25 26 27 28 29 30 31 3233 H F Cl Br I O O O S S S 34 35 36 37 38 39 40 41 42 CCl CF CH 3 3 3 O C N NCS O 43 44 OH O NH 2 NH 2

Figure 5-3: Functional groups (FG) in the computational screening data set. The connection to the ferrocenium core is denoted by a gray bond, and the colors of FG indices represent polar (blue), slightly polar (green) and nonpolar (red) character.

For relative binding energy comparisons, 10 initial guesses were generated using the constraint feature to select distances that primarily lead to the lat-O2 binding mode in unfunctionalized ferrocenium. The choices for functionalization include bulky substituents (e.g., diphenylphosphine: -PPh2, benzene: -C6H5), halogens (e.g., -Br and -Cl), nonpolar (e.g., ethyl -CH2CH3), electron withdrawing (e.g., carboxylic: -

COOH, trifluoromethyl: -CF3, aldehyde: -CHO), sulfur-containing (e.g., thiocyanic acid: -SCN, isothiocyanic acid: -NCS) and nitrogen-containing (e.g., imino: -CHNH, ammonia: NH3). The FGs were grouped by the largest difference in Pauling elec- tronegativity (χ) between bonded atoms i and j:

Δχ = max |χi – χj|, (5.5)

119 where using this relative polarity metric, the 44 FGs may be divided into three roughly equally sized groups as: i) nonpolar: Δχ < 0.4 (red, 14 FGs), ii) slightly polar: 0.5 < Δχ < 0.8 (green, 14 FGs) and iii) polar:Δχ > 0.8 (blue, 16 FGs). We expect the functional groups to modulate the adsorption energetics of perchlorate and formate in a differing manner, thus allowing us to shift the relative binding affinity.

5.3.1 Hydrogen bonding

Adsorption energies calculated for both formate and perchlorate with functionalized ferrocenium span a large range (Figure 5-4). Formate binding is sensitive to the presence of functional groups with ΔEads ranging from -11 to -22 kcal/mol. Func- tionalization in some cases reduces adsorption energies with respect to the unfunc- tionalized ferrocenium (ΔEads = -12 kcal/mol). The strongest binding is observed for the alkanolamine (43) functionalization, while the weakest binding corresponds to diisopropylamine (3) (Figure 5-5).

Figure 5-4: Histogram of adsorption energies in kcal/mol for formate (red) and per- chlorate (green). Formate shows a wider range of values while perchlorate has a narrow range of adsorption energies. The dashed lines denote the adsorption energies for the non-functionalized ferrocenium.

120 Analysis of the lateral mode of binding of formate with unfunctionalized ferroce- nium and comparison to the functionalized cases can be used to rationalize these differences in binding. Review of the structure of the unfunctionalized case reveals that two hydrogen bonds are formed between a cyclopentadienyl ring and the oxygen atoms of formate. This hydrogen bonding is always asymmetrical, and no cases in which four hydrogen bonds are formed are observed in any of the structures. The H··· O distance ranges from 2.0-2.2 Å and angle for the C-H··· O bond is around 140◦. The height of the ferrocenium complex, as defined by the distance between eclipsed hydrogen atoms on the two cyclopentadienyl rings is typically around 3.5 Å. By constraining the H··· O distance to 2.1 Å, we can compute that a configuration that allows for four hydrogen bonds between formate and ferrocenium would require a less obtuse C-H··· O angle of 125◦, which would correspond to much weaker hydrogen bonds than the more obtuse value [238]. Turning now to the functionalized cases with strong formate binding, the strongest-binding alkanolamine (43), forms an O-H...O hydrogen bond with a 1.6 Å H...O distance between the alcohol H and the formate O (Figure 5-5) which is shorter than typical 1.7 to 2.0 Å distances due to the negative charge on formate. In this complex, formate prefers an angled orientation to minimize distance to the hydrogen bond donor on the FG rather than a lateral orientation pre- ferred in pristine ferrocenium. In the weak binding case, the same geometric hydrogen bonding analysis also proves fruitful. For the isopropyl hydrogen atom hydrogen bond donors of the weakest binding and least selective FG, diisopropylamine (3) are too far away from the ferrocenium core (Figure 5-5). These FG hydrogen bond donors pull formate away from the ferrocenium core (Fe to anion distance of 4.1 Å versus 3.5 Å in the unfunctionalized case), which simultaneously weakens hydrogen bonds with the cyclopentadienyl ring and reduces the distance-dependent electrostatic contribution to adsorption.

For perchlorate, ΔEads values range from -7 kcal/mol to -13 kcal/mol. The pre- dominant adsorption energies are centered around -8 to -9 kcal/mol, which is close to the unfunctionalized value of -7 kcal/mol. The fact that perchlorate is less sensitive to ferrocenium functionalization is supported by analysis of the structure of perchlorate

121 Figure 5-5: Hydrogen bonding interactions between (43) alkanolamine-functionalized ferrocenium and formate (top left) or perchlorate (top right) compared to (3) diisopropylamine-functionalized ferrocenium and formate (bottom left) or perchlo- rate (bottom right). The magenta and green dashed lines represent short (< 2.3 Å) and long (> 2.3 Å) hydrogen bonds, respectively.

when it binds to ferrocenium (see Figure 5-5). Three hydrogen bonding interactions may be nominally formed between perchlorate and ferrocenium with distances around 2.1-2.2 Å and angles ranging from 117 to 128◦. Since we have already identified that hydrogen bond strength is poor for smaller hydrogen bonding angles, the predomi- nant factor in adsorption energy of perchlorate with a functionalized ferrocenium is purely electrostatic, which is less dramatically modulated by the functionalization than hydrogen bond introduction or disruption.

The specific FGs that strengthened formate adsorption do not strengthen perchlo- rate adsorption. For the strongest formate-binding FG (alkanolamine, 43), there is no direct FG-perchlorate interaction, and the FG instead forms a 2.0 Å intramolec- ular O-H...N bond with itself. Most of the 7 longer C-H...O hydrogen bonds (2.3-2.7 Å) are instead between perchlorate and the FG ethyl chain. For the weak-formate

122 binding FG (diisopropylamine, 3), perchlorate’s larger size enables interaction with the isopropyl hydrogen atoms through an additional 2.3 Å, 160o C-H... O bond while maintaining hydrogen bonds with the cyclopentadienyl ring, strengthening the overall

ΔEads . Our primary interest is in identifying the functionalizations that shift the relative binding strength of formate with respect to perchlorate. While the range of adsorption energies overlap between perchlorate and formate, the distributions are not identical (see Figure 5-4), suggesting there should be some cases that favor one ion over the other. The relative adsorption energy for a given structure i is defined as:

ΔEf-p = ΔEform – ΔEperc, (5.6) adsi adsi adsi and values greater than 0 denote higher preference for perchlorate, while values Δ f-p less than 0 denote higher preference for formate. Comparison of Eads plotted Δ f against Eads confirms that relative affinity for formate or perchlorate correlates well (R2=0.9) to the formate binding energy (Figure 5-6). All the functionalizations still Δ f-p show binding preference for formate over perchlorate, although the largest Eads of around -9 kcal/mol is only a slight increase over the unfunctionalized case. Although there are a few outliers, overall energetics indicate that enhancing formate binding also increases relative affinity for formate.

5.3.2 Additional correlations

While hydrogen bonding explains well the extrema in our data set, the more subtle variations in relative binding affinity are likely due to other electronic structure vari- ations. Including indirect adsorption modes that would be preferred for anion/FG-

ferrocenium complexes with ΔEads less than unfunctionalized ferrocenium and con- straining our data set to a subset of functionalizations that show increased indirect binding, we identified correlations between relative adsorption strength and different charge measures. The molSimplify code automatically evaluates several quantities that may be useful in extracting correlations in energetics between related molecules.

123 Figure 5-6: Relative adsorption energies in kcal/mol versus corresponding formate adsorption energy in kcal/mol on the same structure.

The quantities automatically computed included distance of the anion from the metal center, metal partial charge, electron localization on the metal with localization or delocalization index from QTAIM or the high electron localization domain popula- tion (HELP) [239], frontier orbital energies and the d-band center of the isolated species [218]. For any given screening effort, some of these properties may correlate more or less strongly to the energetics of interest, and it is potentially fruitful to con- sider multiple properties. In the case of ferrocenium complex screening, iron partial charge in the isolated ferrocenium complexes and differential delocalization indices between the two bound species are correlated to relative binding affinity. Interest- ingly, the d-band center, which is widely employed for screening in heterogeneous catalysis [218] did not correlate well to the relative anion binding affinities.

The iron partial charge for the functionalized ferrocenium complexes without the presence of ions was calculated using NBO [151] population analysis. More positive iron charge correlates (R2 = 0.51) to preferential perchlorate binding, while a more neutral iron charge correlates to preferential formate binding (Figure 5-7). Many

124 functionalizations lead to a more positive partial charge on the iron with respect to the unfunctionalized formate, likely due to the large number of functional groups in the data set that as electron withdrawing groups. More positive partial charge on the iron likely enhances the electrostatic component of binding, which is the main factor in perchlorate binding since its larger size prevents the formation of strong hydrogen bonds. At the same time, an increase in partial charge on iron corresponds to an increase in electron density on the cyclopentadienyl rings, reducing the strength of hydrogen bonding interactions. Therefore, an increase in iron partial charge should preferentially benefit perchlorate binding. This observation suggests that further screening of functionalized ferrocenium complexes may be at least partially carried out without evaluating the anion binding energies.

A stronger correlation (R2 = 0.75) is observed for the difference in delocalization Δδf–p indices of iron between the formate and perchlorate case ( Fe , Figure 5-7). The iron delocalization index (see Sec. 5.1 for more details) provides a measure of how Δδf–p many electrons are shared between iron and neighboring atoms. A positive Fe indicates more hybridization between iron and the neighboring atoms for formate, while a negative value indicates more delocalization for perchlorate binding. This trend can again be rationalized in part by invoking the observation that formate makes directional hydrogen bonds with the ferrocenium complex while perchlorate binding is more strongly determined by electrostatics. Reduced delocalization in the formate likely indicates greater electrostatic binding and reduced hydrogen bonding. Conversely, an increase in delocalization suggests more electrons are participating in bonding and a spread out electron density that may slightly reduce the favorable electrostatic interactions between the anion and ferrocenium. These results indicate that future screens of complexes that can selectively bind ions of comparable size should target higher differential electron delocalization in concert with directional hydrogen bonds.

125 Figure 5-7: (Top) Relative adsorption energies in kcal/mol versus the difference in the Delocalization Index between the two adsorbed intermolecular complexes that correspond to the two anions. (Bottom) Relative adsorption energies in kcal/mol versus NBO charge of iron in the isolated functionalized structures.

5.4 Conclusions

In this application, we unveiled 6 different binding modes that allow small ions to interact with a ferrocenium core directly and indirectly. We additionally identified that key hydrogen bonding interactions enhance selectivity of ferrocenium toward

126 carboxylates over perchlorate. We also identified a reasonable correlation to enhanced selectivity for formate with more neutral iron partial charge in the isolated complex, suggesting that future screens of ferrocenium complexes could be carried out without directly computing binding energies. In both examples, new key interactions were unearthed, providing a path forward to tune properties of key materials and catalysts.

127 128 Chapter 6

CO binding on metalloporphyrins

The binding of small molecules to metalloporphyrins and groups is of consider- able biological importance and has been studied extensively [240,241]. Heme cofactors (, ) are well-known oxygen carriers [242] and are also crucial for catalyzing important redox reactions in the body [243]. The origin of the interesting physical properties of these complexes is the partial filling of the central metal fron- tier d orbitals, which are responsible for their high reactivity. However, for a detailed understanding of these enzymatic functions it is necessary to consider not only the metal cofactor but also the surrounding environment, taking into account interactions with distal groups and conformational changes caused by the ligand binding [244]. Binding of small molecules on metalloporphyrins has been widely studied both experimentally [245] and computationally [244, 246] with primary focus on ferrous complexes [247]. The affinity of the ligand toward the receptor is largely determined by its binding energy to the transition metal site and it can be tuned by the heme environment [248]. Although the process of ligand binding may seem simple and well understood from a computational perspective, this is not the case for the binding of CO ligands to open-shell transition metal sites. The existence of multiple possible electronic states poses a significant challenge in studying and understanding porphyrin

129 properties and is still a subject of ongoing research [249,250].

In addition, despite the large amount of available experimental and theoretical data, the effect of ligand functionalization on the binding strength of CO is still unknown. Accurate estimation of binding energies in metalloporphyrins is particularly difficult mainly due to the multiple spin states involved, which can differ between bound and unbound species [251] and are strongly affected by the distal axial ligand.

Density functional theory has been the method of choice for studying these sys- tems, mainly due the combination of low computational cost and relatively good accuracy. However, it is well known that even within DFT predictions from different exchange-correlation functionals regarding electronic (relative spin-state energetics), geometrical (bond lengths) or optical (band gaps) properties can vary substantially. In particular, DFT at the generalized gradient approximation (GGA) level of theory fails to accurately reproduce the electronic structure of metalloporphyrins with its bad performance mainly attributed to the selective overstabilization of low-spin elec- tronic states and overestimation of the high-spin low-spin energy gap. Furthermore, practical DFT fails in general to accurately describe inorganic complexes with one sug- gested reason being the inherent problem of unphysical delocalization of the valence d electrons due to self-interaction error (SIE) present in all pure density functionals. A popular approach for partially correcting SIE is including a fraction of exact, Hartree- Fock exchange (HFX) in hybrid density functionals, usually in the range of 20-50% HFX. The amount of exact exchange included in the functional strongly affects the predictions for inorganic complexes and is an important parameter of the calculation that the practitioner has to select.

The present work includes a systematic theoretical study of CO binding in a se- ries of functionalized metalloporphyrins using DFT. Rather than using a standard amount of HFX, we additionally aim to understand the way in which relative en- ergetic, electronic and structural properties of metalloporphyrins on different spin states are affected by the amount of exact exchange included.

130 6.1 Computational details

We generated a total of 21 metal tetraphenyl porphyrin (MTPP-L) structures with a combination of 3 metal centers (Mn, Fe, and Co) in the +2 and +3 oxidation – – – states and 7 distal axial ligands (CO, NO2, NH3, imidazole, H2O, SH , and F ) using the recently developed structure building toolkit molSimplify [195]. In mol- Simplify, the additional molecule placement feature was used to add a bound CO molecule as a proximal ligand (Figure 6-1). The distal ligand set was selected to span π – the spectrochemical series with strong field ligands ( -acceptors, e.g., CO, NO2),

intermediate-field ligands (σ-donors, e.g., NH3, imidazole, H2O), and weak-field lig- ands (π-donors, e.g., SH–,F–). Notably, our set includes the biologically-relevant imidazole (imd) representative of coordination of the heme cofactor in hemoglobin and myoglobin [252] and the thiolate (SH–) ligand that mimics the coordination en- vironment in the enzyme cytochrome P450 [253], which have both been the focus of previous computational studies [254,255].

Figure 6-1: Structure of metal tetraphenyl porphyrin (MTPP) functionalized with one axial ligand, L (L-MTPP) and with CO bound on top. Three metals were used as the center atoms (Mn, Fe, Co) and 7 ligands were attached for the functionalization – – – (CO, NO2, NH3, imidazole, H2O, SH ,F ). The ligands are presented in the order of their ligand field strength (stronger to weaker) according to the spectrochemical series.

Mid-row transition metals are well-known to have a range of accessible spin states, which have also been observed experimentally when these metals are coordinated by

131 the TPP macrocycle. Thus, for each MTPP-L with and without a proximal CO lig- and, we characterize a high-spin (HS), intermediate-spin (IS), and low-spin (LS) state. For Mn(III), Fe(II) and Co(III) these correspond to a quintet, triplet and singlet spin state respectively, whereas for Mn(II) and Fe(III) to a sextet, a quartet and a doublet spin state. For the Co(II) ion, the lowest energy sextet resides over 5 eV above the ground state, and we thus consider only the quartet and doublet, which we refer to as the HS and LS states, respectively. Electronic structure calculations were car- ried out using the TeraChem [147, 209] graphical processing unit (GPU)-accelerated quantum chemistry package with the B3LYP [9–11] hybrid exchange-correlation func- tional augmented with empirical DFT-D3 dispersion [232,233]. The default definition of B3LYP in TeraChem employs the VWN1-RPA form for the LDA VWA [4] compo- nent of LYP [10] correlation. Additionally, the effect of exact exchange was studied by modifying Hartree-Fock exchange percentages to 15% and 25% following a pre- vious procedure [94] and performing single-point energy calculations on the B3LYP optimized geometries. Metal atoms were described with the LANL2DZ effective core potential, and the 6-31G* basis set was used for the remaining atoms. Geometry optimizations both with and without the proximal CO bound were carried out in the gas phase using the L-BFGS algorithm in translation and rotation internal coordi- nates [256] to default thresholds of 4.5x10–4 hartree/bohr for the maximum gradient and 1x10–6 hartree for the change in energy between steps. Single spin state calcula- tions were spin-restricted, and calculations in all other cases were spin-unrestricted. For unrestricted calculations, virtual and open-shell orbitals were level-shifted by 1.0 eV and 0.1 eV, respectively, to aid convergence to an unrestricted solution.

Partial charges and 3d and 4s subshell natural atomic orbital occupations (NAOs) were obtained from the TeraChem interface with the Natural Bond Orbital (NBO) v6.0 package [151].

132 6.2 Structures

Structural properties provide insight into the nature of bonding as metal and ligand are varied in MTPP-L structures. For instance, although the TPP macrocycle is ex- pected to be rigid, strong electrostatic interactions with distal ligands are known to potentially produce significant displacement of the metal out of the TPP plane con- comitant with elongated M-NTPP bonds. Out-of-plane bending is also favored elec-

tronically in high-spin complexes due to stabilization of high-energy dz2 and dx2–y2 orbitals and destabilization of low-energy dxz and dyz orbitals. Among the complexes studied in this work, this effect is most pronounced in quintet Co(III)TPP-F–, where the combined effect of high F– charge density and high-spin d6 configuration causes a 0.52 Å out-of-plane displacement of the Co atom and a resulting average M-NTPP bond length of 2.09 Å, which is 0.14 Å longer than cases with minimal distortion (e.g., singlet Co(III)TPP-imd).

The metal distal axial ligand (M-Lax) bond lengths in the 5-coordinate MTPP-L complexes exhibit even more pronounced variations with metal- and ligand-identity than the M-NTPP bond length variations. The primary factors in optimized M-Lax bond lengths are expected to be the occupancy and energy of the 3dz2 and 3dx2–y2 (eg) orbitals, which, when already filled in the four-coordinate structure, correspond

to weaker and elongated M-Lax bonds. With the exception of Co(II), the MTPP-Ls

show marked M-Lax bond length sensitivity to spin state, with shortened LS bond distances and elongated HS distances (Table 6.1). Bond length variation is reduced for Co(II)TPP-Ls due to the d7 electronic configuration, which ensures that at least

one eg orbital is occupied in both spin states, whereas the bond length variation is

preserved for all other metals and oxidation states due to higher eg occupation in the

HS than the LS state. The difference in HS-LS M-Lax bond lengths increases with ligand field strength, with the largest variation occurring for Mn(II)TPP-CO from

d(M-Lax) = 1.77 Å in the LS to 2.56 Å in the HS state. Shifts of similar magnitude are also observed for Mn(III)TPP-CO and both oxidation states of FeTPP-CO, in agreement with results obtained by Charkin et al [257]. For the same ligand and spin

133 Ligand Metal – – – CO NO2 NH3 imd OH2 SH F ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ 1.80 ¨ 1.85 ¨ 2.01 ¨ 1.93 ¨ 1.98 ¨ 2.14 ¨ 1.73 ¨ Mn(III) ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨ 2.57 ¨ 2.15 ¨ 2.26 ¨ 2.19 ¨ 2.23 ¨ 2.45 ¨ 1.82 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ 1.77 ¨ 1.93 ¨ 2.10 ¨ 2.06 ¨ 2.16 ¨ 2.28 ¨ 1.75 ¨ Mn(II) ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨ 2.56 ¨ 2.05 ¨ 2.28 ¨ 2.23 ¨ 2.32 ¨ 2.47 ¨ 1.87 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ Fe(III) 1.74¨¨ 1.90¨¨ 2.05¨¨ 1.91¨¨ 2.26¨¨ 2.22¨¨ 1.73¨¨ ¨¨ 2.47 ¨¨ 2.24 ¨¨ 2.21 ¨¨ 2.08 ¨¨ 2.20 ¨¨ 2.41 ¨¨ 1.76 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ Fe(II) 1.74¨¨ 1.87¨¨ 1.97¨¨ 1.93¨¨ 2.00¨¨ 2.31¨¨ 1.80¨¨ ¨¨ 2.47 ¨¨ 2.21 ¨¨ 2.21 ¨¨ 2.16 ¨¨ 2.25 ¨¨ 2.44 ¨¨ 1.82 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ Co(III) 1.77¨¨ 1.87¨¨ 1.91¨¨ 1.87¨¨ 1.95¨¨ 2.23¨¨ 1.74¨¨ ¨¨ 2.19 ¨¨ 2.51 ¨¨ 2.20 ¨¨ 2.17 ¨¨ 2.21 ¨¨ 2.37 ¨¨ 1.76 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ Co(II) 1.96¨¨ 2.08¨¨ 2.12¨¨ 2.10¨¨ 2.23¨¨ 2.53¨¨ 1.90¨¨ ¨¨ 1.97 ¨¨ 2.28 ¨¨ 2.19 ¨¨ 2.20 ¨¨ 2.23 ¨¨ 2.38 ¨¨ 1.83

Table 6.1: Metal distal axial ligand bond lengths (in Å) for the optimized low-spin (top, left) and high-spin (bottom, right) five-coordinate MTPP-L.

state, complexes of isoelectronic metal centers (e.g., Mn(II)TPP-L and Fe(III)TPP-

L) have similar M-Laxŋ bond lengths within 0.1 Å. The major exceptions, such as HS CO complexes of Mn(II)/Fe(III) and Fe(II)/Co(III) that differ by 0.11 Å and 0.28 Å respectively, are due to differences in porphyrin ring deformation.

Invoking ligand field theory (LFT) arguments (see Chapter 1.3), we can approxi- mate MTPP-L/CO bonding as a combination of σ-bonding between the CO 5σ orbital π and the metal dz2 orbital and back-bonding between the metal dxz and dyz orbitals and the CO 2π∗ orbital (Figure 6-2).

Due to the similar principles governing M-Lax and M-CO bonding, we can use the same framework to rationalize differences in bond lengths between the 5-coordinate

MTPP-L and 6-coordinate MTPP-L/CO complexes, except that the eg (3dz2 and

3dx2–y2) orbitals are no longer degenerate in the 5-coordinate MTPP-L complexes.

Instead, the 3dz2 orbital is lower in energy and thus more sensitive to axial ligand effects. Coordination of the second axial ligand further increases its occupancy in most σ σ LS complexes through bonding between the ligand orbital and the metal 3dz2 orbital, resulting in destabilization and concomitant lengthening of both axial M-L bonds. This effect applies for binding of CO to MTPP-L (Table 6.2) or of L to MTPP-

134 Figure 6-2: Molecular orbitals for Fe(II)TPP with strong metal 3dz2 (top), 3dxz (middle) and 3dyz (bottom) character as well as 3σg (top), 3π∗ (middle, bottom) CO molecular orbitals. The isosurfaces are colored with green indicating positive wavefunction phase and purple representing the negative wavefunction phase.

CO (Table 6.3). In contrast, for LS complexes of Co(II) and HS complexes where the high-energy orbital occupation is already substantial, this effect is smaller, and other competing ligand-dependent effects become more relevant. For instance, high-energy lone pairs of π-donor ligands (e.g. SH–,F–) can stabilize 6-coordinate octahedral configurations over 5-coordinate configurations by interacting with partially unfilled low-lying t2g orbitals. This effect is sufficient to shorten the M-L/M-CO bonds in the majority of HS complexes.

6.3 Binding energies

As structural properties are well explained by variations in MTPP-L metal 3dz2 oc- cupation, we would also expect the MTPP-L/CO binding strength to be explained

135 Ligand Metal – – – CO NO2 NH3 imd OH2 SH F ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑1.88¨ ↑1.90¨ ↑2.06¨ ↑2.03¨ ↑2.06¨ ↑2.21¨ ↑1.73¨ Mn(III) ¨↓ ¨↓ ¨↑ ¨↑ ¨↑ ¨↓ ¨↑ ¨¨ 1.95 ¨¨ 1.94 ¨¨ 2.27 ¨¨ 2.21 ¨¨ 2.25 ¨¨ 2.45 ¨¨ 1.83 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑1.88¨ ↑2.04¨ ↓2.10¨ ↑2.08¨ ↓2.11¨ ↑2.45¨ ↑1.76¨ Mn(II) ¨↑ ¨↑ ¨↑ ¨↓ ¨↑ ¨↓ ¨↑ ¨¨ 2.66 ¨¨ 2.17 ¨¨ 2.28 ¨¨ 1.99 ¨¨ 2.32 ¨¨ 2.47 ¨¨ 1.87 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑1.86¨ ↑1.91¨ ↓2.03¨ ↑2.01¨ ↓2.05¨ ↑2.22¨ ↑1.73¨ Fe(III) ¨↓ ¨↓ ¨↑ ¨↑ ¨↑ ¨↓ ¨↓ ¨¨ 2.43 ¨¨ 2.08 ¨¨ 2.24 ¨¨ 2.10 ¨¨ 2.22 ¨¨ 2.23 ¨¨ 1.73 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑1.85¨ ↑2.00¨ ↑2.07¨ ↑2.02¨ ↑2.08¨ ↑2.39¨ ↑1.82¨ Fe(II) ¨↓ ¨↓ ¨↓ ¨↑ ¨↑ ¨↓ ¨↓ ¨¨ 2.46 ¨¨ 1.92 ¨¨ 2.23 ¨¨ 2.20 ¨¨ 2.32 ¨¨ 2.23 ¨¨ 1.74 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑1.87¨ ↑1.90¨ ↑1.95¨ ↑1.93¨ ↑2.00¨ ↑2.26¨ ↑1.76¨ Co(III) ¨↑ ¨↓ ¨↓ ¨↑ ¨↑ ¨↓ ¨↓ ¨¨ 2.38 ¨¨ 1.91 ¨¨ 2.19 ¨¨ 2.21 ¨¨ 2.25 ¨¨ 2.36 ¨¨ 1.76 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑2.43¨ ↓1.91¨ ↑2.20¨ ↑2.20¨ ↑2.42¨ ↓2.28¨ ↓1.76¨ Co(II) ¨↑ ¨↓ ¨↑ ¨↑ ¨↑ ¨↓ ¨↓ ¨¨ 2.35 ¨¨ 1.92 ¨¨ 2.20 ¨¨ 2.22 ¨¨ 2.46 ¨¨ 2.28 ¨¨ 1.84

Table 6.2: Metal distal axial ligand bond lengths (in Å) for the optimized low- spin (top, left) and high-spin (bottom, right) 6-coordinate MTPP-L/CO. An up- ward pointing arrow indicates elongated M-Lax bonds compared to the 5-coordinate MTPP-L complex, whereas a downward pointing arrow indicates shortened bonds.

Ligand Metal – – – CO NO2 NH3 imd OH2 SH F ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑1.89¨ ↑1.88¨ ↑2.73¨ ↑1.88¨ ↑2.69¨ ↑1.90¨ ↑2.06¨ Mn(III) ¨↓ ¨↓ ¨↑ ¨↑ ¨↑ ¨↑ ¨↑ ¨¨ 1.95 ¨¨ 2.07 ¨¨ 1.87 ¨¨ 2.75 ¨¨ 1.84 ¨¨ 3.06 ¨¨ 3.06 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑1.88¨ ↑1.83¨ ↑1.81¨ ↑1.81¨ ↑1.80¨ ↑1.81¨ ↑1.97¨ Mn(II) ¨↑ ¨↑ ¨↑ ¨↓ ¨↑ ¨↑ ¨↑ ¨¨ 2.66 ¨¨ 3.31 ¨¨ 2.94 ¨¨ 1.94 ¨¨ 2.89 ¨¨ 3.37 ¨¨ 3.35 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑1.86¨ ↑2.09¨ ↑1.81¨ ↑1.81¨ ↑1.78¨ ↑2.00¨ ↑1.99¨ Fe(III) ¨↓ ¨↑ ¨↑ ¨↑ ¨↓ ¨↓ ¨↓ ¨¨ 2.43 ¨¨ 2.97 ¨¨ 2.48 ¨¨ 2.63 ¨¨ 2.45 ¨¨ 1.95 ¨¨ 1.97 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑1.85¨ ↑1.81¨ ↑1.80¨ ↑1.79¨ ↑1.76¨ ↑1.79¨ ↑1.80¨ Fe(II) ¨↓ ¨↓ ¨↑ ¨↑ ¨↑ ¨↓ ¨↓ ¨¨ 2.46 ¨¨ 1.99 ¨¨ 2.59 ¨¨ 2.58 ¨¨ 2.47 ¨¨ 1.92 ¨¨ 1.95 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑1.87¨ ↑1.99¨ ↑1.85¨ ↑1.86¨ ↑1.80¨ ↑1.96¨ ↑1.90¨ Co(III) ¨↑ ¨↓ ¨↑ ¨↑ ¨↑ ¨↑ ¨↑ ¨¨ 2.38 ¨¨ 1.95 ¨¨ 2.46 ¨¨ 2.40 ¨¨ 2.38 ¨¨ 3.22 ¨¨ 3.33 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ↑2.43¨ ↓1.96¨ ↑2.50¨ ↑2.51¨ ↑2.10¨ ↓1.92¨ ↓1.88¨ Co(II) ¨↑ ¨↓ ¨↑ ¨↑ ¨↑ ¨↓ ¨↑ ¨¨ 2.35 ¨¨ 1.92 ¨¨ 2.38 ¨¨ 2.37 ¨¨ 2.06 ¨¨ 1.89 ¨¨ 3.16

Table 6.3: Metal-proximal axial CO bond lengths (in Å) for the optimized low-spin (top, left) and high-spin (bottom, right) six-coordinate MTPP-L/CO. An upward pointing arrow indicates elongated M-CO bonds compared to the 6-coordinate MTPP- CO complex, whereas a downward pointing arrow indicates shortened bonds.

136 by the same descriptor. Hence, we first define the CO dissociation energy (De) from MTPP-L as the difference of the optimized MTPP-L/CO complex (E(MTPP-L/CO)) electronic energy with the optimized, separated MTPP-L (E(MTPP-L)) and CO (E(CO)) energies:

De = E(MTPP – L) + E(CO) – E(MTPP – L/CO). (6.1)

A positive value of De corresponds to exothermic adsorption of CO. While we have taken into account dispersion contributions to binding via the empirical DFT-D3 correction, the effects of basis set superposition error as well as zero point vibrational energy and entropic contributions to dissociation have been neglected as they are expected to be comparable across all metals and distal ligands considered.

Physically, De can be broken down into a positive electronic contribution from CO binding and a negative geometric contribution from relaxation of the MTPP framework from its equilibrium 5-coordinate geometry to its equilibrium 6-coordinate geometry. The relative importance of these two contributions can be assessed by

computing the relaxation energy, Erelax:

rigid rigid Erelax = (E(CO) – E (CO)) + (E(MTPP – L) – E (MTPP – L)), (6.2)

where the rigid energies are computed by removing the proximal axial CO from the 6-coordinate species and calculating the energies of the two resulting fragments. The energy penalty associated with this geometric constraint can be large enough to ener- getically disfavor CO binding in some cases, particularly those where the displacement is large and binding is intrinsically weak (e.g. HS complexes of negatively charged distal axial ligands). This is confirmed by calculation of the relaxation energies (Table 6.4) which are all negative and in HS MTPP-F–/CO, LS and HS Mn(II)TPP-imd/CO and LS and HS Mn(II)TPP-OH2/CO large enough to not favor binding of CO (Table 6.5). Experimentally, it has been observed [258] that the latter complexes do not bind CO due to the strong σ-donor character of the ligands and therefore we omit them

137 Ligand Metal – – – CO NO2 NH3 imd OH2 SH F ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ -5 ¨ -11 ¨ -8 ¨ -10 ¨ -7 ¨ -10 ¨ -5 ¨ Mn(III) ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨ -20 ¨ -45 ¨ 0 ¨ 0 ¨ 0 ¨ 0 ¨ -65 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ -8 ¨ -7 ¨ -3 ¨ -7 ¨ -25 ¨ -10 ¨ -31 ¨ Mn(II) ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨ -1 ¨ -7 ¨ -45 ¨ -21 ¨ -18 ¨ 0 ¨ -27 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ Fe(III) -13¨¨ -2 ¨¨ -1 ¨¨ -7 ¨¨ -44¨¨ -2 ¨¨ -3 ¨¨ ¨¨ -45 ¨¨ -1 ¨¨ -2 ¨¨ -2 ¨¨ -13 ¨¨ -9 ¨¨ -48 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ Fe(II) -6 ¨¨ -6 ¨¨ -5 ¨¨ -4 ¨¨ -4 ¨¨ -4 ¨¨ -4 ¨¨ ¨¨ -2 ¨¨ -1 ¨¨ -1 ¨¨ -2 ¨¨ -1 ¨¨ -10 ¨¨ -29 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ Co(III) -6 ¨¨ -3 ¨¨ -4 ¨¨ -4 ¨¨ -6 ¨¨ -2 ¨¨ -3 ¨¨ ¨¨ -5 ¨¨ -9 ¨¨ -4 ¨¨ -1 ¨¨ -1 ¨¨ 0 ¨¨ 0 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ Co(II) -26¨¨ -10¨¨ -9 ¨¨ -29¨¨ -2 ¨¨ -25¨¨ -23¨¨ ¨¨ -2 ¨¨ -12 ¨¨ -46 ¨¨ -46 ¨¨ -1 ¨¨ -24 ¨¨ 0

Table 6.4: Relaxation energies of 42 MTPP-L in low-spin (top left) and high-spin (bottom right) states calculated with B3LYP/LACVP*. from our subsequent analysis. Removal of these points yields a moderate distance- energy correlation (Figure 6-3), which suggests that the factors affecting M-CO bond length can partially but not fully explain variations in De. We previously noted (Section 6.2) that for a given MTPP-L combination, the M-

CO and M-L bond lengths increase as spin multiplicity increases, and the De (Table 6.5) also decreases as spin multiplicity increases. The strongest binding complexes are the LS singlet Fe(II)TPP-OH2/CO and doublet Mn(II)TPP-CO/CO with Des=38 kcal/mol. High-spin complexes have the weakest M-CO bonds (De ca. 8-13 kcal/mol) for all complexes, with the only exception being the doublet Fe(III)TPP-OH2/CO with a De of 22 kcal/mol. For the stronger binding LS complexes, we observe that intermediate-field σ-donor ligands polarize the metal dz2 orbital toward the open axial position, strengthening the M-CO bond by promoting effective overlap with the CO 3σ orbital, consistent with previous observations by Rovira et al [247] for the imd ligand. Within our dataset, we observe Des greater than 25 kcal/mol for all LS Fe(II/III)TPP-L/CO complexes with L=NH3, imd, OH2 and greater than 22 kcal/mol for all LS Co(III)TPP-L/CO complexes with the same ligands (Figure 6-4). Smaller polarization in the strong-

138 Figure 6-3: Dissociation energy of the MTPP-L/CO bond in the 6-coordinate com- plexes (in kcal/mol) versus bond length (in Å). Different colors represent the 3 metal centers of the MTPPs with gray corresponding to Mn, red to Fe and blue to Co. Dark colors represent the +2 and light colors the +3 oxidation state. Different symbols correspond to the 7 distal axial ligands in the MTPP-L/CO complexes with hol- low shapes indicating low-spin states, semi-filled intermediate-spin states and filled high-spin states. A nonlinear fitting line is shown with its associated R2 value.

– – – (CO, NO2) and weak-field (SH ,F ) ligands (Figure 6-4) results in smaller Des especially in the HS states with a maximum De of 13 kcal/mol observed for the sextet Fe(III)TPP-CO/CO. Because the largest contribution to the M-CO bond comes from the σ-type inter- σ action between the g MO of CO and the dz2 metal character MO of the porphyrin, we expect minimizing dz2 occupation in the MTPP-L electron configuration to in- crease the MTPP-L/CO dissociation energy. Hence, to study the effect of the 3dz2 σ AO occupation on the interaction between the metal 3dz2 AO and the g MO of

CO, we computed the occupation of the metal dz2 NAO in the 5-coordinate MTPP-L complexes (Figure 6-5) and correlated the De with the 3dz2 NAO occupation. A moderate fit (R2=0.67) was obtained, with complexes of Co and Mn showing 2 slightly weaker fits due to the smaller n(z ) and De ranges spanned. The occupation

of the 3dz2 AO does not vary significantly for Co complexes with a minimum occu-

139 Ligand Metal – – – CO NO2 NH3 imd OH2 SH F ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ Mn(III) 26 ¨ 14 ¨ 25 ¨ 22 ¨ 28 ¨ 12 ¨ 14 ¨ ¨¨ - ¨¨ - ¨¨ 9 ¨¨ 9 ¨¨ 10 ¨¨ 8 ¨¨ 8 ¨ ¨ ¨ ¨ ¨ ¨ ¨ 38 ¨ 29 ¨ - ¨ - ¨ - ¨ 25 ¨ - ¨ Mn(II) ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨ 7 ¨ 1 ¨ 8 ¨ - ¨ - ¨ 8 ¨ - ¨ ¨¨ ¨ ¨ ¨ ¨ ¨ 27 ¨ ¨ 33 ¨ 25 ¨ 30 ¨ 11 ¨ 16 ¨ Fe(III) ¨¨ 11¨ 9 ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨ 13 ¨ ¨ 11 ¨ 7 ¨ 22 ¨ - ¨ - ¨ ¨ ¨ ¨ ¨ ¨ ¨ 22 ¨ 28 ¨ 34 ¨ 33 ¨ 38 ¨ 30 ¨ 34 ¨ Fe(II) ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨ - ¨ - ¨ - ¨ - ¨ - ¨ - ¨ - ¨ ¨ ¨ ¨ ¨ ¨ ¨ 16 ¨ 13 ¨ 22 ¨ 22 ¨ 28 ¨ 12 ¨ 19 ¨ Co(III) ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨ 11 ¨ - ¨ 13 ¨ 9 ¨ 11 ¨ 8 ¨ 8 ¨ ¨ ¨ ¨ ¨ ¨ ¨ 16 ¨ - ¨ 19 ¨ 20 ¨ 11 ¨ - ¨ 9 ¨ Co(II) ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨ 8 ¨ - ¨ 10 ¨ 10 ¨ 11 ¨ - ¨ 10

Table 6.5: Dissociation energies of CO (in kcal/mol) on 42 MTPP-L porphyrins in low- spin (top left) and high-spin (bottom right) states calculated with B3LYP/LACVP*. Dissociation energies of intermediate-spin states fall between the HS and LS values. Ground states of the corresponding 5-coordinate MTPP-L complexes are indicated in bold and negative dissociation energies with a dash.

Figure 6-4: Polarized unoccupied molecular orbitals for low-spin Fe(II)TPP-imd (right) and high-spin Co(III)TPP-F- (left). The higher polarization due to the imi- dazole ligand on the iron complex can facilitate larger overlap with the CO molecular orbitals resulting in a stronger bond. The low polarization in Co(III)TPP-F- due to the fluoride ligand will result in a weak MO interaction.

– pation of 0.9 e corresponding to LS Co(III)TPP-OH2 and a De of 28 kcal/mol and – – a maximum of 1.5 e corresponding to HS Co(III)TPP-F and a De of 8 kcal/mol.

Similarly, 3dz2 AO occupation for Mn complexes shows relatively small variation with – a minimum of 0.7 e corresponding to LS Mn(III)TPP-CO2 and a De of 38 kcal/mol – – and a maximum of 1.35 e corresponding to HS Mn(III)TPP-NO2 and a De of 1

140 Figure 6-5: Bond dissociation energies of the MTPP-L/CO bond (in kcal/mol) versus NBO occupation of the 3dz2 orbital in the 5-coordinate MTPP-L complex. Different colors represent the 3 metal centers of the MTPPs with gray corresponding to Mn, red to Fe and blue to Co. Dark colors represent the +2 and light colors the +3 oxidation state. Different symbols correspond to the 7 distal axial ligands in the MTPP-L/CO complexes with hollow shapes indicating low-spin states, semi-filled intermediate-spin states and filled high-spin states. A linear fitting line is shown with its associated R2 value and 2 complexes falling on the extremes of the trendline labeled with text in the same color as the corresponding shapes.

kcal/mol. Stronger correlations are observed in Fe complexes, where the presence of the distal axial ligand has a strong effect on the distribution of electrons among the – 3d AOs of the metal. The occupation of the 3dz2 AO varies by as much as 0.8 e – between LS Fe(II)TPP-NH3 (0.5 e and De of 34 kcal/mol) and HS Fe(II)TPP-imd – (1.3 e and a De of 7 kcal/mol).

6.4 Charge measures

The metal center partial charge decreased in all MTTP-L/CO complexes upon bind- ing, and the degree of charge accumulation on the metal correlates to the M-CO bond

141 De (Figure 6-6). The resulting linear-scaling relation from the least squares fit is

De ≈ –31.2ΔqM + 10.9, (6.3)

where ΔqM is the NBO partial charge difference of the metal center between the 6-coordinate MTPP-L/CO complexes and the corresponding MTPP-L complexes. This relation suggests that large charge transfer in the metal corresponds to stronger binding.

45 +2/+3 - Mn CO NO2 NH3 Mn(II)-CO 40 +2/+3 +2/+3 - Fe Co OH2 imd SH 35 LS IS HS F- 30 25

(kcal/mol) 20 e 2 D 15 R =0.59 10 5 Fe(III)-imd 0 0.0 -0.1 -0.2 -0.3 -0.4 -0.5 -0.6 Δq(M)(-e-)

Figure 6-6: Bond dissociation energies of the MTPP-L/CO bond (in kcal/mol) ver- sus NBO partial charge difference of the metal center between MTPP-L/CO and MTPP-L. Different colors represent the 3 metal centers of the MTPPs with gray corresponding to Mn, red to Fe and blue to Co. Dark colors represent the +2 and light colors the +3 oxidation state. Different symbols correspond to the 7 distal axial ligands in the MTPP-L/CO complexes with hollow shapes indicating low-spin states, semi-filled intermediate-spin states and filled high-spin states. A linear regression fit and associated R2 value are also shown on the plot with 2 complexes that fall on the extremes of the trendline labeled with text in the same color as the corresponding symbols.

The smallest charge transfer is observed for the HS Fe(III)TPP-L/imd complexes – with a value of 0.01 e that corresponds to weak binding with De = 8 kcal/mol. The complex contains an intermediate-field ligand that does not promote charge transfer

142 for most HS and IS MTPP-L/CO with complexes of NH3 and OH2 showing similar trends. Maximal electron density accumulation is observed for LS Mn(II)TPP-CO

and Mn(III), Fe(II/III) complexes of OH2 and imd due to the π-backbonding interac- tion that increases the electron density around the metal center in the first case and the high metal partial charge of the MTPP-L complex in the latter that decreases – as CO binds. A ΔqM in the range of 0.5-0.65 e is observed for these complexes

corresponding to strong binding with De over 20 kcal/mol in all cases.

6.5 Sensitivity analysis

Leveraging earlier observations (see Chapter 2), we varied HF exchange from the default B3LYP value of 20% to 15% (αHF = 0.15) and higher exchange (25%, αHF = 0.25) in order to determine how energetic and electronic properties vary across a wide range of HF exchange. Based on previous trends observed for organometallic complexes [94, 119], linear fits were introduced for the dependence of the energies of the complexes both with and without the proximal CO with good accuracy. Using

the linear regression results, we then calculated the dependence of the De on HF

exchange by approximating the partial derivative of De with respect to HF exchange

(αHF) as

∆De ≈ ∂De SDe = . (6.4) ∆αHF ∂αHF

We use the unit notation "HFX" to indicate the change from αHF = 0 to 1. Focusing on the complexes in their corresponding MTPP-L ground state, the

calculations indicate that an increase of αHF results in decreasing De (Figure 6-7), in agreement with previous studies that report underbinding for hybrid functionals [259,

260]. The sensitivity of the De on HF exchange, SDe, depends on the metal, ligand and spin-state of the complex with the spin-state having the strongest effect. High- spin ground states that correspond to Mn and Fe complexes are the most sensitive kcal to HF exchange with the De decreasing by as much as 130 mol·HFX for the quintet – kcal Fe(II)TPP-F /CO and by 120 mol·HFX for the sextet Mn(II)TPP-imd/CO. Lower

143 sensitivity is observed for intermediate-spin ground states with SDes of -65 and - kcal – 55 mol·HFX calculated for the triplet Fe(II)TPP-SH /CO and quartet Fe(III)TPP- – SH /CO respectively. In contrast to HS complexes, the De sensitivity on HF exchange kcal for the LS Co(III) complexes shows less variation with SDe less than -25 mol·HFX in all cases. The wide range of sensitivities among MTPP-L/CO complexes with different spin-states suggests that the ground spin-state dominates the effect of HF exchange on De with high-spin ground states being the most sensitive and low-spin ground states the most HF exchange inert.

-140 -120 -100 -80 -60 (kcal/mol/HFX)

-40 e HF D Δ Δα -20 0 Mn3+ Mn 2+ Fe3+ Fe2+Co3+ Co 2+

Figure 6-7: Box plot of the sensitivity of MTPP-L/CO bond dissociation energy Δ ( De ) to Hartree-Fock exchange versus metal electron configuration. Results are ΔaHF shown for the ground state of the corresponding 5-coordinate MTPP-L complexes calculated with B3LYP.

Previous work has shown that the spin-state ordering for octahedral coordination complexes depends linearly on the amount of HF exchange included in the functional [94, 119]. The adiabatic electronic energy gap between high-spin and low-spin states is defined as:

HS–LS ΔE = EHS(RHS)–ELS(RLS), (6.5)

where EHS(RHS) is the electronic energy of the high-spin state at its geometry optimized coordinates and ELS(RLS) is the equivalent for the low-spin state. The

144 partial derivative of the adiabatic electronic energy gap with respect to HF exchange is approximated as

HS–LS HS–LS ∆∆E ≈ ∂∆E S∆E(HS–LS) = . (6.6) ∆αHF ∂αHF

In order to assess the effect of HF exchange on both the De energy and the HS–LS ΔE we define a corresponding measure of relative sensitivity (RSE) for De and ΔEHS–LS as

min { } SE –SΔ E(HS–LS) ∈ Δ HS–LS RSE = max min ,E De, E , (6.7) SΔE(HS–LS) –SΔE(HS–LS) with a value of 0 corresponding to the least sensitive and 1 to the most sensitive complex from the dataset. The calculated relative sensitivities (Figure 6-8) indicate that SΔE(HS–LS) dominates SDe in 80% of the cases with the exceptions being mainly – – – low-spin complexes of strong- (CO, NO2) and weak-field (SH ,F ) ligands.

1.00 Mn+2/+3 (3) (4) Fe+2/+3 Co+2/+3 0.75 y=x CO - NO2 NH 0.50 3 imd

(2) OH2 - (6) SH 0.25 (7) - (1) F

RS(D e ) (9) (8) 0.00 LS (5) (10) HS 0.00 0.25 0.50 0.75 1.00 HS-LS RS(ΔE )

Figure 6-8: Plot of the relative sensitivity of the results for spin-state splittings (x- axis) versus the relative sensitivity for the De (y-axis). A value of 0 corresponds to the least sensitive complex, whereas a value of 1 to the most sensitive. Larger symbols represent the 10 complexes that bind the strongest with indices labeled with text in the same color as the corresponding symbols.

Restricting our focus to the 10 complexes with the highest Des (labeled with a

145 corresponding index in Figure 6-8), we can identify 6 low-spin complexes of Fe, 3 of

Mn and 1 of Co with Des in the range of 29 to 38 kcal/mol. From the 10 complexes, only the complexes (1), (2) and (5) correspond to the 5-coordinate MTPP-L ground state. Among these, complex (2) shows very high De sensitivity to HF exchange followed by complexes (1) and (5) with significantly smaller values. The RSDe for

the 3 structures ranges from 0.35 for (2) to 0.05 for (5) corresponding to a SDe of -49 kcal and -7 mol·HFX respectively. The RSΔE(HS–LS) values for all three complexes are in kcal the range from 0.40 to 0.45 corresponding to a SΔE(HS–LS) of -56 and -63 mol·HFX respectively. However, complexes (2) and (5) that correspond to Fe(II)TPP-NH3 and Fe(II)TPP-imd have a ΔEHS–LS of 10 and 2 kcal/mol respectively, thus suggesting the small variations in the amount of HF exchange will result in qualitative different spin-state predictions. On the contrary, complex (1) that corresponds to Co(II)TPP- HS–LS OH2 is a very stable LS complex (ΔE = 43 kcal/mol) and therefore we can be more confident that both the spin-state ordering and the dissociation energy values calculated will not be significantly affected by the amount of HF exchange included in the hybrid functional.

6.6 Conclusions

In this test case, we used the molSimplify’s structure generation module along with the input file and jobscript generation utilities to study the effect of different axial ligands on the binding strength of carbon monoxide on metalloporphyrins. We re- ported a wide range of bond dissociation energies that depend on the spin-states, metals and axial ligands and identified that the ground spin-state dominates the

binding strength. We observed that the distal axial ligand effect on the metal dz2 orbital can shift the dissociation energy and correlated the binding strength with

the occupation of the dz2-type molecular orbital in the 5-coordinate complexes. A correlation between the binding strength and the charge transfer on the metal was identified, suggesting that high charge transfer on the metal center results in stronger bonds between the metal and the axial CO. Finally, we performed sensitivity analysis

146 by calculating the dependence of both the HS-LS splitting and the binding strength on HF exchange. Restricting our focus on the complexes that bind the strongest, we were able to identify only a single complex, the low-spin Co(II)TPP-OH2, where both the uncertainty regarding the binding strength and the spin-state ordering were minimized.

147 148 Chapter 7

Concluding remarks

We have quantified ranges of property prediction for transition metal complexes based on exchange-correlation functional choice over a range of the most commonly used functional classes (i.e. LDAs, GGAs, GGA hybrids and meta-GGAs). We observed qualitative agreement amongst functionals within the same class when com- paring spin-state ordering across representative Fe(II/III) octahedral coordination complexes. We then observed that increasing HF exchange in a modified hybrid functional results in strong high-spin stabilization over low-spin complexes in the or- der of 1-2 kcal/mol per 1% HF exchange. High values of HF exchange were able to change qualitative predictions for spin-state ordering, even in complexes with strong-

field ligands such as Fe(CO)6 that are normally stable, low-spin complexes. The effect of HF exchange on spin-state energetics was linear in all cases with the strength of the variation (slope) matching for ligands with the same element connected to the central metal. This observations suggests that the effect of HF exchange depends mainly on the directly connected element of the ligand and not on the outer ligand environment, thus allowing extrapolation of these results to a wider range of complexes.

We then quantified the extent to which incorporation of HF exchange changes the underlying charge distribution. With increasing HF exchange we observed that the metal partial charge increased corresponding to delocalization of 3d electrons from the metal center towards the ligands. This observation is in contrast with the common

149 view of how HF exchange corrects SIE where it is usually assumed that electrons are localizing on the metal. Further correlations were identified between sensitivity of HS-LS splitting on HF exchange and HS-LS partial charge differences and their corresponding first derivatives. The results suggest that when charges are similar amongst HS and LS states, the spin-state splitting dependence on HF exchange is diminished.

Based on preliminary results that showed improved predictions for spin-state or- dering of transition-metal complexes by meta-GGA functionals, we further investi- gated the effect of meta-GGA exchange on the energetics of a larger set of structures. We modified the common meta-GGA functional TPSS and studied the dependence of the HS-LS splitting energy across the 0-100% meta-GGA exchange range for metal ions and complexes of the first-row transition metals Ti-Cu. We observed very low sen- sitivity of early- and late-row transition metal ions and octahedral complexes whereas energetics of mid-row transition metal ions and complexes showed higher dependence on meta-GGA exchange. In the case of octahedral complexes, coordination with weak-field ligands resulted in meta-GGA exchange favoring LS states with the oppo- site trend observed for complexes with strong-field coordinated ligands. Results for the mid-row transition metal complexes showed that meta-GGA exchange reduces the absolute HS-LS energy splitting in all cases and thus favoring degeneracy among spin-states.

Further analysis was performed to unveil correlations between meta-GGA ex- change and the underlying charge density properties on the octahedral complexes. Higher metal partial charge is always observed in the HS states compared to LS states with increasing meta-GGA exchange further delocalizing electron density from the metal center towards the ligands. The magnitude of the charge difference de- pends on the ligand-field character with strong-field ligands showing higher relative delocalization between HS and LS states whereas this difference is smaller in the case of weak-field ligands. Similar correlations were observed by performing QTAIM analysis and calculating the electron density difference at the bond critical points with the results indicating that in complexes with strong-field ligands electron den-

150 sity from the metal-ligand bond is transferred towards the ligands at a higher rate for HS complexes as we increase meta-GGA exchange.

The composite effect of HF exchange and meta-GGA exchange in calculating the spin-state splitting of octahedral coordination complexes was studied with opposite behavior observed for strong- and weak-field ligands. In the case of the strong-field

ligand complex Fe(II)(CO)6, meta-GGA exchange acts in synergy with HF exchange indicating that meta-GGA exchange can be incorporated in this case to improve predictions of spin-state ordering. The opposite behavior is observed for weak-field ligands such as NH3 where inclusion of meta-GGA exchange effectively reduces the efficiency of HF exchange tuning even further.

We then presented the molSimplify toolkit that was developed to automate the discovery and generation of inorganic complex structures. The program was designed to efficiently generate a wide range of inorganic complexes, to prepare input files and job scripts for electronic structure calculations and to analyze the results of these calculations in order to unveil structure-property trends. The software is distributed in either a commandline version or with a user-friendly graphical user interface that makes it approachable to the wider scientific community. The program offers addi- tional features including chemical database searching allowing screening for particular molecules as well as randomized generation for chemical discovery. The structure gen- eration module allows modification of existing structures by selectively altering molec- ular fragments and random or guided building of supramolecular complex structures for studying intermolecular interactions.

We used over 190 random and common inorganic molecules to benchmark mol- Simplify showing that the database of trained metal-ligand bond distances that the program utilizes combined with selective force field optimization is able to reduce energy gradients calculated with density functional theory by approximately 20% compared to unoptimized placement. Additionally, we confirmed that complexes generated with molSimplify using this strategy reduce the energy gradients by over 40% when compared to UFF optimized complexes. Overall, molSimplify provides a flexible but robust approach for generating good starting structures for electronic

151 structure characterization in high-throughput screening efforts that will substantially reduce computational time and improve convergence. Extended documentation is available for molSimplify with the current version of the code and multiple examples available online. We expect molSimplify to have wide applicability in fields ranging from to materials science and catalysis. Ongoing effort is geared toward extending molSimplify for the generation and study of hybrid molecular-crystalline and other periodic systems. We then used molSimplify to study the binding of small ions on organometalic complexes and in particular ferrocenium. We used the molSimplify supramolecular builder to generate random conformations of formate-ferrocenium pairs and identified multiple modes of binding. We then used the modification module of molSimplify to functionalize the ferrocenium core with over 40 functional groups and identified that key hydrogen bonding interactions enhance selectivity of ferrocenium toward carboxylates over perchlorate. Reasonable correlations with charge measures were identified for indirect binding with enhanced selectivity for formate with more neutral iron partial charge in the isolated complex. Finally we used the molSimplify’s structure generation module along with the input file and jobscript generation to study the effect of different axial ligands on the binding strength of carbon monoxide on metalloporphyrins. We reported a wide range of bond dissociation energies that depend on the spin-states, metals and axial ligands and correlated the binding strength with electronic descriptors such as the occupation of the dz2-type molecular orbital in the 5-coordinate complexes. Good correlations between binding strength and the charge transfer on the metal center were identified, suggesting that high charge transfer on the metal results in stronger bonds between the metal and the axial CO. We performed sensitivity analysis by calculating the dependence of both the HS-LS splitting and the binding strength on HF exchange. By identifying complexes that showed the strongest and weakest dependence of their electronic properties on HF exchange we were able to provide suggestions about the robustness of our predictions.

152 Chapter 8

Market analysis of the Catalysis industry: Capstone paper

8.1 Introduction

Meeting rising energy requirements and protecting the environment are among the most important applications of catalyst technology. Broadly speaking, a catalyst is a substance that increases the rate of a by reducing the required activation energy, but which is left unchanged by the reaction. The petroleum in- dustry is the largest single user of catalysts, especially in the production of refined products such as gasoline and diesel fuel. Catalysts also contribute to increasing the supply of petroleum by making it commercially possible to produce oil from sources once regarded as uneconomical, such as tar sands and heavy oil deposits. Catalysts are also being used to produce increasing quantities of synthetic oil and gas from coal and oil shale. Catalysts are also at the forefront of technologies such as fuel cells and photovoltaic cells, which are being developed to replace conventional fossil fuels. They indirectly contribute to increasing energy supplies by expanding the efficiency with which hy- drocarbon and other fuels are utilized. Energy consumption is a major source of pol- lution (e.g., automobile and industrial emissions), along with other waste-generating

153 activities. Many people consider the prevention of climate change and other forms of environmental degradation to be a greater priority than increasing energy supplies. Catalysts are indispensable to many types of environmental remediation, from ve- hicle emissions control systems to industrial effluent and municipal waste treatment. They also contribute indirectly to reducing pollution and other adverse environmen- tal impacts, such as through cleaner-burning fuels and the production of products, including refrigerants that pollute less than the substances they replace. There are four major segments of the catalyst market:

1. Environmental

2. Refining and petrochemical

3. Chemical

4. Polymerization

8.2 Types of Catalysts

Based on the comparison of the physical states of the catalysts and the reactants, catalysts are classified into two types: homogeneous and heterogeneous. Homogenous catalysts are in the same physical state as the reactants and are hence miscible. Typically, both reactants and catalysts are in the liquid or gaseous medium, and after the reaction, the two are separated. This allows interactions at the molecular level for the reactions. Heterogeneous catalysts are in a physically different state than the reactants and products. Solid materials that catalyze a reaction in a gaseous medium for gaseous products are the most common examples for such catalysts. For heterogeneous catalysts, the surface area of contact is most important, as this determines how effective the catalyst can be, and a higher surface area of contact ensures faster reaction kinetics per weight of catalyst used. For solid catalysts, a porous structure is hence often preferred. Homogeneous catalysts, particularly Lewis acid catalysts, are well known and have been applied in Friedel-Crafts alkylation and acylation reactions. However, new

154 Figure 8-1: Catalyst market shares by technology (2017) [261]. policies have been introduced involving the applications of homogeneous catalysts as a result of the problems caused by themproblems such as corrosion, loss of catalyst and disrupting the environment. The policies focused on environmental protection and avoidance of unfriendly reactants and promotion of catalysts with better selectivity in order to minimize product waste and expensive separations and recycling. Meanwhile, heterogeneous catalysts, such as molecular sieves, zeolites and porous materials for liquid phase organic synthesis reactions, can provide many benefits such as clean reaction product solution after simple physical separation process steps such as filtration; ease of recovery; and avoidance of corrosion. Therefore, development of efficient heterogeneous catalysts is increasing in popularity, especially in the produc- tion of fine chemical and intermediates. A third type of catalysts is biocatalysts, where living organisms such as enzymes are used to facilitate reactions where such organisms take in certain chemicals and then produce the products desired by the reaction. This segment is small and is useful only for specific reactions as the limitation is set by the nature of the enzyme used and its capabilities for certain reactions. The last segment involves all other types of catalysts that do not fall under these broad segments. Heterogeneous catalysts account for the vast majority of industry use, and this is mainly attributed to the following two reasons:

• High stability and ease of regeneration.

155 • Easy removal from the reaction medium by physical separation.

Since surface area of contact is the most important aspect for solid heterogeneous catalysts, often the catalyst material is coated onto a carrier material for use. This is done to reduce the cost of the catalyst material, to use the available catalyst as efficiently as possible and to bring physical stability to the catalyst material phase. The carrier material, sometimes referred to as support, is often made of alumina, silica, titania, zirconia or other inert materials that are strong, nonreactive and tem- perature stable. A list of common support materials and their properties are given in the tables below. The support materials are required to have several important properties to justify their use such as stability, texture and thermal conductivity.

8.3 Catalyst market segments

The largest market segment for catalyst use by revenue is environmental applications. This segment includes applications such as emission control products in automobiles, industrial emission controls and removal of chemicals from waste streams. This seg- ment roughly represents 39% of all catalyst usage worldwide and is estimated to be a $9.2 billion market globally. The refining segment is the second largest market for catalysts, and here the application is mainly for improving processes of oils and gases that are drilled out from the Earths crust. The crude material is processed or refined to make a multitude of products and each of the reaction steps often utilize a special catalyst designed for the purpose. This market segment represents roughly 29% of the catalyst market worldwide and is estimated to be a $6.6 billion market globally. The other two segments, namely polymers and chemicals, each represent about 16% of the catalyst market or $3.8 billion markets each globally. The applications here are for reactions to produce polymers from monomers and in production of various chemicals. Of the four market segments, environmental applications are expected to grow the fastest, with a projected growth rate of 4.3% annually in the next few years.

156 Figure 8-2: Catalyst market shares by application (2017). [261].

The polymer and chemical segments are expected to grow at an annual rate of 3.9%, and the refining segment is expected to grow the slowest of the four segments with a projected growth rate of 3.5%. Since the rate of growth for the four segments are all relatively similar in terms of order of magnitude, the market share size for each of the four segments are expected to more or less remain the same, with a change of only 1% in the size of the segment shares in the next four to five years. The use of combinatorial catalysts for discovery and optimization of catalytic performance is also expected to have a strong impact on the rate at which new catalysts are developed

8.4 Environmental catalysts

This market segment mainly comprises catalytic converters, which are used in ve- hicles to reduce emissions with the latest technology being the three-way catalysts that can also reduce particulate matter. For the automotive industry, there are two technologies in use, one for the light- and medium-duty vehicles and the other for the heavy-duty vehicle segments. The technological difference is more pronounced between gasoline-powered en- gines and diesel-powered engines, with the light-duty market more concentrated by the gasoline engines and the heavy-duty vehicles concentrated by diesel engine tech- nologies. The exhaust chemistry of the two engine technologies is markedly different,

157 Figure 8-3: Catalyst market shares by application (2017). [262].

and so are the environmental catalysts and technologies used to provide emission controls. The global automotive market is a huge and an important segment in the global marketplace and the most important market for the catalyst industry. Emissions from automotive exhaust are mostly cleaned by catalytic technologies. The size and nature of the catalyst market for this segment is directly connected to two factors:

• Automotive market size and growth.

• Regulations to control emissions in various markets.

Both aspects mentioned above are external factors to the catalyst market; hence, this segment of the catalyst market is rather unique in that it plays a reactionary role to changes in the two aspects mentioned above. Companies that can develop the cheapest and most efficient technologies to meet the regulations are the winners in this marketplace. Considering that the regulations in place and the expected changes in regulations

158 vary from country to country or from region to region, the technology in this mar- ketplace is heavily driven by Europe, which is at the forefront with environmental regulations. Other countries tend to follow these regulations, some within few years and others after a decade or two. The trickle-down effect of the regulations plays an important part in the catalyst market. Companies essentially develop new technology in the Western market and then transfer the production technology to markets in other countries as regulations are implemented there. Or, they may just add the product to their product portfolios for marketing and sales activities there. Global light vehicle production is estimated at 87.6 million units in 2014, and growing to 112.7 million units by 2020 for a CAGR of 4.3%. Of this, the strongest growth is in Asia, where light vehicle production is expected to increase from 45.2 million units in 2014 to 61.2 million units by 2020 for a CAGR of 5.2%. With the market itself growing and regulations becoming more stringent, the catalyst market for this industry is in a sweet spot with guaranteed growth in the coming years. However, it is the technologically advanced suppliers that stand to benefit from this trend. Global heavy-duty vehicle production is estimated at 1.8 million units in 2014, and is expected to grow to 4.1 million by 2020 for a staggering CAGR of 15.2%. Of this, the strongest growth is in Asia and South America combined, where heavy-duty vehicle production is expected to increase from 762,000 units in 2014 to 2.7 million units by 2020 for a CAGR of 24.0%. However, the emission controls for heavy-duty vehicles, mostly diesel, are not as strictly regulated as those for light-duty vehicles. This is mainly due to the size of the market, as the number of vehicles in the heavy-duty segment is smaller and hence the effective pollution output is relatively smaller in quantity. In contrast, the fuels used in heavy-duty vehicles are typically not as clean; therefore, the exhaust is more polluting than exhaust from light vehicles, which are predominantly gasoline-fueled. Fuel emissions are tightly regulated around the world, with each country adopting new emission control standards every few years. Expected regulations for the next

159 several years are expected to become more strict.

8.5 Refining catalysts

In 2014, the refining industry accounted for 78.7% ($5.4 billion) of the total market for energy catalysts, with primary energy production (e.g., synfuels, biofuels, hydrogen, photovoltaic cells) accounting for most of the remainder. Catalysts used in energy conversion applications, chiefly fuel cells, accounted for only 0.4% ($27.3 million) of the 2014 energy catalyst market. The demand for catalysts in refining, however, is growing at a much slower rate (i.e., a CAGR of 3.7% from 2015 to 2020) than primary energy catalysts (18.7%) and energy conversion catalysts (38%). As a result, the refining industrys share of the total energy catalyst market is expected to drop from 78.7% in 2014 to 61.5% in 2020, whereas primary energy productions share rises from 20.9% to 36.9%, and energy conversion gains more than a percentage point. The most commonly used raw materials are zeolites, precious and base metals. Zeolites and metals are most commonly used in petroleum refinery and emission re- duction applications, while different types of chemical compounds and enzymes are used in chemical synthesis and polymerization reactions. Chemical compound cata- lysts are further segmented into polyolefins, adsorbents, chemical synthesis catalysts, and other materials such as enzymes and other bio-based catalysts. Synthetic zeolites are the most commonly used catalyst material, owing to its affordable cost. However, increasing demand for natural zeolite due to its environ- mental benefits and decreasing product cost is likely to promote market growth over the next few years. Precious metals are expected to witness the highest growth among the three metal types. However, transition and base metals are expected to witness substantial growth due to wider availability and lower product cost. Precious metals are extremely costly, and are preferred in alternate applications such as jewelry designing. Palladium is expected to have the fastest growth among its counterparts. However, platinum is the

160 most widely used product, owing to its superior properties. However, lower product availability coupled with complex mining and volatile prices are expected to restrain growth. As compared to its inorganic counter processes refinery catalyst such as platinum, gold,rhodium and iridium are gaining higher demand in the oil and gas industry. Growing production of the aforementioned metals in the emerging economies of Chile, China and South Africa due to favorable regulatory support for Foreign Direct In- vestment (FDI) in mineral production is expected to ensure the raw material supply for market players. However, growing demand for these metals in other applications is expected to restrict the raw material availability and may challenge market growth. Chemical compounds, such as sulfuric acid, calcium carbonate, hydrofluoric acid organomagnesium, organoaluminum, metallocenes and triphenylphosphine, are used as catalyst material for numerous applications intended for chemical, polymer and petrochemical applications. In the petrochemical industry, catalysts find applications in the catalytic cracking, isomerization and reforming processes. Similarly, acidic catalysts can be used in organic chemistry for hydration of carbon-carbon double bonds to produce alcohol, acid catalyzed hydrolysis of esters, esterification, nitration of benzene and other such useful reaction processes. The growing popularity of organometallic compounds for the production of polyethy- lene and polypropylene is expected to be a key market driver over the forecast period. The low prices of chemical compounds as compared with zeolites, metals and enzymes are expected to have a positive impact on market growth.

8.6 Polymer catalysts

The global polymer catalyst market is highly fragmented. Unlike the two segments discussed earlier, there are only two of subsegments here that warrant separate men- tion, the polypropylene and polyethylene segments. Of these, the polypropylene mar- ket is growing, and a detailed discussion on this market is included later in this chap- ter. Other applications in the polymer catalysis market are many and too fragmented

161 to warrant a separate detailed analysis. The market for polymer catalysts is estimated to have been about $3.8 billion in 2016, commanding third position among the market segments for catalysts by size. The market is expected to grow at a five-year CAGR of 4.2% in the coming years to reach $4.35 billion by 2019 (Fig. 8-5)

Figure 8-4: Polymer catalyst market growth projection by subsegment. [263].

Of this, 19% or $735 million is the market for Ziegler-Natta catalysts used for polypropylene manufacturing in 2016. This market is expected to grow at a five-year CAGR of 5% in the next few years to reach $866 million by 2019. The second largest segment is the Ziegler-Natta catalysts for polyethylene, which is estimated to have been valued at $679 million in 2016. This segment is expected to grow at a five-year CAGR of 5% to reach $729 million by 2019. The polypropylene catalyst market is dominated by LyondellBasell, which controls 38% of the market. BASF, Grace and Toho each control 10% to 15% of the market, and these four top companies together represent roughly 75% of the catalyst market for PP. LyondellBasell sells its catalysts under the brand Spheripol, Lummus Novolen Technology GmbH, a part of CB&I (Chicago Bridge & Iron Co.), uses the brand name Novolen and Ineos products are marketed under the trade name Innovene.

162 As for the polyethylene market, this is dominated by Univation, which controls 45% of the market. Grace is second, with 18% of the market, and LyondellBasell and Sinopec each represent 8% of the market. Together, these four players have about 79% of the PE catalyst market.

8.7 Chemical catalysts

The chemical catalyst market is much more fragmented than the polymer catalyst market. Table 8.1 provides the market size for the subsegments in this industry segment. The largest subsegment is the production of terephthalic acid, according to data from 2016. This market commands about $400 million in revenues. Major catalyst manufacturers feel that this is an attractive market segment because of its fragmented nature. The syngas ammonia application and shale gas markets, which support olefin production technologies, are the segments that hold the highest poten- tial for future growth. Considering the varied markets and applications for catalysts in the chemicals in- dustry, it is also difficult for companies to have the required human capital to develop new catalysts and processes for each end product and market. The catalyst develop- ment process involves an understanding of the processes involved in the manufacture of the carrier materials, the catalyst chemical itself and details of the process for which it is used and its process conditions, limitations and requirements.

8.8 Global Catalysis Market Trends

There are numerous general trends in catalyst technology development that are driv- ing the marketplace. These are:

• Toward higher activity and better selectivity of catalysts.

• Changes in feedstock and more effective use of feedstock.

• Toward lower operating temperatures for reactions.

163 Subsegment Market Size ($ millions) Terephthalic acid 352 Syngas-ammonia 292 Acrylonitrile 219 Ethylene oxide 207 Syngas-hydrogen 198 Methanol 193 Xylenes 182 Olefin purification 175 PTA 135 Ammonia 133 Edible oils and fats 109 Oxo-aldehyde 106 Sulfuric acid 90 Syngas-methanol 85 Inedible oil 83 Phthalic anhydride 79 Solid phosphoric acid (SPA) 73 Styrene 67 Dimethyl terephthalate 61 Maleic anhydride 50 Formaldehyde 45 Nitric acid 44 Ethylbenzene zeolite 32 Other 297

Table 8.1: Global chemical catalyst market in 2016( [261,263]).

• Energy efficiency.

• Creation of processes around catalyst technologies.

• Gas-to-liquids (GTL) technologies.

The following sections provide a detailed look at each of these trends and how the catalyst industry is participating in each of these trends.

8.8.1 Toward higher activity and selectivity of catalysts

One of the strongest trends in catalyst development is toward higher activity and higher selectivity of the catalyst for the objective reaction. Considering the example of

164 hydrocracking, which is a process of breaking down low-value, highly aromatic, high- sulfur and high-nitrogen feedstock into a slate of desirable products such as liquefied petroleum gas (LPG), diesel fuel, hydrogen-rich FCC feed and ethylene cracker feed. Modern technology for hydrocracking was commercialized in the 1960s. Because of the improvements in catalysts and the process, the industry has been able to scale up the sizes of the reactors as well as produce very targeted and clean final products.

Figure 8-5: Trends in catalyst development in Hydrocracking ( [261,264]).

Current global capacity for hydrocracking is estimated at about 8 billion barrels per day, which has increased from a mere 4 billion barrels per day in 2000. The growth in capacity addition for hydrocracking is in the range of 4% to 6% annually. This change is possible mainly because of the improvements in catalyst development. Catalysts are capable of providing very specific reactions at higher rates now than ever before.

8.8.2 Changes in feedstock and more effective use of feedstock

One of the long-term trends in the industry is the dwindling feedstock supplies. Sev- eral previous generation feedstocks are being depleted, and this is causing all estab- lished processing plants, especially in the refining industry, to take a harder look when new plants are being designed and established. If a local natural crude oil supply is

165 limited, then having to transport the feedstock from other parts of the world often makes processing units unviable. The lives of new plants being established often ri- val the projected life of the local feedstock supplies. Even when local feedstock is available, with wells having to go deeper over time, the cost factor for supplies often makes such sources unattractive.

8.8.3 Lower Operating Temperatures

Newer catalysts can operate at lower temperatures and still be more efficient in pro- viding product output. Since the role of a catalyst is to provide the reactants an easier energy path to reach the final products, newer catalysts are bringing down the thermal energy required for the reactants to cross over. This is indirectly reducing the operating temperature for the reactor vessels. Every drop in reaction temperature is associated with increased life for the cata- lyst, reactor vessel and all related equipment. The structural integrity of the catalyst and catalyst carrier are better maintained for longer life with reduced temperature of operations. Each generation of catalyst lowers the reaction temperature required to achieve conversion target, hence extending run length.

8.8.4 Energy efficiency

Some of the newer reactors that are being commissioned are focusing on catalyst regeneration on-site, which not only reduces operational costs, but also is attractive for optimal use of energy. Regeneration is often an exothermic process, with the carbons that are deposited in the catalyst being burned off. The energy release in the process can be transferred into the reactor easily, either as hot catalyst input or in heating the feedstock before it enters the reactor. Studies have shown that such efficient usage of energy can reduce the energy costs in the reactor between 10% and 20%. While feedstock cost is the highest factor impacting product cost, energy cost is often the second highest factor, and the trend is to control the energy costs as much as possible. The changing feedstock situation discussed earlier is forcing older

166 reactors to run more optimally to compete with new processes, and feedstocks can provide products at a lower cost to the customer markets. The push to lower operating temperatures is indirectly also a new trend to sup- port lowering of the energy costs involved in the process. It should be noted, however, that ex situ regeneration of catalyst is still the more preferred option, as the regen- eration technology is only still in the process of being accepted into the marketplace. Further, the segments that are of highest activity for regeneration are the refining markets, where most customers prefer ex situ regeneration as they focus on their core competencies.

8.8.5 Creation of processes around catalyst technologies

Since the costs of catalyst itself is a relatively smaller portion of the cost of the final product sold in almost all catalytic application markets (except for perhaps the automotive catalysts, where PGM material cost is a significant factor), companies have found it highly attractive to invest in R&D of catalyst development relatively cheap. This has, over the years, morphed into an approach where the catalyst and the process are co-developed for a specific market. The approach has resulted in existing processes becoming more competitive, as well as new processes being developed. Examples are numerous in the refinery and the chemical markets, where companies such as Shell, Chevron, BP and ExxonMobil have invested in and developed catalyst technologies for processes that they have later licensed out. The approach also brings synergy with in-house catalyst capabilities as companies that are strong in the catalyst development, such as UOP and Haldor Topsoe, have ventured into process development later in their market approaches. The development cycles for new technologies being long (several years or even decades), companies are investing in designing, building and operating mini-plants to commercialize the technologies developed. The markets always need proof that (1) the new technology really works in large-scale production settings, and (2), the "mini- plants" approach fulfill this market demand before large-scale commercialization of the technology.

167 One of the unfortunate outcomes of this trend is that the entry barriers for new- comers to the catalyst and catalyst regeneration markets have become significantly higher, with larger companies able to protect their turf much better as the approach builds an even bigger moat for their business structure. While companies that use both processes, companies such as Shell and Chevron could compete with catalyst manufacturers such as UOP and Haldor Topsoe, the marketplace has found partners so that together the companies are creating new catalysts and processes, sharing the risks and rewards in the process while keeping newcomers effectively out of the game field.

8.8.6 Gas-to-liquids (GTL) technologies

Gas-to-liquids technology, or GTL, is one of the areas that has been of great interest to both the energy community and research and development being done over decades. There have been efforts at developing a process that involves fewer steps and is more efficient than the more established techniques based on the Fischer-Tropsch process. If successful, these may drive demand of natural gas, and hence its prices. The biggest advantage for the technology, apart from its price points, is the attraction that the fuels produced and the processes used are environmentally friendly compared with the earlier processes and products. This allows new plants to meet the emerging environmental standards in various countries. With the development of new-generation catalysts and processes, the cost curve for new GTL projects has been dropping and now appears attractive for the next several decades. Additionally, the growing production and market influence by shale gas are driving the future of the industry in ways not seen in the past. It would not be surprising to see new markets and industries develop and thrive with the establishment of these technologies. A better way to convert natural gases into liquid fuels is by using small chemical plants that will allow smaller natural fields to be tapped, as this will reduce the cost of gas transportation insofar as its place in the cost equation. Gas Reaction Technologies, based in California, claims to have a technology for this market. Alaskan

168 gas sources are a target market for this technology. For GTL projects, the microchannel technology being developed by Velocys may prove to be the awaited requisite breakthrough. With some segments of the industry now using GTL-based kerosene (e.g., Qatar Airways) instead of the traditional crude- derived jet fuel or kerosene, demand and potential for the GTL-derived liquid fuel market is poised to expand.

8.9 Conclusions

The global industrial catalyst market reached a value of US $18 billion in 2017. Further, the market is expected to reach a value of US $23 billion by 2023, exhibiting a CAGR of 4.5% during 2018-2023. Rising demand for chemical products, fuels and petroleum refining capacities are some of the major factors which are increasing the usage of industrial catalysts. A demand for eco-friendly fuels further propels the growth of the market as industrial catalysts help in meeting the fuel standards, increases operational efficiency and pro- motes clean fuel trends. Growing consumption of fuels and other chemical products has resulted in the rapid growth of the petroleum industry, thereby boosting the de- mand for industrial catalysts. Technological developments, rising urbanization and growing automation are the other factors driving the market growth. Understanding the current needs of a rapidly evolving global market is crucial for the development of new catalytic materials that can accommodate complex industrial processes and highly selective chemical reactions. The global market for catalysts is growing fast and can be the driving force for the development of environmentally friendly, cost-efficient and effective new chemical processes.

169 170 Bibliography

[1] R. Withnall, B. Z. Chowdhry, S. Bell, and T. J. Dines. Journal of Chemical Education, 84(8):1364, (2007). [2] P. Hohenberg and W. Kohn. Physical Review, 136(3B):B864–B871, (1964). [3] W. Kohn and L. Sham. Physical Review, 140(4A):A1133, (1965). [4] S. H. Vosko, L. Wilk, and M. Nusair. Canadian Journal of Physics, 58(8):1200– 1211, (1980). [5] D. Ceperley and B. Alder. Physical Review Letters, 45:566–569, (1980). [6] J. Perdew et al. Physical Review B, 46:6671, (1992). [7] A. D. Becke. Physical Review A, 38:3098–3100, (1988). [8] J. Tao, J. P. Perdew, V. N. Staroverov, and G. E. Scuseria. Phys. Rev. Lett., 91:146401, (2003). [9] A. D. Becke. The Journal of Chemical Physics, 98(7):5648–5652, (1993). [10] C. Lee, W. Yang, and R. G. Parr. Physical Review B, 37:785–789, (1988). [11] P. J. Stephens, F. J. Devlin, C. F. Chabalowski, and M. J. Frisch. The Journal of Physical Chemistry, 98(45):11623–11627, (1994). [12] C. Adamo and V. Barone. The Journal of Chemical Physics, 110(13):6158– 6170, (1999). [13] M. Ernzerhof and G. E. Scuseria. The Journal of Chemical Physics, 110(11):5029–5036, (1999). [14] O. Gunnarsson and B. I. Lundqvist. Phys. Rev. B, 13:4274–4298, (1976). [15] Y. Li, S. H. Chan, and Q. Sun. Nanoscale, 7:8663–8683, (2015). [16] F. Besenbacher, I. Chorkendorff, B. S. Clausen, B. Hammer, A. M. Molenbroek, J. K. Nørskov, and I. Stensgaard. Science, 279(5358):1913–1915, (1998). [17] J. K. Norskov, F. Abild-Pedersen, F. Studt, and T. Bligaard. Proceedings of the National Academy of Sciences, 108(3):937–943, (2011).

171 [18] J. Norskov, T. Bligaard, J. Rossmeisl, and C. Christensen. Nature Chemistry, 1:37–46, (2009).

[19] S. Kozuch and S. Shaik. Journal of the American Chemical Society, 128(10):3355–3365, (2006).

[20] K. P. Jensen and U. Ryde. Journal of Biological Chemistry, 279 (15):14561– 14569, (2004).

[21] C. J. Cramer and D. G. Truhlar. Physical Chemistry Chemical Physics, 11:10757–10816, (2009).

[22] M. C. Gutzwiller. Physical Review, 134:A923–A941, (1964).

[23] J. N. Harvey. Annual Reports Section ’C’ (Physical Chemistry), 102:203–226, (2006).

[24] A. Borgogno, F. Rastrelli, and A. Bagno. Trans., 43:9486–9496, (2014).

[25] F. A. Cotton and G. Wilkinson. Advanced inorganic chemistry, 5th edition. Wiley, New York, (1988).

[26] N. F. Mott. 153(880):699–717, (1936).

[27] C. R. Jacob and M. Reiher. International Journal of Quantum Chemistry, 112(23):3661–3684, (2012).

[28] M. Cococcioni and S. De Gironcoli. Physical Review B, 71:35105, (2005).

[29] F. Furche and J. P. Perdew. The Journal of Chemical Physics, 124(4), (2006).

[30] A. Sorkin, M. Iron, and D. Truhlar. Journal of Chemical Theory and Compu- tation, 4:307–315, (2007).

[31] J. Harvey. Structure and Bonding, 112:151–183, (2004).

[32] I. C. Gerber, J. G. Ángyán, M. Marsman, and G. Kresse. The Journal of Chemical Physics, 127(5), (2007).

[33] E. Burello and G. Rothenberg. International Journal of Molecular Sciences, 7(9):375, (2006).

[34] M. Boudart. Handbook of Heterogeneous Catalysis. Wiley-VCH, (1997).

[35] K. Karlin. Science, 261(5122):701–708, (1993).

[36] B. Xu, Y. Bhawe, and M. E. Davis. Chemistry of Materials, 25(9):1564–1571, (2013).

[37] B. G. Hashiguchi, S. M. Bischof, M. M. Konnick, and R. A. Periana. Accounts of Chemical Research, 45(6):885–898, (2012).

172 [38] L. Que and W. B. Tolman. Nature, 455(7211):333–340, (2008).

[39] J. R. Kitchin, J. K. Nørskov, M. A. Barteau, and J. G. Chen. The Journal of Chemical Physics, 120(21):10240–10246, (2004).

[40] J. K. Nørskov and T. Bligaard. Angewandte Chemie International Edition, 52(3):776–777, (2013).

[41] A. Jain, S. P. Ong, G. Hautier, W. Chen, W. D. Richards, S. Dacek, S. Cho- lia, D. Gunter, D. Skinner, G. Ceder, and K. A. Persson. APL Materials, 1(1):011002, (2013).

[42] C. W. Glass, A. R. Oganov, and N. Hansen. Computer Physics Communica- tions, 175(11–12):713 – 720, (2006).

[43] A. O. Lyakhov, A. R. Oganov, H. T. Stokes, and Q. Zhu. Computer Physics Communications, 184(4):1172 – 1182, (2013).

[44] J. Loch and R. Crabtree. Pure and Applied Chemistry, 73:119–128, (2009).

[45] J. P. Stambuli and J. F. Hartwig. Current Opinion in Chemical Biology, 7(3):420–426, (2003).

[46] K. Houk and P. H.-Y. Cheong. Nature, 455:309–313, (2008).

[47] Y. Chu, W. Heyndrickx, G. Occhipinti, V. R. Jensen, and B. K. Alsberg. Journal of the American Chemical Society, 134(21):8885–8895, (2012).

[48] S. Keinan, M. J. Therien, D. N. Beratan, and W. Yang. The Journal of Physical Chemistry A, 112(47):12203–12207, (2008).

[49] B. D. Mar, H. W. Qi, F. Liu, and H. J. Kulik. The Journal of Physical Chemistry A, 119(24):6551–6562, (2015).

[50] Z. Tian, T. Saito, and D. en Jiang. The Journal of Physical Chemistry A, 119(16):3848–3852, (2015).

[51] R. Ma, P. Guo, L. Yang, L. Guo, X. Zhang, M. K. Nazeeruddin, and M. Grätzel. The Journal of Physical Chemistry A, 114(4):1973–1979, (2010).

[52] L. McEwen and Y. Li. Journal of Computer-Aided Molecular Design, 28(10):975–988, (2014).

[53] J. Xia, E. L. Tilahun, T. E. Reid, L. Zhang, and X. S. Wang. Methods, 71(Virtual Screening):146–157, (2015).

[54] L. Richter and G. F. Ecker. Drug Discovery Today, 14:37–41, (2015).

[55] L. Mak, D. Marcus, A. Howlett, G. Yarova, G. Duchateau, W. Klaffke, A. Ben- der, and R. C. Glen. Journal of Cheminformatics, 7(1):1–12, (2015).

173 [56] A. Iwaniak, P. Minkiewicz, M. Darewicz, M. Protasiewicz, and D. Mogut. Jour- nal of Functional Foods, 16:334–351, (2015).

[57] S. Beisken, T. Meinl, B. Wiswedel, L. F. de Figueiredo, M. Berthold, and C. Steinbeck. BMC Bioinformatics, 14:257, (2013).

[58] M. J. Martinez, I. Ponzoni, M. F. Diaz, G. E. Vazquez, and A. J. Soto. Journal of Cheminformatics, 7(1):1–17, (2015).

[59] C. G. Thompson, A. Sedykh, M. R. Nicol, E. Muratov, D. Fourches, A. Tropsha, and A. D. Kashuba. AIDS Research & Human Retroviruses, 30(11):1058, (2014).

[60] W. Loging, R. Rodriguez-Esteban, J. Hill, T. Freeman, and J. Miglietta. Drug Discovery Today, 8(Drug repurposing):109–116, (2011).

[61] D. Weininger. Journal of Chemical Information and Computer Sciences, 28(1):31–36, (1988).

[62] Daylight chemical information systems. http: //www.daylight.com/dayhtml/doc/theory/theory.smarts.html. Accessed on November 20, 2015.

[63] Epam life sciences. http://lifescience.opensource.epam.com/indigo. Ac- cessed on November 20, 2015.

[64] Y. Cao, A. Charisi, L. C. Cheng, T. Jiang, and T. Girke. Bioinformatics, 24(15):1733–1734, (2008).

[65] Daylight chemical information systems inc. http: //www.daylight.com/products/toolkit.html. Accessed on November 20, 2015.

[66] C. Steinbeck, Y. Han, S. Kuhn, O. Horlacher, E. Luttmann, and E. Willigha- gen. Journal of Chemical Information and Computer Sciences, 43(2):493–500, (2003).

[67] Rdkit: Open-source cheminformatics. http://www.rdkit.org. Accessed on March 30, 2016.

[68] N. O’Boyle, M. Banck, C. James, C. Morley, T. Vandermeersch, and G. Hutchi- son. Journal of Cheminformatics, 3(1):33, (2011).

[69] H. E. Helson. Structure Diagram Generation, pages 313–398. John Wiley & Sons, Inc., (2007).

[70] D. M. Ball, C. Buda, A. M. Gillespie, D. P. White, and T. R. Cundari. Inorganic Chemistry, 41(1):152–156, (2002).

174 [71] C. Buda, A. Flores, and T. R. Cundari. Journal of Coordination Chemistry, 58(7):575–585, (2005).

[72] C. Buda, S. K. Burt, T. R. Cundari, and P. S. Shenkin. Inorganic Chemistry, 41(8):2060–2069, (2002).

[73] S. Bauerschmidt and J. Gasteiger. Journal of Chemical Information and Com- puter Sciences, 37(4):705–714, (1997).

[74] J. M. Blaney and J. S. Dixon. Distance Geometry in Molecular Modeling, pages 299–335. John Wiley & Sons, Inc., (2007).

[75] D. Lagorce, T. Pencheva, B. Villoutreix, and M. Miteva. BMC Chemical Biol- ogy, 9(1):6, (2009).

[76] Molecular networks. https://www.molecular-networks.com/online_ demos/corina_demo. Accessed on November 20, 2015.

[77] Chemaxon. https://www.chemaxon.com/products/. Accessed on November 20, 2015.

[78] F. H. Allen. Acta Crystallographica Section B, 58(3 Part 1):380–388, (2002).

[79] G. Bergerhoff and I. Brown. International Union of Crystallography, (1987).

[80] X. Chen, M. Liu, and M. Gilson. Combinatorial Chemistry & High Throughput Screening, 4(8):719 – 725, (2002).

[81] E. E. Bolton, Y. Wang, P. A. Thiessen, and S. H. Bryant. Chapter 12-pubchem: Integrated platform of small molecules and biological activities. volume 4 of Annual Reports in Computational Chemistry, pages 217–241. Elsevier, (2008).

[82] V. Law, C. Knox, Y. Djoumbou, T. Jewison, A. C. Guo, Y. Liu, A. Maciejewski, D. Arndt, M. Wilson, V. Neveu, A. Tang, G. Gabriel, C. Ly, S. Adamjee, Z. T. Dame, B. Han, Y. Zhou, and D. S. Wishart. Nucleic Acids Research, 42(D1):D1091–D1097, (2014).

[83] J. J. Irwin, T. Sterling, M. M. Mysinger, E. S. Bolstad, and R. G. Coleman. Journal of Chemical Information and Modeling, 52(7):1757–1768, (2012).

[84] A. Gaulton, L. J. Bellis, A. P. Bento, J. Chambers, M. Davies, A. Hersey, Y. Light, S. McGlinchey, D. Michalovich, B. Al-Lazikani, and J. P. Overington. Nucleic Acids Research, 40(D1):D1100–D1107, (2012).

[85] Chemspider, search and share chemistry. http://www.chemspider.com/. Ac- cessed on November 20, 2015.

[86] H. J. Kulik, S. E. Wong, S. E. Baker, C. A. Valdez, J. H. Satcher, R. D. Aines, and F. C. Lightstone. Acta Crystallographica Section C, 70(2):123–131, (2014).

175 [87] A. Andronico, A. Randall, R. W. Benz, and P. Baldi. Journal of Chemical Information and Modeling, 51(4):760–776, (2011).

[88] M. Foscato, G. Occhipinti, V. Venkatraman, B. K. Alsberg, and V. R. Jensen. Journal of Chemical Information and Modeling, 54(3):767–780, (2014).

[89] M. Foscato, V. Venkatraman, G. Occhipinti, B. K. Alsberg, and V. R. Jensen. Journal of Chemical Information and Modeling, 54(7):1919–1931, (2014).

[90] P. Baldi. Journal of Chemical Information and Modeling, 51(12):3029–3029, (2011).

[91] C. R. Groom. Journal of Chemical Information and Modeling, 51(11):2787– 2787

[92] S. R. Bahn and K. W. Jacobsen. Computational Science and Engineering, 4(3):56–66, (2002).

[93] M. D. Hanwell, D. E. Curtis, D. C. Lonie, T. Vandermeersch, E. Zurek, and G. R. Hutchison. Journal of Cheminformatics, 4(1):17, (2012).

[94] E. I. Ioannidis and H. J. Kulik. The Journal of Chemical Physics, 143(3), (2015).

[95] K. Burke. The Journal of Chemical Physics, 136(15), (2012).

[96] M. Swart, M. Bickelhaupt, F, and M. Duran. (2014). http://www.marcelswart.eu/dft-poll.

[97] Y. Zhao and D. G. Truhlar. Chemical Physics Letters, 502(13):1 – 13, (2011).

[98] N. Mardirossian and M. Head-Gordon. Physical Chemistry Chemical Physics, 16:9904–9924, (2014).

[99] L. A. Curtiss, K. Raghavachari, P. C. Redfern, and J. A. Pople. The Journal of Chemical Physics, 106(3):1063–1079, (1997).

[100] A. J. Cohen, P. Mori-Sánchez, and W. Yang. Science, 321 (5890):792–794, (2008).

[101] S. Lutfalla, V. Shapovalov, and A. T. Bell. Journal of Chemical Theory and Computation, 7(7):2218–2223, (2011).

[102] A. Jain, G. Hautier, S. P. Ong, C. J. Moore, C. C. Fischer, K. A. Persson, and G. Ceder. Physical Review B, 84:045115, (2011).

[103] J. N. Harvey, R. Poli, and K. M. Smith. Coordination Chemistry Reviews, 238239:347 – 361, (2003). Theoretical and Computational Chemistry.

[104] P. Gütlich and A. Hauser. Coordination Chemistry Reviews, 97:1 – 22, (1990).

176 [105] H.-J. Lin, D. Siretanu, D. A. Dickie, D. Subedi, J. J. Scepaniak, D. Mit- cov, R. Clérac, and J. M. Smith. Journal of the American Chemical Society, 136(38):13326–13332, (2014).

[106] P. Gütlich and H. A. Goodwin. Spin Crossover in Transition Metal Compounds I. Springer Science & Business Media, (2004).

[107] J. A. Real, E. Andrés, M. C. Muñoz, M. Julve, T. Granier, A. Bousseksou, and F. Varret. Science, 268 (5208):265–267, (1995).

[108] L. Bogani and W. Wernsdorfer. Nature Materials, 7 :179–186, (2008).

[109] S. Sanvito. Chemical Society Reviews, 40:3336–3355, (2011).

[110] D. Schröder, S. Shaik, and H. Schwarz. Accounts of Chemical Research, 33(3):139–145, (2000).

[111] R. Poli and J. N. Harvey. Chemical Society Reviews, 32:1–8, (2003).

[112] Y. H. Kwon, B. K. Mai, Y.-M. Lee, S. N. Dhuri, D. Mandal, K.-B. Cho, Y. Kim, S. Shaik, and W. Nam. The Journal of Physical Chemistry Letters, 6(8):1472– 1476, (2015).

[113] K. Yoshizawa, Y. Shiota, and T. Yamabe. Chemistry A European Journal, 3(7):1160–1169, (1997).

[114] Y. Shiota, , and K. Yoshizawa*. Journal of the American Chemical Society, 122(49):12317–12326, (2000).

[115] M. Filatov, , and S. Shaik. The Journal of Physical Chemistry A, 102(21):3835– 3846, (1998).

[116] H. J. Kulik and N. Marzari. The Journal of Chemical Physics, 129(13), (2008).

[117] H. J. Kulik, M. Cococcioni, D. A. Scherlis, and N. Marzari. Physical Review Letters, 97:103001, (2006).

[118] H. J. Kulik and N. Marzari. Fuel Cell Science: Theory, Fundamentals, and Bio-Catalysis. edited by J. Norskov and A. Wiezcowski (Wiley Monograph), (2010).

[119] A. Droghetti, D. Alfè, and S. Sanvito. The Journal of Chemical Physics, 137(12), (2012).

[120] G. Ganzenmüller, N. Berkaïne, A. Fouqueau, M. E. Casida, and M. Reiher. The Journal of Chemical Physics, 122(23), (2005).

[121] S. R. Mortensen and K. P. Kepp. The Journal of Physical Chemistry A, 119(17):4041–4050, (2015).

177 [122] S. Zein, S. A. Borshch, P. Fleurat-Lessard, M. E. Casida, and H. Chermette. The Journal of Chemical Physics, 126(1), (2007).

[123] M. Swart, A. R. Groenhof, A. W. Ehlers, and K. Lammertsma. The Journal of Physical Chemistry A, 108(25):5479–5483, (2004).

[124] M. R. Pederson, A. Ruzsinszky, and J. P. Perdew. The Journal of Chemical Physics, 140(12), (2014).

[125] J. P. Perdew and A. Zunger. Physical Review B, 23:5048–5079, (1981).

[126] K. P. Jensen and J. Cirera. The Journal of Physical Chemistry A, 113(37):10033–10039, (2009).

[127] T. F. Hughes and R. A. Friesner. Journal of Chemical Theory and Computation, 7(1):19–32, (2011).

[128] F. Neese. JBIC Journal of Biological Inorganic Chemistry, 11(6):702–711, (2006).

[129] H. Paulsen, V. Schünemann, and J. A. Wolny. European Journal of Inorganic Chemistry, 2013(5-6):628–641, (2013).

[130] H. J. Kulik. The Journal of Chemical Physics, 142(24), (2015).

[131] A. Ghosh, B. J. Persson, and P. Taylor. JBIC Journal of Biological Inorganic Chemistry, 8:507, (2003).

[132] F. Aquilante, P.-A. Malmqvist, T. B. Pedersen, A. Ghosh, and B. O. Roos. Journal of Chemical Theory and Computation, 4(5):694–702, (2008).

[133] A. Ghosh, E. Gonzalez, E. Tangen, and B. O. Roos. The Journal of Physical Chemistry A, 112(50):12792–12798, (2008).

[134] K. Pierloot and S. Vancoillie. The Journal of Chemical Physics, 128(3), (2008).

[135] M. Radon, E. Broclawik, and K. Pierloot. The Journal of Physical Chemistry B, 114(3):1518–1528, (2010).

[136] S. Vancoillie, H. Zhao, M. Radon, and K. Pierloot. Journal of Chemical Theory and Computation, 6(2):576–582, (2010).

[137] L. M. L. Daku, F. Aquilante, T. W. Robinson, and A. Hauser. Journal of Chemical Theory and Computation, 8(11):4216–4231, (2012).

[138] A. Vargas, I. Krivokapic, A. Hauser, and L. M. Lawson Daku. Physical Chem- istry Chemical Physics, 15:3752–3763, (2013).

[139] M. Reiher. Inorganic Chemistry, 41(25):6928–6935, (2002).

178 [140] M. Reiher, O. Salomon, and B. Artur Hess. Theoretical Chemistry Accounts, 107(1):48–55, (2001).

[141] A. Fouqueau, S. Mer, M. E. Casida, L. M. Lawson Daku, A. Hauser, T. Mineva, and F. Neese. The Journal of Chemical Physics, 120(20):9473–9486, (2004).

[142] A. Fouqueau, M. E. Casida, L. M. L. Daku, A. Hauser, and F. Neese. The Journal of Chemical Physics, 122(4), (2005).

[143] T. Stein, J. Autschbach, N. Govind, L. Kronik, and R. Baer. The Journal of Physical Chemistry Letters, 3(24):3740–3744, (2012).

[144] R. Baer, E. Livshits, and U. Salzner. Annual Review of Physical Chemistry, 61(1):85–109, (2010).

[145] J. H. Skone, M. Govoni, and G. Galli. Physical Review B, 89:195112, (2014).

[146] T. Weymuth and M. Reiher. International Journal of Quantum Chemistry, 115(2):90–98, (2015).

[147] Petachem. http://www.petachem.com. Accessed on November 20, 2015.

[148] J. P. Perdew. Phys. Rev. B, 33:8822–8824, (1986).

[149] A. D. Becke. The Journal of Chemical Physics, 107(20):8554–8560, (1997).

[150] J. Kästner, J. M. Carr, T. W. Keal, W. Thiel, A. Wander, and P. Sherwood. The Journal of Physical Chemistry A, 113(43):11856–11865, (2009).

[151] E. Glendening, J. Badenhoop, A. Reed, J. Carpenter, J. Bohmann, C. Morales, C. Landis, and F. Weinhold. Theoretical Chemistry Institute, University of Wisconsin, Madison, (2013).

[152] A. E. Reed, R. B. Weinstock, and F. Weinhold. The Journal of Chemical Physics, 83(2):735–746, (1985).

[153] J. P. Perdew, K. Burke, and M. Ernzerhof. Phys. Rev. Lett., 77:3865–3868, (1996).

[154] Y. Zhao and D. G. Truhlar. The Journal of Chemical Physics, 125(19), (2006).

[155] P. Gütlich, Y. Garcia, and H. A. Goodwin. Chemical Society Reviews, 29:419– 427, (2000).

[156] D. N. Bowman and E. Jakubikova. Inorganic Chemistry, 51(11):6011–6019, (2012).

[157] A. Sorkin, M. A. Iron, and D. G. Truhlar. Journal of Chemical Theory and Computation, 4(2):307–315, (2008).

179 [158] C. Anthon and C. E. Schäffer. Coordination Chemistry Reviews, 226(12):17 – 38, (2002). [159] C. Anthon, J. Bendix, and C. E. Schäffer. Inorganic Chemistry, 42(13):4088– 4097, (2003). [160] J. Moens, G. Roos, P. Jaque, F. DeProft, and P. Geerlings. Chemistry A European Journal, 13(33):9331–9343, (2007). [161] W. Moffitt and C. J. Ballhausen. Annual Review of Physical Chemistry, 7(1):107–136, (1956). [162] A. Kramida, Y. Ralchenko, J. Reader, and T. N. A. Team. (2014). NIST: Atomic Spectra Database, NIST, 2014. [163] J. P. Perdew and K. Schmidt. AIP Conference Proceedings, 577(1):1–20, (2001). [164] A. Seidl, A. Görling, P. Vogl, J. A. Majewski, and M. Levy. Phys. Rev. B, 53:3764–3774, (1996). [165] A. D. Becke. The Journal of Chemical Physics, 98(2):1372–1377, (1993). [166] J. Heyd, G. E. Scuseria, and M. Ernzerhof. The Journal of Chemical Physics, 118(18):8207–8215, (2003). [167] E. A. C. Bushnell and J. W. Gauld. Journal of Computational Chemistry, 34(2):141–148, (2013). [168] V. Polo, E. Kraka, and D. Cremer. Molecular Physics, 100(11):1771–1790, (2002). [169] J. L. F. Da Silva, M. V. Ganduglia-Pirovano, J. Sauer, V. Bayer, and G. Kresse. Phys. Rev. B, 75:045121, (2007). [170] M. Schlipf, M. Betzinger, C. Friedrich, M. Ležaić, and S. Blügel. Phys. Rev. B, 84:125142, (2011). [171] J. Sun, A. Ruzsinszky, and J. P. Perdew. Phys. Rev. Lett., 115:036402, (2015). [172] J. P. Perdew, S. Kurth, A. c. v. Zupan, and P. Blaha. Phys. Rev. Lett., 82:2544– 2547, (1999). [173] T. Van Voorhis and G. E. Scuseria. The Journal of Chemical Physics, 109(2):400–410, (1998). [174] R. Peverati and D. G. Truhlar. The Journal of Physical Chemistry Letters, 3(1):117–124, (2012). [175] H. S. Yu, X. He, and D. G. Truhlar. Journal of Chemical Theory and Compu- tation, 12(3):1280–1293, (2016).

180 [176] J. P. Perdew, A. Ruzsinszky, G. I. Csonka, L. A. Constantin, and J. Sun. Phys. Rev. Lett., 103:026403, (2009).

[177] V. Anisimov, J. Zaanen, and O. Andersen. Physical Review B, 44(943), (1991).

[178] V. Anisimov and O. Gunnarsson. Physical Review B, 43:7570, (1991).

[179] H. Kulik et al. Journal of the American Chemical Society, 131:14426, (2009).

[180] H. Kulik and N. Marzari. Journal of Chemical Physics, 135:194105, (2011).

[181] Y. Zhao and D. G. Truhlar. Theoretical Chemistry Accounts, 120(1):215–241, (2008).

[182] D. A. Kitchaev, H. Peng, Y. Liu, J. Sun, J. P. Perdew, and G. Ceder. Phys. Rev. B, 93:045132, (2016).

[183] F. Tran, J. Stelzl, and P. Blaha. The Journal of Chemical Physics, 144(20), (2016).

[184] S. Moroni, D. M. Ceperley, and G. Senatore. Phys. Rev. Lett., 75:689–692, (1995).

[185] E. H. Lieb and S. Oxford. International Journal of Quantum Chemistry, 19(3):427–439, (1981).

[186] J. P. Perdew, A. Ruzsinszky, J. Tao, G. I. Csonka, and G. E. Scuseria. Phys. Rev. A, 76:042506, (2007).

[187] V. N. Staroverov, G. E. Scuseria, J. Tao, and J. P. Perdew. The Journal of Chemical Physics, 119(23):12129–12137, (2003).

[188] V. N. Staroverov, G. E. Scuseria, J. Tao, and J. P. Perdew. Phys. Rev. B, 69:075102, (2004).

[189] Z.-H. Yang, H. Peng, J. Sun, and J. P. Perdew. Phys. Rev. B, 93:205205, (2016).

[190] J. Sun, R. Remsing, and J. P. Perdew. Nature Chemistry, (2016).

[191] M. W. Schmidt, K. K. Baldridge, J. A. Boatz, S. T. Elbert, M. S. Gordon, J. H. Jensen, S. Koseki, N. Matsunaga, K. A. Nguyen, S. Su, T. L. Win- dus, M. Dupuis, and J. A. Montgomery. Journal of Computational Chemistry, 14(11):1347–1363, (1993).

[192] T. H. Dunning. The Journal of Chemical Physics, 90(2):1007–1023, (1989).

[193] R. Ditchfield, W. J. Hehre, and J. A. Pople. The Journal of Chemical Physics, 54(2):724–728, (1971).

[194] F. Weigend and R. Ahlrichs. Phys. Chem. Chem. Phys., 7:3297–3305, (2005).

181 [195] E. I. Ioannidis, T. Z. H. Gani, and H. J. Kulik. Journal of Computational Chemistry, 37(22):2106–2117, (2016).

[196] R. F. W. Bader. Chemical Reviews, 91(5):893–928, (1991).

[197] T. Lu and F. Chen. Journal of Computational Chemistry, 33(5):580–592, (2012).

[198] A. Domingo, M. Àngels Carvajal, and C. de Graaf. International Journal of Quantum Chemistry, 110(2):331–337, (2010).

[199] K. Pierloot and S. Vancoillie. The Journal of Chemical Physics, 125(12), (2006).

[200] C. Kupper, A. Schober, S. Demeshko, M. Bergner, and F. Meyer. Inorganic Chemistry, 54(7):3096–3098, (2015).

[201] N. M. O’Boyle, C. Morley, and G. R. Hutchison. Chemistry Central Journal, 2(1):1–5, (2008).

[202] K. Müller-Dethlefs and P. Hobza. Chemical Reviews, 100(1):143–168, (2000).

[203] E. R. Gillies and J. M. J. Fréchet. Drug Discovery Today, 10(1):35–43, (2005).

[204] Python Software Foundation. Python Language Reference:. http: //www.python.org. Accessed on March 30, 2016. [205] A. Dalby, J. G. Nourse, W. D. Hounshell, A. K. I. Gushurst, D. L. Grier, B. A. Leland, and J. Laufer. Journal of Chemical Information and Computer Sciences, 32(3):244–255, (1992).

[206] T. A. Halgren. Journal of Computational Chemistry, 17(5-6):490–519, (1996).

[207] A. K. Rappe, C. J. Casewit, K. S. Colwell, W. A. Goddard, and W. M. Skiff. Journal of the American Chemical Society, 114(25):10024–10035, (1992).

[208] eMolecules database:. http://www.emolecules.com. Accessed on March 30, 2016.

[209] I. S. Ufimtsev and T. J. Martinez. Journal of Chemical Theory and Computa- tion, 5(10):2619–2628, (2009).

[210] Y. Shao, L. F. Molnar, Y. Jung, J. Kussmann, C. Ochsenfeld, S. T. Brown, A. T. Gilbert, L. V. Slipchenko, S. V. Levchenko, D. P. O’Neill, R. A. DiS- tasio Jr, R. C. Lochan, T. Wang, G. J. Beran, N. A. Besley, J. M. Herbert, C. Yeh Lin, T. Van Voorhis, S. Hung Chien, A. Sodt, R. P. Steele, V. A. Ras- solov, P. E. Maslen, P. P. Korambath, R. D. Adamson, B. Austin, J. Baker, E. F. C. Byrd, H. Dachsel, R. J. Doerksen, A. Dreuw, B. D. Dunietz, A. D. Du- toi, T. R. Furlani, S. R. Gwaltney, A. Heyden, S. Hirata, C.-P. Hsu, G. Kedziora, R. Z. Khalliulin, P. Klunzinger, A. M. Lee, M. S. Lee, W. Liang, I. Lotan,

182 N. Nair, B. Peters, E. I. Proynov, P. A. Pieniazek, Y. Min Rhee, J. Ritchie, E. Rosta, C. David Sherrill, A. C. Simmonett, J. E. Subotnik, H. Lee Wood- cock III, W. Zhang, A. T. Bell, A. K. Chakraborty, D. M. Chipman, F. J. Keil, A. Warshel, W. J. Hehre, H. F. Schaefer III, J. Kong, A. I. Krylov, P. M. W. Gill, and M. Head-Gordon. Physical Chemistry Chemical Physics, 8:3172–3191, (2006).

[211] V. Saunders and I. Hillier. International Journal of Quantum Chemistry, 7(4):699–705, (1973).

[212] Grid engine. http://gridscheduler.sourceforge.net/. Accessed on Novem- ber 20, 2015.

[213] A. B. Yoo, M. A. Jette, and M. Grondona. Job Scheduling Strategies for Parallel Processing: 9th International Workshop, JSSPP 2003, Seattle, WA, USA, June 24, 2003. Revised Paper, pages 44–60. Springer Berlin Heidelberg, (2003).

[214] G. Schaftenaar and J. H. Noordik. Journal of computer-aided molecular design, 14(2):123–134, (2000).

[215] C. Fonseca Guerra, J.-W. Handgraaf, E. J. Baerends, and F. M. Bickelhaupt. Journal of Computational Chemistry, 25(2):189–210, (2004).

[216] F. L. Hirshfeld. Theoretica Chimica Acta, 44(2):129–138, (1977).

[217] A. D. Becke and K. E. Edgecombe. The Journal of Chemical Physics, 92(9):5397–5403, (1990).

[218] B. Hammer, Y. Morikawa, and J. K. Nørskov. Physical Review Letters, 76(12):2141, (1996).

[219] N. M. O’Boyle, A. L. Tenderholt, and K. M. Langner. Journal of Computational Chemistry, 29(5):839–845, (2008).

[220] G. Hermann, V. Pohl, J. C. Tremblay, B. Paulus, H.-C. Hege, and A. Schild. Journal of Computational Chemistry, 37(16):1511–1520, (2016).

[221] T. Z. H. Gani, E. I. Ioannidis, and H. J. Kulik. Chemistry of Materials, (2016).

[222] S. Toma and R. Sebesta. Synthesis, 47:1683–1695, (2015).

[223] T. K. Hyster. Ferrocenium Salts. John Wiley & Sons, Ltd, (2001).

[224] M. J. Queensen, J. M. Rabus, and E. B. Bauer. Journal of Molecular Catalysis A: Chemical, 407:221–229, (2015).

[225] L. Pena, A. Seidl, L. Cohen, and P. Hoggard. Transition Metal Chemistry, 34(2):135–141, (2009).

183 [226] A. A. J. Torriero, M. J. A. Shiddiky, I. Burgar, and A. M. Bond. Organometallics, 32(20):5731–5739, (2013).

[227] X. Mao, W. Tian, J. Wu, G. C. Rutledge, and T. A. Hatton. Journal of the American Chemical Society, 137(3):1348–1355, (2015).

[228] Y. Yang and L. Yu. Physical Chemistry Chemical Physics, 15:2669–2683, (2013).

[229] D. L. Stone and D. K. Smith. Polyhedron, 22(5):763–768, (2003).

[230] S. J. Coles, G. Denuault, P. A. Gale, P. N. Horton, M. B. Hursthouse, M. E. Light, and C. N. Warriner. Polyhedron, 22(5):699–709, (2003).

[231] J.-B. Zhuo, C.-Y. Zhang, C.-X. Lin, S. Bai, L.-L. Xie, and Y.-F. Yuan. Journal of Organometallic Chemistry, 763-764:34–43, (2014).

[232] S. Grimme, J. Antony, S. Ehrlich, and H. Krieg. The Journal of Chemical Physics, 132(15):154104, (2010).

[233] S. Grimme, S. Ehrlich, and L. Goerigk. Journal of Computational Chemistry, 32(7):1456–1465, (2011).

[234] A. W. Lange and J. M. Herbert. The Journal of Chemical Physics, 133(24):244111, (2010).

[235] J. Herbert and A. Lange. The Polarizable Continuum Model for Molecular Elec- trostatics: Basic Theory, Recent Advances, and Future Challenges. Springer, (2014).

[236] F. Liu, N. Luehr, H. J. Kulik, and T. J. Martinez. Journal of Chemical Theory and Computation, 11(7):3131–3144, (2015).

[237] S. Boys and F. Bernardi. Molecular Physics, 19(4):553–566, (1970).

[238] G. R. Desiraju and T. Steiner. The weak hydrogen bond: in structural chemistry and biology, volume 9. Oxford university press, (2001).

[239] M. Rahm. Journal of Chemical Theory and Computation, 11(8):3617–3628, (2015).

[240] B. A. Springer, S. G. Sligar, J. S. Olson, and J. Phillips, George N. Chemical Reviews, 94(3):699–714, (1994).

[241] C. Rovira and M. Parrinello. Biophysical Journal, [78.

[242] C. C. Hsia. New England Journal of Medicine, 338(4):239–248, (1998).

[243] B. Meunier, S. P. de Visser, and S. Shaik. Chemical Reviews, 104(9):3947–3980, (2004).

184 [244] W. Zeng, N. J. Silvernail, D. C. Wharton, G. Y. Georgiev, B. M. Leu, W. R. Scheidt, J. Zhao, W. Sturhahn, E. E. Alp, and J. T. Sage. Journal of the American Chemical Society, 127(32):11200–11201, (2005).

[245] T. Karpuschkin, M. M. Kappes, and O. Hampe. Angewandte Chemie Interna- tional Edition, 52(39):10374–10377, (2013).

[246] M.-S. Liao and S. Scheiner. The Journal of Chemical Physics, 117(1):205–219, (2002).

[247] C. Rovira, K. Kunc, J. Hutter, P. Ballone, and M. Parrinello. The Journal of Physical Chemistry A, 101(47):8914–8925, (1997).

[248] K. P. Kepp and P. Dasmeh. The Journal of Physical Chemistry B, 117(14):3755–3770, (2013).

[249] M. Rado. Journal of Chemical Theory and Computation, 10(6):2306–2321, (2014).

[250] M. Radon. Inorganic Chemistry, 54(12):5634–5645, (2015).

[251] G. G. Gibson and P. P. Tamburini. Xenobiotica, 14(1-2):27–47, (1984).

[252] C. C. Hsia. New England Journal of Medicine, 338(4):239–248, (1998). PMID: 9435331.

[253] . Bernard Meunier, *, . Samuël P. de Visser, , and . Sason Shaik*. Chemical Reviews, 104(9):3947–3980, (2004). PMID: 15352783.

[254] F. Ogliaro, S. P. de Visser, and S. Shaik. Journal of Inorganic Biochemistry, 91(4):554 – 567, (2002). Advances in the Inorganic Biochemistry of Cytochrome {P450}.

[255] C. Rovira. Journal of Physics: Condensed Matter, 15(18):S1809, (2003).

[256] L.-P. Wang and C. Song. The Journal of Chemical Physics, 144(21), 2016.

[257] O. P. Charkin, N. M. Klimenko, D. O. Charkin, and S. H. Lin. Russian Journal of Inorganic Chemistry, 52(8):1248–1261, (2007).

[258] K. M. Kadish, K. M. Smith, and R. Guilard. The Porphyrin Handbook, volume 3. Academic Press, (2002).

[259] T. E. Shubina and T. Clark. Journal of Coordination Chemistry, 63(14- 16):2854–2867, (2010).

[260] M.-S. Liao, M.-J. Huang, and J. D. Watts. The Journal of Physical Chemistry A, 114(35):9554–9569, (2010).

[261] BCC Research, (2014).

185 [262] Grand View Research, (2017).

[263] Albemarle Corporation, (2016).

[264] Art Catalyst catalog, (2016).

186