A generally applicable atomic-charge dependent London dispersion correction Eike Caldeweyher,1 Sebastian Ehlert,1 Andreas Hansen,1 Hagen Neugebauer,1 Sebastian Spicher,1 Christoph Bannwarth,1, 2 and Stefan Grimme1, a) 1)Mulliken Center for Theoretical Chemistry, Institut f¨urPhysikalische und Theoretische Chemie der Universit¨atBonn Beringstr. 4, D-53115 Bonn, Germany 2)Department of Chemistry, Stanford University, Stanford, CA 94305, United States of America (Dated: 25 January 2019) The D4 model is presented for the accurate computation of London dispersion interactions in density func- tional theory approximations (DFT-D4) and generally for atomistic modeling methods. In this successor to the DFT-D3 model, the atomic coordination-dependent dipole polarizabilities are scaled based on atomic partial charges which can be taken from various sources. For this purpose, a new charge-dependent parameter- economic scaling function is designed. Classical charges are obtained from an atomic electronegativity equi- libration procedure for which efficient analytical derivatives are developed. A numerical Casimir-Polder integration of the atom-in-molecule dynamic polarizabilities yields charge- and geometry-dependent dipole- dipole dispersion coefficients. Similar to the D3 model, the dynamic polarizabilities are pre-computed by time-dependent DFT and elements up to radon are covered. For a benchmark set of 1225 dispersion coeffi- cients, the D4 model achieves an unprecedented accuracy with a mean relative deviation of 3.8% compared to 4.7% for D3. In addition to the two-body part, three-body effects are described by an Axilrod-Teller-Muto term. A common many-body dispersion expansion was extensively tested and an energy correction based on D4 polarizabilities is found to be advantageous for some larger systems. Becke-Johnson-type damping parameters for DFT-D4 are determined for more than 60 common functionals. For various energy bench- mark sets DFT-D4 slightly outperforms DFT-D3. Especially for metal containing systems, the introduced charge dependence improves thermochemical properties. We suggest (DFT-)D4 as a physically improved and more sophisticated dispersion model in place of DFT-D3 for DFT calculations as well as for other low-cost approaches like semi-empirical models. Keywords: density functional theory, London dispersion, non-covalent interactions, thermochemistry

I. INTRODUCTION only mentioned briefly here. Among them are the ex- change hole dipole moment14,17–19 (XDM) model and the 20 Many computational studies have shown that dispersion- Tkatchenko-Scheffler (TS) model both employing Hir- corrected Kohn-Sham density functional theory (abbre- shfeld partitioning. It is also possible to directly develop viated as DFT in the following) is currently the method non-local density functionals, which are inherently capa- of choice for the routine computation of the electronic ble of describing London dispersion interactions. This and geometric structure of large systems1,2, e.g., in the way, atomic partitioning, which always involves some ar- fields of supra-molecular chemistry3–5, catalysis6,7, or bitrariness, is avoided. Here, the family of van der Waals in materials science8–10. In contrast to more elabo- density functionals is to be mentioned, which are based rate wave function theory (WFT) based methods, most on the fundamental adiabatic connection theory and of- density functional approximations (DFAs) are not able fer a rigorous basis for the design of dispersion-inclusive 21,22 to describe long-ranged electron correlation effects11–14. exchange-correlation functionals . A simplified con- Their treatment is, however, important to compute en- struction scheme for the non-local correlation part has 23,24 ergetic properties with high accuracy (less than about 1 been introduced by Vydrov and Van Voorhis . kcal mol−1 error), particularly for non-covalently bonded The computational costs of incorporating dispersion cor- or condensed phase systems. Therefore, various correc- rections into a standard DFT treatment are very method tion schemes have been developed to describe so-called dependent but generally smaller compared to the actual London dispersion interactions in a DFT framework15. DFT calculation. Consequently, it is reasonable to in- The most commonly used method for molecular appli- clude them by default, and it has been demonstrated cations is the so-called DFT-D3 scheme16, which cal- that the accuracy of dispersion-corrected DFAs for ther- culates the inter- and intramolecular dispersion interac- mochemical properties on average follows the Jacob’s lad- tions only by employing the given system coordinates der25 classification. Note that for very low-cost atomistic (and atomic numbers). Similar atom pair-wise models, models like semi-empirical molecular orbital or force-field which additionally include information from the elec- methods, only non-density dependent schemes like DFT- tron density, have been reviewed recently15 and are D3 (and the here proposed D4) are computationally fea- sible. The current work describes the further development of a)Corresponding author:[email protected] the widely used semi-classical DFT-D3 approach. Re- 2 cently, it was shown that this “geometry-only” model xyz can be further improved by addition of atomic charge in- q formation26. Therein we showed that computed atom-in- tot molecule dynamic polarizabilities can be scaled by means of an element specific function with Mulliken-type atomic charges as input. A Σ AABqA+χB= 0 Here we want to report on the final version of the D4 A=1 model and provide it in a usable form for a large number (default) of density functionals as DFT-D4. In general we retained IN-/OUTPUT xyz Nel. the general idea and strong points of the well established Alternative D3 scheme and introduced the charge dependence as well STORED (optional) as some less important, mostly technical improvements. charges zA=ZA+qA Compared to the scaling scheme in Ref. 26, a less empir- AT RUNTIME (TB,DFT) ical function is used (see below) and the Mulliken partial charges are replaced by default with classical electroneg- αA,ref(iω) 1) Scaling ativity equilibration (EEQ) partial charges as recently A described by Goedecker et al27. The D4 scheme is, how- CN ever, general in a sense that any type of atomic charges zA,ref αA,ref(iω,zA) in addition to the geometric structure of the system can be used as input. CNA,ref 2) Averaging Previous studies have revealed that many-body disper- sion interactions beyond a pair-wise picture are impor- 28,29 30,31 tant, e.g., in supra-molecular , in cluster , and αA(iω,zA,CNA) in condensed-phase systems32–36 where so-called Dobson type B effects37 play an important role. The use of a coupled-dipole based many-body dispersion (MBD) cor- DFT-D4 Integration rection (cf. the MBD method in Refs. 38,39) in our imple- AB (optional) C6 mentation does not yield systematic improvements com- ABC pared to the simple third-order Axilrod-Teller-Muto40,41 A(iω)T C9 (ATM) term. However, the MBD correction proves to be Pair-wise Three-body Many-body useful for large systems (see non-covalent benchmarks in dispersion dispersion dispersion section III), for which we recommend its application in energy energy terms of an energy correction. The theory and technical details of the D4 method are described in section II. Subsequently, in section III, re- FIG. 1. Schematic workflow of the D4 program for an example sults for dispersion coefficients with D4 as well as for molecule (all definitions and steps are explained in detail in energies, and structures obtained with DFT-D4 are com- the text). pared directly to those of other established dispersion correction schemes for the same underlying DFA. Finally, a summary and an outlook on possible future work will atomic reference polarizabilities. For this purpose, a new be given. charge-scaling function ζ has been designed in this work given by

ζ(zA, zA,ref ) = II. THEORY     A,ref  (2) A z exp β1 1 − exp γ 1 − Atomic units are used throughout in this work. The zA DFT-D4 workflow is simplified for an example molecule which is sketched exemplarily in Figure 2. In equation 2, in Figure 1. The pair-wise dipole-dipole dispersion coef- the chemical hardness γA is taken from Ref. 42 and serves ficients in D4 are calculated by numerical integration via as a non-fitted element-specific parameter to control the the well-known Casimir-Polder relation steepness of the scaling function. The value of the global ∞ parameter β = 3 was determined by inspection. The 3 Z 1 AB A B empirical charge-dependent scaling in D4 is intended to C6 = dω α (iω)α (iω). (1) π increase the magnitude of the atomic dynamic dipole 0 polarizabilities α(iω) for larger number of electrons in At runtime, the isotropically averaged atomic dynamic proximity to the considered atom. Independent from the polarizabilities at imaginary frequency α(iω) in equa- actual DFA which is to be corrected, environment de- tion 1 are obtained in two steps: The first step incor- pendent atomic partial charges are taken as input. The porates an atomic partial charge dependent scaling of advantage is that such external partial charges can be 3

A 0 β1 and γ are set to unity TABLE I. Effective nuclear charges Z used in the D4 charge 2.5 scaling function for elements beyond krypton. The subtracted number corresponds to the number of core electrons absorbed in the ECP used (default def2-ECPs in TURBOMOLE43–45). 2 Element Z’ ζ 1.5 Rb–Xe Z-18 Cs–La Z-46 Ce–Lu Z-28 1 Hf–Rn Z-60

0.5

0.5 1 1.5 2 2.5 3 3.5 lized. Since such atom-in-molecule dynamic polarizabili- zA,ref /zA ties cannot be calculated directly, molecular dynamic po- larizabilities of the reference systems AmXn (having m chemically equivalent atoms A and n chemically equiva- FIG. 2. The ζ-function is shown, which scales the time- lent X atoms) are used as in the D3 model. From these dependent density functional theory (TD-DFT) computed dy- AmXn namic polarizabilities α(iω) depending on the calculated effec- molecular dynamic polarizabilities α (iω), the con- tive nuclear charge zA. The scaling depends on the quotient tribution of the nX atoms is subtracted to obtain atom- A,ref of the latter and the pre-calculated charge zA,ref of the atom in-molecule dynamic polarizabilities α (iω) for atom in the reference system for which the dynamic polarizability A in different chemical environments. Here, the approxi- has been computed. If both effective charges are equal, no mate additivity of polarizabilities46 is exploited to gener- scaling is performed (crossing point of dotted lines) and the ate atom-in-molecule dynamic polarizabilities according reference polarizability is taken. to the following partitioning scheme

(independently from the actual calculation) made accu- A,ref rate, robust, and that the D4 dispersion coefficients re- α (iω) = main functional independent. By default, classical EEQ 1 h n i (5) αAmXn (iω) − αXl (iω)ζ(zX , zX,ref ) . type partial charges are used as descriptor for the elec- m l tron density change (see section II A). Other common options for the choice of the charges are discussed below. In equation 2, we follow the definition of the effective A Equation 5 directly considers the charge scaling of all nuclear charge z used also in Ref. 26 as the sum of the X atoms in the respective reference system. l is a sto- nuclear charge of atom A and the atomic partial charge A ichiometric factor specific to the reference molecule of q element X. The effective nuclear charges zX,ref entering equation 5 are constant values determined once for the re- zA = ZA + qA, (3) spective reference system. By subtracting charge-scaled and introduce effective nuclear charges for element spe- polarizabilities of the atoms X, the partitioning of molec- cific reference systems zA,ref , respectively. For all ele- ular polarizabilities into atomic contributions changes – ments beyond krypton, we consistently employ modified with respect to D3 – depending on the charge of the nuclear charges Z0 as defined in Table I, due to the use atoms X within each reference system. In D4 the po- of effective core potentials (ECPs) in the computation of larizability of atom A is either increased (positive partial the dynamic polarizability reference values. charge at X, subtract lower amount from A) or decreased In comparison to the initially proposed charge-scaling (negative partial charge at X, subtract higher amount function presented in Ref. 26, the two element-specific pa- from A) when comparing with polarizabilities obtained rameters could be discarded resulting in a global scaling by subtracting neutral X atoms. function with only one empirical parameter. The charge- dependent atomic dynamic polarizability for a single ref- The second step in the D4 procedure is the geometry- based interpolation over the charge-scaled element spe- erence system of atom A is given by the product of cific reference systems. In order to enable a geometric αA,ref (iω) and its scaling function as interpolation of all αA,ref (iω, zA) of atom A, a weighting αA,ref (iω, zA) = αA,ref (iω)ζ(zA, zA,ref ). (4) procedure similar to the fractional coordination number (CN) based scheme in D3 is used. As already described In equation 4, atom-in-molecule dynamic polarizabilities in Ref. 26, the CN used within D4 is, however, slightly of element specific reference systems αA,ref (iω) are uti- modified and includes an electronegativity difference de- 4 pendence for the respective element-pair, i.e., (a) coordination numbers EN    cov  A X X δAB RAB − RAB CN = 1 + erf −k0 2 Rcov A B6=A AB EN  2  δAB = k1 exp (|ENA − ENB| + k2) k3. (6) 4.04 (4.87) 47 In equation 6, Pauling electronegativities (EN), the 4.18 0.85 0.93 (4.90) (1.10) inter-nuclear distance RAB of the pair AB, and the co- (1.00) 48 cov cov cov valent atomic radii (RAB = RA +RB ) are used. The 2.85 (3.13) parameters in equation 6 (k0 = 7.5, k1 = 4.1, k2 = 19.09, and k = 254.56) were obtained by fitting CN values 3 2.25 49 50 2.92 (2.77) to GFN2-xTB derived Wiberg bond orders of singly (3.14) bonded diatomic molecules. The exponential used in D3 is replaced here by an error function to avoid a diver- gence behaviour of the CN in applications for dense sys- tems under periodic boundary conditions51. A graphical (b) static polarizabilities comparison of CN values is shown for D4 and D3 in Fig- 2.30 ure 3(a) for a catalyst frequently used in organometallic (2.77) 9.39 (8.79) synthesis as an example. 2.44 2.48 A,ref (2.77) (2.77) All reference coordination numbers, termed CN , 6.76 (6.01) are pre-calculated and stored in the code such that a 9.17 6.57 A,ref (8.77) (6.01) weighting function WA can efficiently be used to generate system-specific charge- and geometry- 8.84 8.95 (8.13) (8.07) 2.46 dependent atom-in-molecule polarizabilities for atom A (2.77) A 2.45 2.41 abbreviated as α (iω) for clarity (2.77) (2.77)

αA(iω) ≡ αA,ref (iω, zA,CN A) =

A,ref FIG. 3. (a) Coordination numbers of selected atoms for the N (7) 52,53 X A,ref A A,ref first generation Hoveyda-Grubbs catalyst given as ex- α (iω, z ) WA . ample. Black/red(brackets) values show CNs for D4/D3. A,ref=1 (b) Example molecule with depicted static atom-in-molecule polarizabilities α(0) (in Bohr3) from D4 (black) according to A,ref A The contribution of each reference value α (iω, z ) equation 5 and for D3 (red, in brackets). Values for car- to the final atom-in-molecule polarizability of atom A is bon/hydrogen are given in bold/italic font. given by

W A,ref (CN A,CN A,ref ) = A to indicate when another more compact Gaussian is nec- N s P  A A,ref 2 essary. As indicated in Figure 4(a), reference systems exp −β2 × j CN − CN j=1 (8) with a sufficiently large CN difference can easily be dis- A,ref s , tinguished from each other within the Gaussian weighting N N   P P A A,ref 2 s exp −β2 × j (CN − CN ) procedure in equation 8 for N = 1. This is no longer A,ref=1 j=1 the case if the reference CN difference between two sys- tems approaches smaller values as present between ref- for A,ref reference systems per element (note that N A erence system A,ref2 and A,ref3 (denoted as ∆CN23 in N A,ref P A,ref Figure 4). The overlap between the Gaussian functions WA = 1). The parameter β2 = 6 is adjusted A,ref=1 placed on A,ref2 and A,ref3 – shown in white – makes it manually to guarantee a smooth weighting function. In considerably more difficult to differentiate between both contrast to D3, the Gaussian weighting is changed, such reference systems. For this reason, the set of Gaussian s that several Gaussian functions can be used for single functions is enlarged – by increasing N – for these near reference systems as shown in equation 8. Generally the lying reference systems, as shown in Figure 4(b). This number of Gaussian functions N s is obtained once for way, less overlapping functions are added, which makes every reference system and used at runtime as described both reference systems distinguishable within the Gaus- in equation 8. The procedure of setting N s – for dif- sian weighting procedure. By following this strategy it is ferent reference systems – is exemplified in Figure 4 to possible to generate a smooth weighting scheme without explain the principle of the weighting scheme. Differ- any discontinuities. ences between CN values of reference systems are used The final charge- and coordination-dependent atom-in- 5

and iωmax = 10.0i given in Hartree). ∆CN23 ∆CN12

A,ref2 A,ref3 AB AB A A B B (a) C6 ≡ C6 (CN , z ,CN , z ) 22 3 X = (ωj+1 − ωj) (9) 2π j=1 A B A B  s s s × α (iωj+1)α (iωj+1) + α (iωj)α (iωj) . NA,ref1 = NA,ref2 = NA,ref3

A,ref1 The effect of charge scaling and Gaussian weighting on the αA(0) values with variations in the partial charge (leading to different effective nuclear charges z) and in the coordination number is visualized for carbon and hy- drogen in Figure 5. 0 1 2 3 4 5 Furthermore, the lower part of Figure 3 shows static CN Increase atom-in-molecule polarizabilities for carbon atoms in set size different hybridization states for the (3Z)-hexen-1-yne ∆CN23 molecule. Here, we follow the definition in equa- ∆CN12 tion 5 for D4 and the definition given in Ref. 16 for (b) A,ref2 A,ref3 D3 to obtain those polarizabilities. As atomic par- titioning schemes generally introduce some arbitrari- ness, also the individual atom-in-molecule polarizabil- ities from D3 and D4 differ (by about 10%). How- ever, the physical observable, i.e., the total molecu- Ns , Ns = Ns lar dispersion coefficient, is similar with the two meth- A,ref1 A,ref2 A,ref3 AA ods for this rather non-polar compound (C6,mol(TD- AA A,ref1 PBE38/daug-def2-QZVP)=2103.1, C6,mol(D4)=1949.5, AA 6 and C6,mol(D3)=1893.0 – all given in Hartree Bohr ). Here, the additivity of pairwise dispersion coefficients in molecules P (atoms p ∈ P ) and Q (atoms q ∈ Q) has been used 0 1 2 3 4 5 CN PQ X X pq C6,mol = C6 . (10) p q FIG. 4. Example for setting N s for different reference sys- tems. The CN difference between system A,ref1 and A,ref2 is denoted as ∆CN12. A,ref1 and A,ref2 are easily distinguish- able within the Gaussian weighting procedure of equation 8 with N s = 1, while A,ref2 and A,ref3 are not as could be seen A. Classical environment dependent partial charges in part (a) of the figure. To circumvent this behaviour the set of Gaussian functions is enlarged for A,ref2 and A,ref3 as it is shown in part (b) by varying the N s value in equation 8 For the scaling of atom-in-molecule polarizabilities an dynamically. established classical charge model based on electroneg- ativity equilibration of Gaussian type charge densities is used27. It allows the electronic charge to distribute itself in an optimal way over the whole system, includes pen- etration effects, and thus can describe both neutral and charged systems. Unlike environmental dependent par- tial charges that are determined by neural networks54, the adapted method determines the total charge (as sum of all atomic charges) of the system exactly. For this pur- molecule polarizabilities, as obtained from equation 7, pose atomic charge densities are used within an isotropic hence include the dependence of α(iω) on the spatially electrostatics (IES) energy expression closest binding partners (i.e., so-called Dobson type A effects37). They are used to calculate pair-wise dispersion   coefficients via a numerical Casimir-Polder integration T 1 −6 EIES = q Aq − X , (11) over a fixed number of 23 points (between iωmin = 10 i 2 6

where elements of the X vector and elements of the A (a) αC (0) for carbon matrix are given by √  2γAA J AA + √ A = B  π A A AB  6.4 16 X = −χ and A =  AB   erf γ RAB 6.3  otherwise 14 RAB 6.2 (12) As atomic parameters appear atomic radii aA (within cationic 6.1 12 γAB = ((aA)2 + (aB)2)−1/2), element dependent atomic 6.0 hardnesses J AA, and the right-hand side (RHS) χA. The z 10 RHS consists of the atomic electronegativity A which 5.9 EN is scaled by the square-root of the error function mod- 5.8 8 ified D3 coordination number (termed mCN) that in-

5.7 corporates the environment dependency into the model

anionic A 6 including an element specific scaling parameter κ . 5.6 √ χA = EN A − κA mCN A (13) 5.5 4 1.0 2.0 3.0 4.0 The is given for atom as CN mCN A     1 X RAB mCN A = 1 + erf −k − 1 (14) 2 Rcov B=1 AB (b) αH (0) for hydrogen B6=A cov cov cov similar to equation 6 with RAB = RA + RB . Geometry-only dependent partial charges are obtained by solving a set of linear equations under the constraint 1.4 35 that the atomic charges sum up to the correct overall

1.3 charge 30 ! 1.2 X A

cationic 25 L = EIES + λ q − qtot , (15) 1.1 A=1

1.0 20 N

z P with ∂L/∂q = 0 and ∂L/∂λ = qi − qtot. Adding this 0.9 15 i=1 constraint in terms of an Lagrange multiplier leads to the 0.8 10 modified linear system of equations 0.7       A 1 q X anionic 5 T = . (16) 0.6 1 0 λ qtot 0.5 0 This classical charge model requires five empirical pa- 0.2 0.4 0.6 0.8 AA A A A A CN rameters (J , a , EN , κ , and Rcov) per element (at which the first four are fitted) and achieves for molecules across the entire periodic table of elements an average de- viation of about 0.04 e− (0.03 e− for organic molecules) FIG. 5. Visualization of the two-dimensional dependence of from PBE0 based Hirshfeld charges (see Supplementary static polarizabilities α(0) (in Bohr3) on charge and coordi- Material). The main motivation to propose a classical nation state in the D4 model for (a) carbon and for (b) hy- charge model as default instead of a quantum chemistry drogen. White circles represent αA(0) values for the reference based one is the higher robustness in electronically com- systems. plicated cases for which simple tight-binding or DFT methods may fail to converge properly (see the Sup- plementary Material for a discussion). Furthermore it enables the use of D4 in combination with fast, non- quantum mechanical approaches. For the construction of the analytical gradient of the D4 dispersion energy, the derivatives of the charges with re- spect to nuclear displacements are required. The par- tial derivative of the Lagrangian is derived with respect 7

(n) to inter-nuclear distances in complete analogy to, e.g., term, respectively), and fdamp denotes the rational coupled-perturbed SCF equations and the analytical par- Becke-Johnson (BJ) damping function (denoted as BJ- tial charge derivatives were developed in this work for damping (BJD) in the following). The damping function the first time as given in the Supplementary Material. Computer timings for the calculation of the charges and R(n) f (n) (R ) = AB (19) their derivatives are given in Table II for different protein BJD AB (n) (n) 3 AB  structures. The formal scaling of the procedure is O(N ) RAB + a1 R0 + a2 with number of atoms N and hence the same as for the dispersion energy in D3-ATM and D4 (for a two-body has become already the default in DFT-D3. For alterna- energy only approach – abbreviated by 2B – the formal tives, see Ref. 56 and for a general discussion of damp- scaling reduces to O(N 2)). ing functions in dispersion corrected DFT models, see Ref. 57. Equation 19 incorporates the functional-specific parameters a1 and a2 and the cutoff-radii defined as TABLE II. Computer timings in seconds for the calculation of energies (E), energies and analytical gradients (E+g), and s AB AB C8 analytical charge derivatives (∂q/∂Rj ) for differently sized R0 = AB , (20) protein structures with their protein database (PDB) entry. C6 All calculations have been conducted at four cores (each CPU: Intel(R) Core(TM) i7-7700K [email protected]) with DFT-D4- where the recursive relation between dipole-dipole and 2B, DFT-D4-ATM, or DFT-D4-MBD. Timings for D4-MBD dipole-quadrupole dispersion coefficients is used as in are excluded for proteins with more than 1500 atoms. DFT-D316. Furthermore, we define the following expres- sion for the rational damping term E(D4) E+g(D4) #Atoms PDB 2B ATM MBD ATM ∂q/∂Rj AB AB  R0,BJ = a1 R0 + a2 . (21) 373 2BEG 0.07 0.20 2.11 0.25 0.01 782 1R0I 0.22 1.32 37.08 2.18 0.04 1562 1MOL 1.13 9.75 401.00 13.28 0.40 C. Three-body dispersion and efficient geometry 1929 2ZOH 1.80 18.21 – 24.85 1.29 optimizations 2489 1YMB 3.45 38.70 – 48.26 2.24 5988 1JS8 47.07 528.45 – 570.45 24.47 The simplest way to include three-body effects is to use the well-known ATM term By using the definition of the Lagrangian given in equa- ABC tion 15 the analytical charge gradients is derived as ABC C9 (3 cos θa cos θb cos θc + 1) E = . (22) ( )3 ∂q ! !−1 RABRBC RCA A 1 ∂Rj = ∂λ 1T 0 Here, θa, θb, and θc are the internal angles of the triangle ∂Rj (17) formed by R , R , and R while CABC is the triple- " ! ! !# AB BC CA 9 ∂A 0 q ∂X dipole constant given by × − ∂Rj + ∂Rj T 0 0 λ 0 ∞ Z ABC 3 A B C where the inverse of the indefinite (N+1) matrix has been C9 = dω α (iω)α (iω)α (iω). (23) 55 π obtained by a Bunch-Kaufman factorization and inver- 0 sion. The numerical integration of the triple-dipole dispersion coefficient is possible using D4 polarizabilities but due to B. Two-body dispersion energy the fact that the three-body energy contribution is rather small (at most 5-10% of Edisp), the coefficients can be The pair-wise dispersion coefficients are then used to reasonably approximated as in the D3-ATM model by a compute the corresponding dispersion energy in complete geometric mean of dipole-dipole dispersion coefficients, analogy to DFT-D3 by multiplying with a short-range i.e., damping function in order to apply the model in com- q ABC AB BC CA bination with standard DFAs. The DFT-D4 pair-wise C9 ≈ C6 C6 C6 . (24) dispersion energy is given by

AB This approximation was already tested within the DFT- (6,8) X X C(n) (n) D3-ATM scheme for different element combinations and Edisp = − sn fdamp (RAB) , (18) R(n) a typically small deviation of about 10–20% to the exact AB n=6,8 AB 16 expression has been found . In the D4 model, the C6 ABC where sn scales the individual multi-polar contributions coefficients used to obtain the C9 are obtained from (s6 and s8 for the dipole-dipole and dipole-quadrupole charge-neutral atomic polarizabilities (i.e., neutral atoms 8

A A (6) (6),MBD with z = Z ). The finally used three-body dispersion In this equation we exploit that Edisp = Edisp (see energy expression is then given as Supplementary Material for the complete derivation). Furthermore, re-arranging to D4−MBD = (n),MBD + (9),AT M X ABC (9) Edisp Edisp E = E f ( RABC ), (25) disp damp E(8) is not possible in the general case, as for double hy- ABC disp brid density functionals (abbreviated as DHDF) s6 6= 1, where the sum is over all atom triples ABC applied whereas this scaling cannot be applied to an individual with a zero-damping scheme proposed by Chai and Head- term in the infinite-order MBD energy. Gordon56 1 f (9) ( R ) = . (26) damp ABC −16 E. Definition of the D4 default model and use of 1 + 6 RABC alternative charges Equation 26 includes the averaged inter-atomic distance At first, it should be emphasized that the D4 model   1/3 only turns into a DFT-D4 method when used in com- AB BC CA RABC = RAB RBC RCA R0,BJ R0,BJ R0,BJ , bination with a specific density functional. If just po- larizabilities or dispersion coefficients are calculated, the (27) results are functional independent. The D4 default set- AB/BC/CA which incorporates R0,BJ (cf. equation 21). The ting uses classical EEQ partial charges to scale atom-in- final energy expression used is therefore given as molecule polarizabilities due to their characteristics of be- ing robust and efficient. The quality of the used charges D4 (6,8) (9),AT M Edisp = Edisp + Edisp . (28) is important but not essential for the finally obtained accuracy. The D4 model also works well with other Exact analytical gradients are available for this energy charges, e.g., with partial charges obtained by the re- expression within the D4 implementation. cently developed GFN2-xTB tight-binding method. This indicates its robustness and that under almost all cir- cumstances D4 is similar or better than D3 but never D. Many-body dispersion energy worse (by construction). Generally, the use of three types of reference charges is currently implemented in the model: EEQ gas phase charges q (default), Mulliken- Depending on the size and the geometrical arrange- TB ment of the atoms, higher-order dispersion contributions type partial charges q from the GFN2-xTB tight- binding method49, and DFT Hirshfeld charges qDFT (larger than three-body) can be of similar magnitude 59 60 as three-body contributions and hence, for consistency, (PBE0 /def2-TZVP level). Hirshfeld partial charges terms up to infinite order should be included to achieve at the PBE0/def2-TZVP level of theory were also used a consistent description of all dipole-dipole interaction or- to parameterize the EEQ model. If DFT charges should ders37. Here, a conceptually simple but robust approach be used they must be calculated separately such that introduced originally by Cao and Berne58 is adapted, they can be fed into the model. For a given type of 36 charges (classical or TB or DFT) different reference effec- which has been made popular by Tkatchenko et al. . A,ref Physically it is based on a coupling of atomic dipole tive nuclear charges z , as introduced in equation 5, polarizabilities in terms of quantum harmonic oscilla- are stored and accordingly used in the D4 program. tors (QHOs). The coupled dipole model of QHOs serves The DFT-D4 default model always includes an ATM as an approximation to describe the density-density re- term as described in section II C (see equation 28 for the sponse functions, which would otherwise be calculated, total energy expression). Nevertheless we offer the pos- e.g., via the random phase approximation (RPA). The sibility to use DFT-D4 including MBD effects (termed atomic response functions in a coupled dipole model DFT-D4-MBD) in terms of an energy correction. The allow a considerable reduction in the degrees of free- rational BJ-damping function is always applied. The dom (i.e., three QHOs per atom) and the computational use of other damping functions is not supported since costs. With this in mind, an alternative energy expres- we obtained in many test calculations the overall best sion is proposed which consists of two parts. The first results always with the BJ-damping for several DFA is to compose the two-body dipole-dipole and dipole- classes. When using this so-defined method, it should quadrupole interaction. The second part includes all be abbreviated as “method-D4” (where “method” rep- (n),MBD resents either a DFA or Hartree-Fock), which allows dipole-dipole interactions up to infinite order, E disp simple and clear referencing in future publications. If (n = 6, 9, 12, 15,... ,∞). To avoid double counting of the charges other than the default are used, this could be in- two-body dipole-dipole energy, it is removed explicitly dicated by adding e.g. ”(TB)” or ”(PBE0/def2-TZVP)”. from the MBD energy according to Unfortunately, the previous DFT-D3 method has partly lost a clear abbreviation over the years, since addi- D4−MBD (6,8)  (n),MBD (6),MBD Edisp = Edisp + Edisp − Edisp . (29) tional parameterizations or extensions to the method 9 have been assigned with various nomenclatures (see, calculations with the XDM model were conducted with e.g., DFT-D3(0)16, DFT-D3(BJ)57, DFT-D3(CSO)61,62, the postg78 program based on TURBOMOLE gener- DFT-D3(op)63, DFT-D3M64). ated wave function wfn-files. Unfortunately, postg only In the next sections, the technical details of the calcu- supports a limited number of basis sets, since each ba- lations are given first, followed by benchmarking of this sis set has its own parameterization linked to the ap- finalized D4/DFT-D4 method for dispersion coefficients, plied density functional. Therefore, all postg input files interaction energies, conformational energies, as well as were calculated with the PBE79/def2-TZVP setup. Fur- in general thermochemical applications. Last, optimized thermore, calculations were also carried out with the TS covalent as well as non-covalent geometries are discussed. based MBD method of Tkatchenko et al. using the corre- sponding standalone code80. The required relative Hir- shfeld volumes were taken from postg calculations per- F. Technical details formed at the PBE/def2-TZVP level. a. Density functional theory calculations All ground All geometries of the TD-DFT derived organometallic molecular C6 coefficient benchmark set (abbreviated as state DFT calculations were performed with either TUR- 81 BOMOLE 7.0.2, and TURBOMOLE 7.2.1 (for all TOMC6 benchmark set) were obtained on the TPSS - SCAN65,66 calculations) or ORCA 4.0.167,68. Standard D3(BJ)-ATM/def2-TZVP level of theory and molecular exchange-correlation energy integration grids (TURBO- reference dispersion coefficients are shared in the Supple- MOLE: m4, ORCA: grid4, finalgrid5 ) and usual con- mentary Material. vergence criteria for the self-consistent field convergence (10−7 Hartree) were used. The resolution of the identity d. Non-covalent reference interaction energies for L7 (RI) approximation69–71 was applied in all calculations and S30L benchmark sets We use recently published ref- for the electronic Coulomb energy contribution. Ahlrichs’ erence values (see the Supplementary Material of Ref. 82) type quadruple-zeta basis sets (def2-QZVP72) were used for the L783 and the S30L84 benchmark sets. They throughout if not stated otherwise. were obtained from local CCSD(T) calculations together b. BJ-damping function parameterization The func- with a special purpose CBS estimation scheme includ- tional specific parameters of the BJ-damping function ing the geometry deformation energy and Boys/Bernardi 85 86,87 have been determined using a Levenberg-Marquardt counter-poise (CP) correction (DLPNO-CCSD(T) 88 least-squares minimization to reference interaction ener- in its sparse matrix implementation employing the 89 gies of established non-covalent interaction benchmark CBS* protocol as described in reference ). sets (S66x873, S22x574, NCIBLIND1075). In total, 98 dissociation curves with 718 reference data points of high e. Reference energies for thermochemical benchmarks accuracy were used for regression. The use of this new The reference conformational energies and structures of fitting set enabled parameterizations of such DFAs for SCONF90, PCONF2191,92, ICONF25, and UPU2389 sub- which BJ-damping parameters could not be obtained suc- sets were extracted from the GMTKN55 database (see cessfully in earlier works due to over-binding tendencies Ref. 25 for further information and our homepage93 for (e.g., Minnesota functionals, see Ref. 25). the entire database). For each benchmark set, the respec- c. Molecular dispersion coefficients All molecular dy- tive reference data with the accompanying level of theory namic dipole polarizabilities α(iω) were calculated us- are listed in the Supplementary Material. The reference ing TD-DFT76,77. As in D3, a variant of the PBE0 energies and structures for the MOR41 transition metal hybrid functional was used, with a Fock-exchange ad- reaction benchmark set were taken from previous work mixture of 37.5% (dubbed PBE38). This method has (see Ref. 94 and our homepage95 for the entire database). already proved its accuracy and robustness in previous works16,26. The atomic orbital (AO) basis sets used in f. Tetrakis(isonitrile)rhodium(I) dimer and monomer the TD-DFT calculations are of doubly augmented def2- calculations For the generation of reference asso- QZVP quality very closely representing the complete ba- ciation energies of the tetrakis(isonitrile)rhodium(I) sis set (CBS) limit for this property. Here, for the respec- complex, CP-corrected DLPNO-CCSD(T) calculations tive systems, each hydrogen has been augmented with with tight thresholds and extended basis sets (def2- additional (2s/2p), each main group element with addi- TZVPP(VeryTightPNO96)/def2-QZVPP(TightPNO96)) tional (2s/2p/1d), and each transition metal with addi- were conducted using ORCA 4.0.1. A basis set ex- tional (2s/2p/1d/1f) Gaussian primitive functions. The trapolation was performed using optimized exponents exponent-extrapolation of the additional primitives was proposed by Neese and Valeev97. The deformation done with the subprogram define from TURBOMOLE energy of the monomers (0.42 kcal mol−1) was also 7.0.2. The following def2-ECPs are used: ECP-28 cov- taken into account. The error bar of the calculated ering 28 core electrons (for Rb, Sr, Y-Cd, In-SB, Te- interaction energy is estimated to about ±0.5 kcal mol−1. Xe, Ce-Lu), ECP-46 covering 46 core electrons (for Cs, Geometries have been obtained at the PBEh-3c98 level of Ba, La), and ECP-60 covering 60 core electrons (for Hf- theory and verified as minimum structures by frequency Hg, Tl-Bi, Po-Rn) as defined in Ref. 60. Comparative calculations. 10

III. RESULTS improvement in the description of pair-wise dispersion co- efficients for such systems. In the authors’ opinion, the A. Molecular dispersion coefficients achieved MAD of 3.8% closely approaches the inherent accuracy of the underlying TD-DFT calculations and is thus difficult to improve further. For comparison, an as- We have taken reference molecular dispersion coefficients sessment of the XDM method yielded an MAD of 10.0% which were determined experimentally from dipole oscil- for a similar small-molecule database19, while a number lator strength distributions (DOSD) as described in pre- of vdW density functionals yield even larger deviations vious works by Meath and co-workers99,100. From this for asymptotic molecular C coefficients103. Note that data, a benchmark set has been compiled in Ref. 20 con- 6 the resulting DFT-D4 interaction energy errors in the sisting of 1225 molecular dispersion coefficients for sys- asymptotic regime of about 4.0% are comparable or even tems ranging from di-hydrogen to octane and other non- smaller than, e.g., residual errors in WFT energy calcula- organic molecules such as SF or O . The respective 6 2 tions employing large but finite triple- or quadruple-zeta statistical evaluation for D3101, D4, D4(TB), the local basis sets. response dispersion method (LRD)102, and TS20 for all systems in the DOSD benchmark set are given in Ta- Since the DOSD set excludes the important class of tran- ble III. sition metal complexes, we have created a corresponding benchmark set consisting of computed molecular dipole- dipole dispersion coefficients derived from TD-DFT dy- TABLE III. Top (labeled DOSD): Statistical measures for namic reference polarizabilities at the PBE38/daug-def2- the relative deviation (in %) of calculated molecular C dis- 6 QZVP level of theory. This benchmark set consists of persion coefficients for different approaches with respect to experimental values for a molecular benchmark set consisting 25 organometallic complexes (see upper part of Figure 6 of 50 molecules. For D3, LRD, and TS values are taken from for example structures) and is dubbed as TOMC6 which Refs. 20,101,102. Bottom (labeled TOMC6): Statistical stands for TD-DFT derived organometallic molecular measures for the relative deviation (in %) of calculated homo- C6 coefficient benchmark set. Results for the D3, D4, molecular C6 dispersion coefficients for different approaches D4(TB), XDM, and MBD schemes are given in the lower with respect to theoretical TD-DFT values for a transition part of Figure 6 together with their statistical evaluation metal benchmark set of 25 complexes for the complete set listed in Table III. Here, the TOMC6 set was divided into and 15 complexes for the subset. The best values are high- two parts because several systems could not be treated lighted in bold font. with XDM and MBD for technical reasons. The result- ing smaller set for which also XDM and MBD data are Measure D4 D4(TB) D3 LRD TS available is termed “subset” in the following. In accor- MAD 3.8 3.9 4.7 6.1 5.3 dance with the previously discussed DOSD benchmark MD -0.1 0.3 2.4 −2.5 −2.7 set, the use of scaled polarizabilities in D4/D4(TB) im- proves upon the accuracy of the D3 method. For this SD 5.1 5.3 5.2 7.7 7.3 DOSD set only XDM is able to achieve a comparable accuracy, AMAX 29.1 34.7 23.9 52.9 44.0 however, at a substantially larger computational effort D4 D4(TB) D3 XDM MBD because a properly converged DFT electron density is required in XDM. Furthermore, XDM energies need to MAD 8.7 8.9 20.5 9.4 13.7 be integrated numerically over a fine grid to avoid nu- MD −1.7 −2.3 −9.6 1.4 13.7 merical noise. The MBD model achieves a result which is in between of D4/D4(TB) and D3 (here the molec-

subset SD 11.2 11.4 28.1 13.3 6.1 AMAX 29.7 25.1 80.8 36.6 26.2 ular dispersion coefficient changes for different range- separation parameters – we used βPBE = 0.83 – and, Measure D4 D4(TB) D3 therefore, for different DFAs within MBD). Notably, all methods yield larger errors for the electronically more TOMC6 MAD 7.0 7.9 12.0 complicated systems in the TOMC6 (sub)set than for MD -2.2 −3.6 −5.5 the DOSD molecules. Nevertheless, the asymptotic D4 SD 9.3 9.4 15.7 error of about 7.0% is still smaller than typical errors complete AMAX 29.7 25.1 56.1 of DFAs for other correlation energy effects. This will be further discussed below for the thermochemistry of transition metal complexes. The application of Hirshfeld partial charges at the PBE0/def2-TZVP level of theory For medium-sized organic and inorganic molecules, slightly worsens upon the D4(TB) result with an MAD D4/D4(TB) further improves upon the already accu- of 9.5% for the complete set. The fact that calculated D4 rate D3 model (mean absolute deviation, MAD, of 3.8% molecular dispersion coefficients have the smallest errors and 3.9% vs. 4.7%). Other statistical measures for with respect to reference values when applying the EEQ D4/D4(TB) are also lowered in comparison to values for charges additionally motivates their use as default (as de- the competitors which indicates a robust and consistent fined in section II E). An explicit discussion of the effect 11 of using different charges is avoided, whereby DFT-D4 B. Non-covalent interactions and conformational energies results with GFN2-xTB charges are shared in the Sup- plementary Material for interaction energies and struc- As shown in the previous section, the pair-wise dispersion tures. coefficients for organic/inorganic molecules and transi- tion metal complexes are improved with the D4 method. In general, it is assumed that improved dispersion coef- ficients in DFT-D type methods are associated with im- (a) TOMC6 examples proved non-covalent interaction energies. This assump- tion is to be verified in this section for various small to large molecule benchmark sets. Reference interaction en- (I) (II) ergies refer to the CCSD(T) or DLPNO-CCSD(T) level of theory with tight threshold settings and CBS extrap- olation mostly taken from Ref. 82. The upper part of Figure 7 shows interaction energies in benchmark sets of increasing complex size which are mainly stabilized by London dispersion: S22105, L783, and S30L84. Vari- ous typical non-covalent interaction motifs like hydrogen (III) (IV) and halogen bonding, π-π stacking, non-polar dispersion, CH-π, and cation-dipolar interactions are represented. The description of non-covalent interactions in the larger systems is slightly (S30L) to largely (L7) improved at the DFT-D4-MBD level compared to the already well performing DFT-D3(BJ)-ATM method. The only excep- tion is the TPSS-D4-MBD result for the L7 set which is worse than the D3(BJ)-ATM corrected one. Note that residue MAD values of about 1 kcal mol−1 for L7 (b) statistical measures and 1-2 kcal mol−1 for S30L are not far from the ac- curacy of the reference calculations. For the S22 set, subset we use averaged MAD values for different DFA classes D4, MAD=8.7% arranged according to Jacob’s ladder. Data for eight D4(TB), MAD=8.9% (meta) generalized gradient approximation DFAs, ab- XDM, MAD=9.4% breviated as (m)GGA, (namely BLYP106,107, BP86108, 10000 MBD, MAD=13.7% M06L109,110, O-LYP111,112, PBE, revPBE113, RPBE114 6 D3, MAD=20.5% , and SCAN), nine hybrid DFAs (M06115, B3LYP116,117, Bohr · BHLYP, M062X, O3LYP, PBE0, PW6B95118, TPSS0, and TPSSh), and three DHDF (DSD-BLYP119, DSD- PBEB95120, and PWPB95120) are used. For this im- portant and prototypical non-covalent interaction (NCI) (ref.)/Hartree

6 complete set

C benchmark, D4 outperforms D3(BJ)-ATM especially for D4, MAD=7.0% 2000 hybrids and DHDFs. D4(TB), MAD=7.9% The lower part of Figure 7 shows an interesting exam- D3, MAD=12.0% ple for strong London dispersion interactions in a dou-

2000 10000 bly positively charged organometallic di-rhodium com- 6 C6(calc.)/Hartree Bohr plex. Here, the dominant Coulomb repulsion can almost · be compensated by dispersion interactions104 in the gas phase (compare the association energy ∆E(PBE0) in Fig- FIG. 6. (a) Example complexes from the TOMC6 set includ- ure 7(b) with its corresponding dispersion corrected val- ing (I) aluminium, (II) nickel, (III) molybdenum, and (IV) ues). This and related complexes have been studied in- cobalt to demonstrate the diversity of the benchmark set. tensively and the property of oligomer formation in so- Structures and reference molecular dispersion coefficients are 121–123 listed in the Supplementary Material. (b) Correlation plot lution has been an “open case” for many years . between reference molecular dispersion coefficients of transi- Quantum chemical calculations have been able to assign tion metal complexes obtained by hybrid TD-DFT (numer- the main binding motif of this highly charged system to ically integrated from molecular α(iω) values according to be the London dispersion thus making it an ideal test equation 9) and molecular ones derived from pair-wise dis- case for the DFT-D4 method. The theoretical reference persion coefficients with D3, D4, D4(TB), XDM, and MBD association energy was calculated with an accurate lo- models. cal coupled-cluster protocol (see technical details section) and is compared to the dispersion and Boys-Bernardi CP- corrected PBE0/def2-QZVPP values. 12

Rh C RhH5, CN = 4.70 and C6H6, CN = 2.92), increas- (a) non-covalent benchmarks ing the homo-atomic dispersion coefficient for those ele-

4 1.00 ments (e.g., for rhodium atoms within the complex the L7 S30L S22 AA 6 difference between D3 (C6 (D3)=244.7 Hartree Bohr ) AA 6 and D4 (C6 (D4)=294.9 Hartree Bohr ) is significant). 1 − 3 0.75 Furthermore, the molecular charge is distributed over all atoms in the molecule and is not centered on the rhodium 2 0.50 atoms. Thus, the partial charges of the rhodium atoms are only marginally changed when the neutral dimer com-

MAD / kcal mol plex is turned into the di-cation (the change in Rh is less 1 0.25 q than 0.1). In summary, the stronger binding in D4 which is in better agreement with the localized CCSD(T) ref- 0 erence value (for both types of partial charges – EEQ PW6B95PBE0TPSS PBE SCANPW6B95 (m)GGAhybridsDHDF or TB) can be explained by larger atomic dispersion co- efficients in combination with only a small decrease of atomic polarizabilities due to small partial charges on (b) organometallic example the central rhodium atoms. DFT-D4 also yields a comparable or higher accuracy for (bio)chemically important conformational energies 2+ Coulomb interaction Dispersion interaction in various test systems. Figure 8 exemplifies this for sugar conformers (SCONF) as well as for tri- and tetra- peptide conformers (PCONF21) both taken from the + + + + GMTKN55 database. Additionally, results for confor- mational energies of inorganic molecules (ICONF), as well as for RNA backbone conformers (UPU23) are given Strong Strong repulsion attraction in Table IV. Note that in contrast to DFT-D3, the DFT- D4 training set used for the determination of the BJ- Method ΔE damping parameters does not contain conformational en- DLPNO-CCSD(T) 8.8 ergies. In Figure 8 we show DSD-BLYP based results as PBE0-D4 7.5 PBE0-D4(TB) 7.8 an example and provide statistical evaluations for other PBE0-D3(BJ)-ATM 10.9 selected DFAs (PBE, B3LYP, and PW6B95) in Table IV. PBE0 38.9 Overall, the conformational energies are generally im- proved for PBE, B3LYP, and PW6B95 by applying DFT- D4/DFT-D4(TB) compared to DFT-D3(BJ)-ATM. The FIG. 7. (a) Comparison of MAD values for three typical non- reduction of the MAD is often noticeable (about 0.1 covalent interaction benchmark sets (L7, S30L, S22) and vari- −1 ous standard DFAs. Density functionals corrected by D3(BJ)- kcal mol ) but sometimes significant, e.g., 0.20–0.25 −1 ATM are shown in gray (bar width 1.0), DFT-D4 results are kcal mol for peptides and RNA structures with PBE or shown in blue (bar width 0.4), and DFT-D4-MBD results are B3LYP. Especially for the peptide set the improvement shown in yellow (bar width 0.2). For the S22 benchmark is large when compared to the small average absolute set the scaling of the MAD axis was adjusted accordingly relative energy of only 1.62 kcal mol−1. and values were obtained by averaging over several typical DFAs (eight meta-GGAs, nine hybrids, and three DHDFs). (b) Shown is a cationic di-rhodium complex with the respec- C. Thermochemistry tive association energy ∆E in the gas phase where the re- pulsive cation-cation interaction is almost compensated by attractive London dispersion interactions schematic adapted As discussed in section III A, the D4 model is particularly from Ref. 104. good for systems in which atomic partial charges show significant deviations from the reference systems used in the Gaussian weighting procedure (cf. equation 8). Typ- Counter-intuitively, the D4 corrected gas phase interac- ical examples are transition metal complexes, which con- tion is more favorable (less positive association energy) tain d-block elements in varying oxidation states. We than the corresponding D3 value. This is initially surpris- have discussed the improved description of pair-wise dis- ing, since cationic compared to neutral complexes should persion coefficients for transition metal complexes for the feature smaller dispersion coefficients as accounted for TOMC6 set (see section III A). In the following, the in D4 but not in D3. According to a detailed anal- thermochemistry of transition metal compounds is in- ysis, the observed increased dispersion interaction in vestigated employing the recently composed MOR41 set DFT-D4 is based on new reference systems that have consisting of 41 closed-shell transition metal reactions of been added to some elements within the D4 model (e.g., uncharged molecules. The estimated maximum error of 13

TABLE IV. Conformational energies calculated with three widely used dispersion corrected DFAs (PBE, B3LYP, and PW6B95) with the def2-QZVP basis set. All statistical measures are given in kcal mol−1 relative to the reference energies. The best result for each measure is highlighted in bold, DFT-D3(BJ)-ATM is abbreviated by D3, DFT-D4 is abbreviated by D4, and DFT-D4(TB) by D4(TB). The average reference energy over all entries in the set |∆E| is taken from Ref. 25.

Sugar conformers (SCONF, |∆E| = 4.60)

Measure PBE B3LYP PW6B95 D3 D4 D4(TB) D3 D4 D4(TB) D3 D4 D4(TB)

MAD 0.78 0.88 0.77 0.29 0.33 0.23 0.24 0.28 0.20 MD 0.28 0.34 0.29 -0.14 -0.07 -0.14 -0.01 0.04 -0.02 SD 0.98 1.09 0.94 0.48 0.58 0.42 0.38 0.43 0.36 AMAX 2.73 2.96 2.67 1.73 2.07 1.64 1.21 1.33 1.12

Tri- and tetrapeptides (PCONF21, |∆E| = 1.62)

MAD 1.20 0.99 1.04 0.53 0.35 0.30 0.48 0.49 0.49 MD -0.54 -0.51 -0.53 -0.04 0.03 0.01 0.34 0.36 0.38 SD 1.44 1.24 1.18 0.60 0.41 0.37 0.58 0.60 0.46 AMAX 2.51 2.31 2.36 1.11 0.90 0.88 1.05 0.98 0.98

Inorganic conformers (ICONF, |∆E| = 3.27)

MAD 0.31 0.31 0.31 0.28 0.28 0.27 0.22 0.22 0.22 MD 0.09 0.09 0.10 -0.06 -0.08 -0.07 0.04 0.02 0.02 SD 0.38 0.40 0.40 0.39 0.42 0.39 0.33 0.33 0.34 AMAX 0.84 1.12 1.11 0.99 1.15 0.98 0.90 0.83 0.85

RNA backbone conformers (UPU23, |∆E| = 5.72)

MAD 0.57 0.51 0.48 0.68 0.60 0.54 0.66 0.63 0.59 MD 0.36 0.27 0.20 0.55 0.42 0.31 0.52 0.49 0.42 SD 0.71 0.63 0.58 0.80 0.70 0.57 0.80 0.76 0.58 AMAX 1.67 1.62 1.50 1.57 1.50 1.33 1.65 1.55 1.44 the reference reaction energies is about 2 kcal mol−1 (see to Jacob’s ladder plotted with their weighted mean total Ref. 94 for further information). The reactions in this set absolute deviations (WTMAD-2, see Ref. 25 for the exact typically occur in homogeneous catalysis, i.e., complexa- definition). The figure is divided into two parts describ- tion reactions, oxidative additions, and ligand exchange ing the performance for the entire GMTKN55 database reactions. (termed as GMTKN55 full) and for all its NCI subsets (shown inverted and termed as GMTKN55 NCIs). For the MOR41 set, the D4 model outperforms D3(BJ)- ATM for all tested DFAs (except PW6B95). This is par- Overall for this typical selection of DFAs, DFT-D4 rep- ticularly noticeable for the well performing double hy- resents a retention or small improvement over DFT- brid DFAs DOD-PBE124, and DSD-PBE124 or hybrid D3(BJ)-ATM for general thermochemistry. No single DFAs (especially for PBE0) for which it is now possible outlier in the huge number of systems in GMTKN55 was to almost reach the estimated error level of the reference detected. The improvement obtained with DFT-D4 is method. Hence, the combination of such DFAs with D4 more pronounced in the NCI subsets than for the en- represents an efficient route to obtain reaction energies tire database because in the former the dispersion con- in transition metal thermochemistry applications. Dur- tributions are relatively larger than for most ”normal” ing the D4 parameterization process, damping parame- chemical reactions. A direct comparison between DFT- ters for some DFAs could also be “repaired”. The RPBE D4 (ATM, MBD) and non-locally corrected DFT (DFT- functional is given as example, where the previously pub- NL) for this NCI subsets shows a clear improvement when lished parameters were apparently not optimal (see Sup- applying D4 (see Table V). plementary Material of Ref. 25 for original values). Importantly, the remaining MAD of 2.0 kcal mol−1 for −1 For main group elements, the general performance of the MOR41 set and the WTMAD-2 of 2.0–3.0 kcal mol DFT-D4 for thermochemistry and kinetics is assessed us- for GMTKN55 with the best functionals is not far from ing the large GMTKN55 database. This database allows the accuracy of the underlying reference methods. an evaluation for a wide variety of chemical problems and The s6 damping parameters for the double hybrid DFAs provides a large number of 2462 systems resulting in 1505 (which is unity for all lower-rung DFAs) has been con- relative energies. The lower part of Figure 9 shows results strained in the fitting according to the procedure of for several dispersion-corrected DFAs grouped according Ref. 16, i.e., we have not followed the construction 14

(a) SCONF 9 MOR41 set 8 Sugar conformers 1 7 − 7 6 6 5 1 − 5 4 3 4 2 MAD / kcal mol 1 3 Reference 0 DSDBLYP-D4/QZ DOD-PBEDSD-PBE B3LYPPBE0 PW6B95CAM-B3LYPrevPBEM06L PBE RPBE 2

relative energy / kcal mol DSDBLYP-D3-ATM(BJ)/QZ 1 double hybrids hybrids (m)GGAs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 12

conformer number 1 9 GMTKN55 full − (b) PCONF21 6 3 Tri- and tetrapeptide conformers 4

Reference 3

1 6

− DSDBLYP-D4/QZ 3 WTMAD2 / kcal mol DSDBLYP-D3(BJ)-ATM/QZ 9 GMTKN55 NCIs 12 DSDPBEP86DSDBLYP B3LYPPBE0 PW6B95TPSS0 SCANM06LPBE RPBE 2

1 relative energy / kcal mol

FIG. 9. (upper) Comparison of MAD values for MOR41 re-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 action energies. The inset shows one example reaction (reac- conformer number tion 28 in the set). Various density functionals from different rungs of Jacob’s ladder are investigated. Shown are devi- ations of D3(BJ)-ATM corrected (gray, bar width 1.0), D4 FIG. 8. Conformational energies for two benchmark sets corrected (blue, bar width 0.4), and D4-MBD corrected (yel- (SCONF and PCONF21) in comparison to accurate reference low, bar width 0.2) DFT computed reaction energies to refer- values at the DLPNO-CCSD(T)/TightPNO/CBS level of the- ence values with an estimated uncertainty of ≈ 2 kcal mol−1. ory25. For all calculations a def2-QZVP basis set (abbreviated (lower) Comparison of weighted mean deviations (WTMAD- as QZ) has been applied. 2 values) for the entire GMTKN55 set and its non-covalent interaction subsets. Shown are deviations of DFT-D3(BJ)- ATM (gray), DFT-D4 (blue), and DFT-D4-MBD (yellow) TABLE V. WTMAD-2 values over all NCI subsets of the values to reference data (see technical details section for fur- GMTKN55 benchmark database given in kcal mol−1 for DFT- ther information). D4 (ATM, MBD) or DFT-NL which were extracted from Ref. 125. shows that an ATM treatment is sufficient for medium- DFA ATM MBD DFT-NL sized molecules, i.e., below 50–100 atoms. B3LYP 5.6 5.8 6.0 PW6B95 5.3 5.3 5.9 DSD-BLYP 3.2 3.2 3.2 D. Covalent and non-covalent structures

In order to investigate the DFT-D4 accuracy more scheme of Martin and co-workers124 for its determination. closely, various standard benchmark sets for bond dis- This constraint, however, does not affect their accuracy tances in equilibrium geometries are discussed here. This which is actually significantly increased with DFT-D4 is of particular importance because structure determi- compared to the already very accurate DFT-D3 versions. nation is a common application area of DFT meth- Overall, the performance of D4 and D4-MBD is simi- ods. First, covalent bond lengths are to be investigated, lar for the whole GMTKN55 benchmark database which whereby non-covalent distances, and entire equilibrium 15 structures are also examined in this section. All ge- ometries discussed here are fully optimized employing (a) an Ahlrichs’ type quadruple-zeta AO basis set (def2- LMGB35 HMGB11 TMC32 72 QZVP ). 3 The sets used for testing covalent bond lengths consist of molecules formed by the first or second row main group elements of the periodic table (LMGB35), molecules 2 which are composed of main group elements of the third or higher rows (HMGB11), and 3d-transition metal com- MAD / pm plexes with a total of 50 analyzed bond lengths as com- 1 piled by B¨uhland Kabrede126 (TMC32). For a detailed description of the former benchmark sets see Ref. 98. These sets are ideally suited also for ”cross-checking”, since many of the here treated elements were not part of PBE TPSS PBE0 PBE TPSS PBE0 PBE TPSS PBE0 the D4 parameterization process (only the elements H, C, N, and O are present in the BJ-damping function fit), which was furthermore based solely on energies. (b) Figure 10(a) shows mean absolute deviations from refer- ROT34 S66x8 ence values for three typical D3(BJ)-ATM and D4 dis- 3 persion corrected DFAs and a statistical evaluation of all three benchmark sets is given in the Supplementary Material. 2 Inspection of Figure 10(a) shows only marginal differ- ences between DFT-D4 and DFT-D3(BJ)-ATM for bond lengths which is expected because dispersion corrections MAD / % mostly affect the non-covalent distance regime which is 1 not really covered in these relatively small molecules tested. The tiny worsening observed for HMGB11 and for the TMC32 benchmark (the MAD increases by less 0 than 0.1 pm) is practically irrelevant. B3LYP PBE TPSS PBE0 B3LYP O-LYP revPBE SCAN Small differences between DFT-D4 and DFT-D3(BJ)- ATM are found for equilibrium rotational constants Be while larger differences occur for intermolecular distances in the S66127 non-covalent equilibrium complex bench- mark (as derived from data for the S66x8 benchmark FIG. 10. (a) Mean absolute deviations from reference values 73 for covalent bond lengths for the LMGB35, HMGB11, and set ). The Be values can be considered as a measure for the quality of a complete molecular structure. Its ac- TMC32 benchmark sets with three DFAs. Gray bars: DFT- D3(BJ)-ATM (bar width 1.0), and blue bars: DFT-D4 (bar curate computation requires a consistently good descrip- width 0.4). (b) Mean relative deviations from reference values tion of covalent bond lengths as well as of non-bonded for two benchmark sets representing the quality of entire equi- distances and small changes of internal rotational de- librium structures (rotational constants, ROT34) and non- grees of freedom may result in rather large deviations covalent center-of-mass equilibrium distances (derived from of a few percent for Be. As already discussed in the lit- S66x8) for seven DFAs. Gray bars: DFT-D3(BJ)-ATM (bar erature, dispersion corrections to DFT typically increase width 1.0), blue bars: DFT-D4 (bar width 0.4). Be values (shrink molecular size) significantly by about 0.5-1.5%128, thereby in general improving the agreement with the reference data. tries obtained with DFT-D3(BJ)-ATM is retained with The left side of Figure 10(b) shows the relative MAD in % DFT-D4. of computed equilibrium rotational constants compared Non-covalent equilibrium geometries are considered on to back-corrected experimental values for the ROT34 the right side of Figure 10(b). Reference coupled-cluster benchmark set129. Here, dispersion corrected PBE0 interaction energies taken from Ref. 73 for eight differ- yields the most accurate molecular structures within the ent shifted center-of-mass (CMA) distances were interpo- uncertainty of the reference data while dispersion cor- lated to obtain reference CMA distances for all complexes rected TPSS and PBE on average overestimate molecular in the S66 set. The DFT computed CMA distances were size (see the statistical evaluation in the Supplementary obtained in the same way to allow a direct comparison Material). Dispersion corrected B3LYP provides slightly to the reference values. enlarged molecular structures where both, the D4 and As can be seen from this plot, the D4 model yields a the D3(BJ)-ATM correction, perform equally well. All slight (O-LYP and SCAN) to enormous improvement in all, the high quality of covalent equilibrium geome- (revPBE) compared to the already very accurate DFT- 16

D3(BJ)-ATM geometries. For the standard DFAs PBE, and 3d-transition metal elements), organic equilibrium TPSS, and PBE0 shown in Figure 10(a), D4 performs structures (rotational constants), non-covalent equilib- slightly better or equally well than D3-ATM. Note, that rium structures, and reaction energies of main group the relative error for calculated CMA distances without chemistry (extended GMTKN55 database) and transi- dispersion correction is about one order of magnitude tion metal complexes occurring in homogeneous catalysis larger and that the very small residual relative deviations (MOR41). of 1.0% or less are similar for mostly covalent (ROT34) DFT-D4 in general improves slightly upon the already ac- and non-covalent (S66) sets, respectively. curate and well-established DFT-D3(BJ)-ATM method and the new D4 model seems to be the most accurate and general molecular dispersion correction available. This IV. SUMMARY AND CONCLUSIONS mainly results from the inclusion of atomic charge effects into the atomic polarizabilities, the balanced treatment of We presented the theory and main features of the final D4 higher-order dipole-dipole interactions, and most impor- method, in which charge-scaled atom-in-molecule dipole tantly the high accuracy of the underlying TD-DFT dis- polarizabilities are interpolated by means of an atomic persion coefficients. In particular systems involving large coordination number within a Gaussian weighting proce- atomic charges or respective charge changes in chemical dure to generate charge- and geometry-dependent atom- processes (e.g., oxidation/reduction) benefit from a DFT- in-molecule polarizabilities α(iω). The partial charges D4 treatment compared to DFT-D3. Moreover, it should used in the polarizability scaling are obtained by default be emphasized that the generally good description of dis- with a classical electronegativity equilibration (EEQ) persion interactions in non-polar main group or organic model. As fall-back levels, TB based Mulliken-type par- molecules is retained compared to DFT-D3(BJ)-ATM, tial charges qTB (GFN2-xTB) or DFT based Hirshfeld i.e., DFT-D4 represents a save replacement of a widely partial charges qDFT (PBE0/def2-TZVP) are proposed. used and well tested method. In the current form, the D4 The polarizabilities are then numerically integrated at method appears to have reached a plateau level within runtime of the D4 code to obtain dipole-dipole dispersion the DFT-D methology in terms of physical sophistication AB and cost-accuracy ratio. Further improvements would coefficients C6 . For two corresponding benchmark sets they are more accurate than the charge-independent ones most likely lead to significantly increased computational from the predecessor D3 scheme, and particularly, better costs and moreover would require concomitant improve- coefficients are obtained for transition metal compounds. ments of the applied DFAs or damping schemes. Impor- The two-body dispersion energy is computed in the com- tantly, the general black-box philosophy of the DFT-D4 monly applied form as a sum over pair interactions in- approach and the coupling of a single dispersion model cluding dipole-dipole and approximate dipole-quadrupole to various standard DFAs is analogous to DFT-D3. This terms. A well-known many-body dispersion model based and the availability for most of the periodic table (ele- on coupled harmonic oscillators is used for the determina- ments up to atomic number Z = 86) enables broad appli- tion of all higher-order dipole terms using the charge- and cation to a wide variety of quantum chemical problems. geometry-dependent α(iω) values. Exact analytical gra- A list of all 67 currently parameterized DFAs together dients including derivatives of the EEQ charges are avail- with their BJ-damping parameters is given in the Sup- able for the simplified default energy expression where plementary Material for different atomic partial charges the higher-order dipole-dipole interactions are truncated in combination with an ATM or an MBD treatment. to the ATM term for computational efficiency. The fi- The proposal of D4 with a robust and reasonably accu- nal DFT-D4 default model uses the two-body and the rate classical charge model as basis allows its application ATM energy expression (cf. equation 28) in structure op- not only in the DFT context. Simplified semi-empirical timizations. In addition, a single-point MBD correction orbital models, sophisticated force-fields130 or neural net- is recommended as a sanity check for larger systems (cf. work potentials can easily be coupled with the D4 model equation 29). The degree of empiricism in the adhoc simply because no electron density is required and the more physical D4 is slightly higher than in D3 with the whole procedure including nuclear gradients still runs at same number of functional-dependent damping parame- ”force-field speed”. This opens a route for an efficient, ters (three) and a few additional global parameters for physically sound, and basically non-empirical treatment the charge dependency and coordination numbers. of non-covalent interactions in large systems with thou- The accuracy of the D4 method has been extensively veri- sands of atoms. fied using various benchmark sets including dispersion co- The here presented final molecular (non-periodic) version efficients (organic and organometallic systems), and with can be downloaded in coded form from the authors’ web- various standard density functionals as DFT-D4 for in- site free of charge131. Furthermore, the DFT-D4 method teraction energies (intramolecular dispersion, and host- is available in the upcoming release of TURBOMOLE guest systems of medium to large size), conformational 7.3 software43–45, the next release of ORCA 4.1.067, and energies (sugars, tri- and tetrapeptides, inorganic sys- within an update of MOLPRO132. Ongoing work fo- tems, and RNA backbone models), covalent bond lengths cuses on the implementation of the D4 model for periodic (light main group elements, heavy main group elements, systems, which requires a cost-efficient determination of 17 partial charges under periodic boundary conditions. 5K. I. Assaf, M. Florea, J. Antony, N. M. Henriksen, J. Yin, A. Hansen, Z. W. Qu, R. Sure, D. Klapstein, M. K. Gilson, S. Grimme, and W. M. Nau. HYDROPHOBE Challenge: A Joint Experimental and Computational Study on the Host– V. ACKNOWLEDGMENTS Guest Binding of Hydrocarbons to Cucurbiturils, Allowing Ex- plicit Evaluation of Guest Hydration Free-Energy Contribu- This work was supported by the DFG in the framework tions. J. Phys. Chem. B, 121:11144–11162, 2017. 6M. P´erez,Z. W. Qu, C. B. Caputo, V. Podgorny, L. J. Hounjet, of the priority Program No. SPP 1807, ”Control of Dis- A. Hansen, R. Dobrovetsky, S. Grimme, and D. W. Stephan. Hy- persion Interactions in Chemistry”. E. C. thanks Philipp drosilylation of Ketones, Imines and Nitriles Catalysed by Elec- Pracht for the preparation of ∂q/∂R timings and Jakob trophilic Phosphonium Cations: Functional Group Selectivity Seibert for sharing several PDB structures. Furthermore, and Mechanistic Considerations. Chem. Eur. J., 21:6491–6500, 2015. E. C. thanks Markus Bursch and Dr. Jan Gerit Branden- 7T. Sperger, I. A. Sanhueza, I. Kalvet, and F. Schoenebeck. burg for fruitful discussions. Computational studies of synthetically relevant homogeneous organometallic catalysis involving Ni, Pd, Ir, and Rh: an overview of commonly employed DFT methods and mechanistic insights. Chem. Rev., 115:9532–9586, 2015. VI. SUPPLEMENTARY MATERIAL 8D. A. Bardwell, C. S. Adjiman, Y. A. Arnautova, E. Barta- shevich, S. X. M. Boerrigter, D. E. Braun, A. J. Cruz-Cabeza, In the Supplementary Material the classical charge model G. M. Day, R. G. Della Valle, G. R. Desiraju, B. P. van Eijck, is defined in more detail including an energy expres- J. C. Facelli, M. B. Ferraro, D. Grillo, M. Habgood, D. W. M. Hofmann, F. Hofmann, K. V. J. Jose, P. G. Karamertzanis, sion and analytical gradients. The theoretical foun- A. V. Kazantsev, J. Kendrick, L. N. Kuleshova, F. J. J. Leusen, dations of the many-body dispersion correction used A. V. Maleev, A. J. Misquitta, S. Mohamed, R. J. Needs, M. A. within the DFT-D4-MBD method is discussed in de- Neumann, D. Nikylov, A. M. Orendt, R. Pal, C. C. Pantelides, tail. The derivation of the two-body dispersion poten- C. J. Pickard, L. S. Price, S. L. Price, H. A. Scheraga, J. van de tial which is included into the GFN2-xTB Hamiltonian Streek, T. S. Thakur, S. Tiwari, E. Venuti, and I. K. Zhitkov. Towards crystal structure prediction of complex organic com- matrix is given. The definitions of the double hybrid pounds – a report on the fifth blind test. Acta Cryst. B, 67:535– density functionals used are given. The BJ-damping pa- 551, 2011. rameters are given for 67 DFAs and for Hartree-Fock. 9A. M. Reilly and et al. Report on the sixth blind test of organic We compare timings (single-point energies and gradi- crystal structure prediction methods. Acta Cryst. B, 72:439– 459, 2016. ents) between DFT-D4 and DFT-D3(BJ)-ATM for the 10S. L. Price and S. M. Reutzel-Edens. The potential of computed tetrakis(isonitrile)rhodium(I) dimer with 106 atoms and crystal energy landscapes to aid solid-form development. Drug for a diamond chunk with 430 atoms. Furthermore, all Discov. Today, 21:912–923, 2016. statistical quantities are defined and listed as used in 11S. Kristy´anand P. Pulay. Can (semi)local density functional the evaluation of the calculated data. Statistical eval- theory account for the London dispersion forces? Chem. Phys. Lett., 229:175–180, 1994. uations of the following benchmark systems are given 12J. M. P´erez-Jord´aand A. D. Becke. A density-functional study with all reference and computed data for S30L, L7, of van der Waals forces: rare gas diatomics. Chem. Phys. Lett., MOR41, SCONF, PCONF21, ICONF, UPU23, ROT34, 233:134–137, 1995. LMGB35, HMGB11, and TMC32. Structures of the 13M. D. Wodrich, C. Corminboeuf, and P. R. Schleyer. System- tetrakis(isonitrile)rhodium(I) monomer and dimer are atic errors in computed alkane energies using B3LYP and other popular DFT functionals. Org. Lett., 8:3631–3634, 2006. given as a tarball. The TOMC6 benchmark set (struc- 14E. R. Johnson and G. A. DiLabio. Structure and binding en- tures and reference molecular C6 coefficients) is given as a ergies in van der Waals dimers: Comparison between density tarball. Furthermore, we share the fitting set used for de- functional theory and correlated ab initio methods. Chem. Phys. termination of damping parameters, to ensure that new Lett., 419:333–339, 2006. 15 DFAs can be parameterized consistently. S. Grimme, A. Hansen, J. G. Brandenburg, and C. Bannwarth. Dispersion-Corrected Mean-Field Electronic Structure Meth- ods. Chem. Rev., 116:5105–5154, 2016. 16S. Grimme, J. Antony, S. Ehrlich, and H. Krieg. A consistent VII. REFERENCES and accurate ab initio parametrization of density functional dis- persion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys., 132:154104, 2010. 17A. D. Becke and E. R. Johnson. Exchange-hole dipole moment 1 R. G. Parr and W. Yang. Density-functional theory of the elec- and the dispersion interaction. J. Chem. Phys., 122:154104, tronic structure of molecules. Annu. Rev. Phys. Chem., 46:701– 2005. 728, 1995. 18A. D. Becke and E. R. Johnson. A density-functional model of 2 A. D. Becke. Perspective: Fifty years of density-functional the- the dispersion interaction. J. Chem. Phys., 123:154101, 2005. ory in chemical physics. J. Chem. Phys., 140:18A301, 2014. 19A. Otero-de-la Roza and E. R. Johnson. Many-body dispersion 3 R. Sure, J. Antony, and S. Grimme. Blind Prediction of Bind- interactions from the exchange-hole dipole moment model. J. ing Affinities for Charged Supramolecular HostGuest Systems: Chem. Phys., 138:054103, 2013. Achievements and Shortcomings of DFT-D3. J. Phys. Chem. 20A. Tkatchenko and M. Scheffler. Accurate Molecular Van Der B, 118:3431–3440, 2014. Waals Interactions from Ground-State Electron Density and 4 J. Yin, N. M. Henriksen, D. R. Slochower, M. R. Shirts, M. W. Free-Atom Reference Data. Phys. Rev. Lett., 102:073005, 2009. Chiu, D. L. Mobley, and M. K. Gilson. Overview of the SAMPL5 21M. Dion, H. Rydberg, E. Schr¨oder,D. C. Langreth, and B. I. host–guest challenge: Are we doing better? J. Comput. Aided Lundqvist. Van der Waals Density Functional for General Ge- Mol. Des., 31:1–19, 2017. 18

ometries. Phys. Rev. Lett., 92:246401, 2004. 44F. Furche, R. Ahlrichs, C. H¨attig,W. Klopper, M. Sierka, and 22K. Berland, V. R. Cooper, K. Lee, E. Schr¨oder,T. Thonhauser, F. Weigend. Turbomole. WIREs Comput. Mol. Sci., 4:91–100, P. Hyldgaard, and B. I. Lundqvist. van der Waals forces in 2014. density functional theory: a review of the vdW-DF method. 45R. Ahlrichs, M. B¨ar,M. H¨aser,H. Horn, and C. K¨olmel. Elec- Rep. Prog. Phys., 78:066501, 2015. tronic structure calculations on workstation computers: The 23O. A. Vydrov and T. Van Voorhis. Improving the accuracy program system . Chem. Phys. Lett., 162:165–169, of the nonlocal van der Waals density functional with minimal 1989. empiricism. J. Chem. Phys., 130:104105, 2009. 46K. J. Miller. Additivity methods in molecular polarizability. J. 24O. A Vydrov and T. Van Voorhis. Nonlocal van der Waals Am. Chem. Soc., 112:8533–8542, 1990. Density Functional Made Simple. Phys. Rev. Lett., 103:063004, 47L. Pauling. The Nature of the Chemical Bond, 3rd edit. Cornell 2009. University, Press, Ithaca, New York, page 17, 1960. 25L. Goerigk, A. Hansen, C. Bauer, S. Ehrlich, A. Najibi, and 48P. Pyykk¨oand M. Atsumi. Molecular Single-Bond Covalent S. Grimme. A look at the density functional theory zoo with the Radii for Elements 1–118. Chem.-Eur. J., 15:186–197, 2009. advanced GMTKN55 database for general main group thermo- 49C. Bannwarth, S. Ehlert, and S. Grimme. GFN2-xTB - chemistry, kinetics and noncovalent interactions. Phys. Chem. an Accurate and Broadly Parametrized Self-Consistent Tight- Chem. Phys., 19:32184–32215, 2017. Binding Quantum Chemical Method with Multipole Elec- 26E. Caldeweyher, C. Bannwarth, and S. Grimme. Exten- trostatics and Density-Dependent Dispersion Contributions sion of the D3 dispersion coefficient model. J .Chem. Phys., DOI:10.1021/acs.jctc.8b01176. 2019. 147:034112, 2017. 50K. B. Wiberg. Application of the pople-santry-segal CNDO 27S. A. Ghasemi, A. Hofstetter, S. Saha, and S. Goedecker. In- method to the cyclopropylcarbinyl and cyclobutyl cation and to teratomic potentials for ionic systems with density functional bicyclobutane. Tetrahedron, 24:1083–1096, 1968. accuracy based on charge densities obtained by a neural net- 51W. Reckien, F. Janetzko, M. F. Peintinger, and T. Bredow. Im- work. Phys. Rev. B, 92:045131, 2015. plementation of empirical dispersion corrections to density func- 28R. A. DiStasio, O. A. von Lilienfeld, and A. Tkatchenko. Col- tional theory for periodic systems. J. Comput. Chem., 33:2023– lective many-body van der Waals interactions in molecular sys- 2031, 2012. tems. Proc. Natl. Acad. Sci. USA, 109:14791–14795, 2012. 52T. M. Trnka and R. H. Grubbs. The development of L2X2Ru 29A. Ambrosetti, D. Alf`e,R. A. DiStasio Jr., and A. Tkatchenko. CHR olefin metathesis catalysts: an organometallic success Hard Numbers for Large Molecules: Toward Exact Energetics story. Acc. Chem. Res., 34:18–29, 2001. for Supramolecular Systems. J. Phys. Chem. Lett., 5:849–855, 53J. J. Van Veldhuizen, D. G. Gillingham, S. B. Garber, 2014. O. Kataoka, and A. H. Hoveyda. Chiral Ru-based complexes 30S. M. Gatica, M. W. Cole, and D. Velegol. Designing van der for asymmetric olefin metathesis: Enhancement of catalyst ac- Waals forces between nanocolloids. Nano Lett., 5:169–173, 2005. tivity through steric and electronic modifications. J. Am. Chem. 31H. Y. Kim, J. O. Sofo, D. Velegol, M. W. Cole, and A. A. Lucas. Soc., 125:12502–12508, 2003. Van der Waals dispersion forces between dielectric nanoclusters. 54N. Artrith, T. Morawietz, and J. Behler. High-dimensional Langmuir, 23:1735–1740, 2007. neural-network potentials for multicomponent systems: Appli- 32V. V. Gobre and A. Tkatchenko. Scaling laws for van der cations to zinc oxide. Phys. Rev. B, 83:153101, 2011. Waals interactions in nanostructured materials. Nat. Commun., 55J. R. Bunch and L. Kaufman. A computational method for the 4:2341, 2013. indefinite quadratic programming problem. Linear Algebra Its 33T. Buˇcko, S. Leb`egue, T. Gould, and J. G. Angy´an.´ Many- Appl., 34:341–370, 1980. body dispersion corrections for periodic systems: an efficient 56J. D. Chai and M. Head-Gordon. Long-range corrected double- reciprocal space implementation. J. Phys. Condens. Matter, hybrid density functionals with damped atom-atom dispersion 28:045201, 2016. corrections. Phys. Chem. Chem. Phys., 10:6615–6620, 2000. 34M. J. Elrod and R. J. Saykally. Many-body effects in intermolec- 57S. Grimme, S. Ehrlich, and L. Goerigk. Effect of the Damping ular forces. Chem. Rev., 94:1975–1997, 1994. Function in Dispersion Corrected Density Functional Theory. 35A. G. Donchev. Many-body effects of dispersion interaction. J. J. Comput. Chem., 32:1456–1465, 2011. Chem. Phys., 125:074713, 2006. 58J. Cao and B. J. Berne. Many-body dispersion forces of po- 36J. Hermann, R. A. DiStasio Jr, and A. Tkatchenko. First- larizable clusters and liquids. J. Chem. Phys., 97:8628–8636, principles models for van der Waals interactions in molecules 1992. and materials: Concepts, theory, and applications. Chem. Rev., 59C. Adamo and V. Barone. Toward reliable density functional 117:4714–4758, 2017. methods without adjustable parameters: The PBE0 model. J. 37J. F. Dobson. Beyond pairwise additivity in London dispersion Chem. Phys., 110:6158–6170, 1999. interactions. Int. J. Quantum Chem., 114:1157–1161, 2014. 60F. Weigend and R. Ahlrichs. Balanced basis sets of split valence, 38A. Tkatchenko, R. A. DiStasio Jr, R. Car, and M. Scheffler. triple zeta valence and quadruple zeta valence quality for H to Accurate and efficient method for many-body van der Waals Rn: Design and assessment of accuracy. Phys. Chem. Chem. interactions. Phys. Rev. Lett., 108:236402, 2012. Phys., 7:3297–3305, 2005. 39R. A. DiStasio, V. V. Gobre, and A. Tkatchenko. Many-Body 61H. Schr¨oder,A. Creon, and T. Schwabe. Reformulation of the van der Waals Interactions in Molecules and Condense Matter. D3 (Becke–Johnson) Dispersion Correction without Resorting J. Phys.: Condens. Matter, 26:213202, 2014. to Higher than C 6 Dispersion Coefficients. J. Chem. Theory 40B. M. Axilrod and E. Teller. Interaction of the van der Waals Comput., 11:3163–3170, 2015. type between three atoms. J. Chem. Phys., 11:299–300, 1943. 62H. Schr¨oder,J. H¨uhnert,and T. Schwabe. Evaluation of DFT- 41Y. Muto. Force between nonpolar molecules. In Proc. Phys. D3 dispersion corrections for various structural benchmark sets. Math. Soc. Jpn., volume 17, pages 629–631, 1943. J. Chem. Phys., 146:044115, 2017. 42D. C. Ghosh and N. Islam. Semiempirical evaluation of the 63J. Witte, N. Mardirossian, J. B. Neaton, and M. Head-Gordon. global hardness of the atoms of 103 elements of the periodic Assessing DFT-D3 Damping Functions Across Widely Used table using the most probable radii as their size descriptors. Density Functionals: Can We Do Better? J. Chem. Theory Int. J. Quantum Chem., 110:1206–1213, 2010. Comput., 13:2043–2052, 2017. 43TURBOMOLE V7.0 2015, a development of University of Karl- 64D. G. A. Smith, L. A. Burns, K. Patkowski, and C. D. Sherrill. sruhe and Forschungszentrum Karlsruhe GmbH, 1989-2007, Revised damping parameters for the D3 dispersion correction to TURBOMOLE GmbH, since 2007; available from density functional theory. J. Phys. Chem. Lett., 7:2197–2203, http://www.turbomole.com. 2016. 19

65J. Sun, A. Ruzsinszky, and J. P. Perdew. Strongly constrained 85S. F. Boys and F. Bernardi. The calculation of small molecular and appropriately normed semilocal density functional. Phys. interactions by the differences of separate total energies. Some Rev. Lett., 115:036402, 2015. procedures with reduced errors. Mol. Phys., 19:553–566, 1970. 66J. G. Brandenburg, J. E. Bates, J. Sun, and J. P. Perdew. Bench- 86C. Riplinger, B. Sandhoefer, A. Hansen, and F. Neese. Natural mark tests of a strongly constrained semilocal functional with a triple excitations in local coupled cluster calculations with pair long-range dispersion correction. Phys. Rev. B, 94:115144, 2016. natural orbitals. J. Chem. Phys., 139:134101, 2013. 67F. Neese. The ORCA program system. WIREs Comput. Mol. 87C. Riplinger and F. Neese. An efficient and near linear scal- Sci., 2:73–78, 2012. ing pair natural orbital based local coupled cluster method. J. 68F. Neese. Software update: the ORCA program system, version Chem. Phys., 138:034106, 2013. 4.0. WIREs Comput. Mol. Sci., 8, 2018. 88C. Riplinger, P. Pinski, U. Becker, E. F. Valeev, and F. Neese. 69O. Vahtras, J. Alml¨of, and M. W. Feyereisen. Integral approxi- Sparse maps – A systematic infrastructure for reduced-scaling mations for LCAO-SCF calculations. Chem. Phys. Lett., 213(5- electronic structure methods. II. Linear scaling domain based 6):514–518, 1993. pair natural orbital coupled cluster theory. J. Chem. Phys., 70K. Eichkorn, F. Weigend, O. Treutler, and R. Ahlrichs. Aux- 144:024109, 2016. iliary basis sets for main row atoms and transition metals and 89H. Kruse, A. Mladek, K. Gkionis, A. Hansen, S. Grimme, and their use to approximate Coulomb potentials. Theor. Chem. J. Sponer. Quantum Chemical Benchmark Study on 46 RNA Acc., 97(1):119–124, 1997. Backbone Families Using a Dinucleotide Unit. J. Chem. Theory 71F. Weigend. Accurate Coulomb-fitting basis sets for H to Rn. Comput., 11:4972–4991, 2015. Phys. Chem. Chem. Phys., 8(9):1057–1065, 2006. 90G. I. Csonka, A. D. French, G. P. Johnson, and C. A. Stortz. 72F. Weigend, F. Furche, and R. Ahlrichs. Gaussian basis sets Evaluation of Density Functionals and Basis Sets for Carbohy- of quadruple zeta quality for atoms H to Kr. J. Chem. Phys., drates. J. Chem. Theory Comput., 5:679–692, 2009. 119:12753–12762, 2003. 91D. Reha,ˇ H. Vald`es, J. Vondr´aˇsek, P. Hobza, A. Abu-Riziq, 73B. Brauer, M. K. Kesharwani, S. Kozuch, and J. M. L. Martin. B. Crews, and M. S. de Vries. Structure and IR Spectrum of The S66x8 benchmark for noncovalent interactions revisited: ex- PhenylalanylGlycylGlycine Tripetide in the Gas-Phase: IR/UV plicitly correlated ab initio methods and density functional the- Experiments, Ab Initio Quantum Chemical Calculations, and ory. J. Chem. Theory Comput., 18:20905–20925, 2016. Molecular Dynamic Simulations. Chem. Eur. J., 11, 2005. 74L. Gr´afov´a,M. Pitonak, J. Rez´aˇc,andˇ P. Hobza. Comparative 92L. Goerigk, A. Karton, J. M. L. Martin, and L. Radom. Accu- study of selected wave function and density functional meth- rate quantum chemical energies for tetrapeptide conformations: ods for noncovalent interaction energy calculations using the why MP2 data with an insufficient basis set should be handled extended S22 data set. J. Chem. Theory Comput., 6:2365–2376, with caution. Phys. Chem. Chem. Phys., 15:7028–7031, 2013. 2010. 93L. Goerigk, A. Hansen., C. A. Bauer, S. Ehrlich, A. Najibi, and 75D. E. Taylor, J. G. Angy´an,G.´ Galli, C. Zhang, F. Gygi, K. Hi- S. Grimme. GMTKN55 Database. https://www.chemie.uni- rao, J. W. Song, K. Rahul, O. Anatole von Lilienfeld, and R. et bonn.de/pctc/mulliken-center/software/GMTKN/gmtkn, al. Podeszwa. Blind test of density-functional-based methods on 2017. [Online; accessed 6/28/2018]. intermolecular interaction energies. J. Chem. Phys., 145:124105, 94S. Dohm, A. Hansen, M. Steinmetz, S. Grimme, and M. P. 2016. Checinski. Comprehensive thermochemical benchmark set of 76S. J. A. van Gisbergen, J. G. Snijders, and E. J. Baerends. A realistic closed-shell metal organic reactions. J. Chem. Theory density functional theory study of frequency-dependent polar- Comput., 14:2596–2608, 2018. izabilities and Van der Waals dispersion coefficients for poly- 95S. Dohm, A. Hansen, M. Steinmetz, S. Grimme, and M. P. atomic molecules. J. Chem. Phys., 103:9347–9354, 1995. Checinski. MOR41 benchmark set. https://www.chemie. 77V. P. Osinga, S. J. A. van Gisbergen, J. G. Snijders, and uni-bonn.de/pctc/mulliken-center/software/mor41, 2018. E. J. Baerends. Density functional results for isotropic and [Online; accessed 6/28/2018]. anisotropic multipole polarizabilities and C6, C7 and C8 Van 96F. Pavoˇsevi´c,C. Peng, P. Pinski, C. Riplinger, F. Neese, and der Waals dispersion coefficients for molecules. J. Chem. Phys., E. F. Valeev. SparseMapsA systematic infrastructure for re- 106:5091–5101, 1997. duced scaling electronic structure methods. V. Linear scaling 78A. Otero de la Roza, F. Kannemann, E. R. John- explicitly correlated coupled-cluster method with pair natural son, R. M. Dickson, H. Schmider, and A. D. Becke. orbitals. J. Chem. Phys., 146:174108, 2017. Exchange-Hole Dipole Moment Model standalone. 97F. Neese and E. F. Valeev. Revisiting the atomic natural orbital http://schooner.chem.dal.ca/wiki/Postg, 2013. [Online; approach for basis sets: Robust systematic basis sets for explic- accessed 2/15/2018]. itly correlated and conventional correlated ab initio methods? 79J. P. Perdew, K. Burke, and M. Ernzerhof. Generalized Gradient J. Chem. Theory Comput., 7:33–43, 2010. Approximation Made Simple. Phys. Rev. Lett., 77:3865–3868, 98S. Grimme, J. G. Brandenburg, C. Bannwarth, and A. Hansen. 1996. erratum Phys. Rev. Lett. 78, 1396 (1997). Consistent structures and interactions by density functional 80A. Tkatchenko. Many-Body Dispersion (MBD) standalone. theory with small atomic orbital basis sets. J. Chem. Phys., http://www.fhi-berlin.mpg.de/~tkatchen/MBD/, 2014. [On- 143:054107, 2015. 99 line; accessed 2/15/2018]. G. D. Zeiss and W. J. Meath. Dispersion energy constants C6(A, 81J. Tao, J. P. Perdew, V. N. Staroverov, and G. E. Scuseria. B), dipole oscillator strength sums and refractivities for Li, N, Climbing the Density Functional Ladder: Nonempirical Meta O, H2,N2,O2, NH3,H2O, NO and N2O. Mol. Phys., 33:1155– Generalized Gradient Approximation Designed for Molecules 1176, 1977. and Solids. Phys. Rev. Lett., 91:146401, 2003. 100D. J. Margoliash and W. J. Meath. Pseudospectral dipole os- 82J. G. Brandenburg, C. Bannwarth, A. Hansen, and S. Grimme. cillator strength distributions and some related two body inter- B97-3c: A revised low-cost variant of the B97-D density func- action coefficients for H, He, Li, N, O, H2,N2,O2, NO, N2O, tional method. J. Chem. Phys., 148:064104, 2018. H2O, NH3, and CH4. J. Chem. Phys., 68:1426–1431, 1978. 83R. Sedlak, T. Janowski, M. Pitoˇn´ak,J. Rez´aˇc,P.ˇ Pulay, and 101S. Grimme. Density functional theory with London dispersion P. Hobza. Accuracy of quantum chemical methods for large non- corrections. WIREs Comput. Mol. Sci., 1:211–228, 2011. covalent complexes. J. Chem. Theory Comput., 9:3364–3374, 102T. Sato and H. Nakai. Local Response Dispersion Method. 2013. II. Generalized Multicenter Interactions. J. Chem. Phys., 84R. Sure and S. Grimme. Comprehensive Benchmark of Asso- 133:194101, 2010. ciation (Free) Energies of Realistic HostGuest Complexes. J. 103O. A. Vydrov and T. Van Voorhis. Dispersion interactions from Chem. Theory Comput., 11:3785–3801, 2015. a local polarizability model. Phys. Rev. A, 81:062708, 2010. 20

104S. Grimme and J. P. Djukic. Cation-Cation Attraction: When sets for the parametrization and validation of density functional London Dispersion Attraction Wins over Coulomb Repulsion. and other approximate methods. Derivation of a robust, gen- Inorg. Chem., 50:2619–2628, 2011. erally applicable, double-hybrid functional for thermochemistry 105P. Jureˇcka, J. Sponer,ˇ J. Cerny, and P. Hobza. Benchmark and thermochemical kinetics. J. Phys. Chem. A, 112:12868– database of accurate (MP2 and CCSD(T) complete basis set 12886, 2008. limit) interaction energies of small model complexes, DNA base 121M. M. Olmstead and A. L. Balch. Rhodium (II) dimers: The pairs, and amino acid pairs. Phys. Chem. Chem. Phys., 8:1985– preparation and structure of [(p-CH3C6H4NC)8Rh2I2]-[PF6]2. 1993, 2006. J. Organomet. Chem., 148:C15–C18, 1978. 106A. D. Becke. Density-functional exchange-energy approximation 122H. Endres, N. Gottstein, H. J. Keller, R. Martin, W. Rode- with correct asymptotic behavior. Phys. Rev. A, 38:3098, 1988. mer, and W. Steiger. Kristall-und Molek¨ulstrukturvon Tetrakis 107C. Lee, W. Yang, and R. G. Parr. Development of the Colle- (4-fluorophenylisonitril) rhodium (I) chloridhydrat und Tetrakis Salvetti correlation-energy formula into a functional of the elec- (4-nitrophenylisonitril) rhodium (I) chlorid/Crystal and Molec- tron density. Phy. Rev. B, 37:785, 1988. ular Structure of Tetrakis (4-fiuorophenylisonitrile) rhodium 108J. P. Perdew. Density-functional approximation for the correla- (I) Chloride Hydrate and Tetrakis (4-nitrophenylisonitrile) tion energy of the inhomogeneous electron gas. Phys. Rev. B, rhodium (I) Chloride. Z. Naturforsch. B, 34:827–833, 1979. 33:8822, 1986. 123N. T. Tran, J. R. Stork, D. Pham, M. M. Olmstead, J. C. Fet- 109 Y. Zhao and D. G. Truhlar. A new local density functional for tinger, and A. L. Balch. Variation in crystallization conditions main-group thermochemistry, transition metal bonding, ther- allows the isolation of trimeric as well as dimeric and monomeric mochemical kinetics, and noncovalent interactions. J. Chem. forms of [(alkyl isocyanide)4Rh I]+. Chem. Comm., 10:1130– Phys., 125:194101, 2006. 1132, 2006. 110 Y. Zhao and D. G. Truhlar. Density functionals with broad 124S. Kozuch and J. M. L. Martin. Spin-component-scaled double applicability in chemistry. Acc. Chem. Res., 41:157–167, 2008. hybrids: An extensive search for the best fifth-rung functionals 111 N. C. Handy and D. J. Tozer. The development of new exchange- blending DFT and perturbation theory. J. Comput. Chem., correlation functionals: 3. Mol. Phys., 94:707–715, 1998. 34:2327–2344, 2013. 112 N. C. Handy and A. J. Cohen. Left-right correlation energy. 125A. Najibi and L. Goerigk. The Nonlocal Kernel in van der Waals Mol. Phys., 99:403–412, 2001. Density Functionals as an Additive Correction: An Extensive 113 Y. Zhang and W. Yang. Comment on ”Generalized Gradient Analysis with Special Emphasis on the B97M-V and ωB97M-V Approximation Made Simple”. Phys. Rev. Lett., 80:890–890, Approaches. J. Chem. Theory Comput., 14:5725–5738, 2018. 1998. 126M. B¨uhland H. Kabrede. Geometries of transition-metal com- 114 B. Hammer, L. B. Hansen, and J. K. Nørskov. Improved adsorp- plexes from density-functional theory. J. Chem. Theory Com- tion energetics within density-functional theory using revised put., 2:1282–1290, 2006. Perdew-Burke-Ernzerhof functionals. Phys. Rev. B, 59:7413, 127J. Rez´aˇc,ˇ K. E. Riley, and P. Hobza. S66: A Well- 1999. balanced Database of Benchmark Interaction Energies Relevant 115 Y. Zhao and D. G. Truhlar. The M06 suite of density func- to Biomolecular Structures. J. Chem. Theory Comput., 7:2427, tionals for main group thermochemistry, thermochemical kinet- 2011. ics, noncovalent interactions, excited states, and transition ele- 128S. Grimme and M. Steinmetz. Effects of London Dispersion ments: two new functionals and systematic testing of four M06- Correction in Density Functional Theory on the Structures of class functionals and 12 other functionals. Theor. Chem. Acc., Organic Molecules in the Gas Phase. Phys. Chem. Chem. Phys., 120:215–241, 2008. 15:16031–16042, 2013. 116 A. D. Becke. Density-functional thermochemistry. III. The role 129T. Risthaus, M. Steinmetz, and S. Grimme. Implementation of exact exchange. J. Chem. Phys., 98:5648–5652, 1993. of Nuclear Gradients of Range-Separated Hybrid Density Func- 117 P. J. Stephens, F. J. Devlin, C. F. N. Chabalowski, and M. J. tionals and Benchmarking on Rotational Constants for Organic Frisch. Ab initio calculation of vibrational absorption and cir- Molecules. J. Comput. Chem., 35:1509–1516, 2014. cular dichroism spectra using density functional force fields. J. 130S. Grimme, C. Bannwarth, E. Caldeweyher, J. Pisarek, and Phys. Chem., 98(45):11623–11627, 1994. A. Hansen. A general intermolecular force field based on 118 Y. Zhao and D. G. Truhlar. Design of density functionals tight-binding quantum chemical calculations. J. Chem. Phys., that are broadly accurate for thermochemistry, thermochemi- 147:161708, 2017. cal kinetics, and nonbonded interactions. J. Phys. Chem. A, 131E. Caldeweyher, S. Ehlert, and S. Grimme. DFT–D4 109:5656–5667, 2005. standalone. https://www.chemie.uni-bonn.de/pctc/mulliken- 119 S. Kozuch, D. Gruzman, and J. M. L. Martin. DSD-BLYP: A center/software/dft-d4/dft-d4, 2018. [Online; accessed general purpose double hybrid density functional including spin XX/XX/2018]. component scaling and dispersion correction. J. Phys. Chem. 132H. J. Werner, P. J. Knowles, G. Knizia, F. R. Manby, and C, 114:20801–20808, 2010. M. Sch¨utz.Molpro: a general-purpose quantum chemistry pro- 120 A. Karton, A. Tarnopolsky, J. F. Lam´ere,G. C. Schatz, and gram package. WIREs Comput. Mol. Sci., 2:242–253, 2012. J. M. L. Martin. Highly accurate first-principles benchmark data A generally applicable atomic-charge dependent London dispersion correction Eike Caldeweyher, Sebastian Ehlert, Andreas Hansen, Hagen Neugebauer, Sebastian Spicher, Christoph Bannwarth, and Stefan Grimme Supplementary Material

Part I: Additional theory, parameters, and timings Classical partial charges Pages S3-S8 Many-body dispersion theory Pages S9-S11 Tight-binding two-body dispersion potential (GFN2-xTB) Pages S12 Double hybrid density functionals Pages S13 BJ-damping parameters Pages S14-S21 Timings of energy and gradient calls Pages S22 Part II: Statistical measures and evaluations Extended statistical measures Pages S24 Statistical evaluation: S30L Pages S25-S27 Statistical evaluation: L7 Pages S28-S30 Statistical evaluation: MOR41 Pages S31-S36 Statistical evaluation: SCONF Pages S37 Statistical evaluation: PCONF21 Pages S38 Statistical evaluation: ICONF Pages S39 Statistical evaluation: UPU23 Pages S40 Statistical evaluation: ROT34 Pages S41-S42 Statistical evaluation: LMGB35 Pages S43-S44 Statistical evaluation: HMGB11 Pages S45 Statistical evaluation: TMC32 Pages S46-S48 References Pages S49-END

S1 Part I: Additional theory, parameters, and timings

S2 A. Classical partial charges Classical electronegativity equilibration (EEQ) partial charges are determined by minimizing the following energy expression

N     N N X 1 2γii 1 X X erf(γijRij) E = χ q + J + q2 + q q (1) IES i i 2 ii √π i 2 i j R i=1 i=1 j=1 ij i6=j

2 2 − 1 where γij is given as (ai + aj ) 2 with ai beeing the van der Waals radius of atom i. For a more compact representation we rewrite the above expression in matrix notation

> 1 EIES = q ( Aq X) (2) 2 − where we define the A matrix and the X vector by  √2γii Jii + i = j  √π Xi = χi and Aij = (3) −  erf(γijRij)  otherwise Rij

Note that the choice of X is defined according to the work of Goedecker et al. in 2015 [1], and we keep the original notation to aid comparability. To obtain EEQ partial charges from this equations, under the constraint that the partial charges conserve the total charge qtotal of the system, the method of constrained Lagrangian optimization is used as

N ! N X ∂L ∂L X L = EIES + λ qk qtotal with = 0 = qi qtotal = 0 (4) − ∂q ∧ ∂λ − k=1 i=1 which leads to the following set of (N + 1) linear equations

 A 1 q  X  > = (5) 1 0 · λ qtotal

In contrast to Goedecker’s approach we determine χi not by a neural network but use a modified variant of the coordination number (mCN) similary as in the DFT-D3 model [2]. For this EEQ charge model we suggest p χi = ENi κi mCNi (6) − where ENi is the electronegativity, κi is a scaling factor for the geometry dependency, and mCNi is the coordination number defined as

N !!! X 1 Rij mCNi = 1 + erf k1 1 (7) 2 · − · Rcov − j=1 ij j6=i where k1 is an ad-hoc parameter which is set to 7.5 to reproduce the short range behaviour of the original DFT-D3 CN as close as possible while having a better long-range behaviour. cov cov cov Rij = Ri + Rj are the covalent radii published by Pyykk¨o et al. in 2010 [3] which are

S3 used to be consistent with the DFT-D3 CN. As we arrived at a stationary point in the con- strained optimization we can derive the expression needed to calculate the analytical partial charge derivative by

2 2 ∂L d ∂L ∂ L ∂ L ∂qk = 0 = 0 = = + 2 ∂qk ⇒ dRj ∂qk ∂qk∂Rj ∂qk · ∂Rj 2 2 (8) ∂ L ∂qk ∂ L 2 = ⇐⇒ ∂qk · ∂Rj −∂qk∂Rj Plugging in the expression for L from equation 4 we get

 ∂q   ∂A   ∂X  ∂2L 0 q  A 1 ∂R = ∂R +  j  ∂R (9)  j  λ 1> 0  ∂λ   j  ∂Rj∂q 0> 0 · ·   − 0 ∂Rj we do the same to obtain the electronic Hessian from equation 4

∂2L  A 1 = (10) ∂q2 1> 0

Plugging everything back into equation 8 we get

 ∂q  −1   ∂A   ∂X  ∂R  A 1 0 q  j  = ∂R + ∂R (11)  ∂λ  1> 0   j  λ  j    · − 0> 0 · 0 ∂Rj To invert the indefinite but symmmetric (N + 1) matrix we apply a Bunch–Kaufman factor- ization. Overall four parameter are fitted for each element i: ENi, Jii, κi, and ai (namely the atomic electronegativity, atomic hardness terms, element specific scaling parameters, and atomic van der Waals radii).

Table A1: Atomic electronegativities EN, element-dependent atomic hardness terms J, element specific scaling parameters κ, and atomic van der Waals radii a for all elements up to radon (Z = 86). Atomic number ENi Jii κi ai 1 1.23695041 -0.35015861 0.04916110 0.55159092 2 1.26590957 1.04121227 0.10937243 0.66205886 3 0.54341808 0.09281243 -0.12349591 0.90529132 4 0.99666991 0.09412380 -0.02665108 1.51710827 5 1.26691604 0.26629137 -0.02631658 2.86070364 6 1.40028282 0.19408787 0.06005196 1.88862966 7 1.55819364 0.05317918 0.09279548 1.32250290 8 1.56866440 0.03151644 0.11689703 1.23166285 9 1.57540015 0.32275132 0.15704746 1.77503721 10 1.15056627 1.30996037 0.07987901 1.11955204 11 0.55936220 0.24206510 -0.10002962 1.28263182 12 0.72373742 0.04147733 -0.07712863 1.22344336 13 1.12910844 0.11634126 -0.02170561 1.70936266

S4 Table A1: Atomic electronegativities EN, element-dependent atomic hardness terms J, element specific scaling parameters κ, and atomic van der Waals radii a for all elements up to radon (Z = 86). Atomic number ENi Jii κi ai 14 1.12306840 0.13155266 -0.04964052 1.54075036 15 1.52672442 0.15350650 0.14250599 1.38200579 16 1.40768172 0.15250997 0.07126660 2.18849322 17 1.48154584 0.17523529 0.13682750 1.36779065 18 1.31062963 0.28774450 0.14877121 1.27039703 19 0.40374140 0.42937314 -0.10219289 1.64466502 20 0.75442607 0.01896455 -0.08979338 1.58859404 21 0.76482096 0.07179178 -0.08273597 1.65357953 22 0.98457281 -0.01121381 -0.01754829 1.50021521 23 0.96702598 -0.03093370 -0.02765460 1.30104175 24 1.05266584 0.02716319 -0.02558926 1.46301827 25 0.93274875 -0.01843812 -0.08010286 1.32928147 26 1.04025281 -0.15270393 -0.04163215 1.02766713 27 0.92738624 -0.09192645 -0.09369631 1.02291377 28 1.07419210 -0.13418723 -0.03774117 0.94343886 29 1.07900668 -0.09861139 -0.05759708 1.14881311 30 1.04712861 0.18338109 0.02431998 1.47080755 31 1.15018618 0.08299615 -0.01056270 1.76901636 32 1.15388455 0.11370033 -0.02692862 1.98724061 33 1.36313743 0.19005278 0.07657769 2.41244711 34 1.36485106 0.10980677 0.06561608 2.26739524 35 1.39801837 0.12327841 0.08006749 2.95378999 36 1.18695346 0.25345554 0.14139200 1.20807752 37 0.36273870 0.58615231 -0.05351029 1.65941046 38 0.58797255 0.16093861 -0.06701705 1.62733880 39 0.71961946 0.04548530 -0.07377246 1.61344972 40 0.96158233 -0.02478645 -0.02927768 1.63220728 41 0.89585296 0.01909943 -0.03867291 1.60899928 42 0.81360499 0.01402541 -0.06929825 1.43501286 43 1.00794665 -0.03595279 -0.04485293 1.54559205 44 0.92613682 0.01137752 -0.04800824 1.32663678 45 1.09152285 -0.03697213 -0.01484022 1.37644152 46 1.14907070 0.08009416 0.07917502 1.36051851 47 1.13508911 0.02274892 0.06619243 1.23395526 48 1.08853785 0.12801822 0.02434095 1.65734544 49 1.11005982 -0.02078702 -0.01505548 1.53895240 50 1.12452195 0.05284319 -0.03030768 1.97542736 51 1.21642129 0.07581190 0.01418235 1.97636542 52 1.36507125 0.09663758 0.08953411 2.05432381 53 1.40340000 0.09547417 0.08967527 3.80138135 54 1.16653482 0.07803344 0.07277771 1.43893803 55 0.34125098 0.64913257 -0.02129476 1.75505957 56 0.58884173 0.15348654 -0.06188828 1.59815118 57 0.68441115 0.05054344 -0.06568203 1.76401732

S5 Table A1: Atomic electronegativities EN, element-dependent atomic hardness terms J, element specific scaling parameters κ, and atomic van der Waals radii a for all elements up to radon (Z = 86). Atomic number ENi Jii κi ai 58 0.56999999 0.11000000 -0.11000000 1.63999999 59 0.56999999 0.11000000 -0.11000000 1.63999999 60 0.56999999 0.11000000 -0.11000000 1.63999999 61 0.56999999 0.11000000 -0.11000000 1.63999999 62 0.56999999 0.11000000 -0.11000000 1.63999999 63 0.56999999 0.11000000 -0.11000000 1.63999999 64 0.56999999 0.11000000 -0.11000000 1.63999999 65 0.56999999 0.11000000 -0.11000000 1.63999999 66 0.56999999 0.11000000 -0.11000000 1.63999999 67 0.56999999 0.11000000 -0.11000000 1.63999999 68 0.56999999 0.11000000 -0.11000000 1.63999999 69 0.56999999 0.11000000 -0.11000000 1.63999999 70 0.56999999 0.11000000 -0.11000000 1.63999999 71 0.56999999 0.11000000 -0.11000000 1.63999999 72 0.87936784 -0.02786741 -0.03585873 1.47055223 73 1.02761808 0.01057858 -0.03132400 1.81127084 74 0.93297476 -0.03892226 -0.05902379 1.40189963 75 1.10172128 -0.04574364 -0.02827592 1.54015481 76 0.97350071 -0.03874080 -0.07606260 1.33721475 77 1.16695666 -0.03782372 -0.02123839 1.57165422 78 1.23997927 -0.07046855 0.03814822 1.04815857 79 1.18464453 0.09546597 0.02146834 1.78342098 80 1.14191734 0.21953269 0.01580538 2.79106396 81 1.12334192 0.02522348 -0.00894298 1.78160840 82 1.01485321 0.15263050 -0.05864876 2.47588882 83 1.12950808 0.08042611 -0.01817842 2.37670734 84 1.30804834 0.01878626 0.07721851 1.76613217 85 1.33689961 0.08715453 0.07936083 2.66172302 86 1.27465977 0.10500484 0.05849285 2.82773085

S6 The quality of those classical partial charges can be seen in Figure 1 and in Figure 2 where we correlate PBE0/def2-TZVP Hirshfeld partial charges with classical EEQ charges and with GFN2-xTB charges. As can be seen from the Table A2, GFN2-xTB converges in some cases to the wrong electronic solution, so that huge deviations can occur (maximum deviation is 19.92 e−). The EEQ model on the other hand proves to be quite robust and can convince with a maximum deviation of 0.56 e− on more than 20000 calculated data points for the Z = 1 86 case. −

Table A2: Statistical measures calculated for the comparison between calculated partial charges and reference PBE0/def2-TZVP Hirshfeld partial charges. Deviations are given in e−.

Measure EEQ(Z = 1 86) GFN2-xTB(Z = 1 86) EEQ(Z = 1 17) GFN2-xTB(Z = 1 17) − − − − MAD 0.04 0.13 0.03 0.18 MD 0.00 0.00 0.00 -0.01 SD 0.06 0.36 0.05 0.65 AMAX 0.56 19.27 0.33 19.92

1.5

qTB 1 qEEQ

0.5

0 q(ref)

0.5 −

1 −

1.5 − 1.5 1 0.5 0 0.5 1 1.5 − − − q(calc) Figure 1: EEQ versus GFN2-xTB partial charges in direct correlation with Hirshfeld partial charges calculated at the PBE0/def2-TZVP level of theory for all elements with Z = 1 86. −

S7 1.5

qTB 1 qEEQ

0.5

0 q(ref)

0.5 −

1 −

1.5 − 1.5 1 0.5 0 0.5 1 1.5 − − − q(calc) Figure 2: EEQ versus GFN2-xTB partial charges in direct correlation with Hirshfeld partial charges calculated at the PBE0/def2-TZVP level of theory for all elements with Z = 1 17 (excluding helium and neon). −

S8 B. Many-body dispersion theory

Tkatchenko et al. [4] have shown that the dispersion energy can be written as

∞ Z dω E (n),MBD = Tr ln (1 A(iω)T) , (12) disp 2π { − } 0 when neglecting intra-oscillator interactions [5] within the matrix formulation Tr A(iω)T = { } 0 . In DFT-D4, the frequency-dependent polarizability matrix A(iω) is obtained from the previously generated atom-in-molecule dynamic polarizabilities

βγ K A (iω) = α (iω)δKP δγβ KP (13) K K K = α (iω, z ,CN ) δKP δγβ. In equation 13, K and P label atoms, and β and γ refer to the Cartesian components of their inter- nuclear distance. The use of D4 atom-in-molecule dynamic polarizabilities offers advantages. Different from the TS-based polarizabilites, the D4 polarizabilites already contain information about the molecular environment and no self-consistent screening needs to be performed, which can jeopardize the stability of the method [6]. The generation of the D4 polarizabilities is simple and robust, since only the geometry and atomic partial charges are needed and no additional information from DFT is required. T is the interaction tensor describing the coupling between the oscillators. The matrix elements of the damped interaction tensor T are given by q ∂ ∂  1  T βγ = f (6) . (14) KP BJD β ∂Rγ R ∂RKP KP KP It should be noted that the BJ-damping function is used here as well to screen the elements of the tensor. A motivation for this choice is given below. The MBD energy can be viewed as a series of n-body dipole-dipole terms, and hence, the n-body energy can be obtained directly via a Casimir-Polder similar integration of the coupled atom-in-molecule polarizabilities. Because the contributions of the terms in the series tend to oscillate and it converges slowly with n, the value of the limit of the series is used here as computed in equation 12. The astute reader will note that the evaluation of the logarithmic trace in equation 12 is not directly possible since the product A(iω)T is a trace-less matrix. To obtain the logarithmic trace, the matrix created by subtraction (1 A(iω)T) is diagonalized and the sum of the eigenvalues is used to calculate all many-body dispersion− terms. Furthermore, splitting the diagonal polarizability matrix A(iω) into the product of its square roots, which is possible due to the invariance regarding cyclic permutation, simplifies the problem to symmetrical matrices only, which makes the calculation of eigenvalues much simpler   1 A1/2(iω)TA1/2(iω) U = UΛ. (15) − Here, Λ represents the matrix of eigenvalues with elements λ. The eigenvalues are then used analogously to equation 12, and hence the final expression for the MBD energy reads

∞ Z 3N ! (n),MBD dω Y E = ln λ . (16) disp 2π l 0 l=1 Semi-local DFAs already include short-ranged electron correlation within the exchange-correlation functional. Along with avoiding singularities, this is why the dispersion energy is always damped

S9 at short range. Likewise, the interaction tensor in the MBD model needs to be damped. Ideally, the second order term of the MBD energy should be exactly equivalent to the D4 two-body dipole-dipole energy, i.e.,

∞ Z dω 1  E (6),MBD = Tr (A(iω)T)2 disp − 2π 2 0 ∞ Z N N K P dω 1 X X α (iω)α (iω) 2 = 10 fdamp − 2π 2 RKP 0 K P 3 3 2 X X  β γ 2  (17) 3R R δβγ R × KP KP − KP β γ N N 1 X X CKP = 6 f 2 −2 R6 damp K P KP N N KP ! 1 X X C (6) (6) = 6 f = E . −2 R6 BJD disp K P KP Hence, the square root of the BJ-damping function is used to damp the MBD interaction tensor. Nevertheless, it should be noted that for higher interaction orders (higher exponentiation of fdamp), the respective MBD energy contributions become damped more strongly also in the mid-range distance regime. However, this peculiarity is considered to be small, since the higher- order (n > 2) MBD energies represent a smaller fraction of the total dispersion energy (usually one to two magnitudes less than two-body contributions). The final D4-MBD dispersion energy expression consists of two parts. The first compose the (6,8) two-body dipole-dipole and dipole-quadrupole interaction (denoted as Edisp ). The second part (n),MBD includes all dipole-dipole interactions up to infinite order, Edisp (n = 6, 9, 12, 15,... , ). To avoid double counting of the two-body dipole-dipole energy, it is removed explicitly from∞ the MBD energy according to   ED4−MBD = E(6,8) + E (n),MBD E(6),MBD . (18) disp disp disp − disp

(6) (6),MBD D4−MBD (n),MBD (8) Exploiting that Edisp = Edisp and re-arranging to Edisp = Edisp + Edisp is not possible in the general case, as for double hybrid density functionals (abbreviated as DHDF) s6 = 1, whereas this scaling cannot be applied to an individual term in the infinite-order MBD energy.6 Hence, the dispersion energy in DFT-D4-MBD is always calculated as shown in Eq. 18. Similar to Figure 16 of Ref.[7], the contributions to the dispersion energy considered in D4 are put into context with other correction schemes in Figure 3.

S10 D4 D3 5 MBD C15 XDM 4 C12 C14

3 C9 C11 C13

many-body order n 2 C6 C8 C10 C12

2 3 4 5 angular momenta Σi li

Figure 3: Asymptotic dispersion coefficients from different many-body orders and increasing number of terms in the multipole expansion. The contributions covered by the D3 (including ATM term), D4-MBD, MBD, and XDM methods are highlighted. This Figure is generated in analogy to Figure 16 in Ref.[7].

S11 C. Tight-binding two–body dispersion potential (GFN2-xTB)

We developed the GFN2-xTB dispersion potential in terms of density fluctuations (see Ref. [8])

"  # ∂ (6,8) X X X X E − nj εj  cκj cλj Sκλ − 1 = 0 ∂c disp νi j A,B κ∈A λ∈B

(6,8) Take the derivative of Edisp with respect to the AO coefficient

(6,8) NA,ref NB,ref ∂Edisp ∂ 1 X X X X A A,ref B B,ref A,ref B,ref = ζ(z , z ) ζ(z , z ) W W ∂c ∂c 2 A B νi νi A A,ref=1 B B,ref=1 | {z } | {z } A,ref B,ref ζ ζ A B AB X C(n) (n) s f (R ) n (n) BJD AB n=6,8 RAB

NA,ref NB,ref A,ref AB 1 X X X X ∂ζ B,ref A,ref B,ref X C(n) (n) = A ζ W W s f (R ) B A B n (n) BJD AB 2 ∂cνi A A,ref=1 B B,ref=1 n=6,8 RAB

NA,ref NB,ref B,ref AB 1 X X X X ∂ζ A,ref A,ref B,ref X C(n) (n) + B ζ W W s f (R ) A A B n (n) BJD AB 2 ∂cνi A A,ref=1 B B,ref=1 n=6,8 RAB

NA,ref NB,ref A,ref AB 1 X X X X X ∂ζ ∂qD B,ref A,ref B,ref X C(n) (n) = A ζ W W s f (R ) B A B n (n) BJD AB 2 ∂qD ∂cνi A A,ref=1 B B,ref=1 D n=6,8 RAB

NA,ref NB,ref B,ref AB 1 X X X X X ∂ζ ∂qD A,ref A,ref B,ref X C(n) (n) + B ζ W W s f (R ) A A B n (n) BJD AB 2 ∂qD ∂cνi A A,ref=1 B B,ref=1 D n=6,8 RAB

NA,ref NB,ref A,ref AB 1 X X X X ∂ζ ∂qA B,ref A,ref B,ref X C(n) (n) = A ζ W W s f (R ) B A B n (n) BJD AB 2 ∂qA ∂cνi A A,ref=1 B B,ref=1 n=6,8 RAB

NA,ref NB,ref B,ref AB 1 X X X X ∂ζ ∂qB A,ref A,ref B,ref X C(n) (n) + B ζ W W s f (R ) A A B n (n) BJD AB 2 ∂qB ∂cνi A A,ref=1 B B,ref=1 n=6,8 RAB

NA,ref NB,ref A,ref   AB 1 X X X X ∂ζ X X X B,ref A,ref B,ref X C(n) (n) = A δ n c S + n c S ζ W W s f (R )  AD i κi νκ i µi νµ B A B n (n) BJD AB 2 ∂qA A A,ref B B,ref=1 C κ∈C µ∈A n=6,8 RAB

NA,ref NB,ref B,ref   AB 1 X X X X ∂ζ X X X A,ref A,ref B,ref X C(n) (n) + B δ n c S + n c S ζ W W s f (R )  BD i κi νκ i µi νµ A A B n (n) BJD AB 2 ∂qB A A,ref=1 B B,ref=1 C κ∈C µ∈A n=6,8 RAB

ND,ref D,ref NB,ref DB 1 X ∂ζ X X B,ref D,ref B,ref X C(n) (n) X X = D ζ W W s f (R ) n c S B D B n (n) BJD AB i κi νκ 2 ∂qD D,ref=1 B B,ref=1 n=6,8 RDB C κ∈C D,ref A,ref N d N AD 1 X ∂ζ X X A,ref A,ref D,ref X C(n) (n) X X + D ζ W W s f (R ) n c S A A D n (n) BJD AB i κi νκ 2 ∂qD D,ref=1 A A,ref=1 n=6,8 RAD C κ∈C

(6,8) NA,ref A,ref NB,ref AB ∂Edisp X ∂ζ X X B,ref A,ref B,ref X C(n) (n) X X = A ζ W W s f (R ) n c S B A B n (n) BJD AB i κi νκ ∂cνi ∂qA A,ref=1 B B,ref=1 n=6,8 RAB C κ∈C | {z } dA Which leads to the two–body DFT–D4 potential used within the GFN2-xTB method

D4 1 F = Sκλ(dA + dB ), ∀κ ∈ A, λ ∈ B κλ 2

S12 D. Double hybrid density functionals

In the following we give the construction scheme to build the double hybrid density functionals (DHDF) as

Fock DFT Fock DFT DFT PT2, OS PT2 PT2, SS PT2 EDHDF = (1 a )E + a + a E + a E + a E . − x x x c c c c,OS c c,SS

Table A3: Double hybrid functional definitions as given in the literature.

Fock DFT PT2, OS PT2, SS Name Exchange Correlation ax ac ac ac Ref. B2PLYP B88 LYP 0.5300 0.7300 0.2700 0.2700 [9] mPW2PLYP mPW LYP 0.5500 0.7500 0.2500 0.2500 [10] PWPB95 PW B95 0.5000 0.7310 0.2690 0.0000 [11] DSD-BLYP B88 LYP 0.6900 0.5400 0.4600 0.3700 [12] DSD-PBE PBE PBE 0.6800 0.4900 0.5500 0.1300 [13] DSD-PBEB95 PBE B95 0.6600 0.5500 0.4600 0.0900 [13] DSD-PBEP86 PBE P86 0.7000 0.4300 0.5300 0.2500 [13] DSD-SVWN Slater VWN5 0.7200 0.3300 0.5900 0.1200 [13] DOD-BLYP B88 LYP 0.6500 0.5800 0.5300 0.0000 [13] DOD-PBE PBE PBE 0.6400 0.5400 0.4200 0.0000 [13] DOD-PBEB95 PBE B95 0.6400 0.5700 0.4600 0.0000 [13] DOD-PBEP86 PBE P86 0.6500 0.4700 0.5400 0.0000 [13] DOD-SVWN Slater VWN5 0.6900 0.3400 0.5800 0.0000 [13] PBE0-2 PBE PBE 0.7937 0.5000 0.5000 0.5000 [14] PBE0-DH PBE PBE 0.5000 0.8750 0.1250 0.1250 [15]

S13 Within this section we neglect explicit notation for EEQ charges and denote GFN2-xTB Mulliken-type charges as “TB”.

E. BJ-damping parameters

Different parametrizations are created for the application of either ATM or MBD for higher-order dipole-dipole interactions within the DFT-D4 treatment.

Table A4: BJ-damping parameter (DFT-D4-ATM, default model also abbreviated as DFT- D4) for various DFAs as derived by fitting to reference data (S66x8 [16], S22x5 [17], NCIBLIND10 [18]). DFA s6 s8 a1 a2 B1B95 1.0000 1.277 011 62 0.405 547 15 4.633 230 74 B1LYP 1.0000 1.985 537 11 0.393 090 40 4.554 651 45 B1P 1.0000 3.361 150 15 0.486 652 93 5.052 195 72 B1PW 1.0000 3.022 275 50 0.473 968 46 4.498 453 09 B2PLYP 0.6400 1.168 886 46 0.441 546 04 4.731 146 42 B3LYP 1.0000 2.029 293 67 0.408 680 35 4.538 071 37 B3P 1.0000 3.088 221 55 0.473 242 38 4.986 821 34 B3PW 1.0000 2.883 642 95 0.469 908 60 4.516 414 22 B97 1.0000 0.878 542 60 0.293 191 26 4.516 477 19 BHLYP 1.0000 1.652 816 46 0.272 636 60 5.486 345 86 BLYP 1.0000 2.340 766 71 0.444 888 65 4.093 300 90 BPBE 1.0000 3.644 052 46 0.529 056 20 4.113 118 91 BP 1.0000 3.354 979 27 0.436 458 61 4.924 068 54 BPW 1.0000 3.245 715 06 0.500 504 54 4.123 464 83 CAMB3LYP 1.0000 1.660 413 01 0.402 671 56 5.174 321 95 DODBLYP 0.4700 1.311 460 43 0.434 072 94 4.279 143 60 DODPBEB95 0.5600 0.015 746 35 0.437 457 20 3.691 807 63 DODPBE 0.4800 0.920 514 54 0.430 370 52 4.380 672 38 DODPBEP86 0.4600 0.714 056 81 0.424 086 65 4.528 844 39 DODSVWN 0.4200 0.945 002 07 0.474 490 26 5.053 160 93 DSDBLYP 0.5400 0.630 182 37 0.475 918 35 4.737 137 81 DSDPBEB95 0.5400 0.146 686 70 0.463 945 87 3.649 138 60 DSDPBE 0.4500− 0.705 841 16 0.457 870 85 4.445 667 42 DSDPBEP86 0.4700 0.375 866 75 0.536 987 68 5.130 224 35 DSDSVWN 0.4100 0.729 144 36 0.513 474 12 5.118 585 41 GLYP 1.0000 4.237 989 24 0.384 264 65 4.384 128 63 HF 1.0000 1.616 798 27 0.449 592 24 3.357 436 05 LB94 1.0000 2.595 384 99 0.420 889 44 3.281 932 23 LCBLYP 1.0000 1.603 441 80 0.457 698 39 7.869 248 93 Lh07s-SVWN 1.0000 3.166 755 31 0.359 655 52 4.319 476 14 Lh07t-SVWN 1.0000 2.093 330 01 0.350 251 89 4.341 665 15 Lh12ct-SsifPW92 1.0000 2.684 676 10 0.341 904 16 3.910 396 66 Lh12ct-SsirPW92 1.0000 2.489 734 02 0.340 260 75 3.969 480 81 Lh14t-calPBE 1.0000 1.281 307 70 0.388 220 21 4.925 012 11 M06 1.0000 0.163 667 29 0.534 564 13 6.061 921 74

S14 Table A4: BJ-damping parameter (DFT-D4-ATM, default model also abbreviated as DFT- D4) for various DFAs as derived by fitting to reference data (S66x8 [16], S22x5 [17], NCIBLIND10 [18]). DFA s6 s8 a1 a2 M06L 1.0000 0.594 937 60 0.714 223 59 6.353 141 82 mPW1B95 1.0000 0.500 930 24 0.415 850 97 4.991 548 69 mPW1LYP 1.0000 1.155 911 53 0.256 034 93 5.320 838 95 mPW1PW 1.0000 1.808 417 16 0.429 618 19 4.688 923 41 mPW2PLYP 0.7500 0.457 888 46 0.429 977 04 5.076 506 82 mPWB1K 1.0000 0.573 383 13 0.446 879 75 5.212 667 77 mPWLYP 1.0000 1.258 429 42 0.257 738 94 5.023 195 42 mPWPW 1.0000 1.825 968 36 0.345 267 45 4.846 207 34 O3LYP 1.0000 1.757 625 08 0.103 489 80 6.162 332 82 OLYP 1.0000 2.748 368 20 0.601 844 98 2.532 921 67 OPBE 1.0000 3.069 174 17 0.682 675 34 2.228 490 18 PBE0-2 0.5000 0.642 990 82 0.765 421 15 5.785 786 75 PBE0 1.0000 1.200 654 98 0.400 855 97 5.029 287 89 PBE0-DH 0.8750 0.968 115 78 0.475 924 88 5.086 228 73 PBE 1.0000 0.959 480 85 0.385 749 91 4.806 885 34 PW1PW 1.0000 0.968 501 70 0.424 275 11 5.020 606 36 PW6B95 1.0000 0.319 260 54 0.041 429 19 5.846 556 08 PW86PBE 1.0000− 1.213 628 56 0.405 103 66 4.667 377 24 PW91 1.0000 0.772 831 11 0.395 815 42 4.934 057 61 PWP1 1.0000 0.604 925 65 0.468 558 37 5.769 214 13 PWPB95 0.8200 0.346 391 27 0.410 806 36 3.838 782 74 PWP 1.0000− 0.328 012 27 0.358 746 87 6.058 611 68 revPBE0 1.0000 1.571 854 14 0.387 059 66 4.110 288 76 revPBE0-DH 0.8750 1.244 560 37 0.367 305 60 4.711 264 82 revPBE38 1.0000 1.665 974 72 0.394 768 33 4.390 266 28 revPBE 1.0000 1.746 765 30 0.536 349 00 3.072 614 85 revTPSS0 1.0000 1.546 644 99 0.458 909 64 4.784 264 05 revTPSS 1.0000 1.530 894 54 0.448 805 97 4.640 423 17 revTPSSH 1.0000 1.527 403 07 0.451 619 57 4.707 794 83 RPBE 1.0000 1.311 837 87 0.461 694 93 3.157 117 57 RPW86PBE 1.0000 1.126 240 34 0.381 512 18 4.754 804 72 SCAN 1.0000 1.461 260 56 0.629 308 55 6.312 840 39 TPSS0 1.0000 1.624 381 02 0.403 290 22 4.805 378 71 TPSS 1.0000 1.765 963 55 0.428 223 03 4.542 571 02 TPSSH 1.0000 1.858 977 50 0.442 869 66 4.602 305 34 ωB97 1.0000 6.557 925 98 0.766 668 02 8.360 273 34 ωB97X 1.0000 0.075 195 16 0.450 948 93 6.784 252 55 X3LYP 1.0000− 1.547 014 29 0.203 184 43 5.618 526 48 XLYP 1.0000 1.629 720 54 0.112 686 73 5.407 864 17

S15 Table A5: BJ-damping parameter (DFT-D4-MBD) for various DFAs as derived by fitting to reference data (S66x8 [16], S22x5 [17], NCIBLIND10 [18]). DFA s6 s8 a1 a2 B1B95 1.0000 1.195 494 20 0.392 414 74 4.603 976 11 B1LYP 1.0000 1.946 095 14 0.386 433 51 4.541 359 68 B1P 1.0000 3.386 930 11 0.484 786 15 5.043 612 24 B1PW 1.0000 2.984 022 04 0.468 629 50 4.486 378 49 B2PLYP 0.6400 1.151 177 73 0.426 661 67 4.736 357 90 B3LYP 1.0000 2.002 462 46 0.402 761 91 4.527 783 20 B3P 1.0000 3.144 562 98 0.471 879 47 4.986 242 58 B3PW 1.0000 2.856 562 68 0.464 918 01 4.506 014 52 B97 1.0000 0.811 712 11 0.284 612 83 4.486 914 68 BHLYP 1.0000 1.680 829 73 0.268 358 37 5.488 472 18 BLYP 1.0000 2.339 713 06 0.447 336 88 4.065 839 31 BPBE 1.0000 3.653 229 96 0.499 335 01 4.242 948 52 BP 1.0000 3.337 281 76 0.432 203 30 4.914 430 61 BPW 1.0000 3.231 374 32 0.499 552 26 4.104 110 84 CAMB3LYP 1.0000 1.744 079 61 0.401 378 70 5.187 312 25 DODBLYP 0.4700 1.178 099 56 0.402 524 28 4.250 965 55 DODPBEB95 0.5400 0.157 028 03 0.306 293 89 3.691 709 56 DODPBE 0.4800− 0.839 083 32 0.406 559 01 4.336 012 39 DODPBEP86 0.4600 0.683 099 10 0.406 009 75 4.500 117 72 DODSVWN 0.4200 1.018 903 45 0.461 674 59 5.111 213 82 DSDBLYP 0.5400 0.654 388 17 0.465 495 74 4.734 498 99 DSDPBEB95 0.5400 0.243 368 62 0.326 974 09 3.697 675 40 DSDPBE 0.4500− 0.661 167 83 0.435 659 15 4.411 106 70 DSDPBEP86 0.4700 0.511 578 21 0.538 897 89 5.186 459 43 DSDSVWN 0.4100 0.900 844 57 0.511 065 29 5.224 901 48 GLYP 1.0000 3.838 615 84 0.363 439 54 4.328 751 83 HF 1.0000 1.460 011 46 0.431 869 01 3.341 160 14 LB94 1.0000 2.364 615 24 0.415 183 79 3.193 654 71 LCBLYP 1.0000 2.401 099 62 0.478 674 38 8.010 384 24 Lh07s-SVWN 1.0000 2.924 984 06 0.341 739 88 4.284 049 51 Lh07t-SVWN 1.0000 1.953 893 00 0.335 115 15 4.318 539 58 Lh12ct-SsifPW92 1.0000 2.413 566 07 0.313 913 16 3.889 357 69 Lh12ct-SsirPW92 1.0000 2.249 171 62 0.314 465 75 3.950 709 25 Lh14t-calPBE 1.0000 1.276 772 53 0.381 286 70 4.916 988 83 M06 1.0000 0.229 482 74 0.529 272 85 6.065 167 82 M06L 1.0000 0.400 777 79 0.696 114 05 6.290 920 87 mPW1B95 1.0000 0.537 918 35 0.410 169 13 4.992 841 76 mPW1LYP 1.0000 1.199 861 00 0.255 024 69 5.323 013 04 mPW1PW 1.0000 1.806 569 73 0.424 569 67 4.681 323 17 mPW2PLYP 0.7500 0.611 611 79 0.437 483 16 5.125 403 64 mPWB1K 1.0000 0.622 211 46 0.442 167 45 5.213 246 59 mPWLYP 1.0000 1.182 433 37 0.389 689 85 4.308 352 85 mPWPW 1.0000 1.796 740 14 0.338 704 79 4.834 422 13 O3LYP 1.0000 1.777 938 02 0.099 617 45 6.160 893 04 OLYP 1.0000 2.587 170 41 0.597 592 71 2.487 603 53

S16 Table A5: BJ-damping parameter (DFT-D4-MBD) for various DFAs as derived by fitting to reference data (S66x8 [16], S22x5 [17], NCIBLIND10 [18]). DFA s6 s8 a1 a2 OPBE 1.0000 2.935 441 02 0.679 039 33 2.198 100 71 PBE0-2 0.5000 0.988 348 59 0.779 110 62 5.903 895 69 PBE0 1.0000 1.268 294 75 0.399 070 98 5.039 513 04 PBE0-DH 0.8750 1.193 060 02 0.461 067 84 5.252 104 80 PBE 1.0000 0.999 246 14 0.381 425 28 4.818 392 84 PW1PW 1.0000 1.097 590 50 0.427 598 30 5.045 595 72 PW6B95 1.0000 0.316 299 35 0.039 993 57 5.836 902 54 PW86PBE 1.0000− 1.228 429 87 0.399 988 24 4.667 391 11 PW91 1.0000 0.814 068 82 0.340 947 06 5.185 688 23 PWP1 1.0000 0.959 362 22 0.485 529 82 5.849 564 11 PWPB95 0.8200 0.464 537 80 0.298 841 36 3.876 412 55 PWP 1.0000− 0.660 560 55 0.377 680 52 6.147 871 38 revPBE0 1.0000 1.471 982 56 0.374 717 56 4.089 043 69 revPBE0-DH 0.8750 1.224 941 88 0.359 047 81 4.702 160 12 revPBE38 1.0000 1.604 235 29 0.389 384 75 4.355 578 32 revPBE 1.0000 1.625 436 93 0.540 318 31 2.979 656 48 revTPSS0 1.0000 1.553 218 88 0.453 553 19 4.775 885 98 revTPSS 1.0000 1.518 580 35 0.442 432 22 4.628 816 20 revTPSSH 1.0000 1.525 420 64 0.445 702 07 4.698 837 17 RPBE 1.0000 1.117 936 96 0.446 324 88 3.088 909 17 RPW86PBE 1.0000 1.137 958 71 0.376 365 36 4.752 363 84 SCAN 1.0000 1.754 083 15 0.635 713 34 6.356 907 48 TPSS0 1.0000 1.667 526 98 0.400 747 46 4.809 271 96 TPSS 1.0000 1.911 308 49 0.433 328 51 4.569 867 97 TPSSH 1.0000 1.887 835 25 0.439 681 67 4.603 427 00 ωB97 1.0000 7.110 224 68 0.764 233 45 8.445 593 34 ωB97X 1.0000 0.388 153 38 0.474 486 29 6.913 673 84 X3LYP 1.0000 1.550 674 92 0.198 185 45 5.612 627 48 XLYP 1.0000 1.515 778 78 0.100 265 85 5.375 064 60

S17 Table A6: BJ-damping parameter (DFT-D4(TB)-MBD) for various DFAs as derived by fitting to reference data (S66x8 [16], S22x5 [17], NCIBLIND10 [18]). DFA s6 s8 a1 a2 HF 1.0000 1.458 286 83 0.447 127 42 3.264 877 34 BLYP 1.0000 2.081 170 58 0.417 116 42 4.039 551 28 BPBE 1.0000 3.642 591 75 0.470 638 78 4.347 122 79 BP 1.0000 3.111 124 73 0.409 953 87 4.910 053 30 BPW 1.0000 2.527 447 27 0.404 027 82 4.220 840 57 LB94 1.0000 2.091 418 91 0.301 280 51 3.457 880 60 MPWLYP 1.0000 1.364 602 00 0.286 102 46 4.910 280 62 MPWPW 1.0000 1.731 307 52 0.325 479 73 4.813 726 63 OLYP 1.0000 2.301 876 44 0.541 547 21 2.532 872 78 OPBE 1.0000 2.478 622 43 0.598 057 92 2.266 713 22 PBE 1.0000 0.951 596 05 0.404 363 18 4.650 108 56 RPBE 1.0000 1.054 014 23 0.425 996 48 3.101 463 07 REVPBE 1.0000 1.528 500 98 0.493 140 34 3.104 412 25 PW86PBE 1.0000 1.464 972 96 0.426 357 74 4.670 700 01 RPW86PBE 1.0000 1.237 829 81 0.397 853 99 4.694 122 60 PW91 1.0000 0.896 485 32 0.395 924 18 4.963 209 77 PWP 1.0000 0.638 069 44 0.473 907 87 5.665 152 08 XLYP 1.0000 1.509 940 82 0.090 512 15 5.362 052 96 B97 1.0000 0.897 997 38 0.308 190 88 4.443 242 65 TPSS 1.0000 1.889 016 38 0.427 750 15 4.553 799 80 REVTPSS 1.0000 1.508 830 62 0.430 175 30 4.657 836 11 SCAN 1.0000 0.469 902 09 0.614 364 50 5.899 114 95 B1LYP 1.0000 1.830 749 38 0.384 935 43 4.455 926 40 B3LYP 1.0000 1.936 427 73 0.404 453 81 4.457 046 39 BHLYP 1.0000 1.518 967 70 0.281 922 18 5.294 274 69 B1P 1.0000 3.416 751 21 0.482 535 11 5.033 893 54 B3P 1.0000 3.182 790 35 0.469 923 25 4.976 502 53 B3PW 1.0000 2.723 632 74 0.443 772 56 4.522 155 74 O3LYP 1.0000 1.738 749 42 0.106 389 82 6.049 817 36 REVPBE0 1.0000 1.498 907 14 0.358 195 41 4.159 479 55 REVPBE38 1.0000 1.573 825 08 0.378 387 02 4.356 324 32 PBE0 1.0000 1.196 619 78 0.417 343 08 4.884 320 30 PWP1 1.0000 0.648 889 26 0.555 648 09 5.322 126 39 PW1PW 1.0000 1.183 642 44 0.469 537 24 4.882 722 76 MPW1PW 1.0000 1.627 884 71 0.415 576 75 4.599 734 89 MPW1LYP 1.0000 1.331 709 09 0.298 309 06 5.173 450 35 PW6B95 1.0000 0.164 439 19 0.079 049 89 5.944 396 46 TPSSH 1.0000− 2.164 689 07 0.452 541 89 4.655 539 22 TPSS0 1.0000 1.252 851 63 0.382 234 99 4.615 935 29 X3LYP 1.0000 1.505 628 53 0.211 527 28 5.479 016 28 M06L 1.0000 0.013 476 97 0.708 346 64 6.033 155 16 M06 1.0000 0.507 850 08 0.589 531 57 5.973 170 57 M062X 1.0000 0.046 726 18 0.870 981 56 7.329 886 30 WB97 1.0000 1.127 363 63 0.753 965 90 7.310 529 61 WB97X 1.0000 0.350 405 01 0.569 747 96 6.443 277 94

S18 Table A6: BJ-damping parameter (DFT-D4(TB)-MBD) for various DFAs as derived by fitting to reference data (S66x8 [16], S22x5 [17], NCIBLIND10 [18]). DFA s6 s8 a1 a2 CAMB3LYP 1.0000 1.652 134 37 0.426 762 06 4.994 505 82 LCBLYP 1.0000 1.674 590 38 0.647 725 66 7.026 910 22 LH07TSVWN 1.0000 1.647 164 68 0.360 275 50 3.948 840 94 LH07SSVWN 1.0000 2.547 734 75 0.371 967 19 3.898 640 94 LH12CTSSIRPW92 1.0000 1.900 238 51 0.335 135 81 3.537 246 35 LH12CTSSIFPW92 1.0000 2.043 716 41 0.332 387 88 3.472 347 11 LH14TCALPBE 1.0000 0.962 171 13 0.408 097 99 4.539 119 55 B2PLYP 0.7800 0.970 900 85 0.418 492 25 4.592 862 43 MPW2PLYP 0.7500 0.555 068 01 0.481 488 34 4.881 793 16 PWPB95 0.8200 0.026 408 53 0.437 447 68 4.538 847 24 DSDBLYP 0.5400− 0.626 421 44 0.455 895 98 4.730 622 94 DSDPBE 0.4500 0.692 291 05 0.415 844 08 4.528 969 60 DSDPBEB95 0.5400 0.025 356 83 0.431 175 70 4.317 249 07 DSDPBEP86 0.4700− 0.404 372 39 0.526 926 25 5.086 782 49 DSDSVWN 0.4100 0.736 685 21 0.505 412 52 5.060 789 30 DODBLYP 0.4700 1.043 849 62 0.370 017 61 4.220 416 49 DODPBE 0.4800 0.808 244 28 0.383 864 76 4.375 732 21 DODPBEB95 0.5600 0.027 816 76 0.381 004 06 4.187 292 80 DODPBEP86 0.4600 0.711 638 46 0.409 071 64 4.513 968 86 DODSVWN 0.4200 0.829 595 03 0.459 577 76 4.896 713 68 PBE0-2 0.5000 0.124 815 39 0.661 505 25 5.701 644 63 PBE0-DH 0.8750 0.656 747 32 0.471 311 18 4.828 169 82

S19 Table A7: BJ-damping parameter (DFT-D4(TB)-ATM) for various DFAs as derived by fitting to reference data (S66x8 [16], S22x5 [17], NCIBLIND10 [18]). DFA s6 s8 a1 a2 HF 1.0000 1.557 366 44 0.442 179 52 3.324 104 41 BLYP 1.0000 2.190 200 80 0.429 130 71 4.051 104 79 BPBE 1.0000 3.629 749 20 0.471 793 11 4.348 327 82 BP 1.0000 3.086 472 46 0.411 621 12 4.919 543 19 BPW 1.0000 2.851 090 94 0.454 632 14 4.143 451 06 LB94 1.0000 2.398 093 64 0.363 471 55 3.341 953 90 MPWLYP 1.0000 1.363 957 83 0.286 024 41 4.910 545 88 MPWPW 1.0000 1.731 856 25 0.325 344 35 4.813 184 52 OLYP 1.0000 2.300 871 26 0.541 448 59 2.533 323 00 OPBE 1.0000 2.709 625 30 0.618 003 52 2.276 211 23 PBE 1.0000 0.936 250 94 0.407 900 49 4.651 359 44 RPBE 1.0000 1.051 644 27 0.425 926 27 3.100 371 33 REVPBE 1.0000 1.611 382 01 0.512 152 00 3.047 183 55 PW86PBE 1.0000 1.192 546 14 0.397 454 89 4.661 501 28 RPW86PBE 1.0000 1.222 892 66 0.399 460 65 4.702 314 15 PW91 1.0000 0.782 210 05 0.390 973 90 4.944 084 51 PWP 1.0000 0.612 188 26 0.477 789 63 5.646 056 64 XLYP 1.0000 1.509 681 43 0.090 563 06 5.362 239 75 B97 1.0000 0.898 180 05 0.308 912 58 4.443 795 03 TPSS 1.0000 2.223 366 84 0.449 039 31 4.650 805 32 REVTPSS 1.0000 1.493 740 36 0.431 991 83 4.664 057 59 SCAN 1.0000 0.457 756 48 0.616 693 42 5.905 234 68 B1LYP 1.0000 1.828 196 41 0.385 610 62 4.461 051 23 B3LYP 1.0000 1.930 777 74 0.405 207 81 4.462 552 49 BHLYP 1.0000 1.506 555 02 0.283 550 60 5.303 546 38 B1P 1.0000 3.394 006 23 0.483 891 19 5.039 821 46 B3P 1.0000 3.161 907 35 0.471 032 71 4.981 373 63 B3PW 1.0000 2.712 739 65 0.446 318 95 4.525 179 62 O3LYP 1.0000 1.723 211 98 0.108 025 98 6.061 266 61 REVPBE0 1.0000 1.500 463 46 0.358 374 24 4.159 799 87 REVPBE38 1.0000 1.619 949 00 0.384 562 95 4.374 873 40 PBE0 1.0000 1.184 973 26 0.419 185 88 4.891 700 85 PWP1 1.0000 0.613 686 82 0.556 828 03 5.332 668 14 PW1PW 1.0000 1.160 028 22 0.470 785 18 4.892 430 94 MPW1PW 1.0000 1.619 913 31 0.417 097 90 4.607 283 22 MPW1LYP 1.0000 1.325 394 33 0.301 234 36 5.175 749 56 PW6B95 1.0000 0.243 642 76 0.068 613 69 5.893 703 10 TPSSH 1.0000− 1.816 993 05 0.437 085 55 4.576 793 51 TPSS0 1.0000 1.469 388 02 0.397 514 11 4.710 147 42 X3LYP 1.0000 1.494 935 75 0.213 108 66 5.487 460 09 M06L 1.0000 0.025 399 65 0.711 107 72 6.050 635 04 M06 1.0000 0.502 957 55 0.588 756 42 5.965 574 87 M062X 1.0000 0.127 702 86 0.862 899 08 7.307 616 22 WB97 1.0000− 1.262 045 57 0.754 376 95 7.315 277 80 WB97X 1.0000 0.347 835 80 0.574 882 91 6.419 218 02

S20 Table A7: BJ-damping parameter (DFT-D4(TB)-ATM) for various DFAs as derived by fitting to reference data (S66x8 [16], S22x5 [17], NCIBLIND10 [18]). DFA s6 s8 a1 a2 CAMB3LYP 1.0000 1.639 669 17 0.424 278 08 5.031 098 15 LCBLYP 1.0000 1.678 383 79 0.647 054 35 7.028 833 75 LH07TSVWN 1.0000 3.408 582 18 0.532 185 98 3.550 686 20 LH07SSVWN 1.0000 2.017 420 30 0.499 831 99 3.534 492 78 LH12CTSSIRPW92 1.0000 2.466 883 56 0.567 836 03 2.831 261 77 LH12CTSSIFPW92 1.0000 2.703 768 07 0.586 232 58 2.721 033 81 LH14TCALPBE 1.0000 1.238 272 87 0.435 375 37 4.639 386 35 B2PLYP 0.7800 1.004 685 53 0.427 371 83 4.626 241 58 MPW2PLYP 0.7500 0.543 180 70 0.484 727 56 4.896 743 42 PWPB95 0.8200 0.353 421 55 0.372 780 86 4.035 800 81 DSDBLYP 0.5400− 0.601 512 54 0.460 913 02 4.753 484 49 DSDPBE 0.4500 0.698 655 39 0.425 083 71 4.565 189 30 DSDPBEB95 0.5400 0.050 974 31 0.429 670 19 4.323 989 58 DSDPBEP86 0.4700− 0.382 717 06 0.533 973 08 5.116 871 01 DSDSVWN 0.4100 0.509 046 43 0.494 132 32 4.920 923 77 DODBLYP 0.4700 1.171 014 52 0.398 337 37 4.258 098 11 DODPBE 0.4800 0.807 612 67 0.388 737 38 4.401 711 91 DODPBEB95 0.5600 0.009 590 16 0.388 667 13 4.094 626 93 DODPBEP86 0.4600 0.713 279 51 0.416 313 67 4.538 519 73 DODSVWN 0.4200 0.621 862 46 0.455 900 32 4.742 986 02 PBE0-2 0.5000 0.107 400 34 0.667 068 19 5.739 361 18 PBE0-DH 0.8750 0.748 647 13 0.475 982 57 4.909 100 90

S21 F. Timings of energy and gradient calls

We compare timings for energy and gradient calls between DFT–D4 and DFT–D3(BJ)-ATM for the Tetrakis(isonitrile)rhodium(I) dimer with 106 atoms (doubly positively charged) and a diamond chunk with 430 atoms (286 carbon atoms and 144 hydrogen atoms) derived at four Intel(R) Core(TM) i7-6700 CPU (3.40 GHz).

Property CPU time(DFT–D4) / s CPU time(DFT–D3(BJ)-ATM) / s Tetrakis(isonitrile)rhodium(I) dimer (106 atoms, charge +2) single–point 0.01 0.03 gradient 0.01 0.03 Diamond chunk (430 atoms, charge 0) single–point 0.34 0.41 gradient 0.51 1.02

S22 Part II: Statistical measures and evaluations

S23 G. Extendet statistical measures

As statistical measure for a set x1,. . . ,xn of data points with references r1,. . . ,rn we use { } { }

1 P Average : x = n xi • i 1 P Mean deviation (MD): MD = n (xi ri) • i − 1 P Mean absolute deviation (MAD): MAD = n xi ri • i | − |

r n 1 P 2 Root mean square deviation (RMSD) : RMSD = n xi • i=1 n 1 P 2 Bessel corrected variance (Var): V ar = n−1 (xi x) • i=1 − r 1 P 2 Bessel corrected standard deviation (SD): SD = n−1 (xi ri MD) • i − −

Maximum deviation (Max): Max = max xi ri • { − } Minimum deviation (Min): Min = min xi ri • { − } Maximum absolute deviation (Amax): Amax = max xi ri • {| − |} H. Statistical evaluation: S30L

S24 Table A8: Extendet statistical evaluation of the S30L [19] benchmark set for various different DFAs. For each functional we directly compare DFT–D3 (BJ)-ATM (abbreviated as D3) corrected values with DFT–D4-ATM (abbreviated as D4) corrected values given in kcal mol−1. For further details please check Ref. [20, 21]. We follow the numberation of the systems regarding Ref. [19]

DLPNO- PW6B95 SCAN PBE # CCSD(T)/CBS* D4 D3 D4 D3 D4 D3 1 31.0 28.2 30.1 30.2 29.7 29.6 28.0 2 −20.7 −19.0 −20.4 −19.8 −19.4 −19.6 −18.2 3 −23.3 −19.1 −19.2 −22.5 −21.5 −20.8 −18.0 4 −18.6 −20.9 −20.3 −21.4 −21.0 −20.7 −18.6 5 −27.9 −31.6 −32.9 −33.2 −32.2 −32.7 −28.7 6 −25.2 −21.2 −22.8 −25.6 −25.2 −24.8 −22.2 7 −31.0 −32.1 −33.4 −33.2 −33.2 −34.8 −29.9 8 −35.6 −36.9 −38.7 −38.1 −38.0 −40.2 −34.5 9 −33.7 −31.6 −32.7 −33.6 −29.4 −35.1 −27.0 10 −35.0 −30.5 −32.0 −34.7 −30.2 −36.2 −27.8 11 −35.8 −32.8 −36.7 −40.1 −34.2 −42.5 −32.5 12 −36.9 −33.6 −37.5 −40.1 −34.1 −42.5 −32.4 13 −27.3 −24.2 −25.7 −27.1 −25.5 −24.0 −22.9 14 −28.6 −24.2 −26.7 −28.3 −26.6 −25.6 −23.9 15 −17.5 −17.3 −18.1 −21.8 −21.9 −21.2 −21.1 16 −21.6 −21.4 −23.9 −24.2 −24.2 −25.3 −24.7 17 −34.3 −32.6 −31.7 −36.0 −35.9 −33.4 −32.3 18 −22.8 −20.8 −20.4 −24.4 −24.0 −22.5 −21.3 19 −15.3 −14.4 −14.7 −17.0 −16.1 −15.4 −15.3 20 −18.5 −17.2 −17.9 −20.1 −19.1 −18.0 −18.2 21 −28.0 −22.8 −24.9 −27.3 −25.1 −23.9 −23.5 22 −35.3 −35.2 −33.9 −38.6 −38.8 −39.5 −38.7 23 −62.1 −63.0 −61.8 −66.5 −66.9 −68.7 −68.1 24 −136.3 −130.3 −133.1 −138.0 −134.7 −126.9 −126.5 25 − 28.7 − 27.4 − 31.2 − 30.2 − 28.9 − 29.8 − 27.1 26 −28.6 −26.4 −30.3 −27.9 −26.5 −29.8 −27.1 27 −83.4 −80.6 −82.0 −84.3 −83.5 −80.6 −80.7 28 −80.0 −77.4 −78.7 −80.7 −80.1 −77.2 −77.4 29 −52.8 −56.0 −54.2 −58.1 −58.2 −55.8 −54.2 30 −49.6 −51.1 −49.9 −53.6 −53.4 −51.5 −49.7 − − − − − − − MD 1.5 0.3 1.7 0.4 0.8 1.8 MAD 2.5 1.8− 2.0− 2.3− 2.9 3.1 RMSD 2.9 2.2 2.6 2.7 3.6 3.8 SD 13.3 11.7 10.5 14.8 19.5 18.4 Var 6.1 4.7 3.8 7.6 13.1 11.7 Max 6.0 4.1 0.9 4.8 9.4 9.8 Min 3.7 5.0 5.3 5.4 6.7 6.0 AMax− 6.0− 5.0− 5.3− 5.4− 9.4− 9.8

S25 Table A9: Extendet statistical evaluation of the S30L [19] benchmark set for various different DFAs. For each functional we directly compare DFT–D3 (BJ)-ATM (abbreviated as D3) corrected values with DFT–D4-MBD (abbreviated as D4) corrected values given in kcal mol−1. For further details please check Ref. [20, 21]. We follow the numberation of the systems regarding Ref. [19]

DLPNO- PW6B95 SCAN PBE # CCSD(T)/CBS* D4 D3 D4 D3 D4 D3 1 31.0 29.8 30.1 30.1 29.7 30.2 28.0 2 −20.7 −20.2 −20.4 −19.7 −19.4 −20.0 −18.2 3 −23.3 −20.8 −19.2 −22.6 −21.5 −21.8 −18.0 4 −18.6 −21.5 −20.3 −21.5 −21.0 −21.1 −18.6 5 −27.9 −33.4 −32.9 −33.2 −32.2 −33.4 −28.7 6 −25.2 −22.9 −22.8 −25.6 −25.2 −25.5 −22.2 7 −31.0 −33.2 −33.4 −33.3 −33.2 −35.2 −29.9 8 −35.6 −38.2 −38.7 −38.1 −38.0 −40.5 −34.5 9 −33.7 −35.2 −32.7 −34.1 −29.4 −37.3 −27.0 10 −35.0 −34.6 −32.0 −35.3 −30.2 −38.6 −27.8 11 −35.8 −38.7 −36.7 −40.9 −34.2 −46.0 −32.5 12 −36.9 −39.4 −37.5 −40.9 −34.1 −45.9 −32.4 13 −27.3 −26.4 −25.7 −27.0 −25.5 −25.3 −22.9 14 −28.6 −26.8 −26.7 −28.2 −26.6 −27.0 −23.9 15 −17.5 −17.9 −18.1 −21.9 −21.9 −21.6 −21.1 16 −21.6 −22.4 −23.9 −24.3 −24.2 −25.9 −24.7 17 −34.3 −33.3 −31.7 −36.0 −35.9 −33.9 −32.3 18 −22.8 −21.6 −20.4 −24.4 −24.0 −23.1 −21.3 19 −15.3 −15.6 −14.7 −17.1 −16.1 −16.3 −15.3 20 −18.5 −19.0 −17.9 −20.2 −19.1 −19.3 −18.2 21 −28.0 −25.9 −24.9 −27.3 −25.1 −25.9 −23.5 22 −35.3 −35.5 −33.9 −38.7 −38.8 −39.8 −38.7 23 −62.1 −63.0 −61.8 −66.6 −66.9 −68.8 −68.1 24 −136.3 −135.9 −133.1 −138.5 −134.7 −130.8 −126.5 25 − 28.7 − 30.4 − 31.2 − 30.2 − 28.9 − 31.0 − 27.1 26 −28.6 −29.5 −30.3 −27.8 −26.5 −31.0 −27.1 27 −83.4 −82.8 −82.0 −84.3 −83.5 −81.9 −80.7 28 −80.0 −79.1 −78.7 −80.6 −80.1 −78.2 −77.4 29 −52.8 −55.9 −54.2 −58.1 −58.2 −55.8 −54.2 30 −49.6 −51.3 −49.9 −53.6 −53.4 −51.6 −49.7 − − − − − − − MD 0.5 0.3 1.8 0.4 1.9 1.8 MAD− 1.5 1.8− 2.1− 2.3− 3.1 3.1 RMSD 1.9 2.2 2.7 2.7 3.9 3.8 SD 10.2 11.7 10.8 14.8 18.8 18.4 Var 3.6 4.7 4.0 7.6 12.2 11.7 Max 2.5 4.1 1.0 4.8 5.5 9.8 Min 5.5 5.0 5.3 5.4 10.2 6.0 AMax− 5.5− 5.0− 5.3− 5.4− 10.2− 9.8

S26 Table A10: Extendet statistical evaluation of the S30L [19] benchmark set for various different DFAs. For each functional we directly compare DFT–D3 (BJ)-ATM (abbreviated as D3) corrected values with DFT–D4(TB)-ATM (abbreviated as D4) corrected values given in kcal mol−1. For further details please check Ref. [20, 21]. We follow the numberation of the systems regarding Ref. [19]

DLPNO- PW6B95 SCAN PBE # CCSD(T)/CBS* D4 D3 D4 D3 D4 D3 1 31.0 29.0 30.1 29.5 29.7 29.6 28.0 2 −20.7 −19.5 −20.4 −19.3 −19.4 −19.4 −18.2 3 −23.3 −19.5 −19.2 −21.9 −21.5 −20.6 −18.0 4 −18.6 −20.9 −20.3 −21.1 −21.0 −20.5 −18.6 5 −27.9 −32.1 −32.9 −32.4 −32.2 −31.9 −28.7 6 −25.2 −21.5 −22.8 −24.9 −25.2 −24.0 −22.2 7 −31.0 −32.5 −33.4 −32.3 −33.2 −33.8 −29.9 8 −35.6 −37.4 −38.7 −37.0 −38.0 −38.9 −34.5 9 −33.7 −31.7 −32.7 −31.3 −29.4 −33.2 −27.0 10 −35.0 −30.8 −32.0 −32.3 −30.2 −34.2 −27.8 11 −35.8 −33.3 −36.7 −37.0 −34.2 −40.1 −32.5 12 −36.9 −34.2 −37.5 −36.9 −34.1 −40.1 −32.4 13 −27.3 −25.5 −25.7 −26.7 −25.5 −24.8 −22.9 14 −28.6 −25.4 −26.7 −27.6 −26.6 −26.0 −23.9 15 −17.5 −17.6 −18.1 −21.5 −21.9 −21.1 −21.1 16 −21.6 −22.0 −23.9 −23.8 −24.2 −25.0 −24.7 17 −34.3 −32.3 −31.7 −35.5 −35.9 −32.9 −32.3 18 −22.8 −20.6 −20.4 −23.8 −24.0 −22.1 −21.3 19 −15.3 −14.8 −14.7 −16.6 −16.1 −15.7 −15.3 20 −18.5 −18.0 −17.9 −19.6 −19.1 −18.7 −18.2 21 −28.0 −25.4 −24.9 −27.1 −25.1 −26.6 −23.5 22 −35.3 −35.0 −33.9 −38.3 −38.8 −39.4 −38.7 23 −62.1 −62.5 −61.8 −66.3 −66.9 −68.4 −68.1 24 −136.3 −132.4 −133.1 −136.7 −134.7 −128.6 −126.5 25 − 28.7 − 28.3 − 31.2 − 29.2 − 28.9 − 29.1 − 27.1 26 −28.6 −27.3 −30.3 −26.8 −26.5 −29.0 −27.1 27 −83.4 −81.5 −82.0 −84.1 −83.5 −81.2 −80.7 28 −80.0 −78.0 −78.7 −80.4 −80.1 −77.5 −77.4 29 −52.8 −56.2 −54.2 −58.3 −58.2 −56.3 −54.2 30 −49.6 −51.4 −49.9 −53.5 −53.4 −52.0 −49.7 − − − − − − − MD 1.0 0.3 0.9 0.4 0.5 1.8 MAD 2.0 1.8− 1.8− 2.3− 2.4 3.1 RMSD 2.4 2.2 2.3 2.7 3.0 3.8 SD 11.8 11.7 11.5 14.8 16.1 18.4 Var 4.8 4.7 4.6 7.6 9.0 11.7 Max 4.2 4.1 2.7 4.8 7.7 9.8 Min 4.2 5.0 5.5 5.4 6.3 6.0 AMax− 4.2− 5.0− 5.5− 5.4− 7.7− 9.8

S27 I. Statistical evaluation: L7

Table A11: Extended statistical evaluations of different DFAs with respect to DLPNO- CCSD(T)/CBS* data. For each functional we directly compare DFT–D3 (BJ)-ATM (abbreviated as D3) corrected values with DFT–D4-ATM (abbreviated as D4) cor- rected values given in kcal mol−1. We follow the numberation of the systems regarding Ref. [22].

DLPNO- PW6B95 PBE0 TPSS # CCSD(T)/CBS* D4 D3 D4 D3 D4 D3 CBH 11.6 8.2 9.0 10.8 11.7 10.8 11.6 C2C2PD −21.3 −18.8 −20.4 −20.1 −18.3 −23.9 −20.1 C3A −17.0 −14.7 −14.9 −16.0 −14.4 −18.4 −15.3 C3GC −29.1 −26.2 −26.6 −27.0 −24.2 −31.1 −25.9 GCGC −12.8 −13.5 −12.4 −14.1 −12.1 −15.8 −12.5 GGG − 1.9 − 1.8 − 1.4 − 2.1 − 1.3 − 3.0 − 1.6 PHE −23.0 −23.7 −23.7 −25.4 −25.3 −24.2 −23.9 − − − − − − − MD 1.4 1.2 0.2 1.3 1.5 0.8 MAD 1.8 1.4 1.3 2.0− 1.7 1.1 RMSD 2.2 1.7 1.5 2.6 1.9 1.5 SD 4.3 3.1 3.9 5.8 3.1 3.3 Var 3.1 1.6 2.5 5.7 1.6 1.8 Max 3.4 2.6 2.1 4.9 0.8 3.2 Min 0.7 0.7 2.4 2.3 3.0 0.9 AMax− 3.4− 2.6− 2.4− 4.9− 3.0− 3.2

S28 Table A12: Extended statistical evaluations of different DFAs with respect to DLPNO- CCSD(T)/CBS* data. For each functional we directly compare DFT–D3 (BJ)-ATM (abbreviated as D3) corrected values with DFT–D4-MBD (abbreviated as D4) cor- rected values given in kcal mol−1. We follow the numberation of the systems regarding Ref. [22].

DLPNO- PW6B95 PBE0 TPSS # CCSD(T)/CBS* D4 D3 D4 D3 D4 D3 CBH 11.6 8.6 9.0 11.0 11.7 10.8 11.6 C2C2PD −21.3 −20.5 −20.4 −20.3 −18.3 −23.6 −20.1 C3A −17.0 −15.6 −14.9 −16.1 −14.4 −18.1 −15.3 C3GC −29.1 −27.8 −26.6 −27.2 −24.2 −30.7 −25.9 GCGC −12.8 −13.9 −12.4 −14.1 −12.1 −15.6 −12.5 GGG − 1.9 − 2.0 − 1.4 − 2.1 − 1.3 − 2.8 − 1.6 PHE −23.0 −24.0 −23.7 −25.6 −25.3 −24.3 −23.9 − − − − − − − MD 0.6 1.2 0.1 1.3 1.3 0.8 MAD 1.2 1.4 1.2 2.0− 1.5 1.1 RMSD 1.5 1.7 1.4 2.6 1.7 1.5 SD 3.6 3.1 3.7 5.8 2.8 3.3 Var 2.1 1.6 2.3 5.7 1.3 1.8 Max 3.0 2.6 1.9 4.9 0.8 3.2 Min 1.1 0.7 2.6 2.3 2.8 0.9 AMax− 3.0− 2.6− 2.6− 4.9− 2.8− 3.2

S29 Table A13: Extended statistical evaluations of different DFAs with respect to DLPNO- CCSD(T)/CBS* data. For each functional we directly compare DFT–D3 (BJ)-ATM (abbreviated as D3) corrected values with DFT–D4(TB)-ATM (abbreviated as D4) corrected values given in kcal mol−1. We follow the numberation of the systems regarding Ref. [22].

DLPNO- PW6B95 PBE0 TPSS # CCSD(T)/CBS* D4 D3 D4 D3 D4 D3 CBH 11.6 9.0 9.0 11.8 11.7 12.1 11.6 C2C2PD −21.3 −18.7 −20.4 −20.0 −18.3 −23.7 −20.1 C3A −17.0 −14.6 −14.9 −16.0 −14.4 −18.3 −15.3 C3GC −29.1 −26.2 −26.6 −26.9 −24.2 −31.0 −25.9 GCGC −12.8 −13.4 −12.4 −14.1 −12.1 −15.7 −12.5 GGG − 1.9 − 1.8 − 1.4 − 2.2 − 1.3 − 3.0 − 1.6 PHE −23.0 −23.8 −23.7 −25.6 −25.3 −24.4 −23.9 − − − − − − − MD 1.3 1.2 0.0 1.3 1.7 0.8 MAD 1.7 1.4 1.2 2.0− 1.7 1.1 RMSD 2.0 1.7 1.5 2.6 1.8 1.5 SD 4.1 3.1 4.0 5.8 2.0 3.3 Var 2.8 1.6 2.6 5.7 0.7 1.8 Max 2.9 2.6 2.2 4.9 0.5 3.2 Min 0.8 0.7 2.6 2.3 −2.9 0.9 AMax− 2.9− 2.6− 2.6− 4.9− 2.9− 3.2

S30 J. Statistical evaluation: MOR41

We follow the numberation of the systems regarding Ref. [23].

S31 Table A14: Extendet statistical evaluations of : DOD-PBE and DSD-PBE in kcal mol−1. We abbreviate D3(BJ)-ATM by D3.

DOD-PBE DSD-PBE # Ref. D4-ATM D4-MBD D3 D4-ATM D4-MBD D3 1 −43.1 −46.2 −46.3 −46.4 −48.4 −48.5 −49.7 2 −46.6 −51.6 −51.8 −51.9 −54.2 −54.3 −55.4 3 −27.6 −35.8 −36.0 −36.0 −39.4 −39.4 −40.3 4 −62.5 −60.3 −60.5 −60.6 −58.5 −58.6 −59.4 5 3.7 2.9 2.7 2.6 3.7 3.6 3.0 6 −23.1 −21.7 −21.7 −21.8 −22.9 −22.9 −21.9 7 −16.2 −14.3 −14.2 −14.5 −14.8 −14.8 −13.3 8 −17.2 −13.5 −13.7 −14.0 −13.2 −13.3 −13.3 9 −18.8 −14.5 −14.6 −15.0 −14.2 −14.3 −13.2 10 −22.6 −21.9 −22.1 −22.1 −23.3 −23.4 −24.6 11 27.0 24.7 24.7 24.5 23.0 23.0 21.4 12 −29.8 −33.4 −33.5 −33.6 −35.5 −35.6 −35.2 13 −43.2 −45.8 −46.0 −46.0 −47.7 −47.8 −49.2 14 −52.0 −53.7 −54.0 −54.5 −55.1 −55.4 −56.7 15 −4.1 7.5 7.3 7.3 11.8 11.6 12.1 16 −39.8 −40.3 −40.8 −41.1 −40.3 −40.8 −41.7 17 −16.1 −13.9 −14.0 −14.0 −14.0 −14.1 −11.8 18 −34.2 −34.0 −34.4 −34.8 −33.7 −33.9 −36.2 19 −40.1 −39.9 −40.3 −40.7 −39.4 −39.6 −41.9 20 −30.2 −29.5 −29.8 −30.1 −28.8 −28.9 −30.9 21 −15.1 −17.2 −17.7 −18.1 −15.8 −16.1 −17.3 22 −35.9 −39.0 −39.4 −39.9 −40.3 −40.6 −42.6 23 −55.0 −55.1 −55.6 −56.1 −54.8 −55.1 −57.9 24 −41.6 −40.2 −41.1 −41.4 −39.9 −40.7 −42.7 25 −45.9 −45.2 −46.3 −46.4 −45.4 −46.3 −48.1 26 −36.4 −34.9 −35.2 −35.3 −33.3 −33.5 −34.7 27 −21.8 −21.1 −21.2 −21.3 −19.8 −19.9 −21.0 28 −36.3 −35.9 −36.1 −36.2 −34.6 −34.7 −36.0 29 −28.3 −28.7 −28.9 −28.8 −27.9 −28.0 −29.2 30 −14.9 −16.0 −16.2 −16.1 −14.9 −15.1 −15.8 31 −29.9 −29.4 −29.8 −30.1 −28.6 −28.9 −30.1 32 −1.9 −2.0 −2.0 −1.9 −2.3 −2.2 −2.1 33 −10.7 −6.7 −6.7 −6.7 −4.2 −4.2 −2.8 34 −25.6 −22.9 −23.0 −23.1 −21.0 −21.1 −21.3 35 −30.9 −28.3 −28.4 −28.7 −26.6 −26.7 −27.4 36 −39.8 −40.2 −40.4 −40.7 −40.0 −40.0 −41.1 37 −14.0 −16.9 −17.4 −17.3 −16.0 −16.4 −16.6 38 −64.4 −68.1 −68.6 −69.0 −67.4 −67.9 −74.0 39 −63.9 −63.3 −63.9 −64.5 −63.7 −64.0 −66.3 40 −65.8 −65.2 −65.4 −65.5 −64.9 −65.0 −67.7 41 −3.2 −2.4 −2.3 −2.4 −2.1 −2.1 −0.7 MD 0.1 −0.2 −0.3 0.2 0.0 −0.9 MAD 2.1 2.1 2.2 2.9 2.9 3.7 RMSD 3.1 3.1 3.1 4.3 4.3 5.0 SD 19.7 19.9 20.0 27.5 27.6 31.8 Var 9.7 9.9 10.0 18.9 19.0 25.3 Max 11.5 11.4 11.4 15.8 15.7 16.2 Min −8.2 −8.4 −8.4 −11.8 −11.9 −12.7 AMax 11.5 11.4 11.4 15.8 15.7 16.2

S32 Table A15: Extendet statistical evaluations of : B3LYP and PBE0 in kcal mol−1. We abbreviate D3(BJ)-ATM by D3.

B3LYP PBE0 # Ref. D4-ATM D4-MBD D3 D4-ATM D4-MBD D3 1 −43.1 −40.5 −40.6 −40.7 −44.6 −44.6 −44.7 2 −46.6 −41.9 −42.0 −42.0 −47.4 −47.4 −47.4 3 −27.6 −22.8 −22.9 −22.8 −26.4 −26.5 −26.4 4 −62.5 −66.2 −66.4 −66.5 −71.3 −71.4 −71.4 5 3.7 −0.1 −0.1 −0.2 −2.6 −2.7 −2.4 6 −23.1 −17.4 −17.4 −17.5 −21.0 −21.0 −20.3 7 −16.2 −12.1 −12.0 −12.8 −15.7 −15.6 −15.3 8 −17.2 −12.8 −12.9 −13.3 −16.0 −16.0 −15.3 9 −18.8 −14.3 −14.3 −15.1 −16.1 −16.1 −15.0 10 −22.6 −16.4 −16.6 −16.4 −19.4 −19.4 −19.1 11 27.0 30.0 30.1 29.8 30.9 30.9 30.2 12 −29.8 −24.6 −24.6 −24.7 −29.9 −29.9 −28.9 13 −43.2 −38.0 −38.1 −37.9 −43.9 −43.9 −43.6 14 −52.0 −43.8 −44.0 −44.7 −50.9 −51.1 −50.5 15 −4.1 −5.0 −5.1 −5.1 −4.8 −4.9 −4.4 16 −39.8 −39.3 −39.8 −39.9 −40.4 −40.7 −40.0 17 −16.1 −14.5 −14.6 −14.5 −13.7 −13.7 −11.2 18 −34.2 −31.8 −32.0 −31.9 −33.3 −33.4 −32.8 19 −40.1 −37.2 −37.4 −37.3 −38.7 −38.7 −38.0 20 −30.2 −29.2 −29.4 −29.3 −28.6 −28.6 −27.8 21 −15.1 −18.5 −18.8 −19.0 −18.7 −18.8 −17.1 22 −35.9 −32.3 −32.6 −32.6 −33.5 −33.7 −32.1 23 −55.0 −51.2 −51.4 −51.3 −52.6 −52.8 −52.3 24 −41.6 −40.8 −41.7 −41.5 −41.3 −42.0 −41.7 25 −45.9 −45.4 −46.4 −46.0 −45.6 −46.4 −45.6 26 −36.4 −42.4 −42.6 −42.5 −38.5 −38.7 −39.0 27 −21.8 −26.8 −27.0 −27.1 −25.1 −25.3 −25.5 28 −36.3 −41.7 −41.9 −41.8 −39.2 −39.4 −39.5 29 −28.3 −32.8 −33.0 −32.6 −30.0 −30.1 −30.0 30 −14.9 −21.8 −22.1 −21.6 −16.3 −16.6 −16.1 31 −29.9 −32.8 −33.2 −33.5 −29.4 −29.8 −29.7 32 −1.9 0.1 0.1 0.5 0.2 0.2 0.7 33 −10.7 −18.9 −19.0 −19.1 −9.4 −9.5 −8.8 34 −25.6 −30.3 −30.5 −30.6 −25.1 −25.2 −25.0 35 −30.9 −34.6 −34.7 −34.9 −29.8 −29.9 −29.9 36 −39.8 −39.4 −39.5 −39.7 −34.9 −34.9 −35.3 37 −14.0 −33.0 −33.4 −33.4 −23.3 −23.5 −23.4 38 −64.4 −72.2 −72.6 −72.6 −69.1 −69.4 −71.8 39 −63.9 −58.2 −58.5 −59.0 −60.7 −60.9 −59.5 40 −65.8 −66.4 −66.5 −66.3 −69.0 −69.1 −69.2 41 −3.2 −6.3 −6.3 −6.6 −3.0 −3.0 −3.4 MD −0.1 −0.3 −0.4 −0.3 −0.4 0.0 MAD 4.2 4.2 4.2 2.3 2.3 2.6 RMSD 5.3 5.3 5.2 3.1 3.1 3.4 SD 33.7 33.9 33.5 19.7 19.8 21.8 Var 28.3 28.8 28.0 9.7 9.8 11.9 Max 8.2 7.9 7.3 5.0 4.9 4.9 Min −19.0 −19.4 −19.3 −9.2 −9.5 −9.4 AMax 19.0 19.4 19.3 9.2 9.5 9.4

S33 Table A16: Extendet statistical evaluations of : PW6B95 and CAM-B3LYP in kcal mol−1. We abbreviate D3(BJ)-ATM by D3.

PW6B95 CAM-B3LYP # Ref. D4-ATM D4-MBD D3 D4-ATM D4-MBD D3 1 −43.1 −41.8 −41.8 −41.8 −40.1 −40.1 −40.1 2 −46.6 −42.9 −43.0 −42.9 −42.7 −42.8 −42.7 3 −27.6 −22.6 −22.6 −22.5 −23.2 −23.2 −23.1 4 −62.5 −63.0 −63.0 −63.0 −64.4 −64.4 −64.2 5 3.7 1.0 1.1 1.1 2.7 2.7 3.0 6 −23.1 −19.8 −19.7 −19.7 −19.5 −19.4 −18.6 7 −16.2 −16.5 −16.0 −16.0 −14.1 −14.1 −13.4 8 −17.2 −12.2 −12.4 −12.6 −15.8 −15.8 −14.8 9 −18.8 −12.5 −12.3 −12.2 −17.8 −17.8 −16.5 10 −22.6 −15.5 −15.6 −15.6 −15.8 −15.9 −15.3 11 27.0 32.8 33.0 32.7 33.6 33.7 33.1 12 −29.8 −28.4 −28.2 −28.2 −24.3 −24.3 −23.0 13 −43.2 −40.9 −40.7 −40.7 −38.4 −38.4 −37.9 14 −52.0 −47.9 −48.1 −48.6 −44.1 −44.3 −43.5 15 −4.1 −3.9 −4.0 −3.8 −9.6 −9.7 −9.1 16 −39.8 −40.0 −40.5 −40.8 −39.0 −39.3 −38.2 17 −16.1 −15.2 −15.2 −15.0 −16.2 −16.3 −13.1 18 −34.2 −29.4 −29.3 −31.0 −30.2 −30.3 −29.8 19 −40.1 −34.5 −34.5 −36.2 −36.4 −36.5 −35.9 20 −30.2 −27.1 −26.9 −27.8 −28.2 −28.3 −27.3 21 −15.1 −16.0 −16.1 −17.5 −14.4 −14.6 −12.8 22 −35.9 −26.1 −26.6 −29.1 −27.4 −27.6 −26.5 23 −55.0 −46.8 −46.7 −49.2 −49.0 −49.1 −49.2 24 −41.6 −39.7 −41.2 −42.0 −40.7 −41.3 −41.0 25 −45.9 −43.0 −44.7 −45.4 −45.0 −45.6 −44.7 26 −36.4 −33.5 −34.2 −34.8 −42.1 −42.3 −42.9 27 −21.8 −21.4 −22.1 −22.7 −27.4 −27.5 −28.2 28 −36.3 −34.3 −34.9 −35.6 −41.8 −41.9 −42.6 29 −28.3 −26.1 −26.7 −27.0 −32.1 −32.2 −32.4 30 −14.9 −9.9 −10.6 −11.0 −17.4 −17.7 −17.7 31 −29.9 −25.4 −26.4 −26.8 −33.3 −33.6 −34.1 32 −1.9 −1.2 −1.1 −1.2 0.5 0.5 0.8 33 −10.7 −10.5 −10.7 −10.7 −18.5 −18.6 −18.1 34 −25.6 −24.1 −24.4 −24.9 −28.6 −28.7 −28.8 35 −30.9 −28.7 −29.0 −30.0 −32.7 −32.9 −33.3 36 −39.8 −33.5 −33.5 −35.2 −36.5 −36.6 −37.7 37 −14.0 −19.6 −20.2 −19.6 −26.7 −26.9 −26.9 38 −64.4 −59.6 −60.5 −61.7 −67.0 −67.3 −70.0 39 −63.9 −61.0 −61.2 −62.4 −60.8 −60.9 −59.8 40 −65.8 −66.2 −66.2 −66.8 −65.9 −66.0 −66.2 41 −3.2 −1.0 −1.4 −1.4 −5.9 −6.0 −6.6 MD 2.7 2.4 1.9 0.5 0.4 0.7 MAD 3.2 3.0 2.6 3.7 3.7 4.3 RMSD 4.0 3.8 3.3 4.6 4.6 5.0 SD 18.9 19.1 17.2 28.9 29.1 31.4 Var 8.9 9.1 7.4 21.0 21.2 24.7 Max 9.8 9.2 7.1 8.5 8.3 9.3 Min −5.6 −6.2 −5.6 −12.7 −12.8 −12.8 AMax 9.8 9.2 7.1 12.7 12.8 12.8

S34 Table A17: Extendet statistical evaluations of : revPBE and M06L in kcal mol−1. We abbreviate D3(BJ)-ATM by D3.

revPBE M06L # Ref. D4-ATM D4-MBD D3 D4-ATM D4-MBD D3 1 −43.1 −44.5 −44.6 −45.0 −41.0 −41.0 −40.7 2 −46.6 −47.6 −47.6 −48.0 −45.1 −45.1 −44.9 3 −27.6 −29.0 −29.1 −29.2 −27.1 −27.1 −26.9 4 −62.5 −73.6 −73.8 −74.5 −64.9 −64.9 −64.7 5 3.7 −6.8 −6.7 −7.1 −2.7 −2.7 −2.5 6 −23.1 −17.9 −17.9 −18.1 −18.1 −18.1 −18.1 7 −16.2 −12.9 −12.9 −13.3 −16.8 −16.8 −16.8 8 −17.2 −14.2 −14.3 −15.3 −7.7 −7.7 −7.5 9 −18.8 −14.0 −13.8 −15.0 −7.2 −7.2 −8.2 10 −22.6 −24.6 −24.7 −25.0 −13.2 −13.2 −12.7 11 27.0 24.2 24.5 23.9 30.7 30.7 31.3 12 −29.8 −30.6 −30.4 −30.9 −24.7 −24.7 −24.3 13 −43.2 −44.8 −44.7 −45.0 −39.8 −39.8 −39.4 14 −52.0 −51.2 −51.3 −52.9 −50.0 −50.0 −49.2 15 −4.1 1.9 1.7 1.6 5.4 5.4 5.5 16 −39.8 −41.7 −42.0 −43.0 −42.3 −42.3 −40.6 17 −16.1 −14.8 −14.9 −14.7 −18.2 −18.2 −17.7 18 −34.2 −34.8 −34.5 −35.7 −24.7 −24.7 −23.9 19 −40.1 −39.0 −38.5 −39.9 −28.8 −28.8 −27.7 20 −30.2 −31.2 −30.9 −31.9 −24.6 −24.6 −23.6 21 −15.1 −23.9 −23.7 −25.2 −13.7 −13.7 −12.7 22 −35.9 −37.5 −37.1 −38.5 −21.8 −21.7 −20.7 23 −55.0 −52.2 −51.6 −53.4 −40.8 −40.8 −39.8 24 −41.6 −39.1 −40.0 −41.0 −39.0 −38.9 −36.8 25 −45.9 −43.5 −44.7 −45.3 −44.0 −43.9 −41.1 26 −36.4 −35.5 −35.6 −36.1 −33.7 −33.7 −33.0 27 −21.8 −21.5 −21.7 −22.2 −22.2 −22.2 −22.1 28 −36.3 −36.0 −36.2 −36.6 −33.9 −33.9 −33.5 29 −28.3 −28.9 −29.1 −28.8 −25.7 −25.7 −25.4 30 −14.9 −23.2 −23.2 −22.9 −12.4 −12.4 −12.2 31 −29.9 −28.0 −28.4 −29.3 −23.8 −23.8 −23.9 32 −1.9 −0.5 −0.3 0.2 1.4 1.4 1.4 33 −10.7 −10.5 −10.6 −10.4 −12.1 −12.1 −11.8 34 −25.6 −25.1 −25.1 −25.5 −22.1 −22.1 −21.7 35 −30.9 −29.1 −29.0 −29.8 −23.6 −23.5 −23.2 36 −39.8 −33.6 −33.2 −34.5 −29.8 −29.8 −29.6 37 −14.0 −29.7 −30.4 −30.5 −26.4 −26.5 −26.1 38 −64.4 −73.0 −73.0 −74.4 −59.1 −59.1 −57.4 39 −63.9 −56.7 −56.8 −59.0 −56.1 −56.1 −55.6 40 −65.8 −66.7 −66.5 −66.9 −65.0 −65.0 −64.7 41 −3.2 −1.3 −1.4 −1.4 −4.7 −4.7 −4.9 MD −0.6 −0.6 −1.3 3.6 3.6 4.2 MAD 3.3 3.3 3.2 5.1 5.1 5.4 RMSD 4.8 4.9 4.9 6.4 6.4 6.8 SD 30.7 31.0 30.5 34.1 34.1 34.5 Var 23.6 24.1 23.3 29.1 29.1 29.7 Max 7.2 7.2 5.7 14.2 14.2 15.2 Min −15.7 −16.3 −16.5 −12.4 −12.5 −12.1 AMax 15.7 16.3 16.5 14.2 14.2 15.2

S35 Table A18: Extendet statistical evaluations of : PBE and RPBE in kcal mol−1. We abbreviate D3(BJ)-ATM by D3.

PBE RPBE # Ref. D4-ATM D4-MBD D3 D4-ATM D4-MBD D3 1 −43.1 −46.7 −46.8 −46.7 −44.4 −44.6 −49.0 2 −46.6 −50.0 −50.0 −50.0 −47.3 −47.6 −51.6 3 −27.6 −31.5 −31.5 −31.4 −28.8 −29.0 −31.8 4 −62.5 −75.1 −75.2 −75.2 −74.0 −74.5 −76.4 5 3.7 −7.4 −7.4 −7.4 −7.2 −7.2 −8.6 6 −23.1 −20.4 −20.4 −20.4 −17.7 −17.8 −11.6 7 −16.2 −14.9 −14.8 −15.3 −12.1 −11.7 −4.3 8 −17.2 −14.4 −14.5 −14.7 −14.1 −14.4 −13.6 9 −18.8 −16.0 −15.9 −16.4 −15.3 −15.2 −10.5 10 −22.6 −24.5 −24.6 −24.4 −24.6 −24.9 −28.1 11 27.0 25.0 25.0 25.0 24.4 24.9 18.2 12 −29.8 −30.8 −30.7 −30.6 −30.8 −30.8 −26.0 13 −43.2 −45.2 −45.2 −45.0 −44.9 −45.1 −48.9 14 −52.0 −52.3 −52.5 −52.6 −51.2 −51.5 −55.5 15 −4.1 1.2 1.1 1.1 1.2 0.8 3.8 16 −39.8 −39.5 −39.9 −39.7 −40.5 −41.0 −42.2 17 −16.1 −14.4 −14.5 −14.4 −14.6 −14.6 −2.1 18 −34.2 −32.5 −32.6 −32.1 −36.0 −36.0 −44.7 19 −40.1 −36.3 −36.4 −35.8 −39.9 −39.9 −48.6 20 −30.2 −28.1 −28.1 −27.7 −32.2 −32.1 −38.0 21 −15.1 −20.4 −20.6 −20.3 −25.2 −25.3 −28.0 22 −35.9 −35.2 −35.4 −34.7 −38.5 −38.6 −48.8 23 −55.0 −50.2 −50.3 −49.6 −53.8 −53.6 −65.6 24 −41.6 −36.8 −37.6 −37.2 −38.1 −39.4 −47.5 25 −45.9 −41.4 −42.3 −41.7 −41.6 −43.3 −50.6 26 −36.4 −34.9 −35.1 −34.9 −35.6 −35.9 −41.8 27 −21.8 −21.9 −22.1 −22.1 −21.8 −22.3 −30.1 28 −36.3 −35.5 −35.7 −35.5 −36.3 −36.8 −44.2 29 −28.3 −27.7 −27.8 −27.5 −29.0 −29.4 −35.5 30 −14.9 −19.9 −20.1 −19.7 −23.6 −23.8 −28.7 31 −29.9 −26.6 −26.9 −27.0 −28.9 −29.6 −36.5 32 −1.9 0.9 0.9 1.2 −0.5 −0.4 −0.4 33 −10.7 −9.5 −9.6 −9.7 −10.1 −10.1 −5.2 34 −25.6 −23.8 −23.9 −23.8 −25.0 −25.0 −27.0 35 −30.9 −28.1 −28.2 −28.1 −29.2 −29.2 −34.1 36 −39.8 −32.8 −32.9 −32.7 −34.3 −34.0 −41.5 37 −14.0 −29.8 −30.0 −30.0 −30.3 −31.3 −31.1 38 −64.4 −68.5 −68.8 −68.4 −73.9 −74.3 −99.4 39 −63.9 −57.0 −57.2 −57.1 −58.6 −59.5 −68.8 40 −65.8 −68.4 −68.5 −68.2 −67.4 −67.5 −79.4 41 −3.2 −2.8 −2.9 −3.2 −1.1 −1.0 2.6 MD −0.1 −0.3 −0.1 −0.9 −1.1 −4.7 MAD 3.5 3.4 3.5 3.4 3.4 8.3 RMSD 4.8 4.7 4.8 5.0 5.1 10.1 SD 30.5 30.4 30.6 31.4 31.7 57.5 Var 23.3 23.0 23.4 24.6 25.2 82.6 Max 7.0 6.9 7.1 5.5 5.9 14.0 Min −15.7 −16.0 −16.0 −16.2 −17.3 −35.0 AMax 15.7 16.0 16.0 16.2 17.3 35.0

S36 K. Statistical evaluation: SCONF

Table A19: Reference values are calculated on a DLPNO-CCSD(T)/TightPNO/CBS(aug- cc-pVTZ/aug-cc-pVQZ) level of theory. We follow the nomenclature of the GMTKN55 [24] database. We abbreviate D3(BJ)-ATM by D3 and D4-ATM by D4.

DSDBLYP B3LYP PW6B95 PBE # Ref. D4 D3 D4 D3 D4 D3 D4 D3 ANGOL15 C1–C2 0.9 0.9 0.9 0.8 0.8 0.9 0.9 0.8 0.8 C1–C3 2.3 2.2 2.2 2.5 2.6 2.5 2.5 3.2 3.2 C1–C4 3.1 3.1 3.1 3.2 3.3 3.3 3.3 3.9 3.9 C1–C5 4.6 4.5 4.5 4.5 4.4 4.7 4.7 5.1 5.1 C1–C6 4.9 4.8 4.8 4.7 4.7 5.1 5.1 5.4 5.4 C1–C7 4.2 4.2 4.2 4.6 4.5 4.3 4.2 5.3 5.3 C1–C8 4.4 4.3 4.3 4.7 4.5 4.4 4.3 5.3 5.2 C1–C9 6.2 6.2 6.1 6.2 6.0 6.4 6.3 6.8 6.6 C1–C10 6.2 6.1 6.1 6.3 6.1 6.5 6.4 7.1 6.9 C1–C11 5.7 5.6 5.6 5.9 5.7 5.8 5.7 6.5 6.4 C1–C12 5.6 5.6 5.6 6.0 5.8 5.8 5.7 6.7 6.6 C1–C13 5.9 5.8 5.8 6.0 5.7 6.7 6.5 6.8 6.7 C1–C14 6.3 6.3 6.3 6.4 6.2 6.5 6.4 6.8 6.7 C1–C15 6.2 6.3 6.2 6.5 6.0 6.2 6.0 6.8 6.5 GLC4 G1–G2 0.2 0.2 0.2 0.1 0.1 0.2 0.2 0.2 0.2 G1–G3 6.2 6.2 6.3 5.3 5.6 4.8 5.0 4.6 4.8 G1–G4 5.5 5.1 5.2 3.5 3.8 4.8 5.0 2.6 2.8 MD 0.0 0.0 −0.1 −0.1 0.0 0.0 0.3 0.3 MAD 0.1 0.1 0.3 0.3 0.3 0.2 0.9 0.8 RMSD 0.1 0.1 0.6 0.5 0.4 0.4 1.1 1.0 SD 0.5 0.4 2.4 1.9 1.8 1.6 4.3 3.9 Var 0.0 0.0 0.3 0.2 0.2 0.2 1.1 0.9 Max 0.1 0.1 0.4 0.3 0.7 0.6 1.2 1.1 Min −0.4 −0.4 −2.1 −1.7 −1.3 −1.2 −3.0 −2.7 AMax 0.4 0.4 2.1 1.7 1.3 1.2 3.0 2.7

S37 L. Statistical evaluation: PCONF21

Table A20: The reference energies were generated on the DLPNO- CCSD(T)/TightPNO/CBS(aug-cc-pVTZ/aug-cc-pVQZ) level of theory while the original geometries were kept. We follow the nomenclature of the GMTKN55 [24] database. We abbreviate D3(BJ)-ATM by D3 and D4-ATM by D4.

DSDBLYP B3LYP PW6B95 PBE # Ref. D4 D3 D4 D3 D4 D3 D4 D3 Tripeptides 99–444 0.0 0.2 0.1 −0.3 −0.7 0.9 0.7 −2.2 −2.5 99–357 1.0 1.0 0.8 0.9 0.1 0.9 0.6 −1.0 −1.5 99–366 0.7 1.1 1.0 1.2 1.1 1.7 1.7 0.6 0.5 99–215 0.8 1.1 0.9 0.6 0.2 1.6 1.4 −1.2 −1.4 99–300 0.8 1.3 1.2 1.5 1.2 1.6 1.5 0.8 0.6 99–114 1.9 1.8 1.6 1.7 1.1 1.6 1.3 0.1 −0.3 99–412 2.2 2.2 2.1 2.0 1.9 2.1 2.1 1.4 1.4 99–691 1.6 1.9 1.8 1.9 1.8 2.3 2.3 1.1 1.1 99–470 1.9 2.0 1.9 2.3 1.7 2.6 2.3 1.4 1.0 99–224 2.1 1.8 1.7 1.2 1.0 2.9 2.8 −0.2 −0.4 GLY GLY ab–GLY aR 1.1 1.1 1.2 0.9 1.3 1.2 1.4 1.3 1.6 GLY ab–GLY pII 1.2 1.5 1.5 1.8 1.9 2.2 2.3 2.3 2.4 GLY ab–GLY aL 2.4 2.5 2.7 2.4 3.0 2.2 2.5 2.8 3.2 GLY ab–GLY b 2.1 1.8 1.9 1.8 1.8 2.1 2.1 1.3 1.3 SER SER ab–SER aR 1.5 1.6 1.6 1.6 1.9 1.6 1.7 2.0 2.2 SER ab–SER pII 2.8 3.1 3.2 3.2 3.5 3.6 3.7 3.9 4.0 SER ab–SER aL 2.3 2.7 2.8 2.6 3.2 1.8 2.1 3.4 3.7 SER ab–SER b 2.7 2.5 2.6 2.5 2.5 2.9 2.9 2.3 2.3 MD 0.1 0.1 0.0 0.0 0.4 0.3 −0.5 −0.5 MAD 0.2 0.2 0.3 0.5 0.5 0.5 1.0 1.2 RMSD 0.3 0.3 0.4 0.6 0.6 0.6 1.2 1.4 SD 1.0 1.2 1.7 2.6 2.0 2.0 4.8 5.7 Var 0.1 0.1 0.2 0.4 0.2 0.2 1.4 1.9 Max 0.5 0.6 0.7 0.9 1.0 1.0 1.1 1.5 Min −0.3 −0.4 −0.9 −1.1 −0.5 −0.6 −2.3 −2.5 AMax 0.5 0.6 0.9 1.1 1.0 1.0 2.3 2.5

S38 M. Statistical evaluation: ICONF

Table A21: Reference energies are obtained with the W1–F12 protocol on TPSS-D3(BJ)/def2- TZVP optimised geometries without spin–orbit and DBOC. We follow the nomen- clature of the GMTKN55 [24] database. We abbreviate D3(BJ)-ATM by D3 and D4-ATM by D4.

DSDBLYP B3LYP PW6B95 PBE # Ref. D4 D3 D4 D3 D4 D3 D4 D3 N3H5 1–N3H5 2 0.9 1.0 1.0 1.0 1.0 1.0 1.0 1.2 1.2 N3H5 1–N3H5 3 5.3 5.3 5.4 5.2 5.3 5.2 5.2 5.6 5.6 N4H6 1–N4H6 2 0.1 0.5 0.5 0.6 0.6 0.8 0.9 0.4 0.4 N4H6 1–N4H6 3 2.3 2.7 2.7 3.0 3.1 3.2 3.2 3.0 3.0 N3P3H12 1–N3P3H12 2 12.2 12.5 12.5 12.2 12.3 12.2 12.2 11.9 11.9 SI5H12 1–SI5H12 2 0.1 −0.1 0.0 0.1 0.2 −0.1 0.0 0.1 0.2 SI5H12 1–SI5H12 3 1.0 1.0 1.0 1.0 0.9 0.9 0.9 0.9 0.9 SI5H12 1–SI5H12 4 3.5 3.7 3.7 3.3 3.3 3.5 3.4 3.1 3.1 SI6H12 1–SI6H12 2 1.7 1.8 1.8 1.7 1.5 1.4 1.4 1.6 1.4 P7H7 1–P7H7 2 1.4 1.6 1.6 1.6 1.6 1.5 1.6 1.2 1.2 S4O4 1–S4O4 2 4.4 4.5 4.5 4.2 4.3 3.8 3.9 5.0 5.0 S8 1–S8 2 9.2 9.1 9.1 9.4 9.1 9.1 9.2 10.3 10.0 H2S2O7 1–H2S2O7 2 0.6 0.5 0.5 0.6 0.6 0.6 0.6 0.5 0.4 H2S2O7 1–H2S2O7 3 3.5 3.2 3.2 3.1 3.2 3.6 3.6 3.1 3.2 H4P2O7 1–H4P2O7 2 1.3 1.4 1.4 1.2 1.2 1.2 1.2 1.2 1.2 H4P2O7 1–H4P2O7 3 3.7 3.4 3.4 3.0 3.0 3.6 3.6 3.8 3.8 H4P2O7 1–H4P2O7 4 4.3 3.7 3.7 3.2 3.4 4.5 4.4 4.2 4.3 MD 0.0 0.0 −0.1 −0.1 0.0 0.0 0.1 0.1 MAD 0.2 0.2 0.3 0.3 0.2 0.2 0.3 0.3 RMSD 0.3 0.3 0.4 0.4 0.3 0.3 0.4 0.4 SD 1.1 1.1 1.7 1.6 1.4 1.4 1.6 1.5 Var 0.1 0.1 0.2 0.2 0.1 0.1 0.2 0.1 Max 0.4 0.4 0.7 0.8 0.8 0.9 1.1 0.8 Min −0.7 −0.6 −1.1 −1.0 −0.6 −0.5 −0.4 −0.4 AMax 0.7 0.6 1.1 1.0 0.8 0.9 1.1 0.8

S39 N. Statistical evaluation: UPU23

Table A22: Reference values are obtained at the DLPNO-CCSD(T)/CBS*//TPSS-D3(BJ)/def2- TZVP(COSMO) level of theory. We follow the nomenclature of the GMTKN55 [24] database. We abbreviate D3(BJ)-ATM by D3 and D4-ATM by D4.

DSDBLYP B3LYP PW6B95 PBE # Ref. D4 D3 D4 D3 D4 D3 D4 D3 2p–1a 4.9 5.4 5.5 5.6 6.0 6.0 6.1 5.5 5.7 2p–1b 3.0 3.7 3.8 3.6 4.1 4.2 4.3 4.0 4.2 2p–1c 8.9 9.4 9.6 9.7 10.1 10.4 10.6 9.6 9.8 2p–1g 2.2 2.6 2.6 2.7 2.7 3.2 3.2 2.9 2.9 2p–1p 2.0 2.7 2.8 2.4 2.7 3.0 3.0 2.5 2.6 2p–2a 3.1 3.0 3.0 3.5 3.5 2.9 2.9 3.1 3.1 2p–5z 0.6 0.2 0.4 −0.3 0.1 0.9 0.9 0.8 1.0 2p–6p 3.3 3.2 3.2 3.1 3.2 3.0 3.1 2.9 2.9 2p–7a 7.3 8.4 8.4 8.8 8.8 8.4 8.4 8.9 8.9 2p–aa 4.0 4.4 4.4 4.8 4.9 4.2 4.2 5.0 5.2 2p–1e 11.1 11.8 11.8 11.8 12.0 12.2 12.3 11.7 11.9 2p–0a 4.8 5.9 6.0 6.0 6.2 5.7 5.8 5.6 5.7 2p–1f 14.4 14.1 14.1 14.3 14.4 14.1 14.2 13.8 13.9 2p–9a 5.2 5.5 5.5 5.8 5.7 5.5 5.4 5.7 5.7 2p–4b 5.5 5.6 5.6 5.5 5.6 5.7 5.7 5.3 5.4 2p–3a 6.8 7.2 7.2 7.3 7.3 7.6 7.6 6.7 6.7 2p–7p 3.9 3.7 3.7 3.6 3.6 3.5 3.5 3.5 3.5 2p–8d 6.4 6.4 6.5 6.7 6.8 6.4 6.5 6.5 6.6 2p–3d 5.4 5.5 5.5 5.7 5.8 5.7 5.8 5.6 5.7 2p–0b 6.7 6.5 6.5 6.1 6.2 6.3 6.3 5.9 6.0 2p–1m 5.6 6.8 6.8 6.7 6.7 6.7 6.7 5.9 5.9 2p–2h 10.4 10.9 10.9 11.5 11.4 10.7 10.8 10.4 10.4 2p–3b 6.1 6.5 6.5 6.4 6.4 6.6 6.6 6.2 6.2 MD 0.3 0.4 0.4 0.6 0.5 0.5 0.3 0.4 MAD 0.4 0.5 0.6 0.7 0.6 0.7 0.5 0.6 RMSD 0.6 0.6 0.7 0.8 0.8 0.8 0.6 0.7 SD 2.1 2.2 2.7 2.8 2.8 2.9 2.8 2.9 Var 0.2 0.2 0.3 0.3 0.4 0.4 0.3 0.4 Max 1.2 1.2 1.5 1.6 1.5 1.7 1.6 1.7 Min −0.3 −0.3 −0.9 −0.5 −0.4 −0.4 −0.8 −0.8 AMax 1.2 1.2 1.5 1.6 1.5 1.7 1.6 1.7

S40 O. Statistical evaluation: ROT34

Table A23: Statistical data for the results of the ROT34 test set for three DFAs using the def2– QZVP basis set. Anharmonic corrections have been performed at the HF/DZ level of theory as described in the literature [25]. All values are given in MHz. We abbreviate D3(BJ)-ATM by D3 and D4-ATM by D4.

PBE0 PBE TPSS # rot. const Ref. D3 D4 D3 D4 D3 D4 A 4293.9 4299.8 4298.1 4240.8 4238.8 4235.9 4235.8 1 B 1395.9 1400.7 1400.7 1383.9 1382.9 1384.5 1384.1 C 1130.2 1133.1 1132.9 1119.3 1118.3 1119.5 1119.1 A 3322.5 3309.3 3307.7 3247.0 3247.8 3239.0 3240.4 2 B 719.8 718.8 719.0 707.2 707.5 709.5 709.4 C 698.0 697.0 697.2 686.0 686.0 687.7 687.6 A 3071.1 3071.9 3071.3 3023.0 3022.8 3021.0 3022.3 3 B 1285.0 1289.9 1290.4 1271.9 1270.9 1271.1 1270.2 C 1248.7 1249.0 1249.3 1232.4 1231.7 1231.0 1230.7 A 2755.9 2765.6 2765.8 2731.3 2731.7 2729.3 2730.3 4 B 2675.6 2689.5 2689.3 2652.4 2652.1 2653.3 2653.1 C 2653.3 2666.5 2666.8 2631.5 2631.8 2633.1 2634.1 5 A 2336.9 2339.1 2339.9 2307.0 2306.5 2306.0 2307.1 A 1464.2 1471.0 1471.1 1440.0 1439.7 1439.1 1439.9 6 B 768.2 767.6 768.3 756.4 757.1 762.1 763.0 C 580.6 580.9 581.4 572.3 572.7 576.1 576.8 A 1165.7 1170.2 1170.4 1152.1 1153.5 1154.6 1155.9 7 B 661.2 660.6 661.3 653.3 653.8 654.0 654.6 C 454.0 454.6 454.9 448.9 449.4 449.6 450.2 A 1166.3 1167.7 1168.3 1147.9 1148.4 1153.1 1155.3 8 B 767.6 766.4 767.0 752.7 753.0 754.3 755.0 C 513.0 512.5 512.9 504.3 504.5 505.6 506.4 A 862.5 865.9 866.0 852.4 852.4 853.2 853.8 9 B 754.2 752.8 752.8 741.8 741.7 742.6 742.9 C 513.7 513.6 513.7 505.7 505.6 506.5 506.8 A 3086.2 3101.0 3100.5 3060.2 3059.9 3061.3 3061.6 10 B 723.7 725.3 725.2 716.2 715.9 715.8 716.0 C 685.0 686.7 686.6 678.0 677.8 677.7 677.9 A 1432.1 1436.0 1435.5 1416.5 1416.5 1418.6 1418.9 11 B 820.5 822.8 822.9 810.9 811.4 812.1 813.3 C 679.4 683.0 682.9 674.1 675.0 675.4 676.1 A 1523.2 1523.3 1521.2 1496.3 1495.8 1497.2 1497.4 12 B 1070.5 1075.0 1076.0 1059.8 1060.6 1060.9 1061.9 C 719.9 721.1 721.5 709.3 709.7 711.0 711.9 MD 2.6 2.7 −18.1 −18.1 −17.6 −17.0 MAD 3.8 3.9 18.1 18.1 17.6 17.0 RMSD 5.7 5.8 23.3 23.4 24.1 23.7 SD 29.4 30.0 85.5 86.5 96.5 95.9 Var 26.1 27.3 221.8 226.5 282.2 278.9 Max 14.8 14.3 −5.1 −4.4 −4.0 −3.3 Min −13.2 −14.8 −75.5 −74.7 −83.5 −82.1 AMax 14.8 14.8 75.5 74.7 83.5 82.1

S41 Table A24: Statistical data for the results of the ROT34 test set for three DFAs using the def2– QZVP basis set. Anharmonic corrections have been performed at the HF/DZ level of theory as described in the literature [25]. All values are given in MHz. We abbreviate D3(BJ)-ATM by D3 and D4(TB)-ATM by D4.

PBE0 PBE TPSS # rot. const Ref. D3 D4 D3 D4 D3 D4 A 4293.9 4299.8 4300.5 4240.8 4239.3 4235.9 4239.0 1 B 1395.9 1400.7 1400.3 1383.9 1383.6 1384.5 1384.9 C 1130.2 1133.1 1132.8 1119.3 1118.8 1119.5 1119.9 A 3322.5 3309.3 3308.4 3247.0 3247.9 3239.0 3243.4 2 B 719.8 718.8 718.9 707.2 708.0 709.5 710.5 C 698.0 697.0 697.1 686.0 686.3 687.7 688.3 A 3071.1 3071.9 3072.6 3023.0 3023.1 3021.0 3024.1 3 B 1285.0 1289.9 1290.0 1271.9 1271.9 1271.1 1272.3 C 1248.7 1249.0 1248.8 1232.4 1232.0 1231.0 1231.8 A 2755.9 2765.6 2766.1 2731.3 2732.3 2729.3 2732.0 4 B 2675.6 2689.5 2689.8 2652.4 2652.8 2653.3 2654.7 C 2653.3 2666.5 2667.2 2631.5 2632.2 2633.1 2635.1 5 A 2336.9 2339.1 2339.1 2307.0 2307.1 2306.0 2308.6 A 1464.2 1471.0 1471.4 1440.0 1440.2 1439.1 1440.7 6 B 768.2 767.6 768.5 756.4 757.3 762.1 764.5 C 580.6 580.9 581.5 572.3 572.8 576.1 577.9 A 1165.7 1170.2 1170.1 1152.1 1153.4 1154.6 1156.2 7 B 661.2 660.6 661.0 653.3 653.8 654.0 654.7 C 454.0 454.6 454.8 448.9 449.4 449.6 450.2 A 1166.3 1167.7 1168.3 1147.9 1149.6 1153.1 1158.8 8 B 767.6 766.4 766.9 752.7 753.3 754.3 755.8 C 513.0 512.5 512.8 504.3 504.8 505.6 507.2 A 862.5 865.9 866.0 852.4 852.6 853.2 854.7 9 B 754.2 752.8 752.8 741.8 741.9 742.6 743.6 C 513.7 513.6 513.7 505.7 505.8 506.5 507.3 A 3086.2 3101.0 3100.8 3060.2 3060.5 3061.3 3063.5 10 B 723.7 725.3 725.3 716.2 716.1 715.8 716.6 C 685.0 686.7 686.6 678.0 677.9 677.7 678.5 A 1432.1 1436.0 1435.6 1416.5 1416.4 1418.6 1421.2 11 B 820.5 822.8 822.6 810.9 810.9 812.1 813.2 C 679.4 683.0 682.7 674.1 674.1 675.4 676.3 A 1523.2 1523.3 1520.1 1496.3 1495.5 1497.2 1499.0 12 B 1070.5 1075.0 1076.4 1059.8 1060.6 1060.9 1062.5 C 719.9 721.1 721.5 709.3 710.1 711.0 712.7

MD 2.6 2.7 −18.1 −17.8 −17.6 −15.8 MAD 3.8 4.0 18.1 17.8 17.6 15.8 RMSD 5.7 5.9 23.3 23.1 24.1 22.4 SD 29.4 30.6 85.5 86.0 96.5 92.7 Var 26.1 28.3 221.8 224.3 282.2 260.3 Max 14.8 14.6 −5.1 −4.6 −4.0 −2.7 Min −13.2 −14.1 −75.5 −74.6 −83.5 −79.1 AMax 14.8 14.6 75.5 74.6 83.5 79.1

S42 P. Statistical evaluation: LMGB35

Table A25: The LMGB35 benchnmark set contains of systems from the first and second row of the periodic system. All bond lengths are given in pm. Reference distances are taken from Ref. [26].We abbreviate D3(BJ)-ATM by D3 and D4-ATM by D4.

PBE0 PBE TPSS system(bond) Ref. D3 D4 D3 D4 D3 D4

H2(H–H) 74.1 74.5 74.4 75.0 75.0 74.3 74.2 HF(H–F) 91.7 91.8 91.7 93.0 93.0 92.9 92.9 H2O(HO) 95.7 95.7 95.7 96.9 96.9 96.7 96.7 HOF(OH) 96.6 96.6 96.6 97.9 97.9 97.7 97.7 OH(O–H) 97.0 97.0 97.0 98.3 98.3 98.2 98.1 NH3(N–H) 101.2 101.1 101.1 102.1 102.1 101.9 101.8 OH+(O–H) 102.9 103.2 103.2 104.7 104.7 104.0 103.9 NH(N–H) 103.6 103.7 103.7 105.0 104.9 104.4 104.4 C2H2(C–H) 106.2 106.4 106.4 107.0 107.0 106.5 106.5 NO+(N–O) 106.3 105.3 105.3 106.9 106.9 106.6 106.6 HCN(H–C) 106.5 106.8 106.8 107.5 107.5 107.0 107.0 NH+(N–H) 107.0 107.6 107.6 109.1 109.1 108.3 108.3 C2H4(C–H) 108.1 108.3 108.3 109.1 109.1 108.6 108.6 CH4(CH) 108.6 108.8 108.8 109.5 109.5 109.1 109.1 N2(N–N) 109.8 108.9 108.9 110.2 110.2 109.9 109.9 CH2O(O–H) 109.9 110.7 110.7 111.7 111.7 111.0 111.0 + N2 (N–N) 111.6 110.1 110.1 111.4 111.4 111.2 111.2 + O2 (O–O) 111.6 109.8 109.8 112.1 112.1 112.0 112.0 CH(C–H) 112.0 112.4 112.4 113.6 113.6 112.9 112.9 CO(C–O) 112.8 112.2 112.2 113.5 113.5 113.3 113.3 HCN(C–N) 115.3 114.5 114.5 115.7 115.7 115.4 115.4 CO2(C–O) 116.0 115.6 115.6 117.0 117.0 116.8 116.8 C2H2(C–C) 120.3 119.6 119.6 120.6 120.6 120.2 120.2 CH2O(C–O) 120.3 119.5 119.5 120.8 120.8 120.7 120.7 BO(B–O) 120.5 119.9 119.9 121.3 121.3 121.2 121.2 O2(O–O) 120.8 119.2 119.2 121.8 121.8 121.9 121.9 BH(B–H) 123.2 124.0 124.0 125.1 125.1 123.6 123.6 BF(B–F) 126.3 125.9 125.9 127.3 127.3 127.3 127.3 CF(C–F) 127.2 126.7 126.7 128.5 128.5 128.9 128.9 NF(N–F) 131.7 130.3 130.3 132.7 132.7 133.3 133.3 + F2 (F–F) 132.2 127.2 127.3 131.6 131.6 131.7 131.7 C2H4(C–C) 133.4 132.2 132.2 133.2 133.2 133.0 133.0 F2(F–F) 141.2 137.5 137.6 141.4 141.4 141.6 141.6 HOF(O–F) 143.5 140.5 140.6 144.5 144.5 145.0 145.0 B2(B–B) 159.0 161.3 161.3 161.8 161.8 161.9 161.9 MD −0.5 −0.6 1.0 1.0 0.7 0.7 MAD 0.9 0.9 1.0 1.0 0.8 0.8 RMSD 1.4 1.4 1.2 1.2 1.0 1.0 SD 7.8 7.7 4.0 3.9 3.9 3.9 Var 1.8 1.7 0.5 0.5 0.5 0.5 Max 2.3 2.3 2.8 2.8 2.9 2.9 Min −5.0 −4.9 −0.6 −0.6 −0.5 −0.5 AMax 5.0 4.9 2.8 2.8 2.9 2.9

S43 Table A26: The LMGB35 benchnmark set contains of systems from the first and second row of the periodic system. All bond lengths are given in pm. Reference distances are taken from Ref. [26].We abbreviate D3(BJ)-ATM by D3 and D4(TB)-ATM by D4.

PBE0 PBE TPSS system(bond) Ref. D3 D4 D3 D4 D3 D4

H2(H–H) 74.1 74.5 74.4 75.0 74.2 74.3 74.2 HF(H–F) 91.7 91.8 91.7 93.0 92.9 92.9 92.9 H2O(HO) 95.7 95.7 95.7 96.9 96.7 96.7 96.7 HOF(OH) 96.6 96.6 96.6 97.9 97.7 97.7 97.7 OH(O–H) 97.0 97.0 97.0 98.3 98.1 98.2 98.1 NH3(N–H) 101.2 101.1 101.1 102.1 101.9 101.9 101.9 OH+(O–H) 102.9 103.2 103.2 104.7 103.9 104.0 103.9 NH(N–H) 103.6 103.7 103.7 105.0 104.4 104.4 104.4 C2H2(C–H) 106.2 106.4 106.4 107.0 106.5 106.5 106.5 NO+(N–O) 106.3 105.3 105.3 106.9 106.6 106.6 106.6 HCN(H–C) 106.5 106.8 106.8 107.5 107.0 107.0 107.0 NH+(N–H) 107.0 107.6 107.6 109.1 108.3 108.3 108.3 C2H4(C–H) 108.1 108.3 108.3 109.1 108.6 108.6 108.6 CH4(CH) 108.6 108.8 108.8 109.5 109.1 109.1 109.1 N2(N–N) 109.8 108.9 108.9 110.2 109.9 109.9 109.9 CH2O(O–H) 109.9 110.7 110.7 111.7 111.0 111.0 111.0 + N2 (N–N) 111.6 110.1 110.1 111.4 111.2 111.2 111.2 + O2 (O–O) 111.6 109.8 109.8 112.1 112.0 112.0 112.0 CH(C–H) 112.0 112.4 112.4 113.6 112.9 112.9 112.9 CO(C–O) 112.8 112.2 112.2 113.5 113.3 113.3 113.3 HCN(C–N) 115.3 114.5 114.5 115.7 115.4 115.4 115.4 CO2(C–O) 116.0 115.6 115.6 117.0 116.8 116.8 116.8 C2H2(C–C) 120.3 119.6 119.6 120.6 120.2 120.2 120.2 CH2O(C–O) 120.3 119.5 119.5 120.8 120.7 120.7 120.7 BO(B–O) 120.5 119.9 119.9 121.3 121.2 121.2 121.2 O2(O–O) 120.8 119.2 119.2 121.8 121.9 121.9 121.9 BH(B–H) 123.2 124.0 124.0 125.1 123.6 123.6 123.6 BF(B–F) 126.3 125.9 125.9 127.3 127.3 127.3 127.3 CF(C–F) 127.2 126.7 126.7 128.5 128.9 128.9 128.9 NF(N–F) 131.7 130.3 130.3 132.7 133.3 133.3 133.3 + F2 (F–F) 132.2 127.2 127.3 131.6 131.7 131.7 131.7 C2H4(C–C) 133.4 132.2 132.2 133.2 133.0 133.0 133.0 F2(F–F) 141.2 137.5 137.6 141.4 141.6 141.6 141.6 HOF(O–F) 143.5 140.5 140.6 144.5 145.0 145.0 145.0 B2(B–B) 159.0 161.3 161.3 161.8 161.9 161.9 161.9 MD −0.5 −0.6 1.0 0.7 0.7 0.7 MAD 0.9 0.9 1.0 0.8 0.8 0.8 RMSD 1.4 1.4 1.2 1.0 1.0 1.0 SD 7.8 7.7 4.0 3.9 3.9 3.9 Var 1.8 1.7 0.5 0.5 0.5 0.5 Max 2.3 2.3 2.8 2.9 2.9 2.9 Min −5.0 −4.9 −0.6 −0.5 −0.5 −0.5 AMax 5.0 4.9 2.8 2.9 2.9 2.9

S44 Q. Statistical evaluation: HMGB11

Table A27: Experimental reference bond distances for 11 molecules from Ref. [27] containing third–row or higher main group elements. All distances are given in pm. We abbre- viate D3(BJ)-ATM by D3 and D4-ATM by D4.

PBE0 PBE TPSS system(bond) Ref. D3 D4 D3 D4 D3 D4

Cl2(Cl–Cl) 198.8 197.9 197.9 200.4 200.5 200.8 200.8 S2H2(S–S) 205.5 204.4 204.4 206.3 206.3 206.4 206.4 P2(CH3)4(P–P) 221.2 219.2 219.2 221.3 221.4 220.9 221.0 Br2(Br–Br) 228.1 227.8 227.8 230.8 230.8 230.5 230.5 Se2H2(Se–Se) 234.6 232.4 232.4 235.0 235.0 234.4 234.4 Ge2H6(Ge–Ge) 241.0 242.1 242.1 243.3 243.3 242.5 242.5 As2(CH3)4(As–As) 242.9 243.8 243.8 247.3 247.2 246.1 246.1 Te2(CH3)2(Te–Te) 268.6 267.3 267.3 269.7 269.6 268.5 269.1 Sn2(CH3)6(Sn–Sn) 277.6 277.7 278.0 279.6 280.1 278.4 278.9 Sb2(CH3)4(Sb–Sb) 281.8 282.5 282.6 286.2 286.1 284.9 285.0 Pb2(CH3)6(Pb–Pb) 288.0 287.1 287.2 292.0 291.9 289.8 290.3 MD −0.5 −0.5 2.2 2.2 1.4 1.5 MAD 1.0 1.1 2.2 2.2 1.5 1.6 RMSD 1.2 1.2 2.6 2.6 1.8 1.9 SD 3.6 3.6 4.9 4.8 4.0 3.8 Var 1.3 1.3 2.4 2.3 1.6 1.5 Max 1.1 1.1 4.4 4.3 3.2 3.2 Min −2.2 −2.2 0.1 0.2 −0.3 −0.2 AMax 2.2 2.2 4.4 4.3 3.2 3.2

Table A28: Experimental reference bond distances for 11 molecules from Ref. [27] containing third–row or higher main group elements. All distances are given in pm. We abbre- viate D3(BJ)-ATM by D3 and D4(TB)-ATM by D4.

PBE0 PBE TPSS system(bond) Ref. D3 D4 D3 D4 D3 D4

Cl2(Cl–Cl) 198.8 197.9 197.9 200.4 200.5 200.8 200.8 S2H2(S–S) 205.5 204.4 204.4 206.3 206.4 206.4 206.4 P2(CH3)4(P–P) 221.2 219.2 219.2 221.3 221.7 220.9 221.3 Br2(Br–Br) 228.1 227.8 227.8 230.8 230.8 230.5 230.6 Se2H2(Se–Se) 234.6 232.4 232.4 235.0 235.0 234.4 234.5 Ge2H6(Ge–Ge) 241.0 242.1 242.1 243.3 243.3 242.5 242.6 As2(CH3)4(As–As) 242.9 243.8 243.8 247.3 247.4 246.1 246.3 Te2(CH3)2(Te–Te) 268.6 267.3 267.3 269.7 269.8 268.5 269.2 Sn2(CH3)6(Sn–Sn) 277.6 277.7 278.0 279.6 280.7 278.4 279.3 Sb2(CH3)4(Sb–Sb) 281.8 282.5 282.6 286.2 286.8 284.9 285.5 Pb2(CH3)6(Pb–Pb) 288.0 287.1 287.2 292.0 292.7 289.8 291.0 MD −0.5 −0.5 2.2 2.4 1.4 1.8 MAD 1.0 1.1 2.2 2.4 1.5 1.8 RMSD 1.2 1.2 2.6 2.9 1.8 2.1 SD 3.6 3.6 4.9 5.3 4.0 4.1 Var 1.3 1.3 2.4 2.9 1.6 1.7 Max 1.1 1.1 4.4 5.0 3.2 3.7 Min −2.2 −2.2 0.1 0.4 −0.3 −0.1 AMax 2.2 2.2 4.4 5.0 3.2 3.7

S45 R. Statistical evaluation: TMC32

S46 Table A29: Diverse set of 32 metal complexes from the first transition row, for which precise gas- phase geometries are known from electron diffraction or microwave spectroscopy. Compilation by B¨uhl et al. [28]. We abbreviate D3(BJ)-ATM by D3 and D4-ATM by D4.

PBE0 PBE TPSS system(bond) Ref. D3 D4 D3 D4 D3 D4

Sc(acac)3(Sc–O) 207.6 209.0 208.8 210.4 210.3 210.0 209.8 TiCl4(Ti–Cl) 216.9 216.4 216.4 218.3 218.2 218.3 218.3 Ti(CH3)Cl3(Ti–C) 204.7 202.3 202.3 204.5 204.7 205.3 205.3 Ti(CH3)Cl3(Ti–Cl) 218.5 218.0 217.9 219.3 219.2 219.5 219.4 Ti(CH3)2Cl2(Ti–C) 205.8 203.5 203.6 205.5 205.5 206.3 206.3 Ti(CH3)2Cl2(Ti–Cl) 219.6 219.6 219.6 220.5 220.4 220.7 220.6 Ti(BD4)3(Ti–B) 217.5 214.3 214.3 214.7 214.6 214.9 214.8 br Ti(BD4)3(Ti–D ) 198.4 193.4 193.5 193.7 193.7 193.2 193.1 VOF3(V=O) 157.0 154.1 154.1 157.5 157.5 157.6 157.5 VOF3(V–F) 172.9 171.6 171.6 173.7 173.7 173.4 173.4 ax VF5(V–F ) 173.4 173.5 173.5 176.4 176.4 175.9 175.9 eq VF5(V–F ) 170.8 169.9 169.9 172.7 172.6 172.3 172.3 VOCl3(V=O) 157.3 153.9 153.9 157.2 157.2 157.3 157.3 VOCl3(V–Cl) 213.8 213.0 213.0 215.0 214.9 214.9 214.9 V(N(CH3)2)4(V–N) 187.9 186.1 186.1 188.2 188.1 188.1 187.9 CO V(Cp)(CO)4(V–C ) 196.3 192.3 192.3 192.7 192.7 194.4 194.3 CrO2F2(Cr=O) 157.4 153.7 153.6 157.1 157.1 157.0 157.0 CrO2F2(Cr–F) 171.9 170.2 170.2 172.4 172.3 172.0 172.0 CrO2Cl2(Cr=O) 157.7 153.8 153.8 157.2 157.2 157.2 157.1 CrO2Cl2(Cr–Cl) 212.2 210.6 210.6 212.4 212.4 212.4 212.3 CrO2(NO3)2(Cr=O) 158.4 153.8 153.8 157.4 157.4 157.4 157.4 CrO2(NO3)2(Cr–O) 195.4 191.0 191.0 193.2 193.2 193.0 192.9 Cr(C6H6)2(Cr–C) 215.0 213.0 212.9 213.8 213.7 213.6 213.5 Ar Cr(C6H6)(CO)3(Cr–C ) 220.8 219.1 219.2 221.0 221.1 220.2 220.2 CO Cr(C6H6)(CO)3(Cr–C ) 186.3 183.8 183.8 183.9 183.9 185.2 185.0 Cr(NO)4(Cr–N) 175.0 171.5 171.5 174.1 174.1 174.1 174.0 MnO3F(Mn=O) 158.6 154.2 154.2 159.2 157.7 157.6 157.6 MnO3F(Mn–F) 172.4 170.1 170.1 171.0 172.0 171.6 171.6 Cp MnCp(CO)3(Mn–C ) 214.7 213.7 213.7 215.2 215.1 214.3 214.2 CO MnCp(CO)3(Mn–C ) 180.6 178.3 178.3 178.4 178.3 179.4 179.3 mean Fe(CO)5(Fe–C) 182.9 179.7 179.7 180.2 180.1 181.0 180.9 CO Fe(CO)3(tmm)(Fe–C ) 181.0 177.9 177.9 178.0 178.0 178.8 178.8 cent Fe(CO)3(tmm)(Fe–C ) 193.8 192.2 192.2 194.6 194.6 194.3 194.3 CH2 Fe(CO)3(tmm)(Fe–C ) 212.3 209.8 209.9 213.1 213.2 212.0 212.2 mean Fe(CO)2(NO)2(Fe–C) 187.2 180.8 180.8 181.5 181.3 182.3 182.2 Fe(CO)2(NO)2(Fe–N) 167.4 164.1 164.1 167.1 167.0 166.9 166.9 FeCp2(Fe–C) 206.4 204.2 204.2 204.0 204.1 203.7 203.5 et Fe(C2H4)(CO)4(Fe–C ) 211.7 209.7 216.2 213.0 216.7 212.3 216.7 ax Fe(C2H4)(CO)4(Fe–C ) 181.5 180.0 180.1 179.9 179.9 180.7 180.7 eq Fe(C2H4)(CO)4(Fe–C ) 180.6 178.1 177.8 178.9 178.7 179.6 179.3 Fe(C5(CH3)5)(P5)(Fe–P) 237.7 235.7 235.7 236.3 236.1 234.9 234.7 eq CoH(CO)4(Co–C ) 181.8 178.4 178.4 179.0 179.0 179.5 179.5 Co(CO)3(NO)(Co–N) 165.8 162.9 162.9 166.1 166.1 165.8 165.8 Co(CO)3(NO)(Co–C) 183.0 180.1 180.1 180.5 180.4 181.1 181.0 Ni(CO)4(Ni–C) 182.5 182.1 182.0 182.3 182.2 182.7 182.6 Ni(acac)2(Ni–O) 187.6 185.1 184.3 185.8 185.3 185.4 184.6 Ni(PF3)4(Ni–P) 209.9 209.0 208.8 210.6 210.3 210.0 209.5 CuCH3(Cu–C) 188.4 190.1 190.1 189.3 189.4 189.6 189.6 CuCN(Cu–C) 183.2 184.3 184.3 182.0 182.0 182.3 182.3 Cu(acac)2(Cu–O) 191.4 191.9 191.8 194.3 194.2 193.2 193.1 MD −2.1 −2.0 −0.6 −0.5 −0.6 −0.5 MAD 2.3 2.4 1.5 1.6 1.3 1.4 RMSD 2.7 2.8 1.9 2.1 1.7 1.9 SD 11.6 13.3 13.0 14.0 11.3 12.8 Var 2.7 3.6 3.5 4.0 2.6 3.3 Max 1.7 4.5 3.0 5.0 2.5 5.0 Min −6.4 −6.4 −5.7 −5.9 −5.2 −5.3 AMax 6.4 6.4 5.7 5.9 5.2 5.3

S47 Table A30: Diverse set of 32 metal complexes from the first transition row, for which precise gas- phase geometries are known from electron diffraction or microwave spectroscopy. Compilation by B¨uhl et al. [28]. We abbreviate D3(BJ)-ATM by D3 and D4(TB)- ATM by D4.

PBE0 PBE TPSS system(bond) Ref. D3 D4 D3 D4 D3 D4

Sc(acac)3(Sc–O) 207.6 209.0 208.9 210.4 210.3 210.0 209.8 TiCl4(Ti–Cl) 216.9 216.4 216.4 218.3 218.2 218.3 218.3 Ti(CH3)Cl3(Ti–C) 204.7 202.3 202.3 204.5 204.5 205.3 205.4 Ti(CH3)Cl3(Ti–Cl) 218.5 218.0 217.9 219.3 219.3 219.5 219.5 Ti(CH3)2Cl2(Ti–C) 205.8 203.5 203.6 205.5 205.5 206.3 206.4 Ti(CH3)2Cl2(Ti–Cl) 219.6 219.6 219.5 220.5 220.5 220.7 220.7 Ti(BD4)3(Ti–B) 217.5 214.3 214.3 214.7 214.6 214.9 214.8 br Ti(BD4)3(Ti–D ) 198.4 193.4 193.4 193.7 193.7 193.2 193.1 VOF3(V=O) 157.0 154.1 154.1 157.5 157.5 157.6 157.5 VOF3(V–F) 172.9 171.6 171.6 173.7 173.7 173.4 173.4 ax VF5(V–F ) 173.4 173.5 173.5 176.4 176.4 175.9 175.9 eq VF5(V–F ) 170.8 169.9 169.8 172.7 172.6 172.3 172.4 VOCl3(V=O) 157.3 153.9 153.9 157.2 157.2 157.3 157.3 VOCl3(V–Cl) 213.8 213.0 213.0 215.0 214.9 214.9 214.9 V(N(CH3)2)4(V–N) 187.9 186.1 186.2 188.2 188.1 188.1 188.0 CO V(Cp)(CO)4(V–C ) 196.3 192.3 192.3 192.7 192.7 194.4 194.3 CrO2F2(Cr=O) 157.4 153.7 153.7 157.1 157.1 157.0 157.0 CrO2F2(Cr–F) 171.9 170.2 170.2 172.4 172.4 172.0 172.0 CrO2Cl2(Cr=O) 157.7 153.8 153.8 157.2 157.2 157.2 157.2 CrO2Cl2(Cr–Cl) 212.2 210.6 210.6 212.4 212.4 212.4 212.4 CrO2(NO3)2(Cr=O) 158.4 153.8 153.8 157.4 157.4 157.4 157.4 CrO2(NO3)2(Cr–O) 195.4 191.0 191.0 193.2 193.2 193.0 192.9 Cr(C6H6)2(Cr–C) 215.0 213.0 213.0 213.8 213.7 213.6 213.6 Ar Cr(C6H6)(CO)3(Cr–C ) 220.8 219.1 219.2 221.0 221.1 220.2 220.3 CO Cr(C6H6)(CO)3(Cr–C ) 186.3 183.8 183.8 183.9 183.9 185.2 185.1 Cr(NO)4(Cr–N) 175.0 171.5 171.5 174.1 174.1 174.1 174.1 MnO3F(Mn=O) 158.6 154.2 154.2 159.2 157.7 157.6 157.6 MnO3F(Mn–F) 172.4 170.1 170.1 171.0 172.0 171.6 171.6 Cp MnCp(CO)3(Mn–C ) 214.7 213.7 213.7 215.2 215.1 214.3 214.3 CO MnCp(CO)3(Mn–C ) 180.6 178.3 178.3 178.4 178.4 179.4 179.4 mean Fe(CO)5(Fe–C) 182.9 179.7 179.7 180.2 180.2 181.0 181.0 CO Fe(CO)3(tmm)(Fe–C ) 181.0 177.9 177.9 178.0 178.0 178.8 178.8 cent Fe(CO)3(tmm)(Fe–C ) 193.8 192.2 192.2 194.6 194.7 194.3 194.4 CH2 Fe(CO)3(tmm)(Fe–C ) 212.3 209.8 209.8 213.1 213.1 212.0 212.2 mean Fe(CO)2(NO)2(Fe–C) 187.2 180.8 180.7 181.5 181.4 182.3 182.3 Fe(CO)2(NO)2(Fe–N) 167.4 164.1 164.2 167.1 167.1 166.9 166.9 FeCp2(Fe–C) 206.4 204.2 204.2 204.0 204.1 203.7 203.6 et Fe(C2H4)(CO)4(Fe–C ) 211.7 209.7 209.9 213.0 213.1 212.3 212.6 ax Fe(C2H4)(CO)4(Fe–C ) 181.5 180.0 180.0 179.9 179.9 180.7 180.7 eq Fe(C2H4)(CO)4(Fe–C ) 180.6 178.1 178.1 178.9 178.8 179.6 179.6 Fe(C5(CH3)5)(P5)(Fe–P) 237.7 235.7 244.7 236.3 236.2 234.9 234.8 eq CoH(CO)4(Co–C ) 181.8 178.4 178.4 179.0 179.0 179.5 179.5 Co(CO)3(NO)(Co–N) 165.8 162.9 162.9 166.1 166.1 165.8 165.8 Co(CO)3(NO)(Co–C) 183.0 180.1 180.1 180.5 180.5 181.1 181.1 Ni(CO)4(Ni–C) 182.5 182.1 182.0 182.3 182.3 182.7 182.7 Ni(acac)2(Ni–O) 187.6 185.1 184.3 185.8 185.3 185.4 184.6 Ni(PF3)4(Ni–P) 209.9 209.0 208.4 210.6 209.8 210.0 209.1 CuCH3(Cu–C) 188.4 190.1 190.1 189.3 189.4 189.6 189.6 CuCN(Cu–C) 183.2 184.3 184.3 182.0 182.0 182.3 182.3 Cu(acac)2(Cu–O) 191.4 191.9 191.8 194.3 194.2 193.2 193.1 MD −2.1 −2.0 −0.6 −0.6 −0.6 −0.6 MAD 2.3 2.5 1.5 1.5 1.3 1.3 RMSD 2.7 2.9 1.9 1.9 1.7 1.7 SD 11.6 14.7 13.0 12.9 11.3 11.5 Var 2.7 4.4 3.5 3.4 2.6 2.7 Max 1.7 7.0 3.0 3.0 2.5 2.5 Min −6.4 −6.5 −5.7 −5.8 −5.2 −5.3 AMax 6.4 7.0 5.7 5.8 5.2 5.3

S48 References

[1] S. A. Ghasemi, A. Hofstetter, S. Saha, S. Goedecker, Phys. Rev. B 2015, 92, 045131. [2] S. Grimme, J. Antony, S. Ehrlich, H. Krieg, J. Chem. Phys. 2010, 132, 154104. [3] P. Pyykk¨o,M. Atsumi, Chem.-Eur. J. 2009, 15, 186–197. [4] A. Tkatchenko, A. Ambrosetti, R. A. DiStasio Jr., J. Chem. Phys. 2013, 138, 074106. [5] A. Ambrosetti, A. M. Reilly, R. A. DiStasio Jr, A. Tkatchenko, J. Chem. Phys. 2014, 140, 18A508. [6] T. Gould, S. Leb`egue,J. G. Angy´an,T.´ Buˇcko, Journal of chemical theory and computation 2016, 12, 5920–5930. [7] S. Grimme, A. Hansen, J. G. Brandenburg, C. Bannwarth, Chem. Rev. 2016, 116, 5105–5154. [8] M. Gaus, Q. Cui, M. Elstner, J. Chem. Theory Comput. 2011, 7, 931–948. [9] S. Grimme, J. Chem. Phys. 2006, 124, 034108. [10] T. Schwabe, S. Grimme, Phys. Chem. Chem. Phys. 2006, 38, 4398. [11] L. Goerigk, S. Grimme, J. Chem. Theory Comput. 2011, 7, 291. [12] S. Kozuch, D. Gruzman, J. M. L. Martin, J. Phys. Chem. C 2010, 114, 20801. [13] S. Kozuch, J. M. L. Martin, J. Comput. Chem. 2013, 34, 2327. [14] J.-D. Chai, S.-P. Mao, Chem. Phys. Lett. 2012, 538, 121. [15] E. Brmond, C. Adamo, J. Chem. Phys. 2011, 135, 024106. [16] B. Brauer, M. K. Kesharwani, S. Kozuch, J. M. L. Martin, J. Chem. Theory Comput. 2016, 18, 20905–20925. [17] L. Gr´afov´a,M. Pitonak, J. Rez´aˇc,P.ˇ Hobza, J. Chem. Theory Comput. 2010, 6, 2365–2376. [18] D. E. Taylor, J. G. Angy´an,G.´ Galli, C. Zhang, F. Gygi, K. Hirao, J. W. Song, K. Rahul, O. Anatole von Lilienfeld, R. t. Podeszwa, J. Chem. Phys. 2016, 145, 124105. [19] R. Sure, S. Grimme, J. Chem. Theory Comput. 2015, 11, 3785–3801.

S49 [20] J. G. Brandenburg, C. Bannwarth, A. Hansen, S. Grimme, J. Chem. Phys. 2018, 148, 064104. [21] H. Kruse, A. Mladek, K. Gkionis, A. Hansen, S. Grimme, J. Sponer, J. Chem. Theory Comput. 2015, 11, 4972–4991. [22] R. Sedlak, T. Janowski, M. Pitoˇn´ak,J. Rez´aˇc,P.ˇ Pulay, P. Hobza, J. Chem. Theory Comput. 2013, 9, 3364–3374. [23] S. Dohm, A. Hansen, M. Steinmetz, S. Grimme, M. P. Checinski, J. Chem. Theory Comput. 2018, 14, 2596–2608. [24] L. Goerigk, A. Hansen, C. Bauer, S. Ehrlich, A. Najibi, S. Grimme, Phys. Chem. Chem. Phys. 2017, 19, 32184–32215. [25] T. Risthaus, M. Steinmetz, S. Grimme, J. Comput. Chem. 2014, 35, 1509– 1516. [26] G. Herzberg, Molecular Spectra and Molecular Structure 1979, 158–169. [27] P. Pyykk¨o,M. Atsumi, Chem.-Eur. J. 2009, 15, 186–197. [28] M. B¨uhl,H. Kabrede, J. Chem. Theory Comput. 2006, 2, 1282–1290.

S50