Graph Invariants – A Tool to Analyze Hydrogen Bonding in and Clusters

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the

Graduate School of The Ohio State University

By

Jer-Lai Kuo, B.S., M.S.

* * * * *

The Ohio State University

2003

Dissertation Committee: Approved by

Prof. Sherwin J. Singer, Adviser Prof. Anne B. McCoy Adviser Prof. James V. Coe Chemical Physics Program Prof. Eric Herbst ABSTRACT

We have studied a wide range of aqueous systems, from the order/disorder in the hy- drogen bond network of ordinary ice, a fundamental problem in ice physics for over 70 years, to properties of water clusters. A common theoretical difficulty in these systems is the enormous number of hydrogen bond arrangements that must be considered in these systems, and for which no systematic treatment has been available. Our analytical method based on graph theory and the introduction of graph invariants provides a means for effi-

cient and reliable analysis. Recent investigations of water clusters have demonstrated that

graph invariants provide a powerful tool for capturing very complex behavior using only

a small number of parameters, and some of our findings have uncovered unknown aspects

of the behavior of water clusters. Applications of this analytic method provide insight into

the nature of hydrogen bond disorder in ice, and impact our understanding of chemical

reactions in water clusters important to environmental chemistry.

ii To My Parents

iii ACKNOWLEDGMENTS

I would like to thank my advisor, Prof. Sherwin J. Singer, for his guidance and support through my graduate study. I also wish to thank Prof. Anne B. McCoy, Prof. James V. Coe

and Prof. Eric Herbst for kindly serving on my dissertation examination committee.

I owe a debt of thanks to Dr. Cristian V. Ciobanu and Prof. Lars P. Ojamae¨ and Prof. Isa-

iah Shavitt have all in some way contribute to this thesis. I would like to acknowledge the

Graduate School of the Ohio State University for the award of Presidential Fellowship.

All the friends in Ohio State make my life in Columbus wonderful and memorable. My

deepest thank goes to my family, through my life they have supported me with their love

and encouragement.

iv VITA

January 12, 1973 ...... Born - Kinmen, Taiwan

1991-1995 ...... B.S. Department of Physics, National Taiwan University, Taiwan 1995-1997 ...... M.S. Department of Physics, National Taiwan University, Taiwan 1997-2001 ...... Graduate Teaching and Research Associate, The Ohio State University. 2001-present ...... Presidential Fellowship, The Ohio State University.

PUBLICATIONS

Research Publications

Jer-Lai Kuo, Jame V. Coe, Sherwin Singer, Yehuda B. Band and Lars Ojamae¨ “On the Use of Graph Invariants for Efficiently Generating Hydrogen Bond Topologies and Predicting Physical Properties of Water Clusters and Ice”. J. Chem. Phys. 114, 2527-2540 (2001)

FIELDS OF STUDY

Major Field: Chemical Physics

v TABLE OF CONTENTS

Page

Abstract ...... ii

Dedication ...... iii

Acknowledgments ...... iv

Vita ...... v

List of Tables ...... viii

List of Figures ...... ix

Chapters:

1. Introduction ...... 1

2. The Use of Graph Invariants for Efficiently Generating Hydrogen Bond Topolo- gies and Predicting Physical Properties of Water Clusters ...... 9

2.1 Graph invariants ...... 14 2.1.1 Generation of graph invariants ...... 18 2.1.2 Physical interpretation of graph invariants ...... 21 2.2 Graph invariants as a tool for enumerating H-bond topologies ...... 24 2.2.1 Sorting strategy ...... 26 2.2.2 Performation of sorting algorithm for realistic calculations . . . . 27 2.3 Correlation and prediction of physical properties from H-Bond topology

using graphical invariants ...... 31 ¡ £¦¥¨§ £ 2.3.1 ¢¡¤£¦¥¨§ © and ...... 31 2.3.2 Using invariants to calculate phase transitions: ¢¡£¥§ £ dodec- ahedral as a dry run ...... 37

vi 2.4 Discussion ...... 42 2.5 Appendix: Necessary and sufficient condition for a graphical invariant to be identically zero ...... 43

3. Graph Invariants for Periodic Systems: Predicting Physical Properties from the Hydrogen Bond Topology of Ice . . . . 45

3.1 A gentle introduction to oriented graphs and graph invariants ...... 48 3.2 Graph invariants for periodic systems ...... 55 3.2.1 Graph invariants and space groups ...... 57 3.2.2 Invariants for arbitrary unit cell choice ...... 58 3.2.3 An illustration for square ice ...... 63 3.3 Graph invariants and graph enumeration for Ice-Ih ...... 69 3.3.1 Invariants for the 8-water orthorhombic unit cell ...... 69 3.3.2 Enumeration of H-bond arrangements in ice-Ih ...... 72 3.3.3 Analysis of enumeration results ...... 76 3.4 Conclusion ...... 80 3.5 Appendix: Graph invariants of the Orth § cell ...... 84

4. Effects of H-Bond Topology on Energetics, Structure and Chemistry on water clusters ...... 87

4.1 Cluster stability and H-bond topology ...... 89 4.2 Self-dissociation and zwitterionic structures ...... 91 4.3 Short H-bonds ...... 95 4.4 Discussion ...... 99

Bibliography ...... 100

vii LIST OF TABLES

Table Page

2.1 Group of vertex permutations for the triangle graph shown in Fig. 2.1, and the induced group of signed permutations on the bonds of the trian-

gle graph. For the vertex permutations, we denote the permutation taking

§

  vertices 1,2 and 3 to  and as . It is also common to indicate per-

mutations in terms of independent cycles, in terms of which, for example, § "!$#%§

the  permutation would be written as . Our notation for signed bond permutations follows that for vertices...... 17

2.2 First and second order invariants for the ¢¡&£¦¥¨§ © cage structure shown in

Fig. 2.2. The invariants are calculated using a permutation symmetry group £(

on the vertices isomorphic to the ' point group. In the text we refer to the invariants by any of the bond products that generate the invariant by

application of a projection operator. For example, )+*,-) ./,102030 is at the

£ £

5 03030

top of the list of 1st order invariants, while )%4564,7) heads the list of

89,:)3;<5 8=,>03030 2nd order invariants and )3*5 is at the bottom of the list. . . . . 23

3.1 Value of the bond variables and graph invariants associated with each of the graphs depicted in Fig. 3.2...... 51

viii LIST OF FIGURES

Figure Page

1.1 Examples of the H-bond network. (a) The simplest H-bonded system. (b) An example of how the H-bond network can be summarized by an oriented graph. By convention H-bonds point from H-bond donor to the H-bond acceptor. An H-bond arrangement of ¡&?@ ¡A£¦¥¨§ B is shown on the left and the direction of the H-bonds are summarized by the oriented graph in the right...... 2

1.2 Three H-bond isomers of dodecahedral ¡&£¥§ £ . They have similar oxygen #

positions and differ only in the direction of H-bonds. Among DC%E%EFE$E%E

H-bond isomers, many of them are related by a symmetry operation and

# ! C there exist EE symmetry-unique representatives...... 3

1.3 Trans and cis configurations in water dimer. From a pure electrostatic point of view, the trans configuration is energetically more stable than cis because the reduced repulsion between the non-H-bonded hydrogen atoms. . . . . 6

2.1 A simple example of an oriented graph, which might represent the config- ¡ £¦¥§

uration of a * cluster, is shown on the left. The direction shown on

the edges indicate the orientation of the edges if all the bond variables GIH 

were taken equal to J , canonical orientations chosen arbitrarily for each ¡K£L¥¨§

bond. Two different H-bond topologies for * are shown on the right, along with the value of the bond variables, as referenced to the canonical orientations of the graph on the left...... 16

2.2 The cage structure of ¢¡K£¦¥¨§ © . One of the 27 possible symmetry-distinct H-bonding arrangements for the cage structure is shown. The arrows and

bond labels indicate the directions of the bonds when the bond variables 

are equal to J ...... 22

ix 2.3 Enumeration of all symmetry-distinct H-bond topologies for a dodecahe- dral ¡ £¥§ £ was performed by considering a sequence of structures con- taining fewer bonds than the full dodecahedron. Additional H-bonds were added to the structures after all symmetry-related duplicates were elimi- nated. This process furnishes data on the computational cost of eliminating symmetry-related structures as a function of the number of graphs. This data shown is for the calculation as performed in Ref. [1], without the use

of the sorting method introduced in this work. The computational cost per £

graph edge is plotted as a function of the number of graphs M before sym- £

metry comparisons were made. Least squares fits of CPU time to MNM

* M and M clearly show that the computational cost scales as or worse without the sorting method...... 29

2.4 Data for the same calculation as in Fig. 2.3, this time employing the sorting method introduced in this work. CPU time per graph edge for sorting the

graphs is plotted against the number of graphs M in the bottom panel. The

total CPU time for symmetry comparisons within groups of size OQP7R$E%E

is shown in the top panel. Least square fits clearly show that the compu-

£ MTSUM tational cost scales as either M or in each case, and definitely not

like M as in Fig. 2.3. On the basis of arguments presented in section 2.2

we expect MTSUM scaling in the bottom panel and linear scaling in the top panel...... 30

2.5 In the upper panel, we test the degree to which the energies of the 27 iso- mers of the ¡ £L¥¨§ © cage are correlated with H-bond topology, and the ef-

fectiveness of graphical invariants in capturing that trend. The V -coordinate

is the energy of the isomers using the PM3 semi-empirical theory. The W - coordinate is the result of a least squares fit to these energies using all 9 linearly independent first and second graph invariants (filled symbols), or just 4 out of those 9 that proved to be most important (open symbols). If the fit was perfect all points would lie along the straight line. The lower panel exhibits the fit of the squared dipole moment to 9 linearly independent first and second order invariants...... 35

x 2.6 This plot evaluates the degree to which the energies of the 30026 isomers of the dodecahedral ¡¤£¦¥§ £ cage are correlated with H-bond topology,

and the effectiveness of graphical invariants in capturing that trend. The V - coordinate is the energy of the isomers using the OSS2 empirical potential

[2]. The W -coordinate is the result of a least squares fit to these energies using the 7 linearly independent second graph invariants. If the fit was perfect all points would like along the straight line. A training set of only 20 randomly selected configurations was used to parameterize the energy as a linear combination of invariants...... 36

2.7 Configurational energy and heat capacity of a model ¡¨£¦¥¨§ £ dodecahedral cluster. The “exact” curve in the top panel is calculated from the partition function in Eqs.(2.21-2.22) using the energy of all 30026 isomers of the ¢¡ £L¥¨§ £ dodecahedron. The vibrational contribution to the energy, which would only add a linear term to the average energy and a constant to the heat capacity under the assumption of harmonic fluctuations about each local minimum, is not included. The curve labelled “converged fit” is obtained using an arbitrarily large number of isomers in the training set. It represents the best fit possible using only second order invariants. The curves with thin lines in the bottom panel give the results of the invariant fitting procedure, as fully explained in the text. The heat capacity was calculated using only 30 of the 30026 isomers as input for a fit to invariants, after which the

energies XY in Eqs.(2.21-2.22) were calculated from the fit. To portray the variability arising from fitting to randomly selected points, we give results for 9 independent trials...... 40

3.1 A square ice lattice used to illustrate graph invariants. The molecular con- figuration, shown on the left, is summarized by the directed graph appear- ing on the right. The H-bond arrangement shown here is adopted as the canonical bond orientation. Other periodic H-bond arrangements are pos- sible, as illustrated in Fig. 3.2...... 49

xi 3.2 Graphs which lead to periodic H-bond patterns satisfying the Bernal-Fowler ice rules in the square ice lattice depicted in Fig. 3.1. In graph A the bonds are arranged in their canonical orientation, the same one shown in Fig. 3.1. The eight bonds associated with the ! Z! unit cell are numbered according to the scheme indicated on graph A. In some graphs the bonds associated with unit cells neighboring the primary unit cell are shown to make it more apparent how the orientation of complete water molecules are indicated by

the graphs. For example, in graph B the periodic image of bond 4 is actu-

£ ©

G.IG GL;

ally drawn to the left of bond 3. In graph B bond variables G[4G

B  

J G * G\ ] and G all have value , while bonds and have value , all defined

relative to the canonical orientations of graph A...... 50

 ^ 3.3 Labeling scheme for bonds in the ^ unit cell of “square ice”...... 64

3.4 The labeling of H-bonds, and their canonical orientation, are shown here for the Orth § unit cell. In the canonical orientation, all of the H- bonds are cis...... 70

3.5 These two configurations of the Orth § cell are symmetry distinct, yet are related to each other by reversal of all H-bonds. The left hand structure is converted into the right hand one by first reflecting through a

horizontal plane that bisects the figure midway between the two _`G bilayers, followed by reversing all the H-bonds...... 75

3.6 Each unit cell is accompanied by three numbers, which from left to right are the number of second order graph invariants, the number of linearly in- dependent second order graph invariants for graphs that satisfy the ice rules for neutral water, and the number of symmetry-distinct H-bond configura- tions for that unit cell...... 77

3.7 The top panel shows the distribution of dipole moment magnitude arising !bc!bAI§

from H-bonds along the a -direction in a 48-water unit cell (Hex ) of ice-Ih. Measured in bond dipoles, the maximum dipole moment is 24,

the number of H-bonds along the a -axis. The bottom panel shows the dis- tribution of trans H-bonds among the 96 H-bonds of the unit cell...... 79

xii 3.8 Scatter plot (top row) and three-dimensional representations (bottom row) of the distribution of H-bond isomers in ice-Ih resolved according to dipole moment and percent trans H-bonds. The three-dimensional plots best con- vey where the bulk of the distribution is located, while the scatter plots depict the locus of possible structures, regardless of their frequency. From left to right are the distribution of total dipole moment of the unit cell,

dipole moment generated by H-bonds along the a -axis, and dipole moment

generated by H-bonds lying within the puckered hexagonal sheets paral- G lel to the _ - and -crystallographic axes. The data was generated for the 48-water Hex "!bc!b§ unit cell, for which there are 2404144962 isomers satisfying periodic ice rules, of which 8360361 are symmetry-distinct. The dipole moment is reported in units of OH bond dipoles...... 81

3.9 Two examples of small unit cells with complete ferroelectric order along

the a -axis coexisting with a high percentage of trans bonds: a) a 16-water dA§ Orth "!b§ cell with 62% trans bonds, b) a 12-water Hex cell with 75% trans bonds...... 82

4.1 Isomers of the ¡¤£¦¥¨§ £ dodecahedron: a) most stable isomer, b) least sta- ble isomer yet observed, c,d) two isomers with 5 nearest neighbor 2AW’s (discussed below in relation to Fig. 4.2), e,f) isomers which have formed zwitterions...... 90

4.2 Energy of ¡¤£¥§ £ isomers plotted on the left against the number of near- est neighbor dangling hydrogen pairs or 2AW’s. The energies are calcu- lated using the OSS2 empirical model [2, 3] (light gray points), or B3LYP electronic density functional theory [4, 5] with the cc-pvdz basis set [6, 7] (black dots). The B3LYP energy points marked c) and d) correspond to isomers with those labels in Fig.4.1. From the 30026 symmetry-distinct structures, we selected for electronic density functional theory calculations predominantly those structures that are predicted by the OSS2 model to be either near the top or the bottom of the energy range, with fewer cases in between. On the right, the OSS2 energy is compared with the B3LYP en- ergy (dots). The gray line indicates perfect agreement. When the number of nearest neighbor 2AW’s is large, we encountered isomers in which the H-bond topology changed significantly upon B3LYP optimization, or for which a dangling hydrogen rotated to point toward the interior of the cage. These isomers were excluded from the plot...... 92

xiii 4.3 Spontaneous self-dissociation in ¡K£¦¥§ £ . Structures shown were calcu- lated using B3LYP density functional theory, as described in Fig. 4.2. The structure on the left is from a very flat portion of the potential surface where there is a short H-bond (2.425A),˚ outlined in the figure. An H £ O self- dissociates in this cluster with no barrier and, following charge migration

through the cluster, yields the locally stable zwitterionic structure shown

 #feDgih j

j ^d0

on the right, kml below the starting energy, containing a hydroxide ion

?

¡ ¥K£

and an excess proton held in an . -like unit. At other levels of theory, there is a small barrier to self-dissociation (shown schematically)...... 94

4.4 Hydrogen bonds between three-coordinate fall into three major classes

(top panel) in which (a) a 2AW donates to a 2DW, (b) a 2DW donates to a £

2AW, or either (c 4 ) a 2AW donates to a 2AW or (c ) a 2DW donates to a

2DW. Each major class is further broken into minor classes according to the

£

4 * \ .

topological index n [Eq.(4.1)]. Examples a , a , a , a and a (middle panel)

 ! #

  ^ illustrate H-bonds of type (a) for which no,pEF and , respectively. The normalized bond length distribution accumulated for H-bonds of type (a) is shown in the bottom panel. The bond lengths were obtained from the B3LYP/DFT optimized structures whose energies are given in Fig. 4.2. . . . 96

xiv CHAPTER 1

INTRODUCTION

The importance of water and the role of hydrogen bonds (H-bonds) cannot be overem- phasized. The molecules of life – enzymes, lipids and nucleic acids – are designed to function in the special chemical environment of aqueous solution. Chemical reactions tak- ing place in water clusters in the atmosphere determine the size of the ozone hole, the temperature of our planet, and other critical features of our environment [8]. Due to its importance, water remains as one of the most intensely studied materials and H-bond has become common textbook knowledge. It is, therefore, surprising to find out that scientists are still puzzled by many of its fascinating properties.

H-bonding in aqueous systems, is the tendency of water molecules to arrange so that the OH bond of one water (the donor) points directly to the oxygen atom of another (the

acceptor). The H-bond was first identified more than 100 years ago and was symbolized

¥ ¡rqIqIqD¥

then by a few dots [9], ] , a notation still in use today. Due to its directional nature,

the H-bond network is often described by an oriented graph (see figure 1.1).

Modeling water at the molecular level dates back to 1974 [10], but even today de-

tailed and accurate modeling of H-bonding in aqueous systems remains notoriously diffi-

cult. Strong cooperative effects suggest pure pair-wise additive water-water potentials will

1 -.11326484E+02

Acceptor Donor

Figure 1.1: Examples of the H-bond network. (a) The simplest H-bonded system. (b) An example of how the H-bond network can be summarized by an oriented graph. By conven- tion H-bonds point from H-bond donor to the H-bond acceptor. An H-bond arrangement of ¡¤?@ ¡A£¦¥¨§ B is shown on the left and the direction of the H-bonds are summarized by the oriented graph in the right.

not produce satisfactory results under many circumstances [11, 12]. Many more sophis- ticated empirical potentials have been designed to better fit experimental data and/or ab initio calculations. While these empirical potential might work very well under the condi- tions they were parameterized, their transferability is limited. The development of ab initio molecular simulations methods has undergone exciting advances in the past decade [13] and has had much success in modeling semi-conductors and metals. Unfortunately, the strong electron correlation effect in aqueous systems limits the system size amenable to accurate electronic structure calculations, because ab initio and density functional methods with adequate electron correlation (such as MP2 [14] and B3LYP [15]) are computationally very demanding.

Aqueous systems are often characterized by an enormous number of H-bond isomers.

H-bond isomers are water clusters having similar oxygen positions but different in the direction of H-bonds. For example, in the dodecahedral ¢¡£¦¥¨§ £ cluster (see figure 1.2),

2 #

there are C%E%EFDE%E%E different ways to arrange the H-bonds assuming all water molecules

are neutral. The key advance presented here is the invention of an analytical tool called

graph invariants that enables us to efficiently generate and analyze the immense number

of possible H-bond arrangements. Without this powerful tool, a systematic study of the

H-bond isomers is impossible. As a result, the differences among H-bond isomers are

often ignored by assuming that the H-bond topology plays a minor role and either one or a

few arbitrary structures are considered. We will show this common wisdom is not always

valid. In particular, we will examine two aqeuous systems where the pattern of H-bond

topology has a profound influence: H-bond ordering/disordering in ordinary ice (ice-Ih), a

fundamental problem in ice physics since 1930s, and auto-ionization of water clusters, a

previously unknown effect in water clusters.

Figure 1.2: Three H-bond isomers of dodecahedral ¡&£¦¥§ £ . They have similar oxygen #

positions and differ only in the direction of H-bonds. Among C%E$EFE%E%E H-bond isomers,

# ! C many of them are related by a symmetry operation and there exist EE symmetry-unique representatives.

3 Unlike most substances, ordinary ice when cooled down to almost absolute zero Kelvin, does not reach a single ground state with a specific ordered H-bond arrangement. The resid- ual entropy of ice has attracted enormous attention from physicists and chemists including three Nobel laureates – Giauque, Pauling and Onsager. It was realized that ordinary ice has non-zero entropy when Giauque and his co-workers carried out a series of classical thermodynamic measurements in the 1930s [16, 17]. Pauling calculated the residual en- tropy assuming that different H-bond isomers, analogous to the ones for ¡¨£¦¥§ £ shown

in figure 1.2, were all equally likely [18]. Because of the close agreement between Paul- ing’s theoretical estimate and the experimental measurements of Giauque and Stout, the ice lattice has for many years been considered a perfectly random arrangement of H-bonds.

However, there is a continuing speculation as to whether the H-bond arrangements should be random at all temperatures, or whether there exists a phase transition from an H-bond

disordered phase to a fully ordered form, recently termed ice-XI. In 1982, Suga’s group ¥¨¡

demonstrated that small amounts of s appeared to catalyze the rearrangement of H-

bonds in ice and there indeed was a phase transition at 72 t [19]. However, the nature of

the ice-Ih u ice-XI phase transition is a matter of current debate and significant interest to the experimental community [20–22]. One of the theoretical difficulties in solving this

mystery is that the number of H-bond topologies allowed in a sample of M water molecules grows roughly as #wv$!%§x , a result first derived by Pauling and later refined by Onsager and

Dupuis [23], Dimarzio and Stillinger [24] and Nagle [25]. Even in a small ice sample of

Ay

#%v%!%§ /z

4 4 ; E 100 water molecules, ^d0 different H-bond arrangements can be found.

The analytical method we developed based on graph theory provides a means of describing

the properties of this otherwise intractable number of arrangements by using only a handful of parameters.

4

Among the studies on water clusters, it is a common knowledge in the field that each

eDgih j

j R H-bond stabilizes a structure by about kml . Hence, researchers tend to assume the ar- rangement of H-bonds would have very little effect on the properties of water clusters if the number of H-bonds is constant. However, previous study in our group [1] on dodeca-

hedral ¡ £¦¥¨§ £ shows that this belief is far from true. The energy of H-bond isomers of

eDgih

¡A£L¥¨§ £

j {$E can differ by as much as k|l , even though they all have the same number of H- bonds. We will show that the arrangement of H-bonds not only affects the stability of water clusters but also their structure and chemistry. For example, the bond length of an H-bond is strongly influenced by the H-bond network surrounding it. Probing deeper, we also find the existence of special arrangements of H-bond topologies can lead to auto-ionization in

¡A£L¥¨§ £ , a rare event that occurs only two in every billion water molecules in bulk water at any give moment.

The notion that the energetics and other physical properties of water is tied to the topol- ogy of the H-bond network is not entirely new in the literature. For example, Bjerrum suggested that H-bonds in ice can be broken into two categories, depending on whether the non-H-bonded hydrogens fall on the opposite or the same sides on the H-bond [26].

Following Buch et al. [21], we denote them as cis and trans, respectively (see figure 1.3).

Based on a qualitative argument (see caption in figure 1.3), some researchers believed that trans-bonds are energetically more stable than cis-bonds. The conclusion drawn from this hypothesis would be that ice structures with highest fractions of trans-bonds are most sta-

ble. Unfortunately, Bjerrum’s suggestion does not agree with current experiments which !

favors a structure with only R%} trans [19, 20].

Even though Bjerrum’s unrealistically simple cis and trans H-bond model disagrees with experiments, it does illustrate how a feature of the H-bond network can be used to

5 trans cis

Figure 1.3: Trans and cis configurations in water dimer. From a pure electrostatic point of view, the trans configuration is energetically more stable than cis because the reduced repulsion between the non-H-bonded hydrogen atoms.

predict physical properties. The number of cis and trans H-bonds is an example of what in this work is called a graph invariant. It is a property of the H-bond topology that does

not change upon subjecting the system to any symmetry operation, such as rotation or

reflection. Physical properties like the energy are invariant to symmetry operations, if these

properties are controlled by the H-bond network, they must depend on invariant features of

the network, namely the graph invariants.

A more comprehensive way to link physical properties with H-bond topology is to use a complete set of graph invariants, not just the count of cis and trans bonds, and ex-

press the dependence of the energy ( X ) on the H-bond topology as a linear combination of graph invariants until convergence is obtained. Symmetry-invariant combinations of oriented graphs are generated using the method of projection operations from the mathe-

matical theory of groups. The result is a hierarchy of graph invariants )+H2D) H ~D)¦H ~¢€303030 where

each subscript represents additional level of complexity. This idea can be summarized by

the following equation.

HD) HmJ  H~ )¦H ~‡JQ H ~iˆ) H ~iJ‰03030

X>,‚ (1.1)

H„ƒ H ~†ƒ H ~i9ƒ

6

In our experience to date, the convergence is rapid. The advantages of graph invariants can

# ! ¢¡£¦¥§ £ C be illustrated for the EFE sysmetry-distinct H-bond isomers of dodecahedral .

As an alternative to calculating the energy of all these isomers, we calculated the energy of

only 30 randomly selected H-bond isomers. These 30 energies can be used to fit the coeffi- H

cients ( H ) via equation 1.1. With these coefficients ( ) in hand, we can predict the energy

ƒ ƒ

for every H-bond isomer from graph invariants ( )IH2D) H ~Š03030 ). A comparison of energy from

# ! C the 30-point fit vs. exact energy for all EFDE isomers is shown in figure 2.6 (for details

see chapter 2). It should be pointed out that our theory is both efficient and systematic. The

efficiency is illustrated by our treatment of the ¡&£L¥¨§ £ dodecahedron, where calculating the energy of roughly one out of a thousand H-bond isomers was sufficient to characterize the entire set of isomers. And, it is systematic, since no prejudice is introduced in selecting the important features of H-bond network that dominate the energetics.

This thesis is organized as follows: Each chapter is self-sufficient and may be read independently. In chapter 2, we first introduce the notion of graph invariants and its appli-

cation to finite (cluster) systems. It is shown that graph invariants can be used to change the

£ M enumeration of symmetry-distinct H-bond topologies from a nominally M , where is the

number of H-bond toplogies, to an MTSUM process. The ability to link physical properties

with H-bond topologies by using graph invariants is discussed by considering the ¢¡/£¦¥§ ©

cage. The predictive power of graph invariants is demonstrated by considering ¢¡/£L¥¨§ £

dodecahedron. In chapter 3, we develop a formalism for graph invariants of periodic (ice) system. First, we show that graph invariants in small unit cells are a subset of the graph invariants of larger unit cells, providing a hierarchy of approximation by which detailed calculation for small unit cells can be used to parameterize the energy of astronomical number of H-bond arrangements present in large unit cells. Secondly, we present graph

7 enumeration results for ice-Ih, analyze conflicting results that have appeared previously in the literature and furnish information on the statistical properties of the H-bond network of ice-Ih in the thermodynamical limit. In chapter 4, we present evidence of the profound influence of H-bond toplogies on the energetics, structure and chemistry of water clusters.

8 CHAPTER 2

THE USE OF GRAPH INVARIANTS FOR EFFICIENTLY GENERATING HYDROGEN BOND TOPOLOGIES AND PREDICTING PHYSICAL PROPERTIES OF WATER CLUSTERS

Hydrogen bonds (H-bonds) are long-lived structures in ice and cold water clusters, and,

to a lesser extent, in liquid water. Thanks to the strong tendency of water to form H-bonds

in a tetrahedral arrangement, our understanding of the 3-dimensional structure and dynam- ics of aqueous systems has long been couched in terms of a reduced description based on H-bond connectivity [27]. Many phenomena illustrate how H-bond topology serves as a critical structural descriptor: The zero-point entropy of ice-Ih, “ordinary” ice at atmo- spheric pressure, is thought to be a manifestation of frozen-in complete disorder among the possible H-bond topologies of the ice lattice [17, 18, 24, 25] (but see below). Trans- port properties of ice are understood in terms of defects in H-bond connectivity [28]. The language used to name the structural isomers of water clusters – “cage” [29], for example, or “cube” [30] – reflects the correspondence between H-bond topology and water cluster structure [1, 31]. The strength of H-bonds within the ice-Ih lattice has been conjectured to fall into strong or weak classes1 based on the local H-bond topology [26, 32–35], although this distinction has recently been questioned [21, 36].

1In some literature, strong and weak classes are refered as trans and cis respectively.

9 To determine the lowest energy H-bonding arrangement or construct a statistical aver- age requires, in general, a sampling of the H-bond topologies. The number of available topologies grows exponentially with system size. In 1935 Pauling [18] estimated that the

number of H-bond arrangements available to M water molecules in the ice-Ih crystal struc-

x

£Œ * ture is ‹ , an estimate that has been shown to be accurate within a few percent [24, 25].

The number of structures in a simulation cell of even one hundred water molecules is in

y B

 4 the range of E , so it might appear that enumeration or sampling of topologies for a system of this size will fall exclusively in the province of either Monte Carlo [21, 37] or more sophisticated variants of Monte Carlo [38, 39].

The purpose of this chapter is to provide analytic techniques for complicated H-bonded systems, such as the myriad arrangements of ice-Ih, which would seem to only be tractable by numerical simulations. The problem of H-bond structures in ice-Ih also happens to be

a particularly fascinating one. The experimental residual entropy of ice-Ih at 0 t is close to Pauling’s estimate [17, 18], leading to the conclusion that H-bond topological disorder becomes frozen into the ice-Ih lattice as temperature is lowered from the freezing point to absolute zero. Within the last decade, experiments have detected a phase transition in KOH-doped ice-Ih, weakly dependent on KOH concentration and tending toward 72

t in the limit of vanishing impurity concentration [19, 40–43]. This seems to indicate

that the KOH impurity catalyzes the rearrangement of H-bonds, and that ordinary ice, if

equilibrium could be attained, would undergo a proton ordering transition at 72 t . Neutron scattering [20,44–46] and thermal depolarization experiments [47,48] on KOH-doped ice-

Ih suggest that the proton-ordered form of ice-Ih, known as ice-XI, is an orthorhombic ferroelectric crystal, although the interpretation of these experiments has been debated [49–

51]. Most common potential models for water do not predict this structure as the ground

10 state, which has forced a re-appraisal of such models [21]. More recently, Antarctic ice cores have been investigated with Raman spectroscopy [52]. These samples are believed

to have been equilibrated at temperatures controlled by their depth beneath the surface for !$#

tens of thousands of years. The Raman spectra indicate a phase transition at {[t that

has similar characteristics as, but lies far above, the phase transition in KOH-doped ice-

Ih. Studies on Greenland failed to find similar evidence for a phase transition [22].

Therefore, the current understanding of ordinary ice is ripe for further experimental and theoretical insight.

H-bonds are directional, so H-bond topologies are in one-to-one correspondence with oriented simple graphs, that is, collections of vertices connected by at most one directed edge. The direction of a H-bond points from hydrogen donor to hydrogen acceptor. Enu- meration of H-bond topologies becomes an exercise in graph theory, to list all possible graphs consistent with the so-called “ice rules” [53]. These rules allow at most two edges emanating from a vertex because ¡&£¥ molecules can donate at most two hydrogens, and at most two edges incident upon a vertex since at most two H-bonds can be accepted at the

lone pairs. The ice rules are modified in an obvious way to accommodate the presence of

? ¡ species like ¥&¡ and [1].

In anything but the smallest water clusters or unit cells of , one is faced

with huge numbers of configurations. In the more rigid ice clusters – the ¢¡/£¥§ © cage, the ¡ £L¥¨§ £ ¡A£L¥¨§ B cube, the dodecahedron, to name a few – and in ice-Ih, local minima of the potential energy surface are, to a good approximation, in one-to-one correspondence with oriented graphs. How does one find the ground state or construct a thermal average with anything but numerical sampling techniques? Perhaps the most useful result of the current work is a strategy by which the physical properties, including but not necessarily confined

11 to the energy, of a large number of H-bonded structures can be summarized and predicted in terms a handful of parameters, each associated with a special linear combination of variables defined for oriented graphs called a graph invariant. The procedure by which physical properties are correlated with, and predicted by graph invariants is automatic and does not rely on special physical insight, although graph invariants will certainly facilitate deeper physical interpretation. The number of graph invariants used to fit and predict phys- ical properties can be systematically enlarged, leading to a hierarchy of approximations. To determine the energy or find the ground state among a vast number of structures, we can therefore use graph invariants to avoid explicit and often costly calculations for all but a

small training set of structures from which the parameters can be extracted. In this chapter, ¡ £L¥¨§ £ the procedure is tested for two water clusters, the ¢¡¨£¦¥¨§ © cage and dodecahedron,

for which 27 and 30026 symmetry-distinct H-bonding arrangements are possible.

Computationally, explicit enumeration of allowed graphs for H-bonded systems is rel-

atively straightforward, but eliminating structures that are related to each other by a sym-

£

y v%!

metry operation is not. Nominally, elimination of symmetry-related graphs is an M

process, where M is the number of graphs, because it involves comparison of pairs of

£

y v%!

graphs. Moreover, each of the M comparisons can be rather expensive when the

symmetry group is large. In this work we show how use of graph invariants [54, 55] can

£

M MŽSU9M change the scaling of computational effort with M from to . The desirability of eliminating symmetry-related structures is illustrated by our calculation for a 48-member hexagonal unit cell of an lattice: there are 2404144962 graphs possible in total, but only 8360361 symmetry-distinct structures. 2 With such large numbers of configurations,

2Detail discussion of periodic systems is covered in chapter 3

12 MTSUM scaling is an enormous improvement. Elimination of symmetry-related configura- tions has been attempted previously “by hand” for small ice unit cells [56]. The results of this effort are in apparent conflict with attempts to group symmetry-related structures based on energetic criteria [21, 57]. Therefore, efficient and reliable computational methods of generating H-bond graphs are needed.

Counting total numbers of allowed graphs on an infinite periodic lattice (or regular

finite structures) can be addressed by series expansion methods [24,25]. Even though series expansions have only been used, to our knowledge, for counting total numbers of H-bond arrangements in regular structures, these methods can presumably be extended to calculate certain averages with generating function techniques [58]. However, the graphs themselves, not just total numbers or averages, are needed to construct explicit molecular structures for further study, as would be needed as input for an ab initio or empirical potential calculation.

The notion of a graph invariant is introduced in section 2.1. Graph invariants often have

a simple physical interpretation, as we illustrate in this section. The use of graph invariants

£ MTSUM to change the enumeration of symmetry-distinct graphs from an M to process is

described in section 2.2. This section may be omitted by readers who are not interested

in the numerical problems associated with enumeration of H-bonded structures. When the energy and other physical properties can be correlated with topological properties of the

H-bonded network, as first noted for water clusters by Radhakrishnan and Herndon [59] and also by McDonald, Ojamae,¨ and Singer [1], then the physical properties, themselves being invariant with respect to symmetry operations, should be expressible in terms of the values of graph invariants.

The link between physical properties and H-bond topology suggests the following strat- egy for global optimization, put forward in section 2.3, which is useful even when the

13 number of such isomers is too large to enumerate or analyze: A Monte Carlo procedure can be used to generate a training set of structures [21, 37]. The training set can be used to establish a relationship between physical properties and the values of graph invariants, either by least-squares fitting or more sophisticated methods. The tentative relationship between physical properties and topology can be refined by further sampling. In this way, low-entropy structures, which may tend to be overlooked in Monte Carlo methods, can be identified by selective enumeration. Invariants can concisely parameterize the energy of a large number of H-bond arrangements, thereby permitting the calculation of physical prop- erties that involve the entire ensemble of H-bond topologies, like phase transitions, with a minimum of input. We illustrate this capability with a model calculation of a cluster phase transition for the ¡¤£¦¥¨§ £ dodecahedron in section 2.3.2.

2.1 Graph invariants

H-bonds between water molecules are directional. One water “donates” its covalently bonded hydrogen to the bond while the second water “accepts” that hydrogen in the vicinity

of the lone pair electronic cloud. The H-bonded network within a water cluster or ice crystal

is summarized by graphs in which the vertices represent oxygen atoms and directed bonds

(or edges) connecting vertices represent H-bonds. By convention, the directed bonds point

from donor to acceptor. The so-called “ice rules” stipulate that each neutral water vertex

has a maximum of 4 neighbors. At each vertex there are a maximum of two outgoing edges,

the covalently bonded hydrogens, and two incoming edges, where the two lone pairs can

accept an H-bond. The formalism can easily accommodate excess protons or hydroxide by

allowing the appropriate local deviations from these rules [1].

14

The H-bond topology of a finite or infinite system of water molecules is summarized ‘

by a collection of variables GLY , one for each vertex pair , whose value is

’ •–



  

“” if water donates to water ,



]   

GDY, if water donates to water , (2.1)

—

  EF if there is no H-bond between and .

The order of the indices on G¦Y6 is meaningful, so to describe the same physical configuration

˜ ‘

G  Yb,>] GDY . It is convenient to let a single index replace the dual index for bond pairs . It

 w is arbitrary whether ˜ stands for or , but some canonical ordering of the bond pairs must

be specified. For example, in the triangle graph of Fig. 2.1 the canonical direction chosen

 G+4™,

for G34 is from vertex 1 to vertex 2. With this convention, indicates that molecule 

1 donates to molecule 2, while G4K,š] specifies that molecule 2 accepts a H-bond from molecule 1. In Fig. 2.1 we give some examples of H-bond arrangements in a triangular cluster of three water molecules, and the value of bond variables that specify these physical configurations relative to the canonical orientations given to the left in Fig. 2.1. In general,

once the canonical orientation for each directed edge is specified, the physical meaning of



˜  GLH›,  

GDH is clear: If stands for , then indicates that water donates to water , while if



w GLH=,„] ˜ stands for , then must be used to indicate the same arrangement.

Symmetry properties of a cluster or crystal are manifested by a group of permutation operations mapping the set of vertices onto themselves. The list of adjacent vertices (ver- tices connected by a bond, irrespective of the bond’s direction) is preserved by each of the symmetry operations. It is important to note that the symmetry group pertains to the oxygen atom “scaffold”, and is not dependent on particular orientations of H-bonds. The group of symmetry operations for the vertices of the triangular graph shown in Fig. 2.1 are

the #Fœ vertex permutations given in the second column of Table 2.1. The group members

are labelled by corresponding real-space point group operations in Table 2.1 to provide a

15 1 b1 b3

b = 1 b = 1 2 3 1 − 1 + b 2 b2= −1 b2= −1 b3= +1 b3= +1

Figure 2.1: A simple example of an oriented graph, which might represent the configuration ¢¡ £¦¥¨§

of a * cluster, is shown on the left. The direction shown on the edges indicate the

 J

orientation of the edges if all the bond variables G H were taken equal to , canonical ¢¡/£¦¥§

orientations chosen arbitrarily for each bond. Two different H-bond topologies for * are shown on the right, along with the value of the bond variables, as referenced to the canonical orientations of the graph on the left.

useful mnemonic, but fundamentally our formalism deals with connectivity, or topology, and not geometry.

We stress that the utility of the graph formalism is not at all dependent on the physical

isomers having the full symmetry of the vertex and bond permutation groups. Consider two H-bond topologies that are symmetry related. They each correspond to distorted lo- cal minima, possibly have low or no symmetry. As long as the topologies are symmetry related, then the energy and any other physical property of the distorted local minima cor- responding to those topologies will be identical. Therefore, parameterization of energy of distorted structures in terms of invariants based on higher symmetry of the oxygen scaf- fold is appropriate. While the values of bond lengths and angles are irrelevant in the graph

theoretical formalism, it is not true that physical geometry is entirely irrelevant. Firstly,

adjacent bonds reflect physical proximity. Secondly, the symmetry group may be chosen to

16

[ž vertex permutation signed bond permutation

£ § G*

E (123) G24¦G



£ §

Ÿ

4

G 3] G*I3] G34

* (231)

£¦§

Ÿ ]¤G*G24L3] G

* (312)

£ §

GL*I2]¤G G24

d (132)

£¦§ ] G34¦G*IG

d (213)

£ §

] G 2]¤G24¦3] G* d (321)

Table 2.1: Group of vertex permutations for the triangle graph shown in Fig. 2.1, and the

induced group of signed permutations on the bonds of the triangle graph. For the vertex

§

 ¡ permutations, we denote the permutation taking vertices 1,2 and 3 to  and as . It

is also common to indicate permutations in terms of independent cycles, in terms of which, §2 !$#w§

for example, the  permutation would be written as . Our notation for signed bond permutations follows that for vertices.

reflect expected physical geometry, which may lower the symmetry from that based solely on connectivity.

As a hypothetical example, consider a ring of five vertices which represent a geometry in which four vertices are coplanar and the fifth lies far outside the plane. The symmetry

group determined by vertex adjacency is '¢.£ , but one may choose a smaller symmetry Ÿ

group, perhaps ~ , to reflect the non-planarity of the ring. However, use of the higher symmetry group may still be appropriate. Consider geometry optimizations initiated from

a planar starting structure. If two planar initial structures which are equivalent within '¤.£

symmetry optimize to the same distorted structure, then the energy of the isomers will be

Ÿ ~ described by the more compact invariants of the '¥.£ symmetry group. The symmetry

group will generate a larger set of initial structures and (if they exist on the potential en- ergy surface) will enumerate more physical isomers. In this situation, the symmetry group should be chosen to suit the goals of the calculation and the properties of the potential energy surface.

17

The symmetry group on vertices induces a group of signed permutations on the bonds.

§

%ž GH

We signify the image of group operation on bond G2H with the notation . For exam-

ƒ Ÿ

ple, the * operation on the triangle graph in Fig. 2.1 brings vertices 1 and 2 to vertices 3

and 1, respectively. (Also see Table 2.1). Therefore, bond G[4 is moved to the location of

 G+4,¦J

bond GL* . By the convention chosen in Fig. 2.1, indicates an H-bond from 1 to 2,



Ÿ * and G*9,‚J indicates an H-bond from 1 to 3. However, the operation takes a bond from

vertex 1 to vertex 2 to another bond from vertex 3 to vertex 1, not from 1 to 3. Therefore,

§

Ÿ

%§%¨

* ] GL* GL* G24 ,„] G* the image of G34 under the operation is , not , and .

2.1.1 Generation of graph invariants

£

«+G+4¦DG 303030GH2303030­¬ We seek functions of the bond variables ©ª, that, like physical properties, are unchanged by application of symmetry operations. These special functions,

the invariants, have the property

§§ § £¦§ § §

$ž [ž [ž

© ,:) G34  G 203030 ,®) © 0

) (2.2)

§ © The functions ) transform as the totally symmetric representation of the induced group on bonds. According to standard group theory, invariants can be constructed using a projec- tion operator, which takes a particularly simple form because the characters for the totally symmetric representation are all unity. For example, the application of a projection operator

to a single bond variable takes the form,

§

Ÿ

[ž

H°¯ GH 

) H=, (2.3)

ž±

4 Ÿ

where H is a normalization constant chosen for convenience. Now consider the application $²

of a group operation to ) H .

§ §<§

Ÿ

[² +² [ž

¯

)¦H , H GDH 0

 (2.4)

ž± 4

18

§§

`² $ž ,

According to the requirements for group operations the composition 03030 ,

ƒ

 ! §<§

`² [ž

303030<³Z 03020

 generates each of group operations once and only once. [If and

§<§

+² $ž w² 02030

] gave the same resultant operation, then would fail to have a unique in-

§

[² ,‰) H verse.] Therefore ) H .

By precisely the same reasoning, )IH~¦D)¦H ~¢€303030 , as defined below, are also invariants.

§

Ÿ

[ž

) H~´, H ~µ¯ GHG¦~ 

 (2.5)

ž±

4

§

Ÿ

[ž

¯

GDHDG¦~ GD  H ~¢ 

) H~¢š, (2.6)

ž± 4 .

. )2H ~ We refer to ) H as a first-order invariant, as a second order invariant, and so on. It is obvious that the order of the indices of bond generators does not change the invariants (i.e.

) H~@,®)3~iH ).

The number of first order invariants is generally less than the number of bonds for several reasons. Many invariants may turn out to be identically zero. The necessary and

sufficient condition for any invariant )IH ~i‘¶6¶6¶ to vanish identically is the existence of a group

operation that takes the product of bond variables G HGL~ G03030 into minus itself, as shown

in the section 2.5. In many cases, it is possible to find an operation which takes a single ]¤GDH bond GH into and therefore many first order bond invariants vanish identically. The

number of first order invariants may be less than the number of bonds for another reason.

§

[ž

,¸·¨GL~ G2H G¦~

If GDH , application of the projection operator in Eq. 2.3 to and yields the )3~ same result within an overall constant, in which case )H and are equivalent. The number of different first order invariants depends on the number of independent orbits of the bond group, which in turn depends on the structure of the induced group on bonds. Similar con-

siderations apply to the higher order invariants. When all bonds are filled, and therefore all

19

 ) H€H~¢º¹I¶6¶6¶F,>)2~¢º¹I¶6¶6¶ bond variables GHK,7· , . There can be no linearly independent invariants of order greater than the number of bonds when all bonds are filled. Given a group of sym- metry operations expressed as permutations of vertices, we have found symbolic algebra programs convenient for computationally generating the induced group on bonds, and then a table of independent invariants.

Let us use the triangle graph of Fig. 2.1 as an example to illustrate the properties of invariants mentioned above. Directly from Table 2.1, it can be seen that all the first order invariants are identically zero. As for second order invariants, after convenient normaliza-

tion we have that

£ £ £

££

£

)+44@,®) ,®)2**», G J G JµG *

4 (2.7)

£ £ £ £

,®) *=,®)+4¢*», G24

)+4 (2.8)

£ £

G* J&G24

Since the image of G34

£ £

G* )%4 *

] G34

£ £ £

££ £ § § £¦§

£

)+4 ,„] G J G* G J G24‡J G* G ] G24|]½G G  * 4 (2.9)

may take non-zero values if one of the bonds is empty.

Products of invariants are also invariant. Therefore, products of two first order invari- ants can be expanded as a linear combination of second order invariants, products of first

and second are a linear combination of third order invariants, and so on.

H€5 ~

)¦HD)3~´,  a ) º¹

º¹ (2.10)

º¹

H€5 ~¢

)¦HD)3~i1,  a )¦¹¦ ¾

¹L ¾ (2.11) ¹L ¾ . . . , .

Standard group representation theory governs the resolution of invariant products.

20 2.1.2 Physical interpretation of graph invariants

The invariants can be interpreted in terms of physical quantities, often in terms of sev- eral such quantities. For example, second order invariants of the triangle graph (Fig. 2.1)

with identical indices like )%44 in Eq. 2.7 simply count the number of non-empty bonds

within a group theoretical orbit. If we would write the dipole moment of water molecules

( l"Â in terms of bond dipoles of magnitude ¿ÁÀ , the total dipole moment due to ring dipoles

would be



(+ ! £ §

l"Â

!

]¤G24ÄJ G JµG*  , ¿bÀ

¿bà (2.12)

#

([ §

l"Â

!

, ]ÇÆ ¿bÀ G24‡J G* 0 ¿bÅ (2.13)

The squared magnitude of the dipole moment is

# 

£ £ £ £ £ £ £ £

! £ £ § ! §Ê

(ÉÈ £

l"Â

¿ Jµ¿ , ¿ G Jµ^wG J G ] G24

à ŠÀ 4 * 4 *

^ ^

£ £ £ £

§ £ £ § Ì £Ï

(KË £ (|Í

*

l"Â l"Â

, ¿ G J G J G ] G24

À 4 * À

a quantity invariant to symmetry operations. It therefore comes as no surprise that sec- ond order invariants nicely capture the dependence of the squared magnitude of the dipole moment on H-bonded configurations.

Second order invariants can always be understood in terms of counting non-empty

bonds and the magnitude of the dipole moment. However, these are not the only possi-

B ;™,1) .5 ble interpretations of the second order invariants. For example, the invariants )%*5

of the ¡ £¦¥¨§ © cage structure shown in Fig. 2.2 are connected to the number of single- donor/single-acceptor molecules, a feature which strongly affects the total energy of the

cage [31]. The precise relation is



§ 

!

] ) *5 ;@0 number of single-donor/single-acceptors , (2.14)

21 Incidentally, the ¡¤£L¥¨§ © cage is an example of a structure which has non-zero first order

invariants. The complete list of invariants for the ¢¡¨£L¥¨§ © cage is given in Table 2.2. The B

first order invariant )3*=,®G*ÐJÑG.dJÑGL;JÑG counts the number of H-bonds among bonds 3, 5,

 £ JG244JG24 7,and 8 pointing away from the center of the cage, while the invariant )$8=,®G8 JG24

has a similar interpretation for bonds 9, 10, 11 and 12. (See Fig. 2.2 for the definition of

the bond variables. Later we explain the rationale for including four bonds incident on the

apical vertices, even though the apical waters participate in only two H-bonds and have

only one dangling hydrogen.)

b10 E= -8.28775002457822296 (Cage) b9

b 3 b7

b2 b1 b6 b4

b5 b8

b11 b12

Figure 2.2: The cage structure of ¢¡&£¦¥¨§ © . One of the 27 possible symmetry-distinct H-

bonding arrangements for the cage structure is shown. The arrows and bond labels indicate 

the directions of the bonds when the bond variables are equal to J .

While we have stressed that invariants possess physical interpretations, their power lies in the fact that their generation and use can be automated. Trends can be deduced with- out reliance on physical insight or tedious trial-and-error process. Better yet, the use of

22

1st order invariants

B

GL*|J G.|J GL;mJ G

 £

GL8|J G24 J G244‡J G24

£ £ £

2nd£ order invariants

£ ©

£ £ £ £

G34 JµG J GD\ J G

B

£ £ £ £

GL* JµGL. J GL; J G

 £

GL8 JµG34 J G244 J G24

£ ©

G GD\c]zG34FG

B

GL*ŠGL;mJ G.bG

 £

GL8ŠG24 JµG244dG24

£ £ © ©

G34FG ]zG34FGD\ÎJ G G ]zGD\bG

B B

GL*ŠG.|J G.bGL;mJ G*ŠG JµG¦;ŠG

B  £

GL.ŠG8|J G G24 J G*ÒG344‡J GL;ÐG24

B  £

G G8|J G.bG24 J GL;ŠG344‡J G*ŠG24

 £  £

G244ÁJ G8ŠG34 J G24 G24 GL8ŠG244ÄJµG24

£ © £ B © B

G34FG*|J G G*¼]½G24G.|J GD\bGL.¼]½GD\bG¦;mJ G GL;¼]zG G ]½G G

£  ©  £ £ © £

G34FG8¼]zG\ÒG8|J G G24 J G G34 ]zG24FG344Î]½G G244‡J GD\bG34 ]zG G34

£ ©   © £ £ £

G G8|J G G8|J G24FG24 ]½GD\bG34 J GD\bG244Î]zG G244@]½G24dG34 ]zG G34

£ © © £ B B

GL*ŠGD\ÎJ G G.¼]½G*bG J G.ÒG ]½G24dG¦;Ó]½G G¦;mJ G24dG ]½GD\bG

  B £ B £

GL*ŠG8|J GL;ÒG8|J G*ŠG24 J GL;ÐG24 JµGL.ŠG244‡J G G244‡J G.ÒG34 J G G24

Table 2.2: First and second order invariants for the ¡&£L¥¨§ © cage structure shown in Fig. 2.2.

The invariants are calculated using a permutation symmetry group on the vertices isomor- £(

phic to the ' point group. In the text we refer to the invariants by any of the bond

products that generate the invariant by application of a projection operator. For example,

£ £

)w4564=,Ž) 5 03020

)2*A,Ž)2.A,Ô03020 is at the top of the list of 1st order invariants, while heads the

89,®)2;<5 89,Õ03030 list of 2nd order invariants and )I*5 is at the bottom of the list.

invariants can guide the discovery of physical interpretation, or show that several interpre- tations are equivalent. We will show that certain critical physical properties are captured by surprisingly low order invariants because the constraints of the ice rules link properties in non-obvious ways.

23 2.2 Graph invariants as a tool for enumerating H-bond topologies

Graph invariants can be used to change the computation of all symmetry distinct H-

£ M bond topologies for a cluster or crystal unit cell from a process scaling as M , where

is the total number of topologies, to an MTSUM process. Readers who are not interested

in the computational aspects of enumerating H-bond arrangements can proceed to the next section.

Enumerating all possible H-bond arrangements for a cluster is accomplished by con-

( (

l"Â l"Â  À

sidering each of the OfÀ H-bonds in turn. After completing the assignment of out of

( (

l"Â l"Â  À

the total ObÀ H-bonds, a list of all H-bond topologies for the first bonds consistent 

with the ice rules is in hand. Each entry of this list is a sequence of 1’s, ] ’s, or 0’s of

( 

l" · length €À for each configuration. The ’s stand for the two orientations of an H-bond,

and the 0 means the bond is left empty. Usually there are just the two orientations signified 

by · but in certain cases, discussed below, we will see that also allowing a bond to be empty is useful. Instead of using bond variables, the H-bond configuration is sometimes parameterized by the arrangement of edges incident at each vertex [57,60] (e.g., the 6 bond arrangements at a vertex of a 4-coordinate water). The advantage furnished by invariants

does not depend on how the H-bond configurations are parameterized.

( §

l"Â J

The addition of bond À means making a new list of H-bond topologies by

( §

l"Â J attempting to fit in all orientations of the  À th H-bond with each member of the old

list. When an orientation of the new bond is allowed by the ice rules, that configuration

( 

l  J

of "À bonds is added to the new list. Eventually a new list containing all possible

( 

l"Â J

orientations of H-bonds among À bonds is completed, and the process of adding

( !

l  "À bond J is started. After some or all of bonds are added, the list of configurations is

24 checked to eliminate configurations that are related by a symmetry operation. Symmetry-

related configurations can be safely eliminated from the list when only a fraction of the

( (

l"Â l"Â ObÀ bonds have been added, for if two partial (that is, containing  À out of the full

bonds) configurations are symmetry-related, then the configurations grown from these two by filling in the rest of the bonds will also be symmetry related. Eliminating symmetry re-

lated graphs is the difficult step in enumerating H-bond topologies. It involves comparing

£ M pairs of graphs and is therefore nominally an M process, where is the number of graphs before symmetry comparison. Symmetry comparison involves applying each group opera- tion in turn and checking if a match is found. Two graphs are determined to be symmetry distinct only after all group operations are applied. Therefore the pairwise comparisons are a lengthy process, more so when the symmetry group is large and symmetry reduction offers the most benefit. By comparison, checking the ice rules after adding an H-bond, a

process that scales as M , is far less costly. The cost of the symmetry comparisons can be

reduced by only performing this check after certain of the H-bonds have been added. It is

best to order the bonds into cycles or orbits, groups of bonds that are related by a symmetry

operation, and check for symmetry only after all bonds of an orbit are added. However,

even with this strategy, the computational cost of checking for symmetry grows rapidly

with system size and would quickly render most large calculations infeasible if there was £

not some way around the M scaling. £

In this section we demonstrate that, using graph invariants, the nominally M process

of symmetry comparison can be turned into a calculation that scales like MTSÖUM . The

strategy is one of divide and conquer. The set of M graphs is sorted into groups of target

size O , such that graphs in different groups must be symmetry distinct. Within each group

£

x O a conventional ( -scaling) symmetry comparison method is used. Since there are  such

25

x 

groups, the total work associated with conventional symmetry comparison scales like Â

£ ,‚M†O

O , linear with the number of graphs. If sorting can be made to scale more efficiently £

than M , the overall efficiency of symmetry comparisons will be improved.

2.2.1 Sorting strategy

The value of all invariants of two symmetry related graphs must be identical. There- × fore, if we divide M graphs into groups, each one with a different value of a particu-

lar invariant, symmetry comparisons need only be done within each group. The work of ×

calculating the value of that particular invariant for all M graphs and sorting them into

£

Ø x

Œ ‹ groups scales like M , while the work of symmetry comparisons now scales like within

each group. After separation into × groups, the work of symmetry comparison scales like

£

x xmÙ

Ø Ø

Œ

× , × ‹ , an improvement by a factor in efficiency.

Instead of using just one invariant to sort the graphs into smaller subsets, imagine using Ú Ú different invariants to sort the graphs times. For simplicity, we will assume that the

graphs are sorted into × equal piles according to each new invariant. After employing the

first invariant the graphs are divided into × groups. Then the next invariant divides each of

£

Ú ×

those groups into × subgroups, making a total of groups. Finally, after such sorts, k

the graphs are partitioned into × groups of size

M

ON, 0

k (2.15) ×

The goal is to reduce the groups to a target size O which is small enough to employ a conventional symmetry comparison method. From Eq. 2.15, the number of sorts required

to reach a target size O is

v §

SU M O

Ú 0

, (2.16) SU×

26 Each time the graphs are sorted, an invariant is calculated for each of the graphs, and

that graph is either labelled or moved to another location in memory or on disk. The cost Ú

of each sort is proportional to M , the total number of graphs. The computational cost of

sorts is proportional to

v §

SU M O Ú

Mp,‚M (2.17) SU×

Associating a coefficient Û with the computational cost of sorting the graphs into Ü groups of target size O , and another coefficient associated with the conventional symme-

try comparison within each group, the total work of eliminating symmetry related graphs

scales like

v §

£

M Û SU›O SÖU M O

J Ü O , MTSÖUMÝJšÞÐܙOÇ]½Û M10

ÛKM (2.18)

SÖU× O SU9× SU×¤ß M

The total work contains components that scale as MTSUM and as , far more efficient than £

conventional M -scaling symmetry comparison. × We arrived at MTSUM scaling by assuming that each sort breaks the graphs into groups

of equal size. Actual computations are more complicated. The number of groups into which the graphs are sorted is the number of different values an invariant takes over the set of graphs. This varies from invariant to invariant, so invariants differ in their ability to resolve the graphs into smaller groups. Moreover, in each sort the graphs are, in general,

broken into groups of unequal size. Therefore, the parameter × used in Eqs. (2.15-2.18) must be taken as an average or effective number of groups. The basic idea is confirmed,

and evidence presented below shows MTSUM scaling in realistic calculations.

2.2.2 Performation of sorting algorithm for realistic calculations

We have previously enumerated the 30026 symmetry distinct H-bond topologies of the

¡A£L¥¨§ £ dodecahedral clathrate [1]. After the addition of each H-bond to the structure, we

27 determined which graphs were related to others by a symmetry operation and eliminated all but one representative from each set of symmetry related structures. If symmetry-related graphs can be eliminated from smaller sub-structures, then we avoid the work of adding and testing redundant new structures built from the symmetry related structures. The process of adding occupied bonds one at a time furnishes a data set on which we can compare different methods of graph enumeration. Our previous calculation [1] did use a crude version of

what we now call invariants to speed up symmetry comparisons, but ultimately it was an £

M -scaling method because the graphs were not sorted as described in the previous section.

The CPU time needed to eliminate symmetry-related duplicate structures is plotted against the number of graphs after each of the 30 bond additions in Fig. 2.3. The CPU time clearly

increases faster than linear with the number of graphs, and the old method even appears to

£ M scale with M more steeply than .

The same ¡ £¦¥¨§ £ dodecahedron calculation was repeated using graph invariants to

sort the graphs until each group contained no more that O†,TR[E%E structures. The CPU time

for sorting is plotted against the number of graphs in the top panel of Fig. 2.4. It is difficult MTSUM

to distinguish whether the computational cost is actually scaling as M or (actually £

a happy state of affairs!), and clearly the calculation no longer scales as M or worse. The

CPU time for symmetry comparisons within the group of size O>PªR$E%E is shown in the

bottom panel of Fig. 2.4, clearly compatible with linear scaling.

At the time we first published the enumeration of the 30026 H-bond topologies of the

¡A£L¥¨§ £ dodecahedron [1] it was a challenging calculation. With the help of graphical invariants, this calculation is rather quick. The most challenging example we have tack- led to date is the enumeration of the 8360361 symmetry-distinct H-bond topologies of a

28 800

(Seconds) N N2

bond 600 N3

400 CPU time/n 200

0 0 100000 200000 300000 N (number of graphs)

Figure 2.3: Enumeration of all symmetry-distinct H-bond topologies for a dodecahedral ¡A£L¥¨§ £ was performed by considering a sequence of structures containing fewer bonds than the full dodecahedron. Additional H-bonds were added to the structures after all symmetry-related duplicates were eliminated. This process furnishes data on the com- putational cost of eliminating symmetry-related structures as a function of the number of graphs. This data shown is for the calculation as performed in Ref. [1], without the use

of the sorting method introduced in this work. The computational cost per graph edge is £

plotted as a function of the number of graphs M before symmetry comparisons were made.

*

£ M Least squares fits of CPU time to MNM and clearly show that the computational cost

scales as M or worse without the sorting method.

48-water hexagonal unit cell from the ice-Ih lattice. There are 2404144962 H-bond topolo-

gies of this system, including symmetry-related configurations, indicating how valuable the symmetry-reduction can be.

29 12 N

(Seconds) 10 N lnN N2 bond 8

6

4 CPU time/n

2

0 0 100000 200000 300000 N (number of graphs)

15 N

(Seconds) N lnN N2

bond 10

5 CPU time/n

0 0 100000 200000 300000 N (number of graphs)

Figure 2.4: Data for the same calculation as in Fig. 2.3, this time employing the sorting method introduced in this work. CPU time per graph edge for sorting the graphs is plotted

against the number of graphs M in the bottom panel. The total CPU time for symmetry

comparisons within groups of size OŽPpR[E%E is shown in the top panel. Least square fits

£ MTSUM clearly show that the computational cost scales as either M or in each case, and

definitely not like M as in Fig. 2.3. On the basis of arguments presented in section 2.2 we

expect MŽSU9M scaling in the bottom panel and linear scaling in the top panel.

30 2.3 Correlation and prediction of physical properties from H-Bond topology using graphical invariants

The number of hydrogen-bonded arrangements for a given water framework, increasing

exponentially with system size, quickly grows beyond the point where it is practical to cal-

culate the energy, or other properties, of each structure by ab initio, or even semi-empirical or empirical potential methods. If physical properties correlate with H-bonding topology, then connecting physical properties with features of the H-bonding network provides a new means of understanding H-bonded structures, and a new route to predicting their properties based on limited input. Physical properties are themselves invariant to symmetry oper- ations. If a correlation exists between physical properties and H-bonding topology, the

graph invariants furnish the required connection. àDá†â[ãåä3âLç 2.3.1 àáNâ[ãåä æ and

Our first test of the correlation between H-bond topology and physical properties is the

¡A£L¥¨§ © cage. Using the semiempirical PM3 method [61], Tissandier et al. recently deter- mined optimized geometries of isomers corresponding to each of the 27 H-bond topologies possible for the ¡¤£¦¥¨§ © cage structure (Fig. 2.2), and calculated the energy and dipole moment at each optimized geometry. Where comparison could be made, they found the semiempirical energies consistent with previous ab initio results [62–66]. Here we take those results and test whether these properties correlate with H-bond topology, and, if so, how effective invariants are in fitting those properties. Of course, what we are testing is not dependent on the absolute accuracy of the PM3 method. As long as both PM3 properties and those from more accurate methods exhibit a similar level of dependence on H-bond

31 topology, then PM3 properties can be used to gauge the effectiveness of our graph invariant technique.

Before discussing how graph invariants apply to the ¢¡¨£L¥¨§ © cage, we pause to discuss

the use of both filled and empty bonds in this case. The canonical orientations we arbitrar-

ily choose for the cage structure are shown in Fig. 2.2. The actual bond orientations are

specified relative to the canonical orientations. For example, for the physical configuration

  G¦;,èJ of H-bonds shown in Fig. 2.2, G¦*™,1] while . The apical water molecules (top and bottom of the cage structure in Fig. 2.2) are only 2-coordinate. When these molecules are single-donor/single-acceptors, ab initio calculations on ¡£L¥¨§ © have shown there are

two minimum energy positions of the apical waters that arise from these molecules accept- ing a hydrogen bond at either of their two lone pairs. The isomers that arise in this case are

conveniently enumerated by adding two “ghost” atoms for each of the 2-coordinate waters,

and to which the the 2-coordinate waters can be treated as donating a hydrogen bond. [31]

Of course, we only allow the ghost atoms to be a H-bond acceptor, not donor. Bonds 9,10,

 £

G244 G34

11 and 12 in Fig. 2.2 involve ghost atoms. The variables G38IG24 and can sometimes

B 

G · take the value 0, while GI4 - only assume the values .

The vertex and bond permutation group of the ¡&£L¥¨§ © cage was taken to be isomorphic

£( £( ' to the ' point group, although each of the 27 isomers is distorted from perfect

symmetry. As discussed earlier in section 2.1, use of graph theory and invariants for the

¡A£L¥¨§ © £(

cage in no way requires that the isomers have ' symmetry. We regard the value of the energy or other physical properties for each of the isomers as a 27-dimensional vector that we wish to express, in a least squared sense, as a linear combination of several 27- dimensional vectors which contain the value of one of the invariants of Table 2.2 for each of the 27 isomers. There are a total of 18 first and second order invariants, 2 first and 16

32 second order, for the ¡¤£¦¥¨§ © cage structure. However, the number of independent vectors from among the first and second order invariants available to fit physical properties turned out to be much less than 18. Because of constraints among the bonds imposed by the ice

rules, some invariants evaluate to be linearly dependent on others. Second order invariants

G¦H™ê,®E with repeated indices ( )3YéY ) merely count the number of filled bonds ( ) within a group theoretical orbit or cycle. Some cycles contain bonds which are always filled bonds, and therefore they give rise to an invariant that is represented by a 27-dimensional vector all of whose components are identical. The existence of more than one such “bond-counting” invariant is another source of linear dependence. For the ¡¨£L¥¨§ © cage structure, the sources

of linear dependence just mentioned conspire to limit the number of independent first and

second order invariants available to fit physical properties to 9.

The ability of the first and second order invariants to capture the trends in energy and

squared total dipole for the ¡K£¦¥¨§ © cage is confirmed in Fig. 2.5. The root-mean-squared

eDgih j



j

E0 R

deviation of the invariant fit from the actual energy is kml and the maximum deviation

eDgih j eDgih j

#%ë

j j

0 ìF0í{ kml is kml , compared to a range of between least and most stable isomers. Specifi-

cally, the 9-invariant fit portrayed in Fig. 2.5 is given as

§´ð ! !$# ë £  ë # #

Fa _`î

X EF0íì%R$C$)2*Ó] 0 R+)28ÄJ E0íC%C ) 5 \|J E0é{%R$R[)2*5 ;ÄJ E0 R )2*5 .Ó]½EF0 ì )2.5 8

Ú¤ï

î

 # £ 

R[ì$)+45 *Ó]½EF0 C$^$) 5 8mJ EF0íR%{ )2*5 \¼

J¨E0 (2.19) ¡ £L¥¨§ ©

where the invariants )3H2D) H~ for the cage are defined in Table 2.2. Since there are linear dependencies among the invariants, the linear fit in Eq. (2.19) could be written in

many different ways, although the fit itself is uniquely defined. The root-mean-squared

£ £ ìF0ñ^%'

and maximum deviations for the squared dipole moment are ^d0é{[' and respectively,

£ %

compared to a range of ì%' between least and greatest squared dipole moment. Further invariants – third, fourth, and even higher order – could have been used to fit physical

33 properties. It is enlightening that physical properties of the ¡¨£¦¥§ © cage do correlate well with the H-bond topology, and encouraging that first and second order invariants of the

H-bond topology can reasonably capture the trends in physical properties. Actually, in

Fig. 2.5 we show that only 4 out of the 9 available first and second order invariants are

really important in describing the cluster energy.

Similar encouraging results are obtained upon examination of a dodecahedral cage of

20 water molecules. We have calculated the optimized geometry, and energy at that geom-

etry, for each of the 30026 symmetry-distinct isomers of the ¡¨£L¥¨§ £ dodecahedron using

the OSS2 empirical potential [2]. The data in Fig. 2.6 confirms that the energies of ¡/£L¥¨§ £

isomers are correlated with H-bond topology, and that the trend is captured well by the 7

linearly independent second order graph invariants for the dodecahedron. (All first order

invariants for the dodecahedron are identically zero. There are 8 second order invariants,

but one linear dependence among this set caused by factors mentioned above in the discus-

sion of the ¡ £¦¥§ © cage.) The energy as a linear combination of the 7 independent graph invariants was determined by least squares fit using 20 isomers randomly selected from the full set of 30026 as a training set. Fig. 2.6 shows that the second order graph invariants can be effectively employed to predict the energy of the remaining isomers based on data from a small training set. The fit could be further improved by using still more isomers in the training set, or by including third and higher order invariants. We did not pursue these refinements since even the lowest level calculation seemed quite adequate for estimating thermal properties or selecting candidates for the lowest energy isomer.

In our original work on the ¢¡K£¦¥¨§ £ dodecahedron [1] we noted that the energy of various isomers largely depended on a single topological feature, the number of nearest neighbor pairs of double acceptor waters in the clathrate structure. One hydrogen of each

34 10

8

6

invariant fit 4

2

0 0 2 4 6 8 10 PM3 energy (kcal/mol)

120

100

80

60 invariant fit 40

20

0 0 20 40 60 80 100 120 PM3 µ2 (D2)

Figure 2.5: In the upper panel, we test the degree to which the energies of the 27 isomers of the ¢¡ £¦¥§ © cage are correlated with H-bond topology, and the effectiveness of graphical

invariants in capturing that trend. The V -coordinate is the energy of the isomers using the

PM3 semi-empirical theory. The W -coordinate is the result of a least squares fit to these energies using all 9 linearly independent first and second graph invariants (filled symbols), or just 4 out of those 9 that proved to be most important (open symbols). If the fit was perfect all points would lie along the straight line. The lower panel exhibits the fit of the squared dipole moment to 9 linearly independent first and second order invariants.

35 40

30

20 invariant fit 10

0

0 10 20 30 OSS2 energy (kcal/mol)

Figure 2.6: This plot evaluates the degree to which the energies of the 30026 isomers of the dodecahedral ¡¤£¦¥¨§ £ cage are correlated with H-bond topology, and the effectiveness of

graphical invariants in capturing that trend. The V -coordinate is the energy of the isomers

using the OSS2 empirical potential [2]. The W -coordinate is the result of a least squares fit to these energies using the 7 linearly independent second graph invariants. If the fit was perfect all points would like along the straight line. A training set of only 20 randomly selected configurations was used to parameterize the energy as a linear combination of invariants.

of the 10 double acceptor waters is a “dangling” hydrogen and does not participate in a H-

bond. Therefore, the energy of the dodecahedral clathrate is determined by the number of

nearest neighbor dangling hydrogen pairs. The number of such pairs is indeed a topological

feature captured by graph invariants of the dodecahedron at second order. Therefore, the

success of the second order graph invariants in Fig. 2.6 is not surprising. However, the

procedure used in this work is entirely automatic. It can be applied in situations (we have

ice-Ih in mind) where a serendipitous discovery of an important topological feature, like

36 the dangling hydrogen pairs of the dodecahedron, is lacking. The graph invariants provide

a non-serendipitous route for discovery of such important topological features.

2.3.2 Using invariants to calculate phase transitions: àáÑâ[ãåä2â¦ç dodeca- hedral as a dry run

The potential energy surface for ice or cold water clusters consists of a number of deep

minima, each corresponding to a different hydrogen bond topology. Working within the

framework of classical statistical mechanics, the classical configuration integral, for these

systems can be written as a sum of contributions from each of the ò symmetry-distinct

local minima of the potential energy surface [67–70].

ô=ø ø ôcø

²3ù‡úíû€üfý ²¡ £¢ ²§¦ ùfúíû€übý ùfúéû€ü ý©¨

¢

   

ó@ô

,  þ Y õ¥¤ öw÷

,‚õ1öw÷ (2.20)

¢

±

Y 4@ÿ

§

M |DM

We use a bold face to stand for , the number of hydrogen and oxygen atoms.

÷%Y Y

The position of the atoms at the  th local minimum is denoted as , is a -dimensional

x §

 X¤Y ÷ 

integration domain about the th minimum, Y is the potential energy at the th

minimum, and Y is the number of symmetry-related configurations which are represented ÿ by one symmetry-distinct configuration. The canonical partition function of the system is

given as

ø ôcø

²¡ £¢ ²§¦ ùfúíû€übý ùfúéû€ü ý¨

¢

  



ô

Y

ô

œ



, þ ¤ öw÷ õ

 (2.21)

*

¢

ÿ

±

Y 4

ø

²+ú© £¢ ý

¢ ¢

?



þ

 Y

 (2.22)

±

Y 4Îÿ

ô

x ! x

œ œí§

œ

  

* *

*

M" M

In keeping with the notation of Eq. (2.20), stands for

The contribution of each isomer to the partition function is determined by the potential

energy XY of the isomer, and an integral over “vibrational” or “phonon” fluctuations about

37

the  th local minimum of the potential energy surface whose contribution we call the vi-

ô ô

§ §

ÛK YºÀ"5 Y  ÷ ]# ÷

brational free energy . In a harmonic approximation, Y would be taken

ô

÷ ÉY

as a quadratic function in deviations from Y and the range of integration over could Y

be safely extended to all space. It also might be a reasonable assumption to replace Û/ YºÀ"5

ø ø

²¡ £¢ ²

¢



  

ð



ô

þ

 Y

by an average value Û¤ YºÀ for each of the isomers, in which case .

±

Y 4Óÿ Consider now the calculation of the average energy as a function of temperature. Under the

framework of classical statistical mechanics, by equipartition the vibrational contribution

£

4 %$'& to the average configurational energy is  for each degree of freedom, regardless of

how the spectrum of harmonic frequencies may change among the isomers. At this level

Ÿ

ù

"X") of approximation, the average energy ( and the heat capacity are unaffected by the details of vibrational structure. Classical statistical mechanics may be acceptable, even for

the ice-Ih * ice-XI transition: the heat capacity peak assigned to this transition only shifts

¡K£¥ £¥

t + from 72 t to 76 upon replacing by [42]. If the vibrational contribution to the heat capacity does not vary widely among the isomers, then an expression in the same

form as Eq.(2.22) applies even in the quantum regime.

Û YºÀ"5 Y Calculating the XAY and for all possible H-bond topologies is a daunting task, yet it is what would be needed to, say, predict proton ordering phase transitions in ice-Ih.

The graph invariants we introduce in this work provide a way to circumvent the need to

Û YºÀ"5 Y calculate all the XAY and . Provided these quantities can be fit, using a relatively small training set, as a linear combination of graph invariants, we would have all information in hand to calculate thermodynamic properties and phase transitions. This idea is tested here for a hypothetical, unphysical situation: We calculate proton ordering phase transitions for the ¡ £L¥¨§ £ dodecahedron. This calculation is not experimentally relevant because it may never be possible to study the ¢¡&£¥§ £ dodecahedral cluster, and even more unlikely to

38 obtain properties as a function of temperature. At higher energies the dodecahedron will transform into a more stable isomer or dissociate into smaller fragments. The calculation of proton ordering transitions in the ¢¡&£¦¥¨§ £ dodecahedron is presented here as a dry run for an especially physically relevant situation, proton ordering transitions in ice.

The top panel of Fig. 2.7 shows the heat capacity of the ¡¨£L¥¨§ £ dodecahedron calcu- lated using the energy of all 30026 isomers in Eqs.(2.21-2.22), which were optimized using the OSS2 potential energy surface [2]. These curves are labelled as “exact” in Fig. 2.7, but

they are only exact under the reasonable assumptions (harmonic fluctuations about local Y minima of the potential energy surface or similar Û& YºÀ"5 among the many isomers) and un- reasonable assumptions (that the ¡K£¦¥¨§ £ dodecahedron will reach several hundred degrees

Kelvin without transforming into another structure or dissociating into smaller fragments) that underly the application of Eqs.(2.21-2.22) to dodecahedral ¡£¦¥¨§ £ . All these assump- tion would be reasonable for ice-Ih. The fit labelled “converged fit” in Fig. 2.7 is the heat capacity derived from cluster energy expressed as a linear combination of second order invariants with the linear coefficients determined by least squared fit to all 30026 isomers of the dodecahedron.

The curves in Fig. 2.7 indicate that there are the cluster analogs of two phase transi- t tions for this model system, one near 50 t and another near 600 . The fit using only 7

second order invariants (recall that all first order invariants of the dodecahedron are zero)

accurately describes the main heat capacity peak. The smaller peak at low temperature is

slightly displaced. This peak is more difficult to reproduce because it involves relatively

few configurations. Using higher order invariants would narrow the gap between the “ex-

act” and “converged fit”. Of course, even if an ensemble of ¢¡£¥§ £ dodecahedra could

39 0.008 converged fit

0.006

0.004 (kcal/mol K) V C 0.002 ‘‘exact’’ CV

0 0 1000 2000 3000 T (K)

0.008

0.006

0.004 (kcal/mol K) V C 0.002

0 0 1000 2000 3000 T (K)

Figure 2.7: Configurational energy and heat capacity of a model ¡£L¥¨§ £ dodecahedral cluster. The “exact” curve in the top panel is calculated from the partition function in Eqs.(2.21-2.22) using the energy of all 30026 isomers of the ¡¨£L¥¨§ £ dodecahedron. The vibrational contribution to the energy, which would only add a linear term to the average energy and a constant to the heat capacity under the assumption of harmonic fluctuations about each local minimum, is not included. The curve labelled “converged fit” is obtained using an arbitrarily large number of isomers in the training set. It represents the best fit possible using only second order invariants. The curves with thin lines in the bottom panel give the results of the invariant fitting procedure, as fully explained in the text. The heat capacity was calculated using only 30 of the 30026 isomers as input for a fit to invariants,

after which the energies X Y in Eqs.(2.21-2.22) were calculated from the fit. To portray the variability arising from fitting to randomly selected points, we give results for 9 indepen- dent trials.

40 be prepared, the high temperature transition would certainly never be observed since these clusters would fragment long before the temperature reached this value.

Next, just 20 randomly selected isomers from the full set of 30026 were used as an ini- tial training set to parameterize the energies as a function of graph invariants. The resulting

fit gave a good representation of the overall behavior of the energy and heat capacity, in-

cluding the heat capacity peak around 600 t , and indicated some structure in the low

! R$E%t ( , ) energy range. However, it gave a poor representation of the low energy peak

in the heat capacity. This is not surprising, since the H-bond isomers range in energy up

y y eDgih

3 !

j

R[E R$E%E%E%t to k|l , or , about the lowest energy isomer. Therefore, the structure at low temperature involves an extremely small fraction of the isomers. We added 10 more randomly selected isomers to the fitting set, this time chosen from isomers that, according

to the initial fit, were within 350 t of the lowest energy configuration. The resulting fit, 

obtained by randomly sampling from EF0 % of the total number of isomers, reproduces the qualitative features of the heat capacity function. We repeated the entire procedure 9 times and show the results as the thin lines in the bottom panel of Fig. 2.7 to illustrate the errors incurred at this level. It is very encouraging that using the lowest order non-vanishing in- variants and a sparse sample of configurations gives a qualitatively correct description of

the 30026 H-bond isomers of the ¢¡K£¦¥¨§ £ dodecahedron.

.0/132 ç.8/132

3 ç

- 7 495

The energy difference among the H-bond isomers is about 465 in OSS2 empirical potential and 2 under B3LYP/DFT. See chapter 4 for detail discussion. For a quick reference,2 see figure 4.2

41 2.4 Discussion

In the past, it has seemed natural to assume some correlation between the H-bond topol-

ogy and physical properties like energy and dipole moment [26, 32–35]. Graph invariants provide a means to quantify the correlation between H-bond arrangement and physical properties, and systematize the process determining the features of the H-bond network most affect the energy, and other physical properties. Such correlations would prove ex- ceedingly useful in resolving outstanding questions regarding thermal properties and pos- sible phase transitions in common ice-Ih. The verdict on the degree to which H-bond topology predicts energy in ice must await further investigation. This work furnishes the tools needed to analyze such putative relationships. We also provide a highly efficient method for either exhaustive or selection enumeration of H-bond topologies of clusters and crystals.

The utility of graph invariants rests on the extent to which physical properties depend on

H-bond topology. The basic idea underlying the formalism is that scalar physical proper- ties, like energy or magnitude of the dipole moment, are invariant to symmetry transforma- tions. If two H-bonding arrangements are symmetry related, then physical properties asso- ciated with these arrangements must also be identical. For example, the association might be pictured as arising from geometry optimization from a high-symmetry initial structure.

The symmetry properties of the oriented graph and the geometry-optimized physical iso- mers may be different. Remarkably, as long as symmetry-equivalent initial structures op- timize to the same distorted structure, invariants based on the high-symmetry structure are still appropriate for describing the relationship between hydrogen bonding topology and physical properties of the distorted final structures.

42 2.5 Appendix: Necessary and sufficient condition for a graphical in- variant to be identically zero

Here we demonstrate a necessary and sufficient condition for a graphical invariant to

be identically zero. We first show that a sufficient condition for the first order invariant,

§

$ž

¯

GDH  

)¦H=, (2.23)

ž± 4

to vanish is the existence of a group element that takes the bond G2H into minus itself,

§

$ž ,>] GDH¼0 GDH (2.24)

Apply the projection operator for the totally symmetric representation as in Eq. 2.3 to both

sides of the above equation.

§§ §

[² [ž +²

¯ ¯

GH ,„]  GDH ,„]A) H

 (2.25)

²I± ²3±

4 4

§<§ §

[² $ž ;:

, GDH

Let GDH and realize that from the group requirement of a unique inverse

<:

that no two elements $² can give rise to the same . It follows that

§ §

=: +²

¯ ¯

)¦H=,  ,>]  GDH ,>]A) H

GDH (2.26)

: ±

²3±

4 4

and therefore the existence of a group element that takes G2H into minus itself is a sufficient )H condition for )2H¢,¸E . To show that this is a necessary condition note that if vanishes

identically, then the result of applying the projector of Eq. 2.3 onto GIH can be grouped into

pairs of terms that cancel each other,

§?> § §´§

$ž [² ;:

¯

)¦H=,  GDH 03030 GDH J GDH J‰03030

ž± (2.27)

4

§

,>03030 GL~ ] GL~ J‰03030

§ §

[² @: =:

)¦H , GL~ GDH ,¸]¤GL~

for if GDH there must be another group operation such that if

§ §

%² ;: wž

,„] GH vanishes identically. However, if GH , then there must be a group element

43

with the property that



§ §§

4

[ž  +²

:

, GH ,„] GDHc0 GDH (2.28)

This shows that condition (2.24) is both necessary and sufficient for )+H to vanish identically.

The same reasoning can be applied to show that any invariant )+H~¢‘¶6¶6¶ vanishes identically

if and only if there exists a group elements such that

§

[ž ,„] GDHLGL~

Condition (2.29) is sufficient because applying the projector of the totally symmetric rep-

resentation to both sides of (2.29) leads to

§ §<§

[² +² [ž

¯ ¯

GDHDG¦~ GDF02030 ,>] )¦H ~¢‘¶6¶6¶+0 GDHLGL~ G03030 ,Õ] )¦H ~¢‘¶6¶6¶$, 

 (2.30)

²I± ²3±

4 4

Condition (2.29) is necessary because if )IH ~i‘¶6¶6¶ vanishes it must be possible to group its

terms into pairs that cancel each other.

§> § § §

[ž +² ;:

¯

)¦H ~¢‘¶6¶6¶$,  GHG¦~ GDF02030 02030 GHDGL~GD¡03030 J GHG¦~ GDF02030 J‰03030

ž± (2.31)

4

§

,„02030 GDH€ ‘GL~i ‘GDº w03020 ] GDH€ GL~¢ GDº w03030 J‰03030

Therefore there must exist a group element wž that satisfies condition (2.29).

44 CHAPTER 3

GRAPH INVARIANTS FOR PERIODIC SYSTEMS: PREDICTING PHYSICAL PROPERTIES FROM THE HYDROGEN BOND TOPOLOGY OF ICE

Hydrogen bond (H-bond) order and disorder in ice-Ih is an old problem that is still the subject of controversy. Since the work of Pauling, Giauque and Stout in the 1930’s

[17, 18], it is believed that the protons in ice-Ih are disordered, subject to the constraints of the Bernal-Fowler “ice ruless” [53], that each water donates to two H-bonds and accepts from two other H-bonds. The number of allowed H-bonding arrangements in an infinite periodic lattice has been well-established [18, 24, 25]. However, conflicting results for periodically replicated units cells, candidates for a possible low-temperature ordered phase of ice, have appeared in the literature [21,56,57]. Small energy differences between H-bond arrangements in ice may induce a phase transition to a proton-ordered crystal, although it is likely that, under normal conditions, the transition is kinetically inaccessible. Most

experimental reports center around a first-order transition at 72 t to a ferroelectric structure

[19,20,40–46]. However, a substantially different transition temperature has been reported

[52], and the ferroelectric nature of the low temperature structure has been questioned as well [49, 71].

The purpose of this work is to extend our graph theoretical techniques to periodic repli- cated systems, providing analytic tools to address the issues mentioned above. We develop

45 the concept of graph invariants for periodically replicated systems. Graph invariants are

functions of H-bond variables which are invariant to the symmetry operations of the system.

In other words, graph invariants are symmetry-adapted functions that are appropriate for describing how the scalar properties, such as the total energy, depend on the arrangement of H-bonds.

Graph invariants serve two important functions: 1) They enable the enumeration of

£

§ §

M A MTSUM all symmetry distinct H-bond arrangements by changing it from an A to

problem, and 2) they provide a means to systematically parameterize physical properties of water cluster or ice configurations that differ in their H-bond topology. In this chapter we use graph invariants to facilitate the listing of all the hydrogen bonding arrangements accessible in several unit cells of the ice-Ih lattice. We resolve some discrepancies in the literature [21, 56, 57] for small unit cells. We also provide results for larger unit cells containing billions of different H-bond arrangements which are of sufficient size to serve as simulation cells for statistical calculations. With this data in hand, we analyze properties of H-bonding in ice-Ih that depend exclusively on the topological constraints of the crystal lattice and satisfying the ice rules.

In chapter 2, we demonstrated that graph invariants furnish a very useful set of symmetry- adapted functions for capturing the dependence of scalar physical properties (energy, mag-

nitude of the dipole moment, 03020 ) on the arrangement of H-bonds. We also developed a hierarchy of invariants – first order, second order, and higher order invariants – and that physical properties of water clusters could be described quite well in terms of the simplest of the invariants, the first and second order invariants. The first order invariants are often identically zero, as they are for the ice-Ih lattice. Using graph invariants to parameterize the dependence of cluster energy on its H-bond topology, we successfully described the

46

y

  ¢¡K£¦¥¨§ £

\ . E

energy of E - different H-bond isomers of a cluster, spanning a range of

y

eDgih

j {$E k|l , with only 7 numbers, one of which sets the zero of the energy scale. In this chapter, we show how the same procedure may be adopted for crystal lattices. At the same

level of approximation that was successful for clusters, we show here that on the order of 

E numbers should parameterize the energies and other scalar physical quantities of the billions of H-bond isomers that are possible in a large unit cell of ice-Ih.

The implication of these results is that a relatively small number of calculations should suffice to predict the lowest energy structure and phase transitions in ice-Ih. This is sig- nificant because the energy of different H-bond arrangements in ice are closely spaced in energy and it is difficult to predict their energetic ordering. Buch, Sandler and Sadlej have demonstrated that empirical water potentials give inconsistent predictions of the relative stability of these arrangements [21]. This information will have to be obtained from costly periodic electronic structure calculations. It is currently not feasible to perform electronic structure calculations for every symmetry-distinct H-bond arrangement for a unit cell large enough for statistical simulations. Even such calculations for a Monte Carlo sample would be quite taxing. Instead of such expensive routes, our techniques offer the possibility of extracting parameters from calculations on small unit cells and bootstrapping to estimate the energy of H-bond arrangements in much larger cells. To accomplish this task, we re- quire the relationship between the graph invariants of small unit cells and those of large unit cells, which is developed in this chapter.

The full formalism requires some rather abstract expressions to describe what are, nev- ertheless, basic concepts. Therefore, in section 3.1 we provide a “gentle introduction” to graph invariants, using an artificial two-dimensional “square ice” lattice as an example to illustrate the basics idea with a minimum of formalism. In section 3.2 we precisely define

47 graph invariants, and how they are generated by group theoretical projection operators (sec- tion 3.2.1). In section 3.2.2 we develop the relation between graph invariants for small and large unit cells, and key concept that permits the physical properties of large unit cells to

be parameterize by calculations on small unit cells. The concepts are illustrated in section

3.2.3 by returning to the example of “square ice”. The reader not interested in the details

of the formalism can gather the basic ideas by reading sections 3.1 and 3.2.3.

Section 3.3 contains applications to ice-Ih. Graph invariants for the common 8-water

orthrhombic unit cell of ice-Ih are presented in section 3.3.1 and section 3.5. Graph in-

variants also provide an efficient means for complete enumeration of all symmetry-distinct

H-bond arrangements for either a cluster or periodic system. Applications to ice-Ih are pre- sented in section 3.3.2. Graph invariants permit complete enumeration of unit cells large enough to approximate the infinite-system limit. Our largest example is a 48-water hexag- onal unit cells for which there are 2404144962 possible H-bond arrangements permitted by the ice rules, of which 8360361 are symmetry-distinct. In section 3.3.3 we analyze selected statistical properties of these arrangements. Finally, we present some concluding remarks in section 3.4.

3.1 A gentle introduction to oriented graphs and graph invariants

Each H-bond in ice or water clusters consists of a hydrogen covalently bonded to one oxygen, the donor, and hydrogen-bonded to a second oxygen, the acceptor. Hence H- bonds are directional, and are conventionally taken to point from donor to acceptor. Proton

arrangements in ice are summarized by oriented graphs, a set of vertices linked by directed ˜ edges [54,72,73]. The symbol G¦H stands for the orientation of the th H-bond with respect to

48 

a canonical orientation, G¦H=,‚J if the H-bond points in the same direction as the canonical 

orientation, GH›,Õ] if the direction is opposite.

To illustrate the theory, let us take a simple example, “square ice”, which, like ordinary ice, consists of 4-coordinate water molecules. (Of course, applications to the real ice-Ih lattice are presented below in section 3.3.) Part of the square ice lattice and the direction

of bonds, all in an arbitrarily chosen canonical bond orientation, are shown in Fig. 3.1. Six

possible graphs within the !¼! unit cell of square ice, shown in Fig. 3.2, when periodically

replicated realize an H-bond topology in agreement with the Bernal-Fowler ice rules. The 8

bonds of the !&¤! unit cell are given an arbitrary index ranging from 1 to 8, as indicated in

£ B 203030DG graph A of Fig. 3.2. The value of the bond variables G[4¦DG for the graphs in Fig. 3.2 are given in Table 3.1.

Figure 3.1: A square ice lattice used to illustrate graph invariants. The molecular config- uration, shown on the left, is summarized by the directed graph appearing on the right. The H-bond arrangement shown here is adopted as the canonical bond orientation. Other periodic H-bond arrangements are possible, as illustrated in Fig. 3.2.

49 A B C 7 8 3 4

5 6 1 2

D E F

Figure 3.2: Graphs which lead to periodic H-bond patterns satisfying the Bernal-Fowler ice rules in the square ice lattice depicted in Fig. 3.1. In graph A the bonds are arranged in their canonical orientation, the same one shown in Fig. 3.1. The eight bonds associated with the !| ! unit cell are numbered according to the scheme indicated on graph A. In some graphs the bonds associated with unit cells neighboring the primary unit cell are shown to make it more apparent how the orientation of complete water molecules are indicated by

the graphs. For example, in graph B the periodic image of bond 4 is actually drawn to the

£ © B 

G.3DG DG¦; G J

left of bond 3. In graph B bond variables G+4¦G and all have value , while



GD\ ] bonds G* and have value , all defined relative to the canonical orientations of graph A.

Some of the graphs shown in Fig. 3.2 are related to each other by a symmetry oper- Ÿ

ation. Graph D is obtained from graph A by either a \ rotation or reflection operation.

Therefore, the energy and other scalar properties of the two configuration should be iden- tical. The same is true for graphs B and E, and graphs C and F. If the energy depends on

the topological features of the H-bond network, then it must depend upon functions of the

bond variables GH that are identical for configurations related by a symmetry operation.

50

graph A B C D E F

 ]

G24 1 1 1 1 1

£  ]

G 1 1 1 1 1

 

] ]

G* 1 1 1 1

 

] ]

GD\ 1 1 1 1

 

] ]

G. 1 1 1 1

©   

] ] ]

G 1 1 1

 

] ]

GL; 1 1 1 1

B   

G ] ] ]

CB$£

£ 1 1 1

 

£CB$£

) ] ]

4¢* 1 0 1 0

£

£CB$£ )

* 1 0 1 1 0 1

 

£

£CB$£

) ] ]

4 1 1 1 1

 

£CB$£

) ] ]

4¢. 0 0 0 0 ) 44 1 1 1 1 1 1

Table 3.1: Value of the bond variables and graph invariants associated with each of the graphs depicted in Fig. 3.2.

Consider the combination of bond variables,



£CB$£

£ © BL§

G34

4¢* (3.1)

^

£CB$£ )

which is an example of a graph invariant. (The origin of the notation 4¢* and the nor-

£CB$£ )

malization constant will be explained later.) Notice in Table 3.1 that 4¢* , has exactly the

£CB$£ )

same value among the three pairs of graphs related by symmetry operations. 4¢* also has a

£CB$£ ) clear physical interpretation. It is a sum of dot products of four pairs of parallel bonds. 4¢*

effectively counts the number of H-bonded pairs in which non-participating hydrogens lie on the same side of the H-bond. Bjerrum postulated that this type of bond is higher energy

than those in which the non-bonded hydrogens are more distant [26, 32]. Instead of the

more complicated notation of Bjerrum, which is only meaningful for the three-dimensional

ice-Ih lattice, we will follow Buch et al. [21] and refer to bonds with non-bonded hydrogens

51 on the same side as “cis”, and others as “trans”. In Fig. 3.2, all four bonds of graphs A and

D are cis. In graphs B and E the bonds connecting waters along the V -axis are cis, while

those connecting waters along the W -axis are trans. None of the bonds are cis in graphs C

and F.

£CB$£

! 

Œ

g

O Y­~ , ‹") J Clearly, 4¢* as follows from the dot product nature of Eq. (3.1) and

can be verified from Fig. 3.2. Hence, if Bjerrum’s conjecture turns out to be correct then

£CB$£ ) the graph invariant 4¢* will be the appropriate link between the energy, a scalar physical property, and the topology of the H-bond network. The validity of Bjerrum’s notion of strong and weak H-bonds has been debated for many years in the literature [21,32,35,36].

While certainly appropriate for the water dimer [74], it is not clear that H-bonds in ice-Ih fall into strong and weak groups according to their cis/trans nature. The reliable way to identify which topological features of the H-bond lattice are most relevant to its stability

is to systematically identify all symmetry-invariant features of the H-bond topology upon ! which scalar physical properties may depend. For the !† unit cell of our square ice

example, there are four other graph invariants that depend upon pairs of bond variables.



£CB$£

£ © B¦§

£

) , G G*ÎJµG34

* (3.2)

^



£CB$£

£ © B¦§

£

) , G24

4 (3.3)

^



£CB$£

£ © £ © © ©



) G24

4¢. (3.4)

C

£ B £ B B B¦§

J¨G24 GL;mJ G GL;|JµG*LG¦;Ó]zGD\¦GL;|JµG34



£CB$£ £ £ £ £ £ £ £ £

£ © B

Œ

ë

) , G J G J G JµG JµG J G J G J G

‹

44 4 * \ . ; (3.5)

For the ice lattice, both real ice-Ih and our illustrative example square ice, all invariant linear combinations of single bond variables (first order invariants) are identically zero. The graph invariants in Eqs. (3.1-3.5) are a complete set of invariant bond combinations for the

52 !| ! unit cell of square ice that can be constructed from products of two bond variables. We

call such combinations of pairs of bond variables second order invariants. Procedures for

generating graph invariants are described in section 3.2. More complicated invariants, made

from products of three or more bond variables (third and higher order graph invariants) are

possible as well, although one may hope for convergence with the respect to description

of physical properties as more complicated invariants are included. We have been able to

document that second order invariants adequately describe the dependence of energy and

other scalar properties on H-bond topology in clusters.

The four additional invariants presented in Eqs. (3.2-3.5) can be assigned physical in-

£CB$£ )

terpretations, just as we discussed for 4¢* with relation to Bjerrum’s conjecture regarding

£CB$£

£ )

cis and trans H-bonds. For example, 4 measures the degree to which chains of H-bonds W along the V or directions align in the same direction. Because of the constraints of the ice

rules, this also measures the number of water molecules whose OH bonds are both parallel W

to the V or directions. (Only graphs C and F contain such waters. All other graphs con-

£CB$£

V W )

tain waters with one bond pointing along and one pointing along .) 4¢. can be seen

£CB$£

£ )

to measure this same property. In fact, with regard to the graphs shown in Fig. 3.2, 4

£CB$£ £CB$£ £CB$£

! 

£

) ) , ) J

4 4¢. and 4¢. are linearly dependent upon each other: . It often happens that, when evaluated for graphs that satisfy constraints like the ice rules, invariants are linearly dependent upon each other. Relaxing the ice rules, for example by allowing hydronium or

hydroxide to appear in the lattice, will break the linear dependence of the invariants. The

£CB$£ ) invariant 44 is rather trivial for the graphs shown in Fig. 3.2, merely giving the fraction of filled H-bonds in a unit cell.

Let us return to Bjerrum’s conjecture that the energy of different H-bond topologies can be linked to the number of cis or trans H-bonds present in the lattice. The beauty of

53 Bjerrum’s simple conjecture is that it can be applied to both regular, periodic patterns of

H-bonds, as well as disordered arrangements. Put another way, the number of cis and trans

H-bonds is a topological invariant for periodically replicated lattice of arbitrary size, for both small unit cells, cells large enough for numerical simulations, or cells whose size tends toward infinity in the true thermodynamic limit. We will demonstrate in this manuscript

that this property of cis and trans H-bonds is shared by all the invariants we generate:

 ! Invariants like the ones we presented in Eqs. (3.1-3.5) for the ! unit cell of the square

ice lattice are also invariants of larger unit cells.

Larger unit cells will also generate new invariants that have no counterpart in small unit cells. However, these new invariants involve bond combinations more distant from each other than in a small unit cell. As a result, one may expect that at a certain point these new,

long range invariants will not be important in capturing physical properties of the system.

This sets up a strategy for describing the properties of large unit cells, those large enough

for statistical simulations in terms of properties derived from small unit cells. Even though

the large unit cells admit millions or billions of H-bond topologies, the energy of each

of these topologies, if our previous calculations for clusters (see chapter 2) is any guide,

depends upon the value of a handful of invariants. The parameterization of the energy in

terms of invariants may be adjusted to match experiment, or may be obtained from ab initio

calculations which are feasible for small unit cells.

54 3.2 Graph invariants for periodic systems

Graph invariants, functions of bond variables that are unchanged under any symmetry operations, can be constructed using standard group theoretical projection operators. The

application of a projection operator to a single bond variable, G3H , takes the form,

§

Ÿ

[ž

Hҁ GDH 

) H=, (3.6)

ž

Ÿ ¡ž

where H is a normalization constant chosen for convenience, is a member of the sym- metry group of the system, and the sum runs over the entire symmetry group. The char- acters of the totally symmetric representation are identical for all symmetry operations.

Therefore, to construct a linear combination that transforms according to the totally sym-

§ wž metry representation of the group, the terms GDH are combined in Eq. (3.6) with equal

coefficients. The appropriate group for a crystal lattice is the space group. We assume D

that the crystal is large and periodic, so the translation subgroup is of order MZÃMÅ2M (or

VÁDWÒFE obviously MÃMÅ for a two dimensional lattice like square ice). We use to designate

the crystal axes, but nothing in our formalism requires that these axes be orthogonal.

Other invariants can be constructed similarly:

§

Ÿ

[ž

) H ~´, H ~d 

GDHLGL~ (3.7)

ž

§

Ÿ

[ž

) H ~i1, H ~¢ GDHGL~

 (3.8) ž . .

Throughout this chapter we conveniently take the normalization constant to be the inverse

of the order of the group, making the invariants intensive quantities.



Ÿ

§ G

H~ ¶6¶6¶$, (3.9) ³

55 Scalar physical properties that depend on H-bond topology will be a function of graph

invariants. The simplest relationship, linear dependence,

H) HmJ  H ~ ) H~‡JQ H ~¢‘) H ~iJ‰03030†

X>,‚ (3.10)

ƒ ƒ ƒ

H H ~ H ~i

will always be valid if the physical differences between H-bond arrangements are not too great. We have shown that the linear expansion can be still quite successful for water clus-

ters, even when energetic differences between H-bond isomers is rather great. In Eq. (3.10)

the graph invariants provide a vector space over H-bond topologies. In particular, the graph

invariants are symmetry-adapted combinations that span the symmetry invariant sub-space.

The linear expansion of Eq. (3.10) is not the most general relation between scalar prop-

erties and H-bond topology, and in certain situations we may expect non-linear dependence

of physical properties on the invariants. To give an example, in a simple model where the H total dipole moment arises from bond dipoles H , the total dipole moment could be ex-

pressed in terms of our bond variables as

,‚ GHIH›H

H (3.11) H

and we expect the squared magnitude of the total dipole moment to be well described by a

£

ðML

JKH H

linear expansion in second order graph invariants, H ~ , and indeed find this ƒ

to hold nicely for H-bond topologies of the ¢¡¨£¥§ © cage cluster. This implies that a linear

›H

expansion of JKH itself through second order invariants would not be as successful, unless

£

AH=J

a series expansion of the square root of JKH converged rapidly. Instead, the non-linear

L

N H ~ ) H ~ J H H=J

function H ~ would be the expansion of choice for . (For non-linear functions, ƒ

the classification of invariants into first, second, and higher orders loses its significance.)

KH

the form of Eq. (3.10) would eventually converge but might require higher order terms.

56 3.2.1 Graph invariants and space groups

The space group of a crystal can be treated as a finite group by invoking periodic

boundary conditions. Consider a lattice with possibly non-orthogonal unit cell vectors

%D3¬ ³ O

«+_%Ãw_%Å+_ . The full space group is designated as . , the crystallographic translational

TS¨§

¹

, P P «QP+ÃRP ÅCPRD3¬ P

D Å

group, is generated by the elementary translation operators , where Ã

S

VUŠ_%Ã9JXW¡_wÅ@JVY¤_ZDI0

J That is,

     ¥]

¹

J UÇ,‚EF 303030

D Å Ã (3.12)

We will always assume a large but finite crystal with periodic boundary conditions.

?x_^ ?x_b ?x_a

¹ ¾ ¹ ¾

P P ,`P ?P ,`P ,`P

à à D D Å

Å and (3.13)

§

G

O O MÉÃMÅ2MD Hence, O becomes a finite group and , the order of , is .

As is well known in the theory of space groups [75], ³ can be decomposed into a sum

of cosets of O :

£

cOedÒ49fgOed fgOhd*if 03030 

³Ž, (3.14)

² f where the d are coset representatives and stands for a summation of two sets, which is

the set of all objects that are contained in at least one of the sets. The set of cosets form

v

²

O d the factor group ³ . Conventionally, the coset representative is chosen to be a pure point group operation if possible, or a space group operation with a minimal translation if a screw or glide operation is involved.

The projection operation for the totally symmetric representation of ³ , denoted here ³ as j , is generated by applying all operations of the group with coefficients proportional to the characters of the totally symmetric representation, that is, with equal coefficients. The projection operator for the totally symmetric representation of the pure translation group,

57 O

denoted here as j is simply

x_a

x ^ x_b

§ § §

¹ ¾

O ,    P à P¦Å PkD 

j (3.15)

  

± ± ±

¹ ¾

and for the full space group the projection operator is

$ž ²

,  Oqdj 0 ³l>

j (3.16)

ž§m

²=m

npo

¯ ¯

The first sum is over all elements in ³ , while the second sum is over the coset represen-

tatives. Our previous equations (3.6,3.7) for graph invariants can be rewritten in terms of

projection operators:

§ § §

Ÿ Ÿ Ÿ

) H=, H ³ GDH D) H ~Î, H ~ ³ GDHGL~ D) H ~ib, H ~¢ ³ GDHLGL~

j j j (3.17)

and so on for higher-order invariants.

The graph invariants of Eqs. (3.1-3.5) were generated using projection operators. Be-

fore explicitly presenting the procedure, we have to recognize that expressions like (3.17) are still not adapted to bond patterns consisting of periodically replicated unit cells.

3.2.2 Invariants for arbitrary unit cell choice

Practical statistical simulations of proton ordering ice-Ih require unit cells large enough to approximate the properties of an infinite system. However, cells much beyond 10 wa- ter molecules allow an astronomical number of different hydrogen-bonded arrangements, seemingly making Monte Carlo sampling [37] the only feasible approach for unit cells large enough to approach the thermodynamic limit. Graph invariants provide a link be- tween the properties of large unit cells and cells small enough to allow accurate ab initio

studies, thereby providing an alternate to numerical simulations for larger unit cells. The

key is the link between graph invariants for unit cells of different size, which we derive in

this section.

58 Consider the smallest unit cell, defined by the translation subgroup O . Since the H-

bonding pattern is repeated in all unit cells,

r

¦YtsuO@ P2YˆGH=, GHÓ0 P value of value of (3.18)

The above equation applies to the value of the bonds, not the bond variables themselves.

2YˆGDH G¦H P2YiGDH The translated bond P resides in a different unit cell from , and the image of under

a symmetry operation is different from the image of G H , even though they might share the

  ] same value of either J or . Periodic replication of the H-bonding pattern, as expressed in Eq. (3.18) implies that only the coset representatives need to be projected to generate

invariants.

§ §

Ÿ Ÿ

²

)¦H ~ ¶6¶6¶‚, H ~ ¶6¶6¶ ³ GDHGL~b03030 , H ~¶6¶6¶ Od GDHGL~b03030

j  j

²

npo

§ §

Ÿ

G

²

¯

d GDHGL~b03030 , H ~ ¶6¶6¶ O 

²

npo



§

¯

²

v §

, d GHG¦~Ò03030 G

 (3.19)

²

³ O npo

The second equality is a consequence ¯ of the periodicity of the lattice, as expressed in

Eq. (3.18). The third equality is obtained by invoking the normalization condition, (3.9).

³ GLHDG¦~Ò03030

The action of j on bond variables produces a linear combination of bonds spanning

§

³ GHDGL~b03030

the entire lattice. Reduction of j in the second line of Eq. (3.19) to a few terms

§ G

over a single unit cell multiplied by O is only true for the value of the invariant, given

that the pattern of bond variables is periodically replicated.

  §

Obà OŠÅ O'D «IOÒÃ_%Ã`DOŠÅ2_%Å[DO'DD_ZD3¬

For a unit cell v with basis , the translation group,

B B

^ a b

  Â

denoted as O , can be written as

^ a b

B B  § § § 

  Â

M¨Ã

¹ ¾

^ a b

  Â

O , w P ]  P P J U ,‚E 303020

à ŠD

OŠÃ

   x

MÅ MD

,:EF 203030 ] FY:,®EF 303030 ] 0

W (3.20)

OÒÅ O'D

59

B B B B B B

^ a bzy

  Â

O¼4 4 4 O OÎ4 4 4¨,O

It is elementary to see that O is equivalent to and . For

B B

^ a b

  Â

any O , we have

B B § §v v

G G

^ a b

  Â

, O OÒÃIOŠÅ2O{D›,®M¨ÃMÅ2MD OÒÃIOŠÅ O{DI0

O (3.21)

  §

Omà OŠÅ O'D

For graphs satisfying the periodic boundary condition of unit cell v , the

B B

^ a b

   value of a bond variable at a position translated by one of the members of O is

equal to the bond variable at the original position:

B B

r

^ a b

  Â

YtsuO  P YˆGDH=, GH20 P value of value of (3.22)

Note that the above equation provides fewer constraints on the H-bonds than Eq. (3.18) §

for the smaller unit cell v . As the periodic cell is enlarged, a greater variety of

H-bonding patterns is permitted until, as the cell size approaches the thermodynamic limit,

it is capable of describing all manner of disorder in an ice-Ih crystal.

B B

^ a b

  Â

The full space group can be decomposed into cosets of the translation subgroup O .

  §

Omà OŠÅ O'D

While the the pure translation group for the crystal with unit cell v is smaller

AdA§   §

v O‡Ã OÒÅ O{D

than for v , the set of coset representatives for is correspondingly

B B

^ a b

  Â

²

{D d enlarged by a factor of OfÃIOŠÅ2O . The set of coset representatives for the larger

cell is given by

v   

]

¹ ¾

²

P P P d J |#so³ OÓ}UÇ,®E303030

[

à Å

D (3.23)

dA§ v

The space group ³ may be decomposed into cosets appropriate for either the cells

  §

OŠÃ OÒÅ O{D

or v .

£

³ , Oedb46f~Oed f 03020

B B B B

B B B B

^ a b ^ a b

     Â

^ a b ^ a b

     Â

£

, O d f~O d f 03020 4 (3.24)

60 In the above equation we have decomposed ³ into right cosets. For the full translation sub-

group the choice between left and right cosets is irrelevant because O is a normal subgroup

B B

^ a b

   O of ³ , for which left and right cosets are identical. However, might not be a nor-

mal subgroup of ³ , and the left and right cosets may be distinct. In this case, decomposition

into right cosets is the most convenient choice because, according to Eq. (3.22), following

B B

^ a b

  Â

the action of a coset representative with any member of O leaves the value of the bond expression unchanged, as explained in the discussion accompanying Eqs. (3.18) and

(3.19). ³

The application of j on a product of bond variables can be simplified in several ways.

  §

O‡Ã OÒÅ O{D

Following from the unit cell conditions of v expressed in Eq. (3.22) and in ³

analogy to Eq. (3.19), application of j only needs to involve the coset representatives.

B B

b a ^

  Â

§ B B §

²

^ a b

  Â

GDHGL~b03020 O j d ³ j GDHDG¦~Ò03030 , 

²

npoQ  

^;€ aR€ b

B B

b a ^

  Â

§ § B B

G

¯

²

b a ^

  Â

GDHLGL~b02030 0 d  O

, (3.25)

²

  npoQ

€ €

b a ^ ¯ The last line of Eq. (3.25) applies to the value of the expression when evaluated for a pe-

riodically replicated system, not the actual bond variables. [See discussion accompanying

Eqs. (3.18) and (3.19).] Using the definition of the coset representatives given in Eq. (3.23),

  §

v Obà OÒÅ O{D

we can further simplify invariants )3H ~ ¶6¶6¶ in unit cell .

a



^ b

  

 Â

4

4 4

B B B B

^ a b ^ a b

§ „†

     Â

Ÿ

¹ ¾

²

) , P P P d GDHLGL~Ò03030

   

H ~ ¶6¶6¶ H ~ ¶6¶6¶ à ŠD‚ƒ

  

± ± ±

²

npo

¹ ¾

a



^ b

Â

 

 Â

4

¯



4 4

†

§ „

¹ ¾

²

v §

G

P P P d GDHLGL~Š03030 ,

   

ƒ

Å D‚

à (3.26)

  

± ± ±

²=m

³ O OÒÃ3OÒÅ O'D

npo

¹ ¾ ¯

Let us analyze result (3.26) in two different situations. First consider when all the

  §

v O‡Ã OÒÅ O'D bonds within the product GLHGL~b03030 lie close to each other within the cell , in

61 §

fact so close they would fit within a smaller unit cell v . Then, for terms of this

B B

^ a b

   H ~¶6¶6¶

type the quantity in square brackets in Eq. (3.26) for ) is, within a constant, an

§   §

v v OÁÃ OÒÅ O{D invariant )2H~ ¶6¶6¶ for , evaluated for a portion of the larger unit cell .

Hence each invariant for the small unit cell appears as an invariant of the larger unit cell.

[The converse, that each invariant in Eq. (3.26) generated from G3HDGL~b03030 lying within a small unit cell is an invariant of the lattice with a smaller unit cell is not true. This is because

periodicity imposes fewer constrains for the larger lattice. This point is illustrated in section

  §

OÁÃ OÒÅ O{D 3.2.3.] Of course, when the unit cell is enlarged to v , the small cell H-bond

pattern is, in general, not periodically replicated within the large cell. Therefore, the value

B B

b a ^

  Â

H ~ ¶6¶6¶ )IH~ ¶6¶6¶ of ) is not simply a multiple of the value of for a smaller unit cell.

Second, let us consider the case where the bonds within the product G3HGL~b03020 in Eq. (3.26) §

do not all lie within a region the size of the small cell v . The invariants generated

  § §

OÒÃ OÒÅ O{D v for v in this case have no analog from . These new invariants arise from the greater variety of H-bond topologies permitted when the unit cell is enlarged.

Since this class invariants involves products of bonds separated by distances greater than AdA§

the dimension of the small unit cell v , these invariants will describe longer ranged

physical interactions, which would be expected to be weaker.

§   §

v O‡Ã OÒÅ O'D In summary, upon enlarging the unit cell from v to , we find two types of invariants. The first type are simply invariants of the small cell evaluated for portions of the large cell and then summed. When decomposing the dependence of energy

(or other scalar physical quantities) on H-bond topology in terms of invariants, we expect these invariants to provide the dominant contribution. The second type are fundamentally

new invariants involving products of bonds separated by distances greater than the dimen- dA§

sion of v .

62

§   §

v O‡Ã OÒÅ O'D

Our discussion of enlarging the cell from v to applies equally

  §   §

v Obà OÒÅ O'D v O9‡ Oˆ‡ O{‡

ŠD when going from to even larger unit cell dimensions à . This

provides a natural hierarchy of approximations for decomposing the dependence of scalar

physical properties on H-bond topology. The most local and dominant effects would be §

captured by fitting to invariants at the level of the small cell v . If these effects

  §

OÄÃ OÒÅ O'D

are completely dominant, then physical properties for cell v would be accu- AI§

rately predicted in terms of invariants that are from the v cell, summed over all

  §

Obà OÒÅ O'D

portions of v . Deviations from this picture are used to parameterize physical

  §

Ofà OŠÅ O'D properties in terms of the invariants for v that involve longer range interac-

tions. This improved characterization could, in principle, be tested at a still larger level

  §

‡ ‡ ‡

v O O O

Å D Ã until convergence is achieved.

For simplicity, the transition from small to large unit cells has been discussed here as

O'D a mere rescaling of the unit vectors by factors of O‡ÃwDOŠÅ and . Quite often, convenient choices of larger unit cells involve linear combinations of primitive lattice vectors. Our conclusions apply to these cases as well, and we illustrate such unit cells in our treatment of ice below. Whatever our choice of unit cell, the unit cell vectors are a subgroup of the

full translation group O . The translation subgroup of the unit cell vectors can be used to decompose the full space group into right cosets, and the link made between invariants for small and large unit cells.

3.2.3 An illustration for square ice

£CB$£ )

In section 3.1 we exhibited the five second order graph invariants H~ associated with ! the !Z unit cell of our “square ice” example. The formalism of sections 3.2.1 and 3.2.2

63 explained how those graph invariants were generated with projection operators, and ex- posed relations between graph invariants for unit cells of arbitrary size. The very practical consequence of these relations is that calculations feasible for only small unit cells, such as ab initio energetic calculations, can be applied to larger unit cells appropriate for statistical simulations. Since the formalism may be forbidding at first glance, we illustrate the rela-

tionship between graph invariants for unit cells of different size for square ice. The entire

B



\ \

) ^ ^

set of second order graph invariants H ~ for the unit cell (Fig. 3.3) is presented in

 ! this section, and we discuss the connections with the graph invariants of the smaller !

unit cell.

7c 8c 7d 8d 3c 4c 3d 4d

5c 6c 5d 6d 1c 2c 1d 2d

7a 8a 7b 8b 3a 4a 3b 4b

5a 6a 5b 6b

1a 2a 1b 2b

 ^ Figure 3.3: Labeling scheme for bonds in the ^ unit cell of “square ice”.

64

 # 

_ ^ ^

We begin by examining the result of projecting onto bonds _ and of the unit

B

© B

cell. £

\ \

£

4

h h h h h h h h

h h

, ) «+G34 G* J G G\ JµG. G J GL; G J

4 5 *

*

£ © B

G24¢À GL*ÀbJ G À GD\ À‡JµG.À G ÀbJ GL;€À G À J

£ © B

g g g g g g g g

G24 G* J G GD\ JµG. G J GL; G J

( ( £( ( ( ©( ( B(

G24 G* JµG GD\ JµG. G J GL; G J

© B

£ (3.27)

g h g h h h

G24 GL* JµG G\ JµG.À G JµGL;€À G J

£ © B

h g h g h h

G24 G* JµG GD\ JµG. G À‡JµGL; G À J

( £ ( ©( B(

g g

G24¢À G* JµG À GD\ J G. G J GL; G J

( £( ( © ( B

g g

G24 G*ÀbJ G G\ ÀbJ G. G JµG¦; G ¬

£CB$£ )

Each of the first four lines are clearly recognizable as 4¢* of Eq. (3.1) evaluated for each

! z!  ^ sector of the ^ unit cell. The terms in the last four lines would be identical in

value to those of the first four lines if the lattice still had !¤ ! periodicity. Put another

way, if the letters were removed from the subscripts in the last four lines, thereby enforcing

!ǽ! periodicity, the last four lines would duplicate the first four lines. These terms are

£CB$£ )

indeed part of 4¢* , but they do not appear explicitly in Eq. (3.1) because their value is

 ^ identical to terms already present in that expression. In the ^ setting these terms must

be included as distinct contributions. Provided the additional invariants [Eqs. (3.32-3.38)

 ^

below] introduced at the ^ level do not make significant contribution, the contribution

B

\ \

h h

)

5 *

of an invariant like 4 to a scalar physical property like the energy could be estimated o!

from ab initio calculations for the ! unit cell.

B

\ \

h h

)

5 *

Just like 4 in Eq. (3.27), each of the graph invariants given below in Eqs. (3.28-3.31) o!

has a counterpart in among those of the ! unit cell, specifically in Eqs. (3.2-3.5).



B

£ © B §

£

\ \

ž ² ž ² ž ² ž ²

h h

G G* J G24 GD\ J G G¦; JµG. G ) ,

 *

5 (3.28)

(

ž ²3±

C$^

h g

5 À"5 5 5



B £ £ £ £ £ £ £ £

£ © B

\ \

Œ

ž ž ž ž ž ž ž ž

h h

#%!

) , G J G JµG JµG J G J G JµG J G



‹

4 564 * \ . ;

4 (3.29)

(

ž±

h g

5 À"5 5

65



B

£ © B §

£

\ \

ž ž ž ž ž ž ž ž

h h

#w!Š‰

, ) G34 G J G* GD\ JµGL. GL; J G G

 5

4 (3.30)

(

ž±

h g

5 À"5 5

£ £ §

ž ² ž ² ž ² ž ²

J  G34 G J G G24 J G* GD\ J GD\ G*

(

úéž ²3ýÖ±Òú ý ú ý

h g

5 5 À 5 5

•

–

© B B © §

ž ² ž ² ž ² ž ²

J  GL. GL; J GL; G. J G G J G G

(

úéž ²3ýÖ±Òú ý ú ý

—

h g

5 5 À"5 5



B

© £ ©

\ \

ž ž ž ž ž ž ž ž

h h

‰

 , G24 G. ]zG* G. ]zG24 G J G G )

5 .

4 (3.31)

(

ž±

C$^

h g

5 À"5 5

© © B B §

ž ž ž ž ž ž ž ž ž ž

J¨G* G ]zG\ G J G* GL; ]zG* G JµGD\ G

B B £ B B £ §

ž ² ž ² ž ² ž ² ž ² ž ²

J  G24 G J G G24 ]zG G ]zG G ]zGL; G34 ]zG24 GL;

(

úíž ²Iý ±Òú ý ú ý

h g

5 5 5 À"5

£ £ §

ž ² ž ² ž ² ž ² ž ² ž ²

J G. GD\ J GD\ G. ]zG G. ]zG. G ]zGD\ G¦; ]zGL; GD\



(

úíž ²Iý ±Òú ý ú ý

h g

5 5 À 5 5

•

–

£ £ §

ž ² ž ²

J G GL; J GL; G



(

úíž ²Iý ±Òú ý ú ý

—

h g

5 5 5 5 À

 ^

Each of the invariants listed so far for the ^ unit cell involves products of bonds that lie å! sufficiently close to each other so that they also generate an invariant for the smaller !

cell, and their contribution to scalar physical properties can be estimated from calculations o!

for the smaller ! cell.

£CB$£

!¢½! )

If the energy of the unit cell was parameterized according to the value of 4¢* ,

£CB$£ £CB$£ £CB$£ £CB$£



£ £

) ) ) ) ^ ^

4 4¢. 44 * , , , and , then a first guess for the energy of configurations of the

cell would be in terms of the invariants in Eqs. (3.27-3.31). Perhaps comparison with

 ^ calculations for the ^ cell would indicate reasonable convergence of the energy. If not,

use of invariants involving bond pairs further separated from each other would be an option

 ^ to improve the description. This would involve invariants for the ^ cell which have no

66 o!

counterpart in the ! cell and are listed below.

’



B

”

“

£ £ §

\ \

ž ² ž ² ž ² ž ²

h

#w!

) , G24 GL* JµG* G24 JµG\ G JµG GD\



5 *À

4 (3.32)

( (

úíž ²Iý ±Òú ý ú ý ú ý ú ý

h g h g

5 5 À 5 5 5 5 5 5 À

•

–

© © B B §

ž ² ž ² ž ² ž ²

J G GL. JµG. G JµG GL; JµG¦; G



( (

úíž ²Iý ±Òú ý ú ý ú ý ú ý

—

h g h g

5 5 À"5 5 5 5 5 5 À

’



B

”

“

£ £ §

£

\ \

ž ² ž ² ž ² ž ²

h g

#w!

, G24 G JµG G24 JµGL* GD\ JµG\ G* )

 5

4 (3.33)

( (

úíž ²Iý ±Òú ý ú ý ú ý ú ý

h g g h

5 5 5 5 À 5 5 5 À"5

•

–

© B B © §

ž ² ž ² ž ² ž ²

J G. G¦; JµGL; G. JµG G JµG G



( (

úíž ²Iý ±Òú ý ú ý ú ý ú ý

—

h g h g

5 5 À 5 5 À 5 5 5 5



B

£ B £ B §

\ \

ž ž ž ž ž ž ž ž ž ž ž ž

h g

!$닉

, )  G G. ]zG\ G. J G24 GL; JµG\ GL; ]½G24 G J G G

5 .

4 (3.34)

(

ž±

h g

5 À"5 5

© © © ©

ž ² ž ² ž ² ž ²

J  G34 G JµG G24 ]½G* G ]½G GL*

( (

úíž ²Iý ±Òú ý ú ý ú ý ú ý

h g h g

5 5 5 5 À 5 À"5 5 5

© ©

ž ² ž ² ž ² ž ²

J&G\ G J G GD\ ]zG24 G. ]zGL. G24

£ © © £

ž ² ž ² ž ² ž ²

J&GL* G. J G. G* ]zG G ]zG G

£ £

ž ² ž ² ž ² ž ²

]¤G GL; ]zGL; G ]zG* GL; ]zGL; G*

B B B B §

ž ² ž ² ž ² ž ²

]¤GD\ G ]zG G\ JµG* G JµG G*

B B £ £

ž ² ž ² ž ² ž ² ž ² ž ²

G24 GL; J GL; G24 ]zG34 G ]½G G24 J G G. J G. G J 

(

ý ý ú úíž ²Iý ±Òú

g h

5 À 5 5 5

£ B B £ §<Œ

ž ² ž ² ž ² ž ² ž ² ž ²

J&G G J G G ]zGD\ G. ]½G. GD\ JµG\ GL; JµG¦; GD\

67

’



B

”

“

© © £ © © £

\ \

ž ² ž ² ž ² ž ² ž ² ž ²

g

G24 G. J G. G24 ]zG34 G ]zG G24 J G G J G G ) ,

 .

4¢À"5 (3.35)

(

ý ú úíž ²Iý ±Òú ý

C$^

g h

5 5 À"5 5

B B © ©

ž ² ž ² ž ² ž ² ž ² ž ²

J¨GD\ G J G GD\ ]zG. G* ]zGL* G. JµG G* JµGL* G

B B © © §

ž ² ž ² ž ² ž ² ž ² ž ²

J¨G* GL; J GL; G* ]zG* G ]zG G* ]zG GD\ ]zGD\ G

£ £ §

ž ² ž ² ž ² ž ² ž ² ž ²

J  GD\ G. J G. GD\ ]zG G. ]zG. G ]zGD\ G¦; ]zGL; GD\

(

úíž ²Iý ±Òú ý ú ý

h g

5 5 À"5 5

B B £ B B £ §

ž ² ž ² ž ² ž ² ž ² ž ²

J G24 G J G G24 ]zG34 GL; ]zGL; G24 ]zG G ]zG G



(

úíž ²Iý ±Òú ý ú ý

h g

5 5 À 5 5

Œ

£

ž ž

J  G GL;

(

ž±

h g

5 À"5 5

’



B

”

“

£ £

\ \

ž ² ž ² ž ² ž ²

g



 G24 G34 JµG G JµGL* G* JµG\ GD\ ) ,

4¢À"564 (3.36)

(

úéž ²3ýÖ±Òú ý ú ý

C

g h

5 À"5 5 5

Œ

© © B B §

ž ² ž ² ž ² ž ²

J&G. G. J G G J GL; GL; J G G

’



B

“”

£ £ §

\ \

ž ² ž ² ž ² ž ²

h g



, )  G24 G24 J G G J G* GL* JµGD\ G\ 564

4 (3.37)

(

úíž ²Iý ±Òú ý ú ý

C

h g

5 5 5 À"5

•

–

© © B B §

ž ² ž ² ž ² ž ²

J  G. G. J G G J GL; G¦; JµG G

(

úíž ²Iý ±Òú ý ú ý

—

h g

5 5 À 5 5

’



B

“”

£ £ §

\ \

ž ² ž ² ž ² ž ²

h



 G24 G24 J G G J G* GL* JµGD\ G\ ) , 564¢À

4 (3.38)

(

úíž ²Iý ±Òú ý ú ý

C

h g

5 5 À 5 5

•

–

© © B B §

ž ² ž ² ž ² ž ²

J  G. G. J G G J GL; GL; J G G

(

úíž ²Iý ±Òú ý ú ý

—

h g

5 5 5 À"5 Eqs. (3.32-3.38), like those in Eqs. (3.27-3.31), reduce to graph invariants found for the

!¤ ! unit cell when the letter subscripts are removed, thereby enforcing the periodicity of the smaller cell. However, it is crucial to realize that the seven invariants in Eqs. (3.32-

3.38) are fundamentally different in nature because they involve products of bonds further separated in the lattice than the five invariants of Eqs. (3.27-3.29).

68 3.3 Graph invariants and graph enumeration for Ice-Ih

In this section, we report graph invariants and graph enumeration results for ice-Ih.

Historically, both orthorhombic and hexagonal unit cells have been used in the study of

ice-Ih, most often the former due to the convenience of using orthogonal unit vectors. The

v

ÚÇÚ a symmetry of the oxygen lattice of ice-Ih has been identified as ×C* since the late

1920’s [76, 77]. Some recent experiments reveal a transition to a low-temperature, proton- ordered phase, ice-XI, in which the hexagonal symmetry of the oxygen lattice of ice-Ih is

broken by ordering of the hydrogen bonds [19, 20, 40–46]. The space group of ice-XI is

!

Ÿ

Ú 4 a .

3.3.1 Invariants for the 8-water orthorhombic unit cell

The series of ice-Ih unit cells formed by the cell vectors,

B

S





,Ž

*

j

ë§S

‘ (3.39)

B

©å,

j

S

Æ

’

“

,

* j

has been popular because these unit cell vectors are conveniently orthogonal. In Eq. (3.39),

S

 ‘ “

is the distance between nearest neighbor oxygens, and  and are Cartesian unit

j j j vectors. The smallest orthorhombic unit cell is an 8-molecule unit cell (Fig 3.4). Extending

our notation to distinguish several different choices of unit cell vectors for ice-Ih, we use

  §   §

h g

OÒÀ O v Obà OÒÅ O'D

Orth O in place of to designate a unit cell obtained by extending

h

_ O‡À G

the smallest orthorhombic unit cell O times along the -axis, times along the -axis, and

AdA§

g a

O times along the -axis. Hence the 8-water orthorhombic cell is labeled Orth ,

c!bAI§ c!%§ and there are three unit cells, Orth !bAI§ , Orth , and Orth , having 16 water molecules, but their geometries are very different.

69 8 131 b8b b8a b7b b6b b7a b6a b5a b5b

b4a b4b

b b2a 3a b2b b3b b1a b1b

Figure 3.4: The labeling of H-bonds, and their canonical orientation, are shown here for the Orth § unit cell. In the canonical orientation, all of the H-bonds are cis.

All first order graph invariants for the ice lattice are identically zero. A set of 13 second

order graph invariants is obtained by projecting on all bond pairs from the Orth §

unit cell according to Eq. (3.25). If the projection operator acts on a pair of bonds that lie

along the a -axis, then all symmetry operations will produce other pairs that also lie along

the a -axis. Hence there is a subset of second order invariants, 3 in all, that are comprised

totally of bond pairs along the a -direction. Similarly, there is another subset of 2 second G

order invariants comprised totally of bond pairs that lie within the _ - bilayers. Finally,

_ G there is a third subset of 8 invariants coupling bonds from the a and - bilayers.

70 The following set of invariants are constructed exclusively from bonds along the a -

direction.



£ £ £ £

B B ]

B B

h h h

h h

!

Jz^wG G À J G J ^%G\ GD\ ÀfJµG JµG [`G )¦\ 5 \ À ,

À \ À

\ (3.40)



! B B ! B B B

h h h h h h

« GD\ G JµGD\ G ÀfJ GD\ À G ÀfJµG\ À G ¬ 5 ,

) \ (3.41)

C



£ £ £ £

]

B B

h h

h h

JµG J G JµG ) \ 5 \ , G

[

À \ À

\ (3.42)

^

h

5 \ À a

) \ describes interactions between -axis bonds that are nearest neighbors above the same

£

h

h

G GL\

bilayer. (The squared term \ appears because is nearest neighbor to its own image in

h

5 \ À an adjacent unit cell, and similarly for the other squared terms in )+\ .) This invariant is

what one would obtain if one were to map the energy of a two-dimensional sheet of a -axis

bonds onto a two-dimensional triangular lattice. The possible dependence of energy upon

ferroelectricity or antiferroelectricity of a -axis bonds will be captured by these invariants.

Notice how the possible coupling between a -axis bonds between two bilayers would be

B

h h 5 described at this level by invariants )I\ .

The remainder of the invariants for the Orth § cell are presented in section 3.5.

We have also generated invariants and determined the number of linear independent invari- ants for much larger unit cells. The total number of second order invariants and the number of linearly independent invariants are reported below in Fig. 3.6, but we do not report the explicit form of invariants for larger unit cells in this work. Properties that have been postu- lated to affect the energy of the ice lattice, such as the number of cis or trans H-bonds in the lattice or the degree of ferroelectricity in the lattice, can be expressed as linear combina- tions of these second order invariants. In addition, the invariants must also describe other topological features which have not been discussed in the literature, but which have not

71 been ruled out as possible factors affecting the energy. Having the full set of second order invariants allows an unbiased analysis of which topological features are most significant.

The graph invariants can be viewed as forming a set of basis vectors in a space of H- bond configurations. For example, there are 16 symmetry-distinct H-bond arrangements possible for the Orth § unit cell which we picture as forming a 16-dimensional vector space. (Further discussion of the enumeration of the actual H-bond arrangements is given below in section 3.3.2.) Each invariant is mapped to a vector whose 16 components are the value of that invariant evaluated for each of the H-bond arrangement. Since there are only 13 second order invariants, they cannot possibly span the 16-dimensional space of symmetry-distinct H-bond arrangements. In fact, for H-bond configurations that obey the ice rules, there is a high degree of linear dependence among the second order invariants.

Consequently, the space spanned by the second order invariants is only of dimension 6.

Of course, adding third and higher order invariants would incorporate further flexibility and more fitting parameters. Our experience to date for clusters indicates that energetics are well described at the level of second order invariants, but this conclusion will have to be tested for ice. If truncation at second order invariants is found to be a reasonable

approximation for ice, this provides strong constraints on how scalar physical properties

might depend on H-bond topology.

3.3.2 Enumeration of H-bond arrangements in ice-Ih

In addition to their usefulness in describing physical properties, graph invariants also

provide an efficient means to generate all symmetry-distinct H-bond arrangements of a fi-

nite, or periodic system. Eliminating symmetry equivalent configurations from a list of M

72

£

§ M graphs is nominally an A operation, because all pairs should be compared for symme- try equivalence. However, graphs with different values of any invariant must be symmetry- distinct. This suggests an efficient scheme for eliminating symmetry equivalent graphs.

The list of graphs is partitioned into subsets such that each subset contains graphs with unique values of one or more invariants. Hence, a graph in one subset cannot be symmetry-

equivalent to another graph in a different subset. As a result, symmetry equivalence need

£

§ M

only be tested among graphs in the same subset, reducing the operation count from A

§ MŽSU›M to A . Details are furnished in section 2.2.

Results of this efficient enumeration scheme are presented here for several unit cells of ice-Ih. We provide results for small unit cells because, as mentioned in the introduction, there are conflicting results in the literature for the symmetry-distinct H-bond arrangements of cells like the 8-member orthorhombic cell. We also provide results for large unit cells, cells whose size can be considered appropriate for statistical simulations, to identify topo- logical properties of the ice lattice in the statistical limit, and to demonstrate the feasibility of large-scale enumeration.

Enumeration results for the 8-member orthorhombic cell

In 1987 Howe reported 17 symmetry distinct H-bond arrangements were possible for the smalled orthorhombic cell, Orth § [56]. In 1997, Lekner enumerated the 114

H-bond arrangements possible for Orth § before symmetry reduction, and then eliminated redundant structures according to the functional form of the Coulomb interac- tion [57]. This is not necessarily the same as reduction by symmetry equivalence. Lekner observed 17 distinct forms of the Coulomb potential function. Among the 17 functional forms, two pairs had the same value, leading to only 15 distinct Coulomb energies. In

1998 Buch, Sandler and Sadlej sought to enumerate the distinct configurations of the

73 Orth § cell [21]. They employed a Monte Carlo scheme to generate H-bond con-

figurations for Orth AI§ , and then eliminated redundant configurations according to physical properties like total energy and dipole moment. Buch and coworkers found 16 distinct arrangements.

In this work, we use symmetry properties of the H-bond topology to eliminate redun- dant configurations. This is the same criterion used by Howe, but we obtain different results. Using the functional form of a potential function is problematic because symmetry- distinct structures may have the same energy for certain potential functions, but the degen- eracy may be lifted for other potentials. To give an elementary example originally noted by

Lekner [57], the total oxygen-oxygen and oxygen-hydrogen Coulomb interaction is iden- tical for all H-bond topologies in an idealized ice-Ih lattice where all covalent and nearest neighbor bond lengths are equal. Differences among the H-bond topologies arise exclu- sively from hydrogen-hydrogen Coulomb interactions. We will encounter a more subtle example below, in which two structures have identical bond lengths, and therefore are de- generate with respect to all pairwise additive potentials, yet are symmetry distinct. Finally,

Monte Carlo methods may be used exhaustively for the smallest unit cells, but would be highly impractical for exhaustive enumeration of some of the larger unit cells we present below.

Following the enumeration procedure described in chapter 2, we obtain 16 symmetry- distinct H-bond topologies for Orth AdA§ . Two configurations out of the 16 are of particular note (Fig. 3.5). These configurations are related to each other by reversal of all

H-bonds, yet are symmetry distinct. In an idealized structure where covalent and nearest

neighbor oxygen-oxygen distances are the same for all molecules, We have verified that the

! S

two structures shown in Fig. 3.5 have identical distributions of pair distances out to 0ñì .

74 From the structural information given in Lekner’s work, it is clear that these two structures are the pair that Lekner found to be energetically degenerate with respect to the Coulomb potential, as they would be for any pairwise additive potential.

118(SJ) 117(SJ)

Figure 3.5: These two configurations of the Orth dA§ cell are symmetry distinct, yet are related to each other by reversal of all H-bonds. The left hand structure is converted into the right hand one by first reflecting through a horizontal plane that bisects the figure

midway between the two _`G bilayers, followed by reversing all the H-bonds.

Sequences of unit cells for ice-Ih

The primitive unit cell of ice-Ih is defined by the following unit vectors.

B

S





,\Ž

*

j

£

S !§S ‘

 (3.43)

© ,Õ]”Ž J

B

*

j j

Æ

S

’

“

,

*

j

  §

h g

OÒÀ O We use hex O to label unit cells built from multiples of the primitive unit cell.

75

We will also consider another hexagonal system constructed from the following unit

!§S

vectors. S



 ‘

, C J

j j

ë§S

Æ Æ

‘

B ,

© (3.44)

j

S

Æ

’

“

,

*

j

  §

h g

ObÀ O

These unit cells, designated here as Hex O , form a convenient sequence of when

  §   §

g g

O O O O O O OÒè,ŽOŠÅ&,TO . The hex and Hex cells can be taken to be prisms with the full hexagonal symmetry of the ice-Ih lattice. (An example of a the Hex § unit

cell is shown below in Fig. 3.9.)

  §   §   §

h g h g g

ObÀ O O OÒÀ O O O O The Orth O , hex , and Hex cells stand in relation to each other as shown in Fig. 3.6. The arrows in that diagram represent a membership relation

between the oriented graphs of the cells linked by an arrow. When the set of oriented graphs

of a smaller unit cell is a subset of the graphs of a larger unit cell, the two cells are joined

by an arrow. In effect, the arrows represent a chain by which invariant from smaller unit

cells can predict the properties of larger cells. Also shown in Fig. 3.6 are the number

of symmetry-distinct graphs that would arise from the ice rules for neutral water, second

order graph invariants, and linearly independent second order graph invariants for neutral

water graphs. The number of linearly independent graph invariants appears to level off and !b=!ÒA§ remain quite small, reaching only 14 for the Orth ¢#fAI§ and Hex cells.

3.3.3 Analysis of enumeration results

While energetic calculations are beyond the scope of this work, constraints on ice struc-

tures, as revealed by enumeration of H-bond topologies, do provide some insights into the

behavior of ice-Ih and possible low temperature phases. Assuming a random distribution

of H-bond topologies, the standard model going back to the work of Pauling [18], we are able here to predict some statistical properties of ice-Ih, such as dipole moment (in a

76 Hex (2x2x1)

36 14 8360361

Orth (3x1x1) hex (2x2x1) 36 14 2275 17 6 55

Hex (1x1x1) 13 5 14

Orth (1x1x1) hex (1x1x1) 13 6 16 7 2 2

Figure 3.6: Each unit cell is accompanied by three numbers, which from left to right are the number of second order graph invariants, the number of linearly independent second order graph invariants for graphs that satisfy the ice rules for neutral water, and the number of symmetry-distinct H-bond configurations for that unit cell.

bond dipole model). In most previous simulations of ice-Ih, the H-bond topology in the simulation cell has been chosen to have zero dipole moment, and often minimum higher multipoles [37, 60, 78]. Here we report on how likely or unlikely these low multipole con-

figurations will be. Our explorations will also categorize possible candidates for the low temperature phase of ordinary ice. Calculations of water dimer indicate that the lowest energy topology would contain maximum fraction of trans bonds, yet recent experiments

77 have been interpreted to favor a ferroelectric structure, where the fraction of trans is 25%, far from optimal according to Bjerrum’s conjecture [26,32]. We will explore the correlation between ferroelectricity and fraction of trans bonds.

We have accumulated data on H-bond geometry and dipole moments for the variety of unit cells shown in Fig. 3.6. The results are all qualitatively the same, both for the orthorhombic and hexagonal cells. Therefore, we present the results for the largest unit cell, Hex "!bc!b§ , whose 2404144962 configurations (8360361 symmetry-distinct), best approximate the infinite system limit. The dipole moment is calculated in a bond dipole approximation, and the bond dipoles are assumed to be parallel to the oxygen-oxygen vec-

tor of the H-bonds. We report dipoles in units of the bond dipoles. Dipole moment arising

!

Ÿ

Ú

a 4 from bonds along the -axis, the origin of ferroelectricity in the proposed a structure

of ice-XI, are of particular interest. The distribution, shown in the top of Fig. 3.7, indicates

that complete alignment of H-bonds along the a -axis is extremely rare. Zero dipole mo- ment, as often imposed in computer simulations of ice-Ih [37,60,78], is relatively frequent, but still is not typical, only occurring in 27.5% of H-bond arrangements. The bottom panel of Fig. 3.7 shows that nearly all H-bond arrangements contain a percent of trans H-bonds

between 40% and 80%, with the maximum near 60%.

!

Ÿ

Ú 4 The proposed ferroelectric a structure of ice-XI is rather unusual among ferro-

electric structures in that it contains a small fraction of trans H-bonds. In the Orth §

unit cell, the H-bond arrangements with H-bonds completely aligned along the a -axis tend

!

Ÿ

Ú 4 to have small percentage of trans bonds: 1 isomer has 50% trans, 4 (including a ) have

25% trans, 1 has 12.5% trans, and 1 has 0% trans. This is an artifact of the small unit cell size. In larger unit cells the completely aligned structures tend to have a larger percentage of trans bonds. This property is illustrated in the two-dimensional distributions of dipole

78 0.4

0.3 P(|µc | ) 0.2

0.1

0 5 10 15 20 25 |µc | (bond dipole)

0.05

0.04 P(%−trans) 0.03

0.02

0.01

20 40 60 80 100 %−trans

Figure 3.7: The top panel shows the distribution of dipole moment magnitude arising from !b=!ÒA§

H-bonds along the a -direction in a 48-water unit cell (Hex ) of ice-Ih. Measured in

bond dipoles, the maximum dipole moment is 24, the number of H-bonds along the a -axis. The bottom panel shows the distribution of trans H-bonds among the 96 H-bonds of the unit cell.

79 moment and fraction of trans bonds shown in Fig. 3.8. Looking at the top, center panel of Fig. 3.8, we find that the percentage of trans H-bonds in arrangements with maximum

ferroelectricity along the a -axis extends from a minimum of 0% to a maximum of 75%.

Contrary to what one might expect by only considering the Orth dA§ unit cell, there exist several examples of slightly larger unit cells with complete ferroelectric order

in the a -direction and high percentage of trans bonds, as shown in Fig. 3.9. As one can

!

Ÿ

Ú 4 see from Fig. 3.8, the a structure is very atypical, at least with respect to dipole

moment and fraction of trans H-bonds. Optimization of a large unit cell by Monte Carlo

!

Ÿ

Ú 4 methods [21, 37] would be unlikely to uncover either the a structure or the examples of Fig. 3.9. Enumeration are an important complement to Monte Carlo search.

3.4 Conclusion

Larger unit cells of ice-Ih, needed to simulate thermal properties and phase transitions, can be arranged in astronomically large numbers of H-bond configurations. Energy dif- ferences among these configurations are rather small, so accurate and expensive ab initio methods are likely to be required to understand the low temperature behavior of ice-Ih.

Graph invariants provide a means of describing the energy, free energy, and other scalar physical properties of the large number of configurations using only a handful of parame- ters. It is significant to note that, even though the number of H-bond arrangements grows exponentially with system size, the number of linear independent invariants grows quite slowly and appears to approach a finite limit. Features of the H-bond topology in ice-Ih previously suggested as determinants of the energy, such as trans and cis H-bonds [26,32],

80 100 100 100

80 80 80 %−trans %−trans %−trans 60 60 60

40 40 40

20 20 20

10 20 30 40 10 20 30 40 10 20 30 40 |µ| (bond dipole) |µc | (bond dipole) |µab | (bond dipole)

%−trans %−trans %−trans 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 0.04

0.015 0.02 0.03

0.01 0.02

0.01 0.005 0.01

0 0 0 0 0 0 10 10 10 |µ| 20 |µ | 20 20 (bond dipole)30 c (bond dipole)30 |µ | 30 40 40 ab (bond dipole) 40

Figure 3.8: Scatter plot (top row) and three-dimensional representations (bottom row) of the distribution of H-bond isomers in ice-Ih resolved according to dipole moment and percent trans H-bonds. The three-dimensional plots best convey where the bulk of the distribution is located, while the scatter plots depict the locus of possible structures, regardless of their frequency. From left to right are the distribution of total dipole moment of the unit cell,

dipole moment generated by H-bonds along the a -axis, and dipole moment generated by H- G bonds lying within the puckered hexagonal sheets parallel to the _ - and -crystallographic axes. The data was generated for the 48-water Hex "!Ò=!b§ unit cell, for which there are 2404144962 isomers satisfying periodic ice rules, of which 8360361 are symmetry-distinct. The dipole moment is reported in units of OH bond dipoles.

81 Origin= 135, ferroelectric and 62.5% trans

a) b)

Figure 3.9: Two examples of small unit cells with complete ferroelectric order along the "!b§

a -axis coexisting with a high percentage of trans bonds: a) a 16-water Orth cell with 62% trans bonds, b) a 12-water Hex dA§ cell with 75% trans bonds.

82 emerge in our theory as some of the possible low-order invariants. However, graph in- variants provide many other possible links between H-bond topology and scalar physical properties which have not been considered, but may turn out to be significant.

The hierarchy of approximations provided by graph invariants can be arranged on a two-dimensional grid. On one axis, the level of approximation is distinguished by the number of bond variables multiplied together in each term. Invariants can be constructed as linear combinations of single bond variables (first order invariants), products of two bond variables (second order invariants), three bond variables (third order invariants), and so on.

The hope is that the expansion of scalar physical properties in term of invariants converges with relatively low order invariants, a property we have demonstrated for finite clusters of water molecules. The second axis of the grid of approximations is unique to periodic systems. The crystal can be constructed from unit cells of different size, with large unit cells needed to describe a disordered such as the H-bond disordered phase of ice-

Ih. Graph invariants can be ordered according to whether they are a linear combination of products of bonds from a small unit cell, or whether the bonds only belong to a larger unit cell. We have shown that graph invariants of larger unit cells fall into two categories.

The first class involves bonds that are close enough to be part of a small unit cell. The

dependence of the energy on these invariants can be determined from smaller unit cells,

and used to determine the energy of the much larger number of H-bond arrangements of

the large cell. The second class of invariants involves products of bonds that only occur in

the large cell. Eventually, as the unit cell is progressively enlarged the contribution arising

from far-away bonds will become negligible. The rate of convergence for this second of the two axes in the grid of approximations has not yet been tested, and awaits the results of periodic ab initio calculations.

83

Q•—–~•—–g•Fä

3.5 Appendix: Graph invariants of the Orth à cell AI§

There are 13 independent invariants for the orthorhombic v unit cell of ice-

Ih. Three of those, Eqs. (3.40-3.42), are discussed in section 3.3.1. The remainder and

presented in this section.

The follow invariants involve products of bonds that lie within the same bilayer.



£ "! © ! © ! £ ! £

h h h h h

!

)+4 5 À , G.À¡G J G. G À|] G24¢À¡G ] G34 G À

^

h h h h

J&G24 G* JµG34¢ÀG* J G24 G*ÀbJ G24¢ÀFG*À

£ £ £ £

h h h h

J&G G* JµG ÀG* J G G*ÀbJ G ÀFG*À

h h h h

J&G. GL; JµGL.ÀGL; J G. GL;€ÀbJ G.ÀFGL;€À

© © © © §

h h h h

GL; ]zG ÀGL; ]½G GL;€Àm]½G ÀFGL;€À

] G (3.45)



£ "! © ! © ! £ ! £

h h h h h h

!

)4 5 , G. G J G.À¡G À|] G24 G ] G24¢À¡G À

^

h h h h

J&G24 G* JµG34¢ÀG* J G24 G*ÀbJ G24¢ÀFG*À

£ £ £ £

h h h h

J&G G* JµG ÀG* J G G*ÀbJ G ÀFG*À

h h h h

J&G. GL; JµGL.ÀGL; J G. GL;€ÀbJ G.ÀFGL;€À

© © © © §

h h h h

GL; ]zG ÀGL; ]½G GL;€Àm]½G ÀFGL;€À

] G (3.46)



h h h

!

)+4 564¢À , ^¼G* G*ÀbJµ^¼GL; GL;€À

^

! ! £ £ ! © © !

h h h h

J G24 G24¢ÀbJ G G ÀfJ G G ÀfJ G. G.À

£ £ £ £ £ £ £ £

£ £ © © §

h h h h

JµG34¢À JµG J G À JµG. J G.À J G J G À

J&G24 (3.47)



£ £ £ £

£ £

h h h h

I!

)4 564 , G24 JµG24¢À JµG J G À J

£ £ £ £

h h

G* JµG*À JµGL. J G.À J

£ £ £ £

© © §

h h

JµG À JµG¦; J GL;€À G (3.48)

84

The next set of invariants involves bonds that lie within adjacent bilayers.



© "! £ ! £ ! © ! ©

h h h h h

!

)4 5 À , G ÀG. J G GL.ÀÄ] G24¢ÀG ] G24 G À

^

h h h h

] GL* GL. ]rGL*ÀG. ]zG* G.Àm]zG*ÀG.À

© © © ©

h h h h

J&G* G J G*ÀG J G* G ÀfJµG*ÀFG À

h h h h

] G34 G¦; ]rG34¢ÀGL; ]zG24 GL;€Àm]zG24¢ÀGL;€À

£ £ £ £ §

h h h h

G¦; ]rG ÀGL; ]zG GL;€Àm]zG ÀGL;€À

] G (3.49)



© "! £ ! £ ! © ! ©

h h h h h h

!

)+4 5 , G G. J G À¡GL.ÀÄ] G24 G ] G24¢ÀG À

^

h h h h

] GL* GL. ]rGL*ÀG. ]zG* G.Àm]zG*ÀG.À

© © © ©

h h h h

J&G* G J G*ÀG J G* G ÀfJµG*ÀFG À

£ £

h h h h h h

] G34 G¦; ]rG34¢ÀGL; ]zG GL; ]½G ÀFGL;

£ £ §

h h

G¦;€ÀÄ]zG24¢ÀFGL;€ÀÄ]½G GL;€À|]½G ÀGL;€À

] G34 (3.50)



£ © £ © §

h h h h h h h h

G24 G. J G24¢À¡GL.Àm]½G G ]½G ÀFG ÀbJ G*ÀG¦; JµG* GL;€À 5 . ,

)+4 (3.51)

C



"! !

h h h

!

)4 5 .À , G* GL; J G*À¡G¦;€À

h h h h

J&G24 G. J G24¢ÀGL. J G24 G.ÀfJµG24¢ÀFG.À

£ © £ © £ © £ © §

h h h h

G ]rG ÀG ]zG G Àm]zG ÀG À ] G (3.52)

85 Finally, there is a set of invariants that couple bonds that lie along the a axis with bonds

in a bilayer.



h h h h h

ë

G24 G\ JµG24¢ÀFGD\ JµG34 G\ ÀbJ G24¢ÀGD\ À )+4 5 \ À ,

^

B B B B

h h h h

J¨G24 G JµG24¢ÀFG J G24 G ÀbJ G24¢ÀG À

£ £ £ £

h h h h

J¨G GD\ JµG ÀFGD\ J G GD\ ÀbJ G ÀG\ À

£ B £ B £ B £ B

h h h h

J¨G G JµG ÀFG J G G ÀbJ G ÀG À

h h h h

]¤GD\ G. ]zGD\ À¡GL. ]zGD\ G.ÀÄ]½GD\ ÀFG.À

© © © ©

h h h h

J¨GD\ G JµGD\ ÀFG J GD\ G ÀbJ GD\ ÀG À

B B B B

h h h h

]¤G. G ]zG.À¡G ]zG. G ÀÄ]½G.ÀFG À

© B © B © B © B

h h h h

J¨G G ÀbJ G ÀG ÀbJ G G JµG À¡G

! B ! B ! !

h h h h

] G*ÀFG ÀÄ] G* G ] GL* GD\ À|] G*À¡G\

! B ! ! ! B §

h h h h

GL;€À¡G J GD\ GL; J GD\ ÀG¦;€ÀbJ GL; G À

J (3.53)



£

h h h h h h h h

!

)4 5 \ , G24 G\ JµG GD\ ]½G* GD\ J G24¢À¡G\ À

^

£

h h

J¨G ÀGD\ Àm]½G*ÀGD\ Àm]zGD\ G. ]½GD\ ÀGL.À

© ©

h h h h

J¨GD\ G JµGD\ ÀFG ÀbJ GD\ ÀFGL; JµG\ G¦;€À

B £ B B B

h h h h h h h

J¨G24 G JµG G ]zGL*À¡G ]½G. G

© B B B £ B

h h h h

J¨G G JµGL; G J G24¢ÀFG ÀbJ G ÀG À

B B © B B §

h

G Àm]rGL.ÀG ÀfJµG ÀFG ÀbJ GL;€ÀFG À ]¤G* (3.54)

86 CHAPTER 4

EFFECTS OF H-BOND TOPOLOGY ON ENERGETICS, STRUCTURE AND CHEMISTRY ON WATER CLUSTERS

Commonly held wisdom asserts that each hydrogen bond between water molecules

y

eg¢h j

j R stabilizes a structure by kml [27]. Therefore aqueous clusters that differ only by the di- rection of hydrogen bonds, but otherwise have the same number of H-bonds and placement of oxygen atoms should have approximately the same energy. This belief implicitly lies

behind several calculations performed to date [79–83] for the ¡¨£L¥¨§ £ dodecahedron and

?

¡ ¢¡ £L¥¨§ £

the associated 4 formed by adding a hydronium ion. In each of those studies, only one or a handful of arbitrarily chosen hydrogen bond arrangements were considered.

The H-bond arrangement in the dodecahedral cage was presumed to have a minor effect on the properties of this cluster.

In this chapter we show that H-bond topology strongly affects the structure and energy of ¢¡ £L¥¨§ £ . Furthermore, we find that the H-bond arrangement strongly affects the chem- istry of ¡ £L¥¨§ £ . In fact, we report that certain seemingly unremarkable arrangements of the H-bonds in ¡ £¦¥¨§ £ – no unusually strained bond lengths or angles – leads to spon- taneous self-dissociation of a water molecule, producing spatially separated excess proton and hydroxide ion units in the cluster. (The terms autoionization and autoprotolysis are also used in the literature to designate the self-dissociation of water into ionic fragments.) This

87 is the first report of such water cluster configurations that can proceed to a lower energy state through self-dissociation.

The ¢¡ £L¥¨§ £ dodecahedron is a well known building block of type I ice clathrates. The

¡A£L¥¨§ £ cage containing a hydronium ion in its interior has been proposed by Kassner and

?

¡ ¡ £¥§ £

Hagen [84] to explain the exceptional abundance of 4 in molecular beam exper- iments. [85,86] Several key pieces of experimental evidence support this hypothesis. Most

notably, Castleman and co-workers found that 10 trimethylamine molecules bind prefer-

?

¡ ¢¡ £L¥¨§ £

entially to 4 clusters, in accord with the 10 dangling hydrogens characteristic

of the dodecahedral cage. [86]. However, no theoretical calculation has found that either

¢¡¤£L¥¨§ £ ¡ ?@ ¡A£L¥¨§ £

dodecahedral or 4 is exceptionally stable [79–82, 87–92]. All calcula-

?

¡ ¢¡ £L¥¨§ £

tions performed to date for 4 find that placing a hydronium inside a dodecahedral

cage leads to highly distorted structures [80–82]. Therefore, the structure of ¢¡/£¦¥¨§ £ and ¡¤?Î ¡ £¦¥¨§ £

4 remain issues of great interest in the study of water clusters.

We have previously determined that 30026 symmetry-distinct H-bond arrangements are possible in the ¡¤£¦¥¨§ £ dodecahedron [1]. Several of the many possible isomers are shown in Fig. 4.1. Each of the water molecules in the dodecahedron is either a double-donor water

(2DW), in which both of its hydrogens are involved in an H-bond, or a double-acceptor wa- ter (2AW), in which one hydrogen participates in an H-bond and the other hydrogen dangles

from the cage. It would seem that the 30026 distinct isomers of the ¡¨£L¥¨§ £ dodecahedron,

all with 20 three-coordinate waters in a similar geometry and 30 H-bonds, should have sim- ilar characteristics, but we will show that this is hardly the case. Long before water clusters were studied, multiple H-bond arrangements for a given lattice of oxygen atoms were rec-

ognized to be a key feature of ice-Ih, ordinary ice. In 1935 Pauling estimated [18], later

x

£Œ * confirmed with remarkable accuracy, [24,25] that ‹ H-bond arrangements are possible

88 for M water molecules in ordinary hexagonal ice, which is consistent with the experimen-

tal residual entropy of ice at E%t [17]. The question of whether H-bond arrangements are completely random and the nature of a phase transition to an ordered structure are matters of current interest [19, 20, 22, 40, 44, 45, 49–52]. Therefore, understanding how H-bond topology affects water clusters is also relevant to our understanding of ice and liquid wa-

ter. Finite cluster studies, like ours, are most relevant to the surface of ice, where less than

4-coordinate water molecules exist.

4.1 Cluster stability and H-bond topology ?Ó ¡A£¦¥§ £

Studies of ˜š™ by Smith and Dang [91], as well as our studies [1] of the 30026 isomers of the ¢¡¤£¦¥¨§ £ dodecahedron, have previously suggested that nearest neighbor

2AW’s, that is, nearest neighbor dangling hydrogens, are strongly disfavored. In these previous studies, either empirical potential models or semi-empirical electronic structure methods were employed. Here we confirm this effect with electronic density functional theory (DFT) calculations. [93] In the left panel of Fig. 4.2 both the OSS2 empirical po- tential [2, 3] and B3LYP density functional theory calculations [4, 5] exhibit a generally increasing trend in energy when plotted against the number of nearest neighbor 2AW’s.

Even when the number of neighboring 2AW’s is fixed, there is considerable variation of the energy, indicating there are other significant factors determining the stability of these clusters, factors that we identify below. In chapter 2, we have calculated the energy of all

30026 isomers using the OSS2 model. We demonstrated that graph invariants, symmetry- invariant functions of bond variables, provide a more sophisticated and successful way to link energy to H-bond topology in the isomers of the ¡&£L¥¨§ £ dodecahedron. [94] While it

89 a) b)

c) d)

e) f)

Figure 4.1: Isomers of the ¢¡K£¦¥¨§ £ dodecahedron: a) most stable isomer, b) least stable isomer yet observed, c,d) two isomers with 5 nearest neighbor 2AW’s (discussed below in relation to Fig. 4.2), e,f) isomers which have formed zwitterions.

90 is not feasible to perform electronic structure calculations for all 30026 isomers of the do-

decahedron, the right panel of Fig. 4.2 does verify the general agreement between quantum calculations and the OSS2 model, although it appears that the OSS2 model underestimates the energy gap between the least and most stable structures.

The topology of the dodecahedron allows a minimum of 3 nearest neighbor 2AW’s

(Fig. 4.1a), which comprise the most stable class of structures, and a maximum of 10

(Fig. 4.1b), generally the least stable group [1]. Both the OSS2 model and quantum chem-

ical studies indicate that the range in energy between the most and least stable H-bond

eg¢h j eDgih

j j

R$E {$E k|l isomers is very large, about kml in the OSS2 model and in B3LYP calculations.

This is a surprisingly large difference, considering that previous theoretical treatments of the ¢¡ £L¥¨§ £ dodecahedron only considered either one or a few arbitrary structures, assum- ing they were roughly equivalent in their properties. [79–83] For example, the dodecahe- dron pictured in work by Laasonen and Klein [82] is a member of the least stable class

of structures, those with 10 nearest neighbor dangling hydrogen pairs. The structure pic-

tured by Lee et al. [83] contains 4 nearest neighbor dangling hydrogen pairs, only one pair

away from the optimum value. Wales and Hodges, [95] using the TIP4P empirical poten-

tial model [96] and the powerful basin-hopping search method, [97] found a lowest energy

¡A£L¥¨§ £ dodecahedron with just 3 nearest neighbor dangling hydrogen pairs.

4.2 Self-dissociation and zwitterionic structures

Not only are some of the isomers of dodecahedral ¡&£L¥¨§ £ surprisingly high in energy,

they are chemically reactive. An example of such behavior is shown in Fig. 4.3. This isomer

¡/? ¥&¡ can lower its energy upon dissociation of a water molecule to and  . According to both the OSS2 empirical potential [2, 3] and Hartree-Fock level ab initio calculations

91 70 70 B3LYP 60 60 OSS2

50 50

40 40

30 30

energy (kcal/mol) c) B3LYP energy (kcal/mol) 20 20

10 10 d)

3 4 5 6 7 8 9 10 10 20 30 40 50 60 70 nearest neighbor 2AW’s OSS2 energy (kcal/mol)

Figure 4.2: Energy of ¡¤£L¥¨§ £ isomers plotted on the left against the number of nearest neighbor dangling hydrogen pairs or 2AW’s. The energies are calculated using the OSS2 empirical model [2, 3] (light gray points), or B3LYP electronic density functional theory [4,5] with the cc-pvdz basis set [6,7] (black dots). The B3LYP energy points marked c) and d) correspond to isomers with those labels in Fig.4.1. From the 30026 symmetry-distinct structures, we selected for electronic density functional theory calculations predominantly those structures that are predicted by the OSS2 model to be either near the top or the bottom of the energy range, with fewer cases in between. On the right, the OSS2 energy is compared with the B3LYP energy (dots). The gray line indicates perfect agreement. When the number of nearest neighbor 2AW’s is large, we encountered isomers in which the H- bond topology changed significantly upon B3LYP optimization, or for which a dangling hydrogen rotated to point toward the interior of the cage. These isomers were excluded from the plot.

92 carried out by us, the neutral structure in Fig. 4.3 is a local minimum with one unusually

short H-bond, representing a partially ionized structure. Displacing the hydrogen of that

bond slightly in the direction of ionization leads to charge separation. At other levels of

theory, such as B3LYP density functional theory, [4, 5] the gradients are small but not

zero at the geometry of the neutral isomer, so the isomer dissociates with no barrier. In

the example shown in Fig. 4.3, self-dissociation of a water molecule results in an energy

eDgih j

 #

j ^d0 drop of kml according to B3LYP-DFT with a double-zeta, correlation consistent (cc- pvdz) basis set. [6, 7] Since this result was surprising, we checked the energy difference between the same two structures using second order Møller-Plesset [98–101] (MP2) ab

initio electronic structure theory with the cc-pvdz basis set. We still found a strong energy

eDgih j

$

j 0ñ^ drop of kml upon self-dissociation. The energy change upon self-dissociation is quite

large. While the precise value will depend upon level of theory and basis set, we do not expect the qualitative conclusion to depend upon the electronic structure method. After the

dissociation, the excess charge does not remain fixed but migrates until the excess proton,

?

¡ ¥&£

symmetrically solvated as an . unit in the example of Fig. 4.3, and the hydroxide are in favorable positions, [102] which we observed to be 2DW’s adjacent to the excess proton and 2AW’s adjacent to ¥¨¡ .

This observation then suggests that the most stable ¡&£¦¥§ £ zwitterions contain the

¡ ¥&? ¥&¡ 

* and linked by H-bonds pointing (according to the usual sign convention, from

¡ ¥&? ¥&¡ 

donor to acceptor) as much as possible from the * to the . This trend has previ-

ously been noted by Anick for zwitterions of the ¡&£¦¥§ B cube. [103] One highly symmetric

?



¡ ¥ ¥¨¡

class of such structures contains the * and at opposite ends of the dodecahedron.

This considerably restricts the candidates for the most stable zwitterion, which we can eas- ily list using the enumeration methods we have described elsewhere [1,94]. The most stable

93 sw1 neutral zwitterion sw2

11.8 kcal/mol

Figure 4.3: Spontaneous self-dissociation in ¡¨£¥§ £ . Structures shown were calculated using B3LYP density functional theory, as described in Fig. 4.2. The structure on the left is from a very flat portion of the potential surface where there is a short H-bond (2.425A),˚ outlined in the figure. An H £ O self-dissociates in this cluster with no barrier and, following

charge migration through the cluster, yields the locally stable zwitterionic structure shown

eDgih j

 #

j ^d0

on the right, kml below the starting energy, containing a hydroxide ion and an excess

?

¡ ¥K£

proton held in an . -like unit. At other levels of theory, there is a small barrier to self-dissociation (shown schematically).

zwitterion of this class (Fig. 4.1e) found using B3LYP density functional theory [4, 5] and

eDgih

ë ë

j

0 { the cc-pvdz basis set is only k|l above the most stable neutral dodecahedral struc-

ture. The energy difference for single point calculations at the same structures changes to

eDgih j eDgih

%

j j

ìF0ñìwR 0íR$ì k|l kml and , respectively, for B3LYP-DFT and MP2 calculations with the aug-

cc-pvdz basis set. [6, 7] We also find other, less symmetric, zwiterionic minima, such as

eDgih j

!$#

j {0 the one shown in Fig. 4.1f. In this example, only kml above the most stable neutral

structure according to B3LYP-DFT with a cc-pvdz basis, the excess proton and hydroxide

?



¡ ¥&£ ¡ ¥K£ * are contained in the configuration characteristic of the isolated . and ions.

94

eDgih



j

EF0 ^wE The energy difference becomes k|l when the calculation is repeated for the same

structures with an aug-cc-pvdz basis.

eDgih j

¡¨? ¥&¡ I#



j 0éR

The enthalpy of dissociation of water to and in bulk water is kml [27].

¡ ¥&?

In small aqueous clusters, local zwitterionic minima containing spatially separated *

eg¢h j

¥&¡¨ !

j E and units have been discovered, all lying roughly kml above the neutral ground

state [103–107]. Therefore, it is surprising that self-dissociation in a small cluster can ei- ther be spontaneous (Fig. 4.3), or, relative to the most stable neutral isomer, requires less energy than in bulk water (Fig. 4.1e). The surprising feature may alternatively be consid- ered to be the wide energy range of the H-bond isomers, allowing neutral local minima to be interspersed with zwitterionic structures. Again, while basis set and level of theory does have significant quantitative impact on our results, the major conclusions are not sensitive to these factors.

4.3 Short H-bonds

Analysis of the H-bond topology of various ¢¡&£L¥¨§ £ isomers reveals a pattern govern- ing the occurrence of exceptionally short H-bonds, and a connection between this pattern and the stability of protons or hydroxides embedded in the dodecahedral cage. The H-bonds of the dodecahedron fall into three major classes according to the two water molecules involved in the H-bond, and each major class is further broken into 5 minor classes ac- cording to the four water molecules adjoining the H-bond. In the major class with the shortest H-bonds (2.43-2.61A),˚ a 2AW donates to a 2DW (Fig. 4.4a). The class with the longest H-bonds (2.73-2.82A)˚ has the opposite arrangement, a 2DW donates to a 2AW

(Fig. 4.4b). Finally, the class with intermediate bond lengths (2.58-2.70A)˚ are bonds be- £

tween two 2DW’s or two 2AW’s (Figs. 4.4c 4 and 4.4c ).

95 (a) (c ) 1

(b) (c ) 2

(a ) (a ) 1 2

2A 2D 2A 2D 2A 2D 2A 2A (a ) (a ) 3 4

2A 2D 2D 2D 2D 2A 2D 2A (a ) 5

2D 2A 2D 2A

ξ= 0 ξ= 1 ξ= 2 0.5 ξ= 3 ξ= 4

o 2.45 2.5 2.55 2.6 R OO ( A )

Figure 4.4: Hydrogen bonds between three-coordinate waters fall into three major classes

(top panel) in which (a) a 2AW donates to a 2DW, (b) a 2DW donates to a 2AW, or either £

(c 4 ) a 2AW donates to a 2AW or (c ) a 2DW donates to a 2DW. Each major class is further

£ 4

broken into minor classes according to the topological index n [Eq.(4.1)]. Examples a , a ,

 ! #

\ . n ,šEF   ^ a * , a and a (middle panel) illustrate H-bonds of type (a) for which and , respectively. The normalized bond length distribution accumulated for H-bonds of type (a) is shown in the bottom panel. The bond lengths were obtained from the B3LYP/DFT optimized structures whose energies are given in Fig. 4.2.

96 Each of the three classes of H-bond described in the previous paragraph can be further

refined into minor classes according to the neighbors of the two water molecules partici-

pating in the H-bond in question. Let n be a topological index defined as follows:

§ ,

n number of 2DW neighbors of the H-bond donor

§

J number of 2AW neighbors of the H-bond acceptor (4.1)

£

03030 . n

Figs. 4.4a 4 , 4.4a 4.4a give examples of a 2AW donating to a 2DW with equal to 0,

E n ,ª^ 1, 2, 3, and 4, respectively. Only bonds with n½, and are unique. The others

admit different placements of the neighbors for which n has the same value. As shown in

Fig. 4.4, there is a strong correlation between the topological index n and the length of the . H-bond for bonds in which a 2AW donates to a 2DW, the types shown in Figs. 4.4a 4 -4.4a .

The same trend is observed for bonds of all the major classes (Figs. 4.4a-c), and it was

previously pointed out for smaller water clusters Anick. [108]

According to our criterion for stability of the zwitterion, H-bonds of the type shown

in Fig. 4.4a 4 are the best candidates for proton transfer. Before proton transfer, the dan- gling hydrogen of the donor is surrounded by two other dangling hydrogens of neighboring

2AW’s, shown in Fig. 4.2 to be a high energy arrangement. Transfer of a proton from

¡ ¥¨? ¥&¡ 

donor to acceptor creates an * surrounded by two 2DW’s and surrounded by two

2AW’s, the most stable configuration for a zwitterion. Indeed the reactivity of the bond 4 pattern of Fig. 4.4a 4 is born out in numerous calculations, in which the pattern of Fig. 4.4a

either produces outright proton transfer and the spontaneous creation of a zwitterion, or an exceptionally short H-bond which can now be understood as incipient proton transfer.

Understanding the role of proton transfer in relieving the instability of neighboring

2AW’s explains some of the large energy differences observed in Fig. 4.2. The two con-

figurations marked c) and d) in Fig. 4.2 are shown in Figs. 4.1c,d. In both structures there

97 are 5 nearest neighbor dangling hydrogen pairs. In c), those neighboring 2AW’s lie in a ring, shown at the top of the figure. All the dangling hydrogens within that ring are part of H-bonds which are unfavorable for proton transfer. There is no pathway for structure c)

to partially or fully transfer a proton to relieve the strain of neighboring 2AW’s. By con- ,TE trast, in structure d) there is an H-bond (middle foreground of Fig. 4.1d) with n of the

type shown in Fig. 4.4a 4 , the most favorable for partial or full proton transfer. In structure d), a proton is partially transferred: the original covalent OH bond has stretched to 1.14A,˚

the length of the H-bond OH is 1.29A,˚ and the OO distance is 2.43A.˚ Many of the struc-

tures whose DFT energy is shown in Fig. 4.2 are either zwitterions or exhibit partial proton

transfer. Some are even double zwitterions, containing two hydroxide ions and two excess

protons. The OSS2 potential is able to capture these effects because it is a dissociating

water potential.

The relationship of energy and bond lengths to H-bond topology has been considered

in two recent papers by Anick. [108, 109] He proposes an interesting H-bond enumeration

scheme, [109] different from the one we have employed. [1, 94] The topological features

of the H-bond topology he chooses for correlation with physical properties are chosen

in an apparently ad hoc fashion, and in this respect we prefer the graph invariants we

have introduced which can be arranged in a hierarchy of increasing complexity. [94] Anick

analyzed the dependence of bond length on H-bond topology in terms of the arrangement of

2AW’s and 2DW’s. Although he approaches the classification of H-bond types in a different

manner, the end result is a very similar trend to those we have found. In particular, the

pattern of fitting coefficients Anick gives for the ¢¡¨£L¥¨§ B cube in Table 4 of Ref.108 suggest that 2AW’s (2DW’s) adjacent to the H-bond donor (acceptor) modify the length of the H- bond to approximately the same degree. [110] Hence, the overall trend is well summarized

98 ¢¡/£L¥¨§ B

by the topological parameter n . The trend first discovered by Anick for the cube

is more dramatic in the ¢¡¤£L¥¨§ £ dodecahedron. Anick reports H-bond lengths as short as

! # ¡ £L¥¨§ £ ! !

Û 0 ^ Û 0íR ˚ , while for the bonds are as short as ˚ and self-dissociation occurs.

4.4 Discussion

The central finding of this chapter, that the H-bond topology has a major effect on the energy, and even on the chemical reactivity, of a small water cluster, has implications in other situations where broken H-bonds also are present in significant numbers. The surface of ice is characterized by dangling hydrogens and variable H-bond topology. If local varia- tions in the H-bond topology at the surface of ice also imply local variations in the chemi-

cal reactivity even fractionally as powerful as on the surface of the ¡£L¥¨§ £ dodecahedron,

then the H-bond topology cannot be ignored. Since the local H-bond topology is seen to

control the propensity for self-dissociation in aqueous clusters, this naturally suggests that

the H-bond topology be monitored as a possible reaction coordinate for self-dissociation in

liquid water, although we recognize that chemical processes in water clusters may be quite #

different from liquid water at E%E$t . Recent simulations have not revealed a link between self-dissociation and H-bond topology. [111] Perhaps this work will provide new ways to formulate the connection between H-bond topology and the behavior of the ice surface and liquid water. In both cases, the key issue is what role the topology of the surrounding

H-bond network plays in the activation of aqueous chemical reactions.

99 BIBLIOGRAPHY

[1] Shannon McDonald, Lars Ojamae,¨ and Sherwin J. Singer. Graph theoretical genera- tion and analysis of hydrogen-bonded structures with applications to the neutral and protonated water cube and dodecahedral clusters. J. Phys. Chem., A102(17):2824, 1998.

[2] Lars Ojamae,¨ Isaiah Shavitt, and Sherwin J. Singer. Potential models for simulations of the solvated proton in water. J. Chem. Phys., 109(13):5547, 1998.

[3] Lars Ojamae,¨ Isaiah Shavitt, and Sherwin J. Singer. Potential energy surfaces and

?

¡ ¥ £

vibrational spectra of . and larger hydrated proton complexes. Int. J. Quant. Chem., Quantum Chem. Symp., 29:657, 1995.

[4] Axel D. Becke. Density-functional thermochemistry. III. the role of exact exchange. J. Chem. Phys., 98(7):5648, 1993.

[5] C. Lee, W. Yang, and R. G. Parr. Development of the colle-salvetti correlation- energy formula into a functional of the electron density. Phys. Rev., B37:785, 1988.

[6] T. J. Dunning, Jr. J. Chem. Phys., 90:1007, 1989.

[7] R. A. Kendall, T. H. Dunning Jr., and R. J. Harrison. J. Chem. Phys., 96:6796, 1992.

[8] Mario J. Molina, Tai-Ly Tso, Luisa T. Molina, and Frank C.-Y. Wang. Antarctic stratospheric chemistry of chlorine nitrate, hydrogen chloride, and ice: release of active chlorine. Science, 238:1253, 1987.

[9] A. Werner. Liebigs Ann., 322:261, 1902.

[10] Frank H. Stillinger and Aneesur Rahman. Improved simulation of liquid water by molecular dynamics. J. Chem. Phys., 60:1545, 1974.

[11] Sotiris S. Xantheas. Cooperativity and hydrogen bonding network in water clusters. Chem. Phys., 258:225, 2000.

[12] Huafeng Xu and B.J. Berne. Can water polarizability be ignored in hydrogen bond kinetics. J.Phys. Chem. B, 106:2054, 2002.

100 [13] R. Car and M. Parrinello. Unified approach for molecular dynamics and density- functional theory. Phys. Rev. Lett., 55(22):2471, 1985.

[14] C. Møller and M.S. Plesset. Phys. Rev., 46:618, 1934.

[15] Axel D. Becke. A new mixing of hartree-fock and local density-functional theories. J. Chem. Phys., 98(2):1372, 1993.

[16] W. F. Giauque and Muriel F. Ashley. Molecular rotation in ice at 10k. free energy of formation and entropy of water. Phys. Rev., 43:81, 1933.

[17] W. F. Giauque and J. W. Stout. The entropy of water and the third law of thermody-

! #£› t namics. The heat capacity of ice from 15 to { . J. Amer. Chem. Soc., 58:1144, 1936.

[18] Linus Pauling. The structure and entropy of ice and of other crystals with some randomness of atomic arrangement. J. Amer. Chem. Soc., 57:2680, 1935.

[19] Y. Tajima, T. Matsuo, and H. Suga. Phase transition in KOH-doped hexagonal ice. Nature, 299:810, 1982.

[20] S. M. Jackson, V. M. Nield, R. W. Whitworth, M. Oguro, and C. C. Wilson. Single-crystal neutron diffraction studies of the structure of ice XI. J. Phys. Chem., B101(32):6142, 1997.

[21] V. Buch, P. Sandler, and J. Sadlej. Simulations of ¡¨£L¥ solid, liquid, and clusters, with emphasis on ferroelectric ordering transition in hexagonal ice. J. Phys. Chem., B102(44):8641, 1998.

[22] Y. Wang, J. C. Li, A. I. Kolesnikov, S. Parker, and S. J. Johnsen. Inelastic neutron scattering investigation of greenland ices. Physica, B276-278:282, 2000.

[23] L. Onsager and M. Dupuis. The electrical properties of ice. Rend. Scuola Intern. FS., X Corso, page 294, 1960.

[24] E. A. DiMarzio and F. H. Stillinger. Residual entropy of ice. J. Chem. Phys., 40(6):1577, 1964.

[25] J. F. Nagle. Lattice statistics of hydrogen bonded crystals. I. the residual entropy of ice. J. Math. Phys., 7(8):1484, 1966.

[26] N. Bjerrum. Structure and properties of ice. Science, 115:386, 1952.

[27] D. Eisenberg and W. Kauzmann. The Structure and Properties of Water. Oxford, New York, 1969.

[28] Peter V. Hobbs. Ice Physics. Oxford, New York, 1974.

101 [29] J. K. Gregory, D. C. Clary, K. Liu, M. G. Brown, and R. J. Saykally. The water dipole moment in water clusters. Science, 275:814, 1997.

[30] Christopher J. Gruenloh, Joel R.Carney, Caleb A. Arrington, Timothy S. Zwier, Sharon Y. Fredericks, and Kenneth D. Jordan. Infrared spectrum of a molecular ice cube: the s4 and d2d water octamers in benzene-(water)8. Science, 276(5319):1678, 1997.

[31] Michael D. Tissandier, Sherwin J. Singer, and James V. Coe. Enumeration and evaluation of the water hexamer cage structure. J. Phys. Chem., A104(4):752, 2000.

[32] Kenneth S. Pitzer and Jan Polissar. The order-disorder problem for ice. J. Amer. Chem. Soc., 60:1140, 1956.

[33] Ernest R. Davidson and Keiji Morokuma. A proposed antiferroelectric structure for proton ordered ice ih. J. Chem. Phys., 81(8):3741, 1984.

[34] Byoung Jip Yoon, Keiji Morokuma, and Ernest R. Davidson. Structure of ice ih. ab initio two- and three-body water-water potentials and geometry optimization. J. Chem. Phys., 83(3):1223, 1985.

[35] J. C. Li and D. K. Ross. Evidence for two kinds of hydrogen bond in ice. Nature, 365:327, 1993.

[36] J. S. Tse and D. D. Klug. Comments on “Further evidence for the existence of two kinds of H-bonds in ice Ih” by Li, et al”. Phys. Lett., A198(5,6):464, 1995.

[37] Aneesur Rahman and Frank H. Stillinger. Proton distribution in ice and the kirkwood correlation factor. J. Chem. Phys., 57(9):4009, 1972.

[38] W. J. Pullan. Genetic operators for the atomic cluster problem. Comput. Phys. Commun., 107(1-3):137, 1997.

[39] Richard Judson. Genetic algorithms and their use in chemistry. Rev. Comput. Chem., 10:1, 1997.

[40] Shuji Kawada. Dielectric dispersion and phase transition of KOH doped ice. J. Phys. Soc. Japan, 32, 1972.

[41] Yoshimitsu Tajima, Takasuke Matsuo, and Hiroshi Suga. Calorimetric study of phase transition in hexagonal ice doped with alkali hydroxides. J. Phys. Chem. , 45(11-12):1135, 1984.

[42] Takasuke Matsuo, Yoshimitsu Tajima, and Hiroshi Suga. Calorimetric study of a phase transition in deuterated ice ih doped with deuterated potassium hydroxide: ice XI. J. Phys. Chem. Solids, 47(2):165, 1986.

102 [43] Takasuke Matsuo and Hiroshi Suga. Calorimetric study of ices ih doped with alkali hydroxides and other impurities. J. Phys., Colloq., C1:477, 1987.

[44] Rachel Howe and R. W. Whitworth. A determination of the crystal structure of ice XI. J. Chem. Phys., 90(8):4450, 1989.

[45] A. J. Leadbetter, R. C. Ward, J. W. Clark, P. A. Tucker, T. Matsuo, and H. Suga. The equilibrium low-temperature structure of ice. J. Chem. Phys., 82(1):424, 1985.

[46] Christina M. B. Line and R. W. Whitworth. A high resolution neutron powder £L¥

diffraction study of + ice XI. J. Chem. Phys., 104(24):10008, 1996.

[47] S. M. Jackson and R. W. Whitworth. Evidence for ferroelectric ordering in ice Ih. J. Chem. Phys., 103(17):7647, 1995.

[48] S. M. Jackson and R. W. Whitworth. Thermally-stimulated depolarization studies of the ice XI-ice Ih phase transition. J. Phys. Chem., 101(32):6177, 1997.

[49] M. J. Iedema, M. J. Dresser, D. L. Doering, J. B. Rowland, W. P. Hess, A. A. Tsek- ouras, and J. P. Cowin. Ferroelectricity in water ice. J. Phys. Chem., B102(46):9203, 1998.

[50] R. W. Whitworth. Comment on “ferroelectricty in water ice”. J. Phys. Chem., 103(38):8192, 1999.

[51] J. P. Cowin and M. J. Iedema. Reply to comment on “ferroelectricty in water ice”. J. Phys. Chem., 103(38):8194, 1999.

[52] Hiroshi Fukazawa, Shinji Mae, Susumu Ikeda, and Okitsugu Watanabe. Proton ordering in antartic ice observed by raman and neutron scattering. Chem. Phys. Lett., 294:554, 1998.

[53] J. D. Bernal and R. H. Fowler. A theory of water and ionic solution, with particular reference to hydrogen and hydroxyl ions. J. Chem. Phys., 1(8):515, 1933.

[54] Frank Harary. Graph theory. Addison-Wesley, Reading, Mass., 1969.

[55] Frank Harary and Edgar M. Palmer. Graphical enumeration. Academic, New York, 1973.

[56] R. Howe. The possible ordered structures of ice ih. J. Physique (Paris), 48(3):C1– 599, 1987. Colloque C1.

[57] John Lekner. Energetics of hydrogen ordering in ice. Physica, B252:149, 1998.

[58] C. Domb. Graph theory and embeddings. volume 3 of Phase Transitions and Critical Phenomena, page 1. Academic, New York, 1974.

103 [59] T. P. Radhakrishnan and William C. Herndon. Graph theoretical analysis of water clusters. J. Phys. Chem., 95:10609, 1991.

[60] J. A. Hayward and J. R. Reimers. Unit cells for the simulation of hexagonal ice. J. Chem. Phys., 106(4):1518, 1997.

[61] J. J. P. Stewart, J. Comp. Chem., 10, 209 (1989), ibid., 10, 221 (1989).

[62] C. J. Tsai and K. D. Jordan. Theoretical study of the ¢¡¨£L¥¨§ © cluster. Chem. Phys. Lett., 213(1,2):181, 1993.

[63] Chengteh Lee, Han Chen, and George Fitzgerald. Structures of the water hexamer using density functional methods. J. Chem. Phys., 101(5):4472, 1994.

[64] Kyungsun Kim, Kenneth D. Jordan, and Timothy S. Zwier. Low-energy struc- tures and vibrational frequencies of the water hexamer: Comparison with benzene- ¡A£L¥¨§ © . J. Am. Chem. Soc., 116(25):11568, 1994.

[65] J. Marc Pedulla, K. Kim, and K. D. Jordan. Theoretical study of the n-body inter- action energies of the ring, cage and prism forms of ¡&£¦¥§ © . Chem. Phys. Lett., 291(1,2):78, 1998.

[66] Jongseob Kim and Kwang S.Kim. Structures, binding energies, and spectra of isoenergetic water hexamer clusters: Extensive ab initio studies. J. Chem. Phys., 109(14):5886, 1998.

[67] Frank H. Stillinger and Thomas A. Weber. Hidden structure in liquids. Physical Review, A25(2):978, 1982.

[68] Frank H. Stillinger and Thomas A. Weber. Dynamics of structural transitions in liquids. Physical Review, 28(4):2408, 1983.

[69] David J. Wales. Coexistence in small inert gas clusters. Mol. Phys., 78(1):151, 1993.

[70] Jonathan P. K. Doye and David J. Wales. Calculation of thermodynamic proper- ties of small Lennard-Jones clusters incorporating anharmonicity. J. Chem. Phys., 102(24):9659, 1995.

[71] G. P. Johari and S. J. Jones. Study of the low-temperature transition in ice 1h by thermally stimulated depolarization measurements. J. Chem. Phys., 62(10):4213, 1975.

[72] Frank Harary. The number of oriented graphs. Michigan Math. J., 4:221, 1957.

[73] Frank Harary and Ed Palmer. Enumeration of locally restricted digraphs. Canad. J. Math., 18:853, 1966.

104 [74] I. Morrison, J.-C. Li, S. Jenkins, S. S. Xantheas, and M. C. Payne. Ab-initio total energy studies of the static and dynamical properties of ice-ih. J. Phys. Chem., B101(32):6146, 1997.

[75] M. B. Boisen, Jr. and G. V. Gibbs. Mathematical Crystallography, volume 15 of Reviews in Mineralogy. Mineralogical Soc. of America, Washington, D.C., 2nd edition, 1990.

[76] W. H. Bragg. The crystal structure of ice. Proc. Phys. Soc., 34:98, 1922.

› Ië ›

Ÿ Ÿ

] E [77] W. H. Barnes. The crystal structure of ice between E and . Proc. R. Soc., A324:127, 1929.

[78] Ernesto Cota and William G. Hoover. Computer simulation of hexagonal ice. J. Chem. Phys., 67(8):3839, 1977.

[79] M. W. Jurema, K. N. Kirschner, and G. C. Shields. Modeling of magic water clus-

¢¡ £¦¥¨§ £ ¡ £¦¥¨§ £ ¡ ?

ters and 4 with the pm3 quantum-mechanical method. J. Comp.

Chem., 14(11):1326, 1993. œ [80] Arshad Khan. Examining the cubic, fused cubic, and cage structures of ¢¡/£¦¥¨§ for n=8,9,12,16,20, and 21: Do fused cubic structures form? J. Phys. Chem.,

99(33):12450, 1995.

? ?

¡K£¦¥¨§ £D¡ ¢¡ £¦¥§ £ ¡

[81] Arshad Khan. Ab initio studies of and 4 prismic, fused cubic

¡ ¥&?

and dodecahedral clusters: can * ion remain in cage cavity? Chem. Phys. Lett.,

319(5-6):440, 2000.

¡¨£L¥¨§ £ ¡ £L¥¨§ £ ¡ ?

[82] Kari Laasonen and Michael L. Klein. Structural study of and 4 using density functional methods. J. Phys. Chem., 98(40):10079, 1994.

[83] Sang-Won Lee, Patrick Freivogel, Thomas Schindler, and J. L. Beauchamp. Freeze- dried biomolecules: Ft-icr studies of the specific solvation of functional groups and clathrate formation observed by the slow evaporation of water from hydrated pep- tides and model compounds in the gas phase. J. Am. Chem. Soc., 120(45):11758, 1998.

[84] J. L. Kassner and D. E. Hagen. Comment on “clustering of water on hydrated protons in a supersonic free jet expansion. J. Chem. Phys., 64(4):1861, 1976.

[85] T. F. Magnera, D. E. David, and J. Michl. Chem. Phys. Lett., 182:363, 1991.

[86] Z. Shi, J. V. Ford, S. Wei, and A. W. Castleman, Jr. Water clusters: contributions of binding energy and entropy to stability. J. Chem. Phys., 99(3):8009, 1993.

105 [87] P. L. M. Plummer and T. S. Chen. A molecular dynamics study of water clathrates. J. Phys. Chem., 87(21):4190, 1983.

[88] Umpei Nagashima, Hisanori Shinohara, Nobuyuki Nishi, and Hideki Tanaka. En- hanced stability of ion clathrate structures for magic number water clusters. J. Chem. Phys., 84(1):209, 1986.

[89] Hisanori Shinohara, Umpei Nagashima, Hideki Tanaka, and Nobuyuki Nishi. Magic numbers for water-ammonia binary clusters: enhanced stability of ion clathrate

structures. J. Chem. Phys., 83(8):4183, 1985. 0œ

[90] R. E. Kozack and P. C. Jordan. Structure of ¡¨?@ ¢¡ £¦¥¨§ clusters near the magic !

number OÑ, . J. Chem. Phys., 99(4):2978, 1993.

[91] David E. Smith and Liem X. Dang. Computer simulations of cesium-water clusters: Do ion-water clusters form gas-phase clathrates? J. Chem. Phys., 101(9):7873, 1994.

[92] M. P. Hodges and D. J. Wales. Global minima of protonated water clusters. Chem. Phys. Lett., 324(4):279, 2000.

[93] M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheese- man, V. G. Zakrzewski, J. A. Montgomery, Jr., R. E. Stratmann, J. C. Burant, S. Dap- prich, J. M. Millam, A. D. Daniels, K. N. Kudin, M. C. Strain, O. Farkas, J. Tomasi, V. Barone, M. Cossi, R. Cammi, B. Mennucci, C. Pomelli, C. Adamo, S. Clifford, J. Ochterski, G. A. Petersson, P. Y. Ayala, Q. Cui, K. Morokuma, D. K. Malick, A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. Cioslowski, J. V. Ortiz, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, R. Gomperts, R. L. Mar- tin, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, C. Gonzalez, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, J. L. Andres, C. Gonzalez, M. Head-Gordon, E. S. Replogle, and J. A. Pople. Gaussian98 (Revi- sion A.6). Gaussian, Inc., Pittsburgh, PA, 1998.

[94] Jer-Lai Kuo, James V. Coe, Sherwin J. Singer, Yehuda B. Band, and Lars Ojamae.¨ On the use of graph invariants for efficiently generating hydrogen bond topolo- gies and predictin physical properties of water clusters and ice. J. Chem. Phys.,

114(6):2527, 2001.

¢¡¨£L¥¨§0œ !¡

[95] David J. Wales and Matthew P. Hodges. Global minima of DUŽP , de- scribed by an empirical potential. Chem. Phys. Lett., 286:65, 1998.

[96] William L. Jorgensen and Jeffry D. Madura. Temperature and size dependence for monte carlo simulations of tip4p water. Mol. Phys., 56:1381, 1985.

106 [97] Z. Li and H. A. Scheraga. Monte carlo-minimization approach to the multiple- minima problem in protein folding. Proc. Natl. Acad. Sci. USA, 84(19):6611, 1987.

[98] R. J. Bartlett and D. M. Silver. Int. J. Quantum Chem. Symp., 8:271, 1974.

[99] R. J. Bartlett and D. M. Silver. Int. J. Quantum Chem. Symp., 9:183, 1975.

[100] J. S. Binkley and J. A. Pople. Int. J. Quantum Chem., 9:229, 1975.

[101] J. A. Pople, J. S. Binkley, , and R. Seeger. Int. J. Quantum Chem. Symp., 10:1, 1976.

[102] In Car-Parrinello [13] type simulations of an excess proton in bulk water, Tuck- erman and co-workers [112] found that the path of proton transfer is directed to the neighboring water that is deficient in the number of hydrogen bonds, being 3- coordinate while most of the bulk water molecules are 4-coordiate. The case we study is not analogous because all waters in the ¢¡¨£¦¥¨§ £ dodecahedron are identi- cally 3-coordinate. Furthermore, there are fundamental differences between migra- tion of an excess proton and self-dissociation.

[103] David J. Anick. Comparison of octameric and decameric water zwitterions. J. Mol. Struct. (Theochem), 574(1-3):109, 2001.

[104] Chengteh Lee, Carlos Sosa, and Juan J. Novoa. Evidence of the existence of disso- ciated water molecules in water clusters. J. Chem. Phys., 103(10):4360, 1995.

[105] James O. Jensen, Alan C. Samuels, P. N. Krishnan, and Luke A. Burke. Ion pair formation in water clusters: a theoretical study. Chem. Phys. Lett., 276(1-2):145, 1997.

[106] Angela Smith, Mark A. Vincent, and Ian H. Hillier. Mechanism of acid dissoci-

¢¡¨£L¥¨§0œ¡h 

^dL{£ž ,

ation in water clusters: Electronic structure studies of U:,

¥&¡ ¡¡ ¡¡ d¥ ¥¥¢ d¥K£¡ ¥¨¥&¡ q= d¥K£L§

IŸc  *2   . J. Phys. Chem., A103(8):1132, 1999.

[107] M. I. Bernal-Uruchurtu and I. Ortega-Blake. On the molecular basis of water hy- drolysis. a detailed ab initio study. J. Phys. Chem., A103(7):884, 1999.

[108] David J. Anick. Polyhedral water clusters, ii: correlations of connectivity parame- ters with electronic energy and hydrogen bond lengths. J. Mol. Struct. (Theochem), 587(1-3):97, 2002.

[109] David J. Anick. Polyhedral water clusters, i: formal consequences of the ice rules.

J. Mol. Struct. (Theochem), 587(1-3):87, 2002.

Ø

L4 £4T¤ [110] The parameters £ and in Table 4 of of Ref.108 are all negative and of similar

magnitude, indicating that 2AW adjacent to the H-bond donor shorten the bond to

Ø

£ £

£ ¤ the same degree. Also, the £ and parameters in the same table are both positive

107 and simimlar in magnitude to the other parameters, indicating that 2DW’s adjacent to the acceptor lengthen the H-bond to the same degree.

[111] Phillip L. Geissler, Christoph Dellago, David Chandler, Jur¨ g Hutter, and Michele Parrinello. Autoionization in liquid water. Science, 291(5513):2121, 2001.

[112] M. Tuckerman, K. Laasonen, M. Sprik, and M. Parrinello. Ab initio molecular dy- namics simulation of the solvation and transport of hydronium and hydroxyl ions in water. J. Chem. Phys., 103(1):150, 1995.

108