The Measurement of Free Energy by

MonteCarlo Computer Simulation

Graham R Smith

A thesis submitted in fullment of the requirements

for the degree of Do ctor of Philosophy

to the

University of Edinburgh

Abstract

One of the most imp ortant problems in statistical mechanics is the measurement of free energies

these b eing the quantities that determine the direction of chemical reactions andthe concern

of this thesisthe lo cation of phase transitions While Monte Carlo MC computer simulation

is a wellestablished and invaluable aid in statistical mechanical calculations it is well known

that in its most commonlypractised form where samples are generated from the Boltzmann

distribution it fails if applied directly to the free energy problem This failure o ccurs b ecause

the measurement of free energies requires a much more extensive exploration of the systems

conguration space than do most statistical mechanical calculations congurations which have

a very low Boltzmann probability make a substantial contribution to the free energy and the

imp ortant regions of conguration space may b e separated by p otential barriers

We b egin the thesis with an introduction and then give a review of the very substantial

literature that the problem of the MC measurement of free energy has pro duced explaining

and classifying the various dierent approaches that have b een adopted We then pro ceed to

present the results of our own investigations

First we investigate metho ds in which the congurations of the system are sampled from a

distribution other than the Boltzmann distribution concentrating in particular on a recently

developed technique known as the multicanonical ensemble The principal diculty in using

the multicanonical ensemble is the diculty of constructing it implicit in it is at least partial

knowledge of the very free energy that we are trying to measure and so to pro duce it requires an

iterative pro cess Therefore we study this iterative pro cess using Bayesian inference to extend

the usual metho d of MC data analysis and introducing a new MC metho d in which inferences

are made based not on the macrostates visited by the simulation but on the transitions made

b etween them We present a detailed comparison b etween the multicanonical ensemble and

the traditional metho d of free energy measurement thermo dynamic integration and use the i

former to make a highaccuracy investigation of the critical magnetisation distribution of the

d Ising mo del from the scaling region all the way to saturation We also make some comments

on the p ossibility of going b eyond the multicanonical ensemble to optimal MC sampling

Second we investigate an isostructural solidsolid phase transition in a system consisting of

hard spheres with a squarewell attractive p otential Recent work which we have conrmed

suggests that this transition exists when the range of the attraction is very small width of

attractive p otential hard core diameter First we study this system using a metho d of

free energy measurement in which the squarewell p otential is smo othly transformed into that

of the Einstein solid This enables a direct comparison of a multicanonicallike metho d with

thermo dynamic integration Then we p erform extensive simulations using a dierent purely

multicanonical approach which enables the direct connection of the two co existing phases It is

found that the measurement of transition probabilities is again advantageous for the generation

of the multicanonical ensemble and can even b e used to pro duce the nal estimators

Some of the work presented in this thesis has b een published or accepted for publication

the references are

G R Smith A D Bruce A Study of the Multicanonical Monte Carlo Method J Phys

A

G R Smith A D Bruce Multicanonical Monte Carlo Study of a Structural Phase Tran

sition to b e published in Europhys Lett

G R Smith A D Bruce Multicanonical Monte Carlo Study of SolidSolid Phase Coexis

tence in a Model Col loid to b e published in Phys Rev E ii

Declaration

This thesis has b een comp osed by myself and it has not b een submitted in any previous ap

plication for a degree The work rep orted within was executed by me unless otherwise stated

March iii

for Christina and Ken iv

Acknowledgements

I would like to thank the following p eople Alastair Bruce for all his guidance help and en

couragement and for never shouting at me even when I richly deserved it Stuart Pawley and

Nigel Wilding for many useful and pleasant discussions David Greig Stuart Johnson Stephen

Bond and Stephen Ilett for carefully reading and commenting on the nal draft of this thesis

Peter Bolhuis for making available the results of my atmates and all my other friends

in Edinburgh and elsewhere

I also gratefully acknowledge the supp ort of a SERCEPSRC research studentship v

Contents

Introduction

Thermo dynamics Statistical Mechanics Free Energy and Phase Transitions

Phase Transitions

The Ising Mo del

Statistical Mechanics

OLattice Systems

Calculation in Statistical Mechanical Problems

Analytic Metho ds

MonteCarlo Simulation

MonteCarlo Simulation at Phase Transitions

Discussion

Review

IntegrationPerturbation Metho ds

Thermo dynamic Integration

Multistage Sampling

The Acceptance Ratio Metho d

Mons FiniteSize metho d

Widoms ParticleInsertion Metho d

Histogram Metho ds

NonCanonical Metho ds

Umbrella Sampling

Multicanonical Ensemble

The Expanded Ensemble vi

Valleaus DensityScaling Monte Carlo

The Dynamical Ensemble

Grand Canonical MonteCarlo

The Gibbs Ensemble

Other Metho ds

Coincidence Counting

Lo cal States Metho ds

Rickman and Philp ots Metho ds

The Partitioning Metho d of Bhanot et al

Discussion

Multicanonical and Related Metho ds

Introduction

The Multicanonical Distribution over Energy Macrostates

An AlternativeThe Ground State Metho d

The Multicanonical Distribution over Magnetisation Macrostates

Techniques for Obtaining and Using the Multicanonical Ensemble

Metho ds Using Visited States

Incorp orating Prior Information

Metho ds Using Transitions

FiniteSize Scaling

Using Transitions for Final Estimators Parallelism and Equilibration

Results

Free Energy and Canonical Averages of the d Ising Mo del

A Comparison Between the Multicanonical Ensemble and Thermo dy

namic Integration

P M at

c

Beyond Multicanonical Sampling

The Multicanonical and Expanded Ensembles

The Problem

Optimal Sampling

Use of the Transition Matrix Prediction of the Optimal Distribution vii

Discussion

A Study of an Isostructural Phase Transition

Introduction

Comparison of Thermo dynamic Integration and the Expanded EnsembleUse

of an Einstein Solid Reference System

Thermo dynamic Integration

Expanded Ensemble with Einstein Solid Reference System

Other Issues

Direct Metho dMulticanonical Ensemble with Variable V

The Multicanonical N pT Ensemble and its Implementation

The Pathological Nature of the SquareWell System

Finding the Preweighting Function

The Pro duction Stage

Canonical Averages

FiniteSize Scaling and the Interfacial Region

Mapping the Co existence Curve

The Physical Basis of the Phase Transition

Discussion

Conclusion

A Exact FiniteSize Scaling Results for the Ising Mo del

B The DoubleTangent Construction

C Statistical Errors and Correlation Times

D Jackknife Estimators

E Details of the SquareWell Solid Simulation viii

Chapter

Introduction

We b egin by giving necessary background to the work carried out in this thesis We shall deal

with the thermo dynamical and statistical mechanical notions that underpin our understanding

of phase transitions in particular the ideas of entropy and free energy We shall describ e the role

of computer simulation esp ecially MonteCarlo simulation and explain why the measurement

of free energy presents particular challenges

Thermo dynamics Statistical Mechanics Free Energy

and Phase Transitions

Who could ever calculate the path of a molecule How do we know that the

creation of worlds are not determined by fal ling grains of sand

FROM Les Miserables

VICTOR HUGO

Thermo dynamics and statistical mechanics are theories which describ e the b ehaviour of a

bulk material comprising large numbers of interacting particles distributed in space for example

the molecules of a solid liquid or gas

Thermo dynamics as its name implies is concerned with the bulk energy of the material

work done on it and heat ows in and out of it It do es not acknowledge explicitly the mi

croscopic interactions of the particles that comp ose the bulkindeed the theory evolved from

CHAPTER INTRODUCTION

empirical observations and exp eriment at a time when the microscopic nature of materials was

not understo o d at all For this reason the development of the theory was itself a painful pro

cess and it was not connected with the physical principles of mechanics until the work of

Boltzmann and Gibbs at the end of the last century and of Einstein at the start of this led to

the development of statistical mechanics For a traditional exp osition of thermo dynamics see

or the rst chapters of

How might we attempt to use a mo del of the microscopic structure of a macroscopic sample

of a material to calculate its prop erties In a Classical Mechanical framework it is clear that

given the initial p ositions and velocities of all the particles that comp ose the system ie the

initial microstate of the system and knowledge of the interactions it is p ossible in principle

to calculate the state at any later time from the known laws of dynamics but it is equally clear

that such a calculation will never b e p ossible in practice for two reasons rstly b ecause the

number of particles in a macroscopic system is O which is simply to o large for any exist

ing or foreseeable computer and secondly b ecause it is likely that the dynamics are in detail

chaotic that is to say the later evolution dep ends with enormous sensitivity on the initial state

Nevertheless it is apparent from our observations of real systems that this microscopic com

plexity do es not result in macroscopic complexity Materials have welldened bulk prop erties

that dep end only on the values of a few thermo dynamic control parameters such as pressure

temp erature magnetic eld etc not on the continually varying p ositions and velocities of all

their innumerable constituent particles Moreover these bulk prop erties seem to all but the

most precise measurements not to vary with time provided the control parameters remain

xed even though there is continuous microscopic activity and when suitably normalised they

are the same for all macroscopic samples large or small This gives a hint that it might b e

p ossible to construct a theory which predicts the bulk prop erties without any details of the

kinetics and which dep ends in only a simple way on the number of particlesan enormous

simplication This is indeed what statistical mechanics achieves on the basis of a knowledge

only of the interactions b etween the constituent particles it provides expressions for the bulk

prop erties We shall not give a detailed derivation of the formalism of statistical mechanics

here or connect it fully with thermo dynamics and results will often b e stated without pro of

1

As long as they are large enough that the number of particles near the b oundary of the sample is small

compared with the number in the bulk The bulk prop erties of a material are normally prop ortional to the

quantity of material presentthey are extensive When we refer to a quantity as a density it means that we

have divided an extensive quantity by the number of particles The result is intensive or eectively indep endent

of the quantity of material The control parameters are naturally intensive

CHAPTER INTRODUCTION

A full exp osition can b e found in and other useful references are

Phase Transitions

We have just noted that the bulk prop erties of a material such as the internal energy density

or the magnetisation density if the material is magnetic dep end on a few control parameters

like the temp erature Normally they vary smo othly with the control parameters but there

exist a few p oints where they may jump suddenly from one value to another ie p oints

where the material changes its prop erties dramatically Where this happ ens we say that the

material undergo es a phase transition The prop erty that jumps can b e used as a socalled

order parameter of the transition by a suitable choice of origin we can make it zero in one

phase and nite in the other or of the same magnitude and opp osite sign in the two phases

which is how it is naturally dened for the mo del we shall introduce in section Examples

of phase transitions that are found in almost all simple atomic or molecular materials are the

melting and b oiling transitions Here there are two obvious order parameters one is the internal

energy density which changes by L pV where L is the latent heat of the transition

H H

p the pressure and V the change in volume and the other is the sp ecic volume v V N

where N is the number of particles in the system which changes more than a thousandfold at

the liquidvapour transition of water at atmospheric pressure Transitions of this kind where

the order parameter is discontinuous are called rst order phase transitions Exactly at the

transition values of the control parameters volumes of the material in states characteristic of

the two sides of the phase transition can exist in contact with one anothercoexisting phases

There is another class of phase transitions known as second order or continuous transitions in

which there is no latent heat and the order parameter itself do es not change discontinuously but

instead exp eriences large uctuations and its derivative with resp ect to the one of the control

parameters diverges An example of this is the ferromagnetic transition of iron in which

the sp ontaneous magnetisation which is the order parameter for the transition disapp ears

continuously at a temp erature of K while the susceptibility diverges The p oint in the

space of control parameters where a continuous phase transition o ccurs is called a critical point

Critical p oints o ccur as the limit of a series of rst order transitions in which the change in the

order parameter has b een getting progressively smaller though not all lines of rst order phase

transitions terminate in critical p oints solidliquid melting curves do not app ear to do so The

unusual phenomena asso ciated with critical p oints have b een the sub ject of a huge amount of

CHAPTER INTRODUCTION

study in the past thirty years or so

Clearly we would like to know which phase will b e found at particular values of the control

parameters and particularly at what values of these control parameters phase transitions will

o ccur We would also like to b e able to calculate the bulk mechanical and thermo dynamic

prop erties of the material in its various phases Statistical mechanics provides answers to these

questions in principle however we shall later see that the expressions that it enables us to write

down particularly those relating to the lo cation of phase transitions are dicult to evaluate

accurately either analytically or computationally The main concern of this thesis has b een the

investigation of various computational techniques designed to overcome these diculties We

shall explain our intentions in more detail when we have provided more background to bring

the particular problems asso ciated with phase transitions more clearly into fo cus Computer

simulations are introduced in section and reviewed in more detail in chapter In the

remainder of section we shall introduce relevant asp ects of statistical mechanics itself We

shall illustrate our explanation of statistical mechanics using one of the mo dels that will b e

widely used in the investigations of this thesis the

The Ising Mo del

First the gypsies brought the magnet A corpulent gypsywho introduced him

self as Melquades made a showy public demonstration of what he himself described

as the eighth marvel of the wise alchemists of Macedonia

FROM One Hundred Years of Solitude

GABRIEL GARCIA MARQUEZ

The Ising mo del is a simple mo del of a magnetic system where the particles are classical

spins on a lattice It is dened by the characteristic that each s may exist in one of

two state called up and down or and There is a coupling J which denes an

interaction energy b etween the spins at sites i and j E J s s The choice of sign is a

ij ij i j

convention Aside from this the system may have any dimensionality any type of lattice and

there may also b e an external magnetic eld However we shall b e particularly concerned with

the Ising mo del on an L L dimensional square lattice with the choice

CHAPTER INTRODUCTION

if i and j are nearest neighbours

J

ij

otherwise

There is no loss of generality in choosing J since as will b ecome apparent in section

any other choice just results in a scaling of temp erature We shall imp ose periodic boundary

conditions PBC meaning that spins i L and i are neighbours for all i as are spins

L j and j for all j Thus all spins are equivalent and each interacts with its four nearest

neighbours only The coupling is p ositive so that the spinspin energy is lower when spins are

aligned parallel and higher when they are antiparallel The ground state is therefore a state

where all spins are parallel for which reason the J mo del is known as the ferromagnetic

ij

Ising mo del The total energy due to spinspin interactions which we shall call congurational

energy and represent by E is therefore given by

X

s s E

i j

ij

X

denotes a sum over nearestneighbour pairs and represents a particular arrange where

ij

ment of all the spins also called a conguration or microstate The number of spins in the

system is N L The magnetisation M is simply

X

s M

i

i

which interacts with an external magnetic eld H to give an extra term HM in the

energy

The total energy often called the Hamiltonian is therefore E

TOT

E E HM

TOT

Despite the simplicity of the Ising mo del it exhibits a phase transition driven by the external

2

In the literature the symbols H U and V are also frequently used for congurational energy

3

The choice of sign is once again conventional one may characterise the energy of a magnetic material

either in terms of the energy stored in the eld or the work done on the solid and b oth conventions are in use

We have chosen to consider the second case

CHAPTER INTRODUCTION

eld H at temp eratures b elow the critical temp erature T this is a rstorder transition at T

c c

it b ecomes continuous and for T T it disapp ears The order parameter is the magnetisation

c

M or magnetisation density m M N This phase b ehaviour is qualitatively and in some

resp ects quantitatively very similar to that of a real ferromagnet like iron We have therefore

to lo ok at the phase b ehaviour as a function of the two variables H and T however it makes

for slightly tidier notation if instead of T we use the inverse temp erature k T so we

B

shall do this from the start First consider the dep endence of m on H for three dierent values

of see gure

1 2 3 m m m

m o

H HH -m o

β > β β = β β < β

c c c

Figure Schematic diagrams of magnetisation density vs external magnetic eld for the

Ising mo del for three dierent values of inverse temp erature

The qualitative b ehaviour at high H is the same in all cases m signH However the

b ehaviour as H dep ends on At high temp eratures or low as in graph of gure

m as H while at high graph there is a residual magnetisation left after the eld

has gone to zero m m as H and m m as H Therefore if at inverse

temp erature the eld is taken from just b elow zero to just ab ove it the magnetisation

c

jumps discontinuously by m We have therefore a rst order phase transition The

p oint of crossover b etween these two regimes o ccurs at graph Here there is no residual

c

magnetisation at H but m H j diverges so there is a continuous phase transition

H

and is the inverse critical temp erature If the system is co oled from to

c c c

at H then the system chooses either the m or m state with equal probabilitya

phenomenon known as spontaneous symmetry breaking

The usual way of summarising phase b ehaviour of this kind is by way of phase diagrams

CHAPTER INTRODUCTION

1 2

m H o

β β β β

c c

Figure Schematic m and H phase diagrams for the Ising mo del

The two most usual kinds are shown in gure We may plot m the order parameter for the

transition against temp erature or inverse temp erature this is done in the diagram on the left

Note that if m is free to vary then those p oints lying b etween m and m for do not

c

represent equilibrium p oints of the system however if we constrain the total magnetisation to

b e constant then this region of the phase diagram contains states which we call mixed states

which describ e twophase co existence the system under these conditions separates into large

domains each with a magnetisation equal to m or m the domains having that volume

which keeps the total magnetisation at its constrained value The other kind of phase diagram

is shown in the diagram on the right of gure Here the axes are the two elds H and

the phase transitions then app ear as a single line which in the Ising case lies along the axis

H and ends at the critical p oint This line is called the coexistence curve All the mixed

states also lie on this line b ecause they dier only in their value of the total magnetisation In

mathematical terms the co existence curve is the line on which m is not a singlevalued function

of H and In fact the two phase diagrams are really pro jections of a surface in m H space

on which each p oint represents the value of m pro duced by the elds H and This surface is

drawn in gure The m H graphs in gure are constant sections through it

We should remark here that b oth a computersimulated Ising mo del and a real ferromagnet

particularly one made of a pure metal are liable to show the phenomenon of hysteresis instead

of the sudden rst order phase transition at H If the system has a p ositive m and H

CHAPTER INTRODUCTION

m

β H β

c

Figure The surface of state p oints in m H space for the Ising mo del

is decreased and then made negative m remains p ositive at rst and requires the applied

eld to b e appreciably negative to drive it to its negative equilibrium value The same thing

happ ens if H is made p ositive and m is initially negative This phenomenon generally called

metastability o ccurs widely in nature Metastability interferes with the measurement of the

true lo cation of phase transitions b oth in exp eriments and in computer simulation it will b e

a frequent concern of ours to try to develop simulation techniques that are as little aected by

it as p ossible

The twodimensional Ising mo del with H is particularly interesting b ecause many of

its prop erties including the phase b ehaviour can b e exactly calculated analytically so it is an

ideal testb ed for computer simulation metho ds which can then b e applied to other systems

There exist exact solutions for the internal energy the heat capacity and the free energy which

we dene in the next section b oth for nite systems and in the N limit In the

latter case m can b e evaluated to o A detailed exp osition of many of the the exactlyknown

prop erties of the d Ising mo del can b e found in A typical conguration typical in the

sense that its magnetisation lies near the p eak of the probability density function pdf of M

p

of the d Ising mo del at the critical p oint H ln

c

is shown in gure This system is just large enough for the selfsimilar structure of the

critical clusters of dierent magnetisations to b e b ecoming apparent this b ehaviour is one of

the most interesting features of the critical p oint and is central to the renormalisation group

CHAPTER INTRODUCTION

theory of continuous phase transitions Further detail and references can b e found in section

Figure A typical conguration of the critical d Ising mo del White corresp onds to

s black to s

i i

Now in order to explain this phase b ehaviour of the Ising mo del we must introduce some

statistical mechanics The initial discussion is general but we shall return to the Ising mo del

as a sp ecic example to clarify the discussion of phase transitions

Statistical Mechanics

We shall b egin by stating the equation which is the basis of statistical mechanics

exp E

TOT

can

P

Z

The probability distribution given by this equation is called the Boltzmann Distribution

For a standard derivation of it see or for an alternative and simpler derivation using

information theory see We shall motivate it here by showing that it is consistent with

and illuminates connections b etween the simple physical prop erties of bulk material that we

describ ed at the start of this chapter We shall start by considering materials away from phase

transitions moving on to phase transitions later

CHAPTER INTRODUCTION

can

Equation relates the total energy E to P the probability that the system in

TOT

equilibrium will b e observed in conguration Note that only the energy of the conguration

app ears in this expression there are no momentum terms and no dynamics In the sense that

a probability can b e interpreted as a timeaverage the dynamics can b e regarded as having

b een averaged out though this interpretation is not necessary to the truth of equation as

is discussed in

The normalising factor Z the partition function is given by

X

exp E Z

TOT

f g

X

is a sum over all the microstates of the system which for the d Ising mo del means where

f g

over all congurations of the spins The logarithm of the partition function the Gibbs free

energy is dened by

G H ln Z

It is also a very imp ortant quantity we shall nd section that it is g GN that

determines the lo cation of phase transitions The set of microstates available to the system

over which the sum runs is called the ensemble here we are considering an ensemble where the

systems volume temp erature and number of particles are constant The choice of ensemble will

in general aect how many terms there are in E in the Ising mo del it

TOT

is in general necessary to include the HM term as well as the congurational energy E though

in most of the cases we shall consider H will b e zero However in an ensemble where M was

constant ie we considered only those congurations with a particular magnetisation the HM

term would b e the same for all congurations and so irrelevant cancelling out ab ove and b elow

in equation

Now let us consider what predictions the theory makes for the values of the bulk prop erties

or observables of the system These observables such as internal energy are canonical

averages averages over the Boltzmann distribution we dene an op erator which acts on a

conguration to give a number which is the value of the prop erty for that conguration Then

the observed bulk prop erty will b e the average of the op erator over the Boltzmann distribution

of the congurations of the system For a general op erator O we have

CHAPTER INTRODUCTION

X X

can

O exp E O P Z O

f g f g

so for congurational energy internal energy we have

X

E exp E E Z

f g

If the interactions are fairly shortranged as for most molecular materials and the Ising

mo del then we exp ect E N this is in fact a general prop erty of what are called normal

systems in statistical mechanics chapter Imp ortant systems that are not normal

include electrically charged and gravitationally b ound ones Now let us consider the the heat

capacity C E T This is also known to b e extensive for short rangedinteractions

H

if it were not we could violate conservation of energy by cutting up a system heating it

reassembling it and co oling it However by dierentiating equation it is easy to show

p

chapter that the rms size of uctuations in E is given by C Thus

E E H

p

N and so the uctuations in the internal energy density e E N die away

E

p

as N For macroscopicallysized samples of material the uctuations will b e to o small to

can

observe and E is eectively identical to E the most probable value of the energy P E

is thus very sharply p eaked ab out its maximum or mode at E in the singlephase region

there is only one mo de The largeN limit is also called the thermodynamic limit since here

the predictions of statistical mechanics b ecome fully consistent with those of thermo dynamics

This is a convenient time to introduce the notion of O where O is an

op erator on the microstates as b efore We dene

X

O O O

f g

so that it is just the number of states for which the op erator has the value O The p ossible

values of O are called the O macrostates of the system The total number of microstates is

X

TOT

f g

which may or may not b e nite though it is in the Ising case Generally increases

TOT

exp onentially with system size expcN for some constant c Let us rst lo ok at the TOT

CHAPTER INTRODUCTION

particularly simple case of O E so we are considering the congurational energy macrostates

can

The Boltzmann distribution where P is a function of E only implies that here all

microstates within a macrostate are equally probable so the probability of an E macrostate is

can

P E Z E exp E

which may also b e written as

can

P E Z exp TS E E

or

can

P E Z exp F E

where S E k ln E is the entropy and F E E TS E is the Helmholtz free energy

B

also sometimes called the free energy functional These quantities are also extensive and

densities f F N and s SN are dened as usual F and S dened here are in fact identical

in the L limit to the quantities for which we use the same symbols in thermo dynamics

The identity can b e shown rigorously chapters

Equation makes it clear that the probability of a macrostate is controlled by an inter

play of its energy and entropy At low temp eratures large the energy term dominates and

the most probable macrostates are those of low energy At high temp eratures small the

energy has less eect and those macrostates which comprise the largest number of microstates

that is have the largest entropy are the most probable It is interesting to consider what the

typical microstates of lowtemperature and hightemp erature macrostates are like At low

temp erature low energies are favoured and these are b est achieved for attractive forces by

surrounding each particle with as many neighbours as p ossible or for the Ising mo del by

surrounding each spin with neighbours having the same orientation These lowenergy congu

rations are typically highly ordered and few in number there are only a few closepacked crystal

lattices and only two ways of arranging the Ising mo dels spins so that they are all parallel

Conversely at high temp eratures all microstates are equally probable whatever their energy

so those macrostates having large numbers of microstates are favoured However consideration

of pro ducing congurations by placing particles or orienting spins randomly which would gen

erate all microstates with equal likelihoo d shows that these congurations will almost always

CHAPTER INTRODUCTION

have rather high energy Thus the notion of entropy as a measure of order b egins to emerge

It is interesting to see how such an apparently unquantiable idea b ecomes asso ciated with the

multiplicity of congurations of a particular energy This happ ens b ecause the nature of the

force is such that low energy demands very ordered congurations and these are geometrically

restricted to b e few in number while the overwhelming ma jority of the much more numerous

disordered congurations have high energy

We can now rewrite the canonical averages as averages over the macrostates for example

X

E Z E E exp E

E

X

Z E exp F E

E

can

F E and P E are shown in gure We showed ab ove that the fractional uctuations

p

can

in E ab out E die away like N so P E is very sharply p eaked ab out its maximum at

E the study of the scaling of probability distribution functions and the related canonical aver

can

ages with system size is known as nitesize scaling theory Since P E exp F E

can

the b ehaviour of P E leads to the thermo dynamic principle that a system seeks out the

value of E that minimises F E

p

can

In fact the extensivity of F E is another conrmation of the fact that P e N if

can d d

we write P e in terms of the bulk free energy density f e l im L F eL writing

L L

d

eL for E it is found that

can d

P e expL f e

L

exp f e

e

d d can d

Thus L and so L ie P E is a Gaussian with halfwidth L

e

E

We can describ e the Ising mo dels magnetisation in a similar way to that in which we de

scrib ed the energy

X

M exp E H M M Z

f g

As was the case with E it can b e shown that away from the critical p oint consideration

CHAPTER INTRODUCTION

exp(F(E))

F(E)

pcan (E)~exp(-F(E)) δE

E

Figure A Schematic diagram of the free energy functional F E and the macrostate

p

can

N probability P E Note that E N while the halfwidth E

of the extensivity of M and of the magnetic susceptibility M H leads to the

p

d

N so the uctuations in m ML conclusion that the uctuations in M grow only as

ab out its maximum m disapp ear as L Thus M M where M is the most

probable value of M It will b e noted that there is an obvious similarity in the b ehaviour of

the uctuations in E and M in each case the uctuation is related to the resp onse function

Y y where Y is E or M and y is the eld that couples to it either T or H Such

Y

b ehaviour is in fact extremely general it is describ ed by the uctuationdissipation theorem

chapter which relates the uctuation in a quantity Y to the resp onse function

Y Y

The density of magnetisation states is now

X

M M M

f g

This time all microstates within an M macrostate are not equally probable b ecause they

may have dierent congurational energies Therefore the appropriate free energy functional

F M is now dened by

4

At the critical p oint however we have already said that diverges This implies that the uctuations in

M b ecome large and this is indeed the case they are large b oth in absolute size is almost extensive and

M

in spatial extent extending over the entire system In this case M M See app endix A for a futher discussion

CHAPTER INTRODUCTION

X

M M exp E exp F M

f g

so that the probability of an M macrostate b ecomes

X

can

M M exp E HM P M Z

f g

Z exp F M H M

and we can write

X

M Z M exp F M H M

M

X

Z exp F M H M

M

and

X

G H ln exp F M H M

M

Now let us extend the discussion from the single phase to phase transitions Taking the

d Ising mo del as a paradigm we shall now use equation along with the results for

can

the b ehaviour of P M as a function of system size and physical arguments ab out the

nature of the favoured congurations at various temp eratures to explain the app earance of

the dramatic jumps in canonical averages that we know are characteristic of rstorder phase

transitions

We now write F M instead of F M to make clear its dep endence on the nitesize of

L

the system Consider therefore the shap es of F M for some nite L ab ove and b elow

L c

For the d Ising mo del these can b e shown chapters chapter to b e as illustrated

schematically in gure

Diagram describ es the situation at high temp eratures In this regime the probability

of a macrostate is dominated by its multiplicity and the eect of the average energy of

the congurations is small Thus the favoured macrostates are those around M which

can

corresp ond to spins b eing chosen with random orientation Thus P M has a single

L

maximum and F M has a single minimum at M describing the single phase

L

d d

that exists there The limiting bulk free energy density f m l im L F mL is

b L L

approached quite quickly as L increases and thus when viewed on the right length scale lo oks

CHAPTER INTRODUCTION

1 F 2 F L L

MM

Figure Schematic diagram of the free energy functional F M for for

L c

c

very similar to F M as shown in diagram of gures and L

1 f2 f

m m

Figure Free energy density f m in the limit N for for

c c

The b ehaviour at low temp erature is dierent as shown in the second diagram of gure

F M now has two minima at M which will describ e the two phases of the system

L

and corresp ondingly f m has two minima at m These come from the dominance of

L

energy over entropy at low temp eratures low energy congurations are favoured even though

their multiplicity is low and these congurations tend to have almost all their spins aligned

either up or down that is to say they have nite magnetisation The symmetry of F M L

CHAPTER INTRODUCTION

follows from the symmetry b etween p ositive and negative magnetisation here Corresp ondingly

can

P M must b e symmetric with two maxima

L

However the scaling this time is

d

L for M M and M M

F M

L

d

L for M M M

The rst line here is normal scaling b ehaviour the microstates that dominate F M have

L

a fairly uniform magnetisation We shall digress for a while to explain the scaling in the region

can

b etween the mo des of P M

The Interfacial Region

d

The L scaling in the region b etween the two minima o ccurs b ecause although there are

of course very many microstates where the magnetisation is roughly uniform throughout the

conguration these are the typical microstates at high temp erature these have relatively

high energy and are strongly suppressed at low temp erature The b est way the system can

achieve low energy when M M is to go into mixed states state of phase co existence

where congurations consist of regions of magnetisation density m and m separated by

interfaces these are thus also known as interfacial states The overwhelming lowtemperature

contribution to F M comes from such microstates These regions have a free energy density

L

f m which is typical of the bulk states while the interfaces introduce an interfacial free

b

energy f f is a free energy b ecause there is an internal energy necessary to create a particular

s s

interfacial area and an entropic part related to the number of ways that such an interface can

b e p ositioned in the system in order that the phases of dierent magnetisations are present in

the right prop ortions to pro duce the overall magnetisation M

For a nitesized system in the interface region one can therefore write

d d

F M L f cM L f

b s

d

The second term is the eect of the interface Its area in d space dimensions is L while

cM is a geometrical factor determined by the shap e of the interface As well as M cM also

dep ends on the b oundary conditions of the system It is given by the Wul construction

The existence of the interface has interesting eects on the b ehaviour of the system Its

CHAPTER INTRODUCTION

contribution to the total free energy go es to innity with L but more slowly than that of the

d d

bulk free energy Therefore its fractional contribution to the total free energy is cL f L f

s b

cf Lf as L and so in the thermo dynamic limit it has no eect on bulk prop erties

s b

nor on the lo cation of phase transitions This is reected in diagram of gure which

lo oks qualitatively dierent from the corresp onding diagram of gure We see that the

limiting f m as L is constant b etween m and m lacking the central maximum

b

of F M which disapp ears as L Thus the limiting f m is convex as required by

L b

thermo dynamical arguments see for example chapter However we would not b e right

to conclude that the interface has no eect at all on the system in the thermo dynamic limit

At co existence

can can d

P M P M exp cL f as L

inter f s

so that the interfacial macrostates are exp onentially suppressed compared to the pure states

Therefore in order to see the co existence of phases in the Ising system it is necessary to constrain

the total system at constant M so that the system is forced into one of the mixed states If

we had lo oked only at f m to try to predict this b ehaviour that is if we had taken only the

d

leading term L f m in the expansion of F M in L we would have concluded wrongly

L

can can

that P M P M implying that even in the thermo dynamic limit a system

inter f

with an unconstrained magnetisation should b e able to pass freely b etween m and m

The presence of this large interfacial region of macrostates whose probability go es to zero in the

large system limit but whose free energy density approaches that of the pure phases provides

a large measure of explanation for certain nonequilibrium prop erties of statistical mechanical

systems in particular metastability For example we now see that to change the sign of M at

the phase transition given that the spins must b e ipp ed piecemeal requires the creation of

an interface b etween regions of v e and v e magnetisations that is it implies the necessity

of passing through the unlikely interfacial region Similar considerations aect all rstorder

phase transitions and are a bugb ear of computer simulations of them see sections and

Of course this explanation is not the whole story b ecause it takes no account of the dynamical

mechanism by which the metastable state eventually decays which is known to b e via the

nucleation of a droplet of some critical size chapter which then grows We would not

CHAPTER INTRODUCTION

necessarily exp ect the order parameter M of the whole system to b e a suitable quantity for

studying this although some very recent work has in fact suggested that a surprisingly

go o d description of the relaxation of metastable states may b e obtained from consideration of

M alone

We now return to the main thread of the argument We shall now write an expression for the

can

probability of a phase which is the sum of P M over those values of M characteristic

of each phase which we shall take as b eing M and M here In fact of course

can d

b ecause of the shap e of P M only those M values within O L of M are really

characteristic of the phase but equally very little error and an error which is increasingly

small as L increases is made by including all the states with ve magnetisation in one phase

and all with ve magnetisation in the other Therefore lab elling the phases with A and B

X

can can

P H P M H Z H Z H

A

A

M A

where Z is dened as

A

X

Z H exp F M H M

A

M A

and similarly for B we retain the HM term for generality though for the Ising mo del transition

o ccurs at H b ecause of the symmetry of F M The relative probabilities of the

coex

two phases are

can can

P P Z Z

A B

A B

and we can dene restricted Gibbs free energies on a particular phase alone

G H ln Z H

A A

so that

can can

P P exp G H G H

A B

A B

CHAPTER INTRODUCTION

can can

H so H P At phase co existence P

coex coex

B A

Z H Z H

A coex B coex

and

G H G H

A coex B coex

demonstrating the fundamental result that the condition for a phase transition is that the

can

sp ecic Gibbs functions of the two phases should b e equal However the shap e of P M

means that the quantities G and G can b e related to the basic free energies G H and

A B

F M

Z H Z H Z H

A coex B coex coex

so

G H G H G H ln

A coex B coex coex

that is

g H g H g H O N

A coex B coex coex

so to within a correction which vanishes in the large N limit g and g are also equal to the

A B

sp ecic Gibbs function of the system considered as a whole

can

The narrowness of the p eaks of P M is also resp onsible for the increasingly abrupt

change in M at the transition Consider the case when the sp ecic Gibbs functions of the

two phases are not quite equal b ecause of a slight change H in the applied eld from its

equilibrium value Since only states very near to M contribute we have

can can

exp HM P P

B A

If M and M M N Therefore any macroscopic H will cause

c

can can can can

P P or dep ending on its sign or put dierently we can make P P by

A B A B

applying

ln

H

m N

CHAPTER INTRODUCTION

Therefore we can see the jump in the order parameter emerge from the exp onential dep endence

of the probability of a phase on the size of the system though of course if H is very small

there may b e metastability problems If then m and the resp onse to H is

c

smo oth

can

The fact that P M is so sharp ab out its maxima also enables us to write the equi

librium condition in another way If under the action of eld H the macrostate of maximum

p

can can

probability in phase A is M H with probability P M then P NP M Thus

A

even if we constrain M M we can dene

g H N F M HM

A

N lnexp F M H M

X

p

N ln c N exp F M H M

M A

ln N

g H O

A

N

where c in the rd line is a constant of order unity Thus in the large N limit

d

L F M H HM H g H

A

So even a knowledge of F M H determines the Gibbs free energy of the entire phase

if we also know M H A practical metho d that uses this to determine the phase co existence

eld H where this is unknown is the doubletangent construction This is describ ed in

coex

the context of an olattice system in app endix B

As the temp erature increases the dominance of the energy over the entropy b ecomes weaker

and m b ecomes smaller Finally at the rst order phase transition disapp ears The scaling

c

of F M is somewhat unusual and is discussed in app endix A It has the prop erty that

L c

can can

P P M constant indep endent of L but the twopeak structure remains

leading to large almost extensive in the system size uctuations in the order parameter M

CHAPTER INTRODUCTION

OLattice Systems

Let us now extend the discussion of phase b ehaviour from the Ising mo del to the more familiar

melting and b oiling transitions Once again we mo del the material under consideration as

a system of particles with some p ositiondep endent p otential energy acting b etween them

though the p otential will usually b e appreciably more complicated than in the Ising case A

clear dierence is that this time the p osition co ordinates may take a continuous range of values

such systems are known as olattice systems as opp osed to latticebased systems like the

Ising mo del The total number of microstates is therefore not denumerable However the same

can can

analysis carries over in slightly mo died form P and P E etc b ecome probability

densities and we write the partition function here at constant volume as a congurational

integral

Z

Z V exp E d

V

R

Where E is the congurational particleparticle energy and the shorthand d refers to the

N ddimensional integral over the co ordinates of all N particles in a ddimensional space The

canonical averages are also dened as integrals in the obvious way

To examine the phase b ehaviour it proves b est to analyse this mo del in an ensemble with V

allowed to vary controlled by an external pressure eld p since this corresp onds to real phase

co existence where the pressure is the same in the two phases Therefore the total energy of a

conguration is E E pV where pV is the work that the system has to do against

TOT

the external pressure p to reach volume V It then follows from that the pdf of and V

is

exp pV exp E

can

P V

Z

p

with the partition function

Z Z

d exp E dV exp pV Z

p

V

Z

exp pV Z V dV

where Z V is the constant V partition function dened in equation The logarithm

CHAPTER INTRODUCTION

of Z is related to a Gibbs free energy

p

G p ln Z

p

and as b efore we dene a related intensive free energy density g p G pN and a free

energy functional

F V ln Z V

so that

X

exp G p exp pV F V

V

and the free energy of a phase is

X

exp G p exp pV F V

A

V A

where A denes the set of volumes characteristic of a phase and

exp pV exp F V

can

P V

Z

p

We remark immediately that while F V will often have a doublewell structure it will

not in general p ossess the symmetry of the Ising mo dels F M which will lead to phase

transitions o ccurring at p whereas all the Ising mo dels o ccur at H

To expand on this let us b egin by considering a typical solidliquidvapour phase diagram

of a continuous system as shown in gure and comment on it using the formalism we

set up in the previous section comparing it with the Ising mo dels phase diagram gure

By examination of the energy functions we identify the external eld energy terms HM

Ising and pV continuous and pair o the corresp onding intensive variables H Ising with

p continuous and the extensive variables M Ising with V continuous

The analogy with the Ising mo del is clearest in the liquidvapour part of the continuous

systems phase diagram so let us ignore the solid phase for now The p and H diagrams

b oth show a co existence line which ends in a critical p oint at high there are two phases

liquid and vapour or the M and M phases while at there is only one The V

c

and M lines b oth show a Ushap ed co existence region with its axis roughly along the

temp erature axis the convention of drawing the temp erature axis vertical for the continuous

CHAPTER INTRODUCTION

s+v p gas (g) TrL Solid (s) β s+l l l+v v CP liquid (l) CP s

TrP s+g g

vapour (v)

β V

Figure Schematic p and V phase diagrams CP is the critical p oint T r P the

triple p oint and T r L the triple line

system and horizontal for the magnet obscures the similarity a little Nevertheless there are

dierences in detail the magnetic system has a clear symmetry ab out H in its phase

diagrams that the continuous system lacks As we have said this is a result of the fact that

there is a complete corresp ondence of microstates b etween the two phases of the magnet for

each microstate with p ositive magnetisation there exists another its photographic negative

with ve magnetisation In the uid there is no such symmetry b etween the liquid and vapour

phases

The statistical mechanical description is also very similar Once again the requirement that

the resp onse function V p should b e extensive for each phase while still b eing related to the

can

uctuations of the order parameter leads to the result that P p V b ecomes sharply p eaked

L

ab out its mo de or mo des very near co existence with a width such that V V N

The free energy functional F V for the continuous system is qualitatively similar in shap e

to F M for the magnet At high temp eratures entropy dominates and there is only one

uid state a gas for which the energy is high b ecause the particles are widely separated but

so is the entropy b ecause each particle can explore the whole volume of the system At low

temp eratures the energy term b ecomes more imp ortant and F V develops two area which

L

are lo cally convex one centred on states with high volume and thus high entropy but also

high energy which form the lowdensity vapour phase and the other centred on states with low

volume and thus low entropy but also low energy which form the highdensity liquid phase

CHAPTER INTRODUCTION

Nevertheless the vapourlike states still have the lower F and thus at p enormously higher

weight However the convexity of F V means that by imp osing a suitable nite p it is

L coex

can

p ossible to pro duce a P V that has two separated maxima such that

L

X X

can can

P p V P p V

coex coex

L L

V A V B

clearly implying

G p V G p V

A coex B coex

which is just the same as was derived for the Ising mo del The narrowness of the mo des means

that analogues of equations and also apply Equation can b e used to estimate

p by the doubletangent construction even if only a part of F V is available see app endix

coex

B

We digress briey to mention that there has b een some controversy as to where the analogue

of the transition should b e regarded as o ccurring in a nitesize systemwhether it should b e

can

when the two p eaks of P V have equal weights or when they have equal heights The

L

controversy arises from the asymmetry b etween liquid and vapour which causes the two p eaks

can

of P V to have dierent shap esindeed b ecause of this asymmetry the phase transition

L

in this type of system is known an an asymmetric rstorder phase transition Both criteria

are compatible with the expression just given for the transition p oint in the innite volume

limit the values of the eld variables required to pro duce b oth equalheights and equalweights

approach the same limits However the dierence is imp ortant when trying to identify the

transition p oint of a nite system in a computer simulation and b oth metho ds have b een used

the authors of favoured the equalweight criterion while those of used equal height

The problem is discussed in and Recently it has b een established that for lattice

based systems using equal weights will give estimators of the control parameters at co existence

that have smaller discrepancies from the innitevolume limit However it is not clear that the

analysis applies to olattice systems

As b efore the region of low probability b etween the p eaks is dominated by states exhibiting

phase co existence As for the Ising mo del these states are unstable unless the order parameter

is constrained Equation and the appropriate analogue of equation also apply

Let us now consider the soliduid phase transitions which are a new feature of the contin

uous systems phase b ehaviour In the solid phase the energy is near its minimum since the

CHAPTER INTRODUCTION

highly ordered structure maximises the number of close neighbours that a particle can have but

the entropy is also low b ecause each particle is held in the cage formed by its neighbours and

can move only in a small volume around its lattice site As can b e seen from the p T phase

diagram if we b egin from low temp erature and pressure then at rst there is a solidvapour

co existence curve which has a junction with the liquidvapour co existence curve at the triple

point marked TrP in gure where solid liquid and vapour can all co exist ie all have the

same sp ecic Gibbs free energy After this a solidliquid co existence curve continues upward

It is generally called the solidgas co existence curve once the pressure is higher than p since

c

after this heating a dense uid at constant p will not pro duce another phase transition to a less

dense uid

An obvious qualitative dierence b etween the soliduid transitions and the liquidvapour

transition is that the soliduid transitions do not end in critical p oints at least no exp erimental

evidence for such b ehaviour has ever b een found This seems to b e b ecause the solid and uid

have qualitatively dierent structures in the solid there is clear longrange crystalline order

and particles are lo calised on their lattice sites while in the liquid and vapour there is no such

order and particles may wander throughout the volume of the system It is hard to envisage

a mechanism which could allow one type of structure to merge continuously into the other as

would have to happ en at a critical p oint It is interesting to note that the liquid and vapour

do have qualitatively identical structures in this sense even though the dierences in bulk

prop erties such as density b etween the phases may b e very largefar larger than the dierence

in the same prop erty b etween the solid and the liquid

Though the absence of a critical p oint means that the order parameter description of the

transition do es not carry over in all its details from the uid case the usefulness of the concept

means that one is often dened Volume or density of the system will once again serve

since the solid and uid always have dierent densities Analysed like this the soliduid

can

phase transition lo oks very much like the liquidvapour onethe probability P p V at a

L

suitable pressure again has an asymmetric double p eak structure the transition o ccurs when

the sp ecic Gibbs function g is the same for b oth solid and uid and the region b etween the

maxima is dominated by mixed phases There are also other p ossible order parameters that

are more closely linked with the obvious dierence in structure of the two phases related to

the average number of nearest neighbours or to the structure factor of the system

CHAPTER INTRODUCTION

Calculation in Statistical Mechanical Problems

Al l exact science is dominated by the idea of approximation

BERTRAND RUSSELL

The formalism of statistical mechanics that we have describ ed provides in principle for the

solution of the problem we stated in section that of nding which phase is stable at par

ticular values of the control parameters and in particular of nding where any phase transitions

o ccur We need a mo del for the p otential of course but having this we have expressions for the

partition function canonical averages and the weights of phases in terms of this p otential and

these tell us which phase is favoured at particular values of the control parameters Knowing

this we can in principle nd those values of the control parameters where b oth phases have

equal weights and so can construct the phase diagram However evaluation of these expres

sions consisting as they do of sums over all the congurations of the system is in practice

extremely dicult and can only b e carried out for a few particularly simple forms of E

whether we are dealing with a continuous or a discrete space of congurations The sub ject

of this thesis is of course the use of computational metho ds in the evaluation of the necessary

expressions and we shall give a basic overview of computer simulation in sections and

as an introduction to chapter where we review in depth the various ways that the

phase co existence problem has b een tackled computationally But rst let us lo ok at analytic

metho ds

Analytic Metho ds

There do exist analytic metho ds for evaluating the partition function and canonical averages of

statistical mechanics there are some exact results and some approximate metho ds of general

applicability However as we shall see most of these approximate metho ds fail in the situation

that is of interest to us viz the calculation of the lo cation of phase b oundaries

Noninteracting systems ie systems in which Z can b e expressed as a pro duct of single

particle partition functions are generally soluble with ease but they are not of much physical

interest A number of exactly soluble systems which do contain particleparticle interactions

have b een found The dimensional Ising mo del in zero eld is one example various

embroidered extensions of this can also b e solved see Some prop erties of the Potts

CHAPTER INTRODUCTION

model a generalisation of the Ising mo del can also b e found analytically The other

exactly soluble mo dels for example the Vertex model tend not to provide much further insight

into real systems for more details of their b ehaviour see and The amount that is

known ab out these systems varies dep ending on the mo del for example g is known at

all temp eratures for the Ising mo del but is only known for the Potts mo dels at their phase

transition temp eratures All the exactly soluble mo dels are lattice mo dels having only one or

two spatial dimensions and they are solved by Transfer Matrix techniques which are

not applicable to other systems The diculty with exact solution is that we must calculate how

many congurations there are with a particular energy density or magnetisation but b ecause

the the number of congurations is so vast and the p otential describing the interaction couples

together the degrees of freedom of the particles this calculation quickly b ecomes a problem in

combinatorics which is in soluble analytically in only a few sp ecial cases

Mean Field Theory is an attempt to deal with an interacting system by eectively reducing it

to a noninteracting one We do this by considering a single particle and treating its interactions

with all the other particles in the system by averaging them out so that they form a continuous

background eld The background eld dep ends on bulk prop erties that we do not know

but we can solve the problem by enforcing selfconsistency we calculate the prop erties of the

single particle that are pro duced by a particular background eld and then demand that

these prop erties would if extended to all the other particles b e such that they would pro duce

the desired background eld In chapter and chapter the d Ising mo del with

H is treated in this way The successes and deciencies of the technique are immediately

apparent In one dimension it is qualitatively wrong while in d and d it predicts qualitatively

the right b ehaviour but its answers are wrong in detail the critical temp erature is to o high

The reason for this lies in the fact that the eects of uctuations and co op erative phenomena

are ignored the results of the technique are therefore b est at low or high temp eratures and

worst at phase transitions esp ecially continuous ones Moreover there is no way of extending

the theory systematically to obtain successively b etter results although there are heuristic

criteria applicable in particular circumstances It is interesting that the p erformance of Mean

Field theory gets b etter as the spatial dimensionality d of the problem increases in four or

more dimensions it is essentially exact for the Ising mo del This o ccurs b ecause the number

P

5

E and may also All physically interesting p otentials have at least pairwise interactions E

ij

ij

contain threeb o dy fourb o dy etc terms

CHAPTER INTRODUCTION

of nearest neighbours of any one particle in particular increases with d and so the size of

uctuations as a fraction of the mean eld diminishes

Series expansions chapter result from p erturbation techniques the partition

function is expanded as a series in some convenient parameter ab out a state where it is known

exactly For example we can expand ab out T in T low temp erature series or ab out

in high temp erature series or ab out in These can equally b e regarded as

expansions ab out the fully ordered states or the entirely random states Series expansions

for simple energy functions like those of the Ising or Potts mo dels have with great eort

b een continued to upwards of terms With careful analysis and using techniques which

were developed using insight obtained from other metho ds they can give useful results even

around phase transitions but convergence tends to b e slow and results are not b etter than

those of computer simulation even for the longest series This is unsurprising b ecause the

expansion is ab out the temp eraturesdensities where a particular phase is at its most stable

which is inevitably distant from the phase transition p oint where it is b ecoming unstable The

expansion parameter thus cannot b e small and many terms will b e required For complex

energy functions the expansion can only generally b e carried to a few terms and results are

corresp ondingly p o orer still

The renormalisation group RG volume and chapter for an introductory

reference is a metho d of evaluating the partition function in stages summing over some of

the degrees of freedom at each stage and then dening a renormalised Hamiltonian on the

remaining ones that in some sense lo oks like the one that existed at the previous stage

This denes a mapping which with appropriate approximations leads to results for the free

energy and canonical averages see The pro cess is in principle exact but quickly b ecomes

extremely complicated to carry out It was developed to deal with the a critical region where

the selfsimilar physical structure of the system when viewed on dierent length scales is echoed

by the similar structures of the sequence of renormalised Hamiltonians We note at this p oint

that analytic RG calculation can b e supplemented by a p owerful numerical technique called the

Monte Carlo Renormalisation Group

Fluid systems can often b e usefully b e treated by metho ds which aim at pro ducing an

equation of state the equation relating p and One approach is approximate solution

of the OrnsteinZernike equation which links the p otential function to the radial distribution

function g r dened in section From g r an equation of state may b e derived For

CHAPTER INTRODUCTION

a general account see for a particular metho d of approximate solution the PercusYevick

closure see Results are often go o d for simple uids like the hardsphere uid but the

b est schemes are not systematically improved but are the result of heuristic combination of

results obtained by using dierent approximations Moreover g r is very dierent in the

liquid and vapour phases and so an approximation scheme that describ es one phase well is not

go o d for the others The consequence is that no single mo del predicts phase transitions they

must b e treated by using a dierent mo del for each phase and estimating the free energy by

integrating the equation of state assuming we have a reference free energy for the liquid state

from some other source Another approximation scheme is the metho d of cluster expansions

which relates the p otential function to functions known as ffunctions that describ e the

interactions amongst successively larger groups of particles The equation of state can then b e

expanded in terms of integrals involving ffunctions However once again the metho d b ecomes

very complicated when taken b eyond the lowest orders

MonteCarlo Simulation

Given that the partition function and canonical averages cannot b e calculated analytically for

the vast ma jority of systems of interest can they b e calculated or at least estimated by

computer It is indeed the case that they can and there are at least two broad approaches to

the problem In b oth cases a mo del of a system of particles interacting with the p otential of

interest is set up in the computer Clearly the number of particles N that is found in macroscopic

samples of a material O is out of the reach of simulation but in fact it is frequently found

that the macroscopic prop erties are observed at least qualitatively and often quantitatively as

well for N O and sometimes even for N O or less Moreover techniques for the

extrap olation of results from nite samples to the thermo dynamic limit are now well developed

see and app endix A even where the nitesize of the computer simulation has its greatest

eect that is near continuous phase transitions

There are two broad approaches to computer simulation in condensed matter physics molec

ular dynamics MD and MonteCarlo MC simulation In molecular dynamics simulation the

force pro duced on each particle by the p otential is calculated and the particles are moved in

accordance with the Newtonian equations of motion However doing this immediately moves us

away from the attractive simplicity of the absence of kinetics in the statistical mechanical pic

ture of manybo dy systems This picture is the foundation of the MonteCarlo metho d which

CHAPTER INTRODUCTION

will b e our concern throughout the thesis An introductory reference to molecular dynamics

which we shall not discuss any more is more detail can b e found in There

are also various metho ds that attempt to combine the b est features of the two for example

hybrid MonteCarlo and constanttemperature molecular dynamics the Ho overNose

Thermostat

The MC metho d is basically a technique for evaluating multidimensional integrals in this

case the integrals or sums that dene the partition function and canonical averages There

are a great many general references to the metho d see for example A sp ecic

reference to the problem of simulating uid systems is and phase transitions are discussed

in mainly for lattice mo dels and mainly for olattice systems

For a system with a discrete conguration space one might at rst imagine that one could

evaluate the partition function directly by simply summing exp E over all the congura

tions of the system we neglect eld terms for now they do not aect the argument However

b ecause the number of p ossible congurations increases exp onentially with increasing system

size it passes out of the range of even the fastest computer quite so on after it passes out of the

range of calculation by hand The same problem prevents the use of normal numeric integra

R

tion routines to nd exp E in a system with a continuous conguration space The

dimensionality of the problem is N d where N is the number of particles and d is the dimension

ality of space A numerical integration routine would require the evaluation of the integrand

exp E at a grid of p oints nely spaced enough for it to b e smo oth However if the

grid is such that there are m p oints along each co ordinate axis then the integrand must b e

p

N d

evaluated m times which once again grows exp onentially with N For example in d with

p

particles and a grid with just p oints along each axis evaluations would b e required

this is already out of the range of computability even though the choice of p oints along each

axis would b e far to o small if the interparticle p otential were realistic and the system were

fairly dense A ner grid would b e required b ecause the integrand would b e zero almost all

the time whenever there was at least one overlap b etween the repulsive hard cores of the

particles corresp onding to the strong repulsive forces pro duced by innershell electron in real

atoms and molecules and for those few p oints on the grid where there were no overlaps the

integrand would vary to o sharply for a numeric integration routine to give an accurate picture

of its b ehaviour

Obviously then we cannot exhaustively denumerate the state space so to pro ceed we must

CHAPTER INTRODUCTION

take a sample of congurations and use them to estimate the quantities of interest This

sample while it may well with mo dern computing p ower contain millions of congurations will

nevertheless include only a tiny fraction of the congurations that exist

The simplest way to pro duce the sample would b e to generate the congurations at random

so that all microstates are generated with equal probability This is known as Simple Sampling

For example we might make a conguration of particles by generating the xy and z co ordinates

of each particle with a uniform probability on L where L is the length of the side of the

b ox containing the particles or we might make a conguration of Ising spins by chosing each

spin randomly in the or orientation Supp ose we generate N congurations Then

c

N

c

X

eb

TOT

exp E Z Z

i

N

c

i

P

N

c

O exp E

eb

i i

i

O O

P

N

c

exp E

i

i

eb

where means that the right hand side is an estimator of the left It is indeed true

that Z Z and O O However they are not go o d estimators for most

cases b ecause N must b e extremely largeO b efore it is at all likely that Z Z

c TOT

and O O The problem is the same one that was describ ed ab ove when talking ab out

numerical integrationthe vast ma jority of the states in the sample are likely to have a very high

energy b ecause they contain at least one overlap and so a very small value of exp E As

we discussed in section the expressions for Z and the canonical averages are dominated by

these high energy congurations only at very high temp eratures at more normal temp eratures

the congurations that are imp ortant have lower energy but b ecause there are so few of them

compared to the high energy ones they are generated by the simple sampling technique only at

d

extremely lengthy intervals and intervals whose length increases exp onentially with L

We see that what is required is a biased sampling scheme that generates much more frequently

those congurations that dominate Z and the canonical averages This is known as importance

sampling Supp ose we have an algorithm that generates microstates such that state app ears

i

with probability P P Then we call the vector P i the sampled distribution

i i i TOT

We shall employ a vector notation P for fP g here and later esp ecially in chapter

for the set of macrostate probabilities

CHAPTER INTRODUCTION

An obvious choice of sampled distribution that satises our requirements for most statisti

cal mechanical problems with the ma jor exception of the phase co existence problem is the

Boltzmann distribution itself we call a simulation that samples this distribution a Boltzmann

sampling simulation However it is not at rst obvious how to do this b ecause we do not know

the partition function Z The problem is the same for any other choice of sampled distribution

we will in general have

P Y Z

where the function Y exp E for Boltzmann sampling is easily sp ecied calculated for

P

any conguration but its normalisation Z Y is unknown and hard to calculate

f g

b ecause it involves a sum over all congurations The known function Y to which the unknown

probability distribution is prop ortional is called the measure of the probability distribution an

expression coming from the statistics literature

A solution which notwithstanding its age is still the basis of most MC work done in condensed

matter physics is the Metropolis algorithm The central idea of this metho d is to generate

a sequence of microstates by evolving a new trial microstate from the existing one by making

j

some usually small random change to it Now we arrange in a way describ ed b elow to move

to this new state with a probability P If the transition is accepted is

i j ij i

replaced by which is then itself used as the source of a new trial microstate Otherwise we

j

try again generating another trial microstate from

i

P

is called a Stochastic Matrix meaning that it has the prop erty that Its size

ij ij

j

is Note that dep ends only on the present state i and the trial state

TOT TOT ij i

j not on any other state that the chain has passed through in getting to i This prop erty is

called the Markov prop erty and the sequence of congurations pro duced is called a Markov

chain can b e thought of as describing the time evolution of the probability distribution of

ij

the microstates Supp ose we have P at time t if we know exactly that we are in state

i

i at this time then P i i Then at t we have

i

X

P P

j j i i

j

Of particular interest is the left eigenvector of with eigenvalue which we write simply ij

CHAPTER INTRODUCTION

as P It is the vector of equilibrium microstate probabilities

i

X

P P

j j i i

j

P represents what is called the stationary probability distribution of the chainthe probabil

ity distribution of microstates that is invariant under the action of and so remains unchanged

once it is reached An extremely imp ortant prop erty of a Markov chain is that it can b e shown

that the sampled distribution converges to P as n for any choice of initial state ie

any P as long as is such that the chain is ergodic meaning that any state can b e reached

i

from any other in nite time In many situations the convergence is rapid This means that

we can use the Markov chain with transition matrix as a way of pro ducing microstates with

probability distribution P We must p erform equilibration discarding the early congura

tions which are unlikely to b e typical of equilibrium while the sampled distribution converges

to P then after that congurations will indeed b e drawn with probability P

Now we must consider how to choose to pro duce the desired probability distribution P

There are many p ossible solutions b ecause there are comp onents in and only

TOT

TOT

constraints coming from the comp onents of P We normally choose to observe

which means taking

P P

i ij j j i

with

R P P

ij i j

ij

R P P otherwise

ij j i

where R is an arbitrary symmetric matrix its signicance is discussed b elow That this

ij

choice satises equation can b e easily veried by substituting for The p ower of the

j i

metho d comes from the fact that the comp onents of P the equilibrium probabilities enter

only as ratios so that we have overcome the problem of the unknown partition functionit

simply cancels out and we need only the measure of the distribution P P Y Y For

j i j i

example to sample from the the Boltzmann distribution we choose

6

It is not p ossible except by observation to say when equilibration has o ccurred in practice this is often a

problem where equilibration is slow as o ccurs near phase transitions

CHAPTER INTRODUCTION

R E E

ij j i

ij

R exp E E otherwise

ij j j

The matrix R describ es which other microstates are accessible from a given microstate so

ij

it is determined by the particular choice of algorithm for generating the trial move Another

way of putting this is to say that R determines the pseudo dynamics of the simulation as

we have seen the physical dynamics of the system are not a part of a MC simulation R is

normally extremely sparse b ecause we choose the new conguration by making only a small

mo dication to the present one for example the singlespinip Metrop olis algorithm where

microstates i and j dier only in the movement of a single particle or the ipping of a single

spin Most microstates are therefore inaccessible in one move We are forced to make this

choice b ecause if we allow a trial transition to any other conguration we shall once again

b e sampling the states of highest entropy so it is almost certain that the energy of the trial

conguration will b e very much higher than that of the starting one Therefore the acceptance

ratio of MonteCarlo moves r b ecomes exp onentially small We discuss the eect that this has

a

b elow and in section

We implement the algorithm in practice in two stages rst we select a trial move from i

to j by a pro cess for which R is symmetric such as a ip of a random spin or a random

ij

small displacement of a randomly chosen particle then we evaluate Y Y If is greater

j i

than one we accept the move if it is less than one we accept it with probability This is

normally done by generating a pseudorandom variate X chapter on the interval

and accepting the move if X

Let us rst consider a Boltzmann sampling simulation Canonical averages are obtained very

simply in this case

N

c

X

eb

O O O N

i c

i

O is the result of applying the op erator O to the ith microstate of the N generated by this we

i c

mean that we generate N trial up dates of the Markov chain not all of them will b e accepted

c

so sometimes the same microstate will reapp ear several times in succession In practice it is

more ecient particularly for a long chain to store a histogram fC g of the number of times

CHAPTER INTRODUCTION

the chain visits each of the j N macrostates of every op erator O of interest Then

m

N

m

X

eb

C O O N O

j j c

j

Because the MC metho d uses random numbers to generate a sample of the congurations of the

simulated system the estimates of thermo dynamic quantities that it pro duces are asso ciated

with a statistical error We consider this error in detail in app endix C Here we conne ourselves

to quoting some of the imp ortant results full derivations of them can b e found in this app endix

and

We quantify the statistical error in our estimate using the Standard Error of the Mean

O O O

If all congurations were uncorrelated this would b e simply related to the variance of O

v ar O O O thus

O v ar O N

uncor r c

However b ecause the MC metho d generates a new conguration by making a small change

to the existing one adjacent congurations in the Markov chain are in practice always highly

correlated The consequence of this is that the standard error of the mean is larger than

equation suggests we have

O N v ar O

cor r c O

where is the correlation time of the observable O This can b e measured by expressing it in

O

terms of correlation functions We consider mainly in section where we shall use

i O

it as an analytic prediction of the variance of estimators from various sampled distributions

However to measure the standard error of O it is easier use a direct metho d To do this we

blo ck the congurations into m N blo cks O is enough and estimate O on each

b

blo ck chapter Then we measure the mean and variance of the blo cked means f O g

m

and use the simple formula

O N v ar O b

CHAPTER INTRODUCTION

since the blo cks should b e long enough for the blo ck means to b e uncorrelated if they are not

then N is not large enough for go o d results anyway A variant of this is to dene Jackknife

c

J

estimators O on all the blo cks of congurations except the mth and then to nd the mean

m

and variance of these see app endix D

We should note that whatever algorithm we are working with we must exp ect to have to up

date all the particles or at least a constant fraction of them to get uncorrelated congurations

d

This implies that the b est we can do is have t L

O

It has b een the norm in MC simulation to choose the sampled distribution to b e the Boltz

mann distribution as describ ed ab ove However other sampled distributions can b e used and

may in fact b e sup erior particularly for problems involving phase transitions for which as we

have mentioned the Boltzmann distribution do es not p erform well in terms of the ideas of

correlation times that we have just introduced we would say that near phase transitions can

O

b e very large indeed The investigation of alternative nonBoltzmann sampled distributions

that are b etter matched to this problem has b een the key concern in this thesis In section

b elow we describ e why Boltzmann sampling gets into trouble chapters and contain

results of investigations of various nonBoltzmann sampled distributions

MonteCarlo Simulation at Phase Transitions

Let us consider the ways that statistical mechanics suggests we could try to nd the values of

the control parameters and H or p that pro duce a phase transition in a particular system

Section suggests two ways in which we could try to do this We shall discuss each in turn

and explain why conventional MC metho ds encounter diculties in each case It was shown in

section that the phase b ehaviour of a system is reected in the probability distribution of

the order parameter with phase co existence o ccurring when the two phases have equal weight

So an obvious approach would b e to simulate in an ensemble with a variable order parameter

that embraces the two phases and measure its probability distribution directly eventually

nding the values of the control parameters where the two phases have equal weight This

metho d can b e considered as measuring the free energy dierence b etween the two phases we

can can

saw in equation that P P exp G G which implied that at co existence

A B

A B

d

g g g O L Note that it is not necessary to know the absolute free energies G

A B A

and G themselves

B

The other metho d would b e simply to measure the absolute free energies G and G of

A B

CHAPTER INTRODUCTION

the two phases separately and then compare them This is particularly attractive if there is

a diculty in crossing the phase b oundary as happ ens for example in the case of soliduid

transitions it requires a very long time for the solid to crystallise out of the uid and any

crystal formed is likely to contain defects and grain b oundaries We shall show b elow how the

absolute free energy of a phase can also b e expressed as a canonical average As a variation on

this we remark that equation shows that it is not necessary to know F V to use the

olattice example for all V A to estimate G p as long as V p is known As we have

A

already commented this is the basis of the doubletangent construction which is describ ed in

app endix B However we will more often use the calculation of G as a canonical average in

this thesis

Free Energy Dierences

can

Let us examine the rst metho d rst in the context of measurement of P M by Boltz

mann sampling for the d Ising mo del Supp ose we have H and so we are actually at

c

the phase co existence and should nd that the two sides of the distribution have equal weight

that is that there is no free energy dierence b etween the two phases However let us imagine

that we do not know that H is the phase co existence eld only that it is a reasonably close

can

approximation so that P M do es have two maxima which we are trying to rene

The diagrams in gure illustrate the problems faced by Boltzmann sampling MonteCarlo

can

All show the underlying distribution P M it has roughly this shap e for all

c

L

can

sampled by the simulation and diagrams A and B also show the function MP M

L

which gives the weight with which each part of the macrostate space contributes to the canonical

average M We also show some p ossible data histograms of visitedstates pro duced after

a short run A gures and after a long run B gures These histograms would give the

estimate for the probability of each phase The accuracy of our assessment of whether we are at

phase co existence and of any canonical averages obtained is clearly limited by the average

r w

time required to travel b etween the p eaks Thus we see that after the short simulation we have

not visited b oth sides of the distribution Only after a long run have enough tunnelling events

b etween the two sides of the distribution o ccurred to give us a go o d idea of their relative weight

can

and to outline the shap e of the whole of P M The accuracy obtained is limited not by

the total number of congurations generated which could b e millions but by the number of

tunnelling events The presence of such b ottlenecks in conguration space can cause the results

CHAPTER INTRODUCTION

A1 A2 M

can P (M) can MP (M)

M M can P (M) and Histogram, short run Estimation of from short run

B1 B2 M

can P (M)

can MP (M)

M M can

P (M) and Histogram, long run Estimation of from long run

can can

Figure P M M P M and p ossible data histograms for an Ising mo del at

c

of even a very long simulation to b e very p o or In such a case it is said that the simulation

suers from ergodic problems Note however that the shap e of the distribution within each p eak

is obtained quite rapidly so an an estimate of jM j within phase A from the short run

A

would b e quite accurate even though the estimate of M which dep ends on b oth phases

is very p o or

We understand then that adequate sampling will only b e obtained with a run much longer

than but how long is this likely to b e The part of the sampled distribution that has

r w

the greatest eect on is of course the region of low probability b etween the p eaks through

r w

which the simulation has to pass to get from one p eak to the other At criticality L

r w

b ecause the relative heights of the centre and p eaks of the pdf do not change with L However

at we are in the region of rstorder phase transitions and here as we describ ed in

c

section to pass through the region around M we must create an interface b etween

d

the two phases with a free energy cost L f Thus with Boltzmann sampling we must

s

wait on average for a time

CHAPTER INTRODUCTION

d

exp f L

M s

b efore a uctuation in the energy that is large enough to do this o ccurs Unless L is very small

or is very close to this exp onential slowing down is so severe that in any run of practicable

c

length the simulation will remain eectively trapp ed in one phase and never tunnel to the other

We to ok as a premise when starting this discussion that the eld H was close enough to zero

can

that P M has in reality roughly equal weights in the two phases equation shows

L

can

how sensitive P M is to the applied eld for there to b e roughly equal weights implies

L

d

H O L for the Ising mo del In fact of course for most systems particularly olattice

mo dels H is not determined by symmetry and determining it is our ma jor concern But the

coex

can

M analysis we have just given shows that with Boltzmann sampling the estimate of P

L

obtained at the co existence p oint is indistinguishable from p oints in the singlephase region

We can do no more than put a wide bracket on H H is certainly less than a pressure

coex coex

H that drives a simulation started in the highM phase into the lowM phase where it is

h

then observed to stay but it is certainly more than H that allows a simulation started in the

l

lowM phase to pass into the highM phase where it then stays H and H must b e at least

l h

far enough from H that F M H M and F M H M are convex everywhere

coex L l L l

can

We should note that the shap e of P M causes problems only b ecause the the structure

L

of the matrix R that determines the pseudo dynamics is such that the simulation has to pass

ij

through the region of low probability b etween the p eaks in order to get from one p eak to

the other In this resp ect the pseudo dynamics of the Metrop olis MC simulation resemble the

dynamics of a real system which must also evolve by small steps and the consequences are

similar the ergo dic problems of the simulation corresp ond to the tendency of real systems

to exhibit metastability section R is of course under our control but nave attempts

ij

to improve the algorithm do not succeed as we explained in section generating a new

conguration by making a large random change to the existing one pro duces an exp onentially

small r easily small enough to negate the eect of a larger average change in magnetisation

a

For lattice mo dels only the tunnelling problems can in fact b e overcome there is a class of

algorithms called cluster algorithms that are able to generate new congurations with

a large M without necessarily incurring a large p ositive E Hybrid MC is able

to some extent to do the same thing Whilst these metho ds are extremely eective we have

CHAPTER INTRODUCTION

taken the simple Metrop olis algorithm as a given in this thesis and concentrated instead on

improving p erformance by the use of nonBoltzmann sampled distributions With a cluster

algorithm Boltzmann sampling is an excellent strategythe region b etween the p eaks do es

not contribute much to the weight of a phase or to M and time sp ent there is time

that cannot b e devoted to sampling the p eaks where most of the weight lies However with

the Metrop olis algorithm a very substantial improvement is obtainable by chosing a dierent

sampled distribution that puts more weight b etween the p eaks and so reduces even at the

r w

cost of a reduction in the sharpness of the denition of the p eaks themselves see section

and much of chapter In fact even after choosing a b etter sampled distribution remains

r w

can

M the two imp ortant regions of macrostate space the large b ecause of the width of P

L

d

p eaks at M are separated by a distance that grows like L In the case where the sampled

distribution is roughly at b etween the p eaks and the same height as them we still require on

average the random walk time

M

r w

r M

a

to travel b etween the p eaks where the average change of magnetisation is M As we have

seen increasing M fails to have the desired eect b ecause of a dramatic fall in r it is unusual

a

d

in MC simulation to settle for an acceptance ratio of less than Therefore L L

r w

How this relates to is discussed in section

O

Measuring Absolute Free Energies

Faced with the diculties describ ed ab ove we may try to avoid the interface region entirely

and measure the free energy of each phase separately In this case we require absolute free

energies so that the two phases can b e compared These can in fact b e derived from averages

over the Boltzmann distribution of op erators which are exp onentials for the Ising mo del with

H we nd

X

exp E Z d

so that

2

L

Z exp E

CHAPTER INTRODUCTION

and

G L ln ln exp E

This is in fact G of the entire system not just of a single phase b ecause H is the co ex

istence line but nevertheless serves to illustrate the principle and if we restricted the algorithm

to generate only congurations with M say the result would indeed b e the free energy of

that phase However attempting to implement this metho d with Boltzmann sampling is again

very unsatisfactory In this case the problem is not that the sampled distribution pro duces

ergo dic problems but simply that the average to b e measured and the sampled distribution

put their weight in dierent macrostates Consider gure which shows schematically the

situation in the case we have just describ ed for the sampling of energy macrostates in the d

Ising mo del

Estimation of < O 1 (E) >

O1 (E)

can P (E) can P (E)O1 (E)

E

O2 (E)

can can can E P (E) P (E)O2 (E) P (E) and Histogram

E

Estimation of < O 2 (E) >

Figure A schematic diagram of a Boltzmann sampling distribution over energy and a

suitable O and an unsuitable O op erator to estimate from it

The leftmost diagram shows a typical sampled distribution for a Boltzmann sampling sim

ulation and a histogram that might b e generated from it This distribution is well suited to

can

estimating O where O E and O E P E are shown in the upp er right illustration

can

However P E is unsuited to estimating O lower right illustration b ecause O E

CHAPTER INTRODUCTION

can can

increases so fast with E that P E O E has a lot of weight in one of the tails of P E

exp E is of course an op erator of this kind

can

The error in the estimate of P E is larger in the tails b ecause there are fewer counts

and the eect of this error on O is magnied by multiplying by O Finally it normally

can

happ ens as shown in the diagram that P E b ecomes so small for states far into the tail

that no counts are recorded there and so no contribution is made to O even though

can can

P E O E is still large This is liable to happ en whenever P E N which so on

c

can d

arises since P E expcL in the tail Thus MC sampling cannot b e b e used to evaluate

can

averages like O if the largest values of P E O E are pro duced by the multiplication

can

of a very small value of P E by a very large value of O E

This problem can also b e ameliorated by using a nonBoltzmann sampled distribution one

that extends over the region where O exp E is large We explain how an estimate of the

canonical average O can b e extracted from this in section Indeed the Ising problem

that we have just describ ed will b e the testb ed for an investigation of this single phase metho d

in chapter In the Ising case we shall lo ok at O E exp E for an olattice mo del for

example a uid in the N pT ensemble we would need to evaluate expp pV where p

is small

Discussion

We have introduced the theory of Statistical Mechanics and a metho d of computer simulation

the MonteCarlo metho d that is naturally related to it We have describ ed the problem of

nding the lo cation of phase transitions and how it relates to the concept of free energy and we

have describ ed two approaches by which the MC metho d could b e used to tackle this problem

We have also explained why MC simulation in its most easilyapplied Boltzmann Sampling

form fails for b oth of these approaches Our task in this thesis will b e to investigate ways in

which the diculties may b e overcome by the use of of MC simulations which sample from

distributions other than the Boltzmann distribution We shall lo ok at metho ds for generating

and applying these distributions in chapter and shall pro duce some new results for the

b ehaviour of the pdf of its magnetisation Then in chapter we shall apply the metho d

to investigate a system of topical interest the squarewell solid But rst we shall review the

extremely extensive literature on the problem of free energy measurement

Chapter

Review

And what there is to conquer

By strength and submission has already been discovered

Once or twice or several times

FROM Four Quartets

T S ELIOT

As we saw in chapter the problems of MonteCarlo simulation of phase transitions par

ticularly rstorder phase transitions centre around the diculty of measuring the appropriate

free energy or free energy dierence If we keep the order parameter of the transition constant

then we are faced with measurement of a Helmholtz free energy for which a Boltzmann sam

pling algorithm is not suitable b ecause it do es not sample the highenergy congurations If we

allow the order parameter to vary then there is a large free energy barrier separating the two

phases Because the simulations pseudo dynamics constrains it to move in short steps through

its conguration space it takes an exp onentially long time to cross this barrier

A large amount of work has already b een done on solving these problems Progress up to

is describ ed in the review articles by Binder and Frenkel while covers some

developments esp ecially for dense liquids up to A more recent short review can b e found

in Two recent metho ds that have b een the basis of the work done in this thesis are the

multicanonical ensemble or for a review and the expanded ensemble

We shall now give a brief description of the imp ortant metho ds going over again some of

the ground covered in the reviews but also bringing in the newer metho ds We

CHAPTER REVIEW

avoid a detailed description of the multicanonical ensemble at this stage such a description can

b e found in chapter We shall divide the metho ds describ ed into three categories as follows

Metho ds which nd a free energy by expressing it in terms of some other quantity which

is more readily evaluated in Boltzmann sampling MonteCarlo simulation We shall call

these integrationperturbation methods They are

a Thermo dynamic Integration

b Multistage Sampling

c Bennetts acceptance ratio metho d

d Mons Metho d

e Widoms particle insertion metho d

We shall also include in this section a description of a relevant related technique

g Histogram metho ds

Metho ds which sample from a distribution other than the canonical Boltzmann distribu

tion with a constant number of particles We shall call these nonBoltzmann methods

They are

a Umbrella Sampling

b The Multicanonical Ensemble and its variants Originally due to Berg and Neuhaus

we describ e work by those authors and others Lee Rummukainen Celik Hansmann

and others

c Expanded Ensemble due to Lyubartsev et al called Simulated Tempering by Mari

nari and Parisi We also describ e related metho ds by Geyer and Neal

d Valleaus DensityScaling

e the Dynamical Ensemble

f Grand Canonical MonteCarlo

g the Gibbs Ensemble

Others Mostly these are coincidence counting methods which try to measure the proba

bility of an individual microstate They are

CHAPTER REVIEW

a Coincidence Counting Mas Metho d

b Lo cal States Meirovich Schlijper

c Rickman and Philp ots functiontting metho d

d The partitioning metho d of Bhanot et al

We shall describ e each metho d individually and discuss its advantages and disadvantages

b efore discussing and comparing them with one another

IntegrationPerturbation Metho ds

Thermo dynamic Integration

Thermo dynamic integration TI may p erhaps b e considered the standard metho d certainly it

is the easiest way to p erform free energy calculations b ecause a Boltzmann sampling program

that may already exist can often b e used with little or no mo dication A review of some of

the many applications can b e found in

It has b een the norm to use constantNV T simulations in these calculations all the examples

given here assume that this is the case Applied in this fashion and assuming that the order

parameter is constant in this ensemble the metho d do es not allow the direct determination

of the whole probability density function pdf of the order parameter V say but rather

measures F V for a particular V However there is no reason why it cannot b e implemented

using constantN p T simulations in which case equations analogous to those b elow equation

etc lead to Gp In whatever form the metho d relies on the fact that the derivatives of

free energies may often b e related to canonical averages that are easily measured by Boltzmann

sampling for example

F

E

V

which implies

Z

E d F F

0

To use this equation we measure E by simulation at constant V for a series of

values connecting and closely spaced enough that the shap e of E is well

determined Then the integration is p erformed numerically typically by using a to p oint

CHAPTER REVIEW

GaussLegendre quadrature It is imp ortant that the path of integration should not pass

through a rstorder phase transition if it do es then E will itself b e p o orly determined

b ecause of metastability problems and it will vary so fast with that the determination of its

shap e will b e dicult As a result the accuracy of the numerical integration will b e drastically

reduced and are known so F is determined if we know F This state at must

therefore b e chosen to b e a reference state of known free energy It usually corresp onds to

either a very high or very low temp erature In the high temp erature limit at we have

N

Z exp F V for an N particle uid with a soft p otential If the system is a uid

with a hard core in its p otential then the system reduces to the hard sphere uid a system

for which the free energy is known with go o d accuracy from analytic calculations At

very low temp eratures high a solid with a smo oth p otential to take a dierent example

approaches a harmonic solid Examples of the use of equation in calculation can b e found

in which use a low reference state and which use a high one

Another equation that is often used is

F

p

V

T

which leads to

Z

V

pdV F V F V

V

0

and we measure p using see

p N k T P

B

where

X X

E r

ij

r P

ij

r

ij

i j i

This time the usual reference state is one of high V for which the virial expansion may

b e used to obtain pV and thus F V Near the reference state equation may b e badly

b ehaved b etter numerical stability is obtained by using

Z

pd

F F

0

where is the density

CHAPTER REVIEW

TI based on equation has b een used so often in the literature that we shall not attempt

to give an exhaustive list two recent examples are

Both the ab ove are examples of what might b e called natural TI the integration path is of

the type that might b e followed in an exp eriment on a real system However a Monte Carlo

MC simulation is more exible than a lab oratory calorimetry exp eriment since the reference

system may dier from the system of interest in the form of its congurational energy as well

as in its control parameters and indeed need not corresp ond to any real system at all It

is in fact common to use articial metho ds where we take advantage of this greater freedom

to change the details of interaction b etween the particles In this case we usually introduce a

parameter which controls the change of the interaction so that by varying it we can smo othly

transform the system under investigation into the reference system for which the free energy is

known exactly If we write E E then from the denition of F

R

d E exp E

R

F

d exp E

E

where indicates that the canonical average is evaluated by a Boltzmann sampling simulation

with congurational energy E We now have an equation of the form of equation the

desired free energy is obtained by casting it into the form of equation or and integrating

as in the natural examples

A typical application is Frenkel and Ladds metho d designed for solid systems We shall

use a variant of this metho d in section Here

E E U

eq

The Einstein solid where U is the energy of an Einstein solid with its lattice sites at r

i

is a crystal comp osed of noninteracting p oint particles each attached to its lattice site by a

harmonic spring so

N

X

eq

r r E E

i

i

i

CHAPTER REVIEW

In almost every application the extra interaction U is added linearly so that E

is just U The desired free energy F is found from

Z

max

0

U d F F

It is apparent that F is equal to F which is exactly known only in the unmeasurable

ein

limit but Frenkel and Ladd use a series expansion of F F to correct

ein

their large results They carry out simulations on hard spheres to to investigate the relative

stabilities of fcc and hcp hard spheres nding that fcc is marginally the more stable structure

Their results agree with those of

Another useful technique for measuring free energies of solids is the singlecell o ccupancy

metho d of Ho over and Ree Equation is used for the integration andthe essential

comp onent of the metho d each particle is constrained to stay throughout within the Wigner

Seitz cell formed by its neighbours This do es not aect the solid phase where diusion of

the particles is in any case prevented by their neighbours but stops the rst order melting

transition that would otherwise o ccur when the density falls suciently The reference system

is thus a solid which is vastly expanded so that the interaction of the particles is extremely

small and the partition function can b e calculated exactly it is similar to the ideal gas except

that each particle has available to it only a fraction of the total volume However it should

b e mentioned that there is some evidence that a secondorder or weak rstorder phase

transition do es takes place in spite of all eorts and b ecause of this extra computer time is

needed in this region for equilibration and to capture the shap e of E V

Advantages and Disadvantages

As we have said a ma jor advantage of the metho d is that it may require very little mo dication

of an existing Boltzmann sampling routine to use it though an articial TI metho d will usually

need some alteration to insert the extra p otential U ab ove The ma jor disadvantage comes

from the inability to handle phase transitions when using TI in phase transition problems

we do not usually have even the option of measuring the free energy dierence b etween the

phases directly They have to b e treated separately and each linked with its separate reference

state Whether or not this causes any diculty dep ends on the length of the integration path

required and whether or not it is easy to nd an integration path that do es not cross a phase

CHAPTER REVIEW

transition It may b e dicult to avoid such a path we have seen that for solids this requires

the use of a trick like the singlecell o ccupancy metho d and the same problem may well arise

in uid problems where integration from the dense uid liquid phase to a very dilute state

would cross the liquidvapour line One way to solve this problem would b e to integrate around

the critical p oint though this seems not to have b een tried for a simple uid In the phase

transition was prevented articially by suppressing density uctuations but this clearly has the

disadvantage that congurations with signicant canonical weight may b e suppressed

The total time required by the metho d dep ends then on the number of simulation p oints

required on the integration path and so dep ends on the particular problem If the path is

long or if F V has some regions where its higher derivatives are large then many p oints will

b e needed and if some of the simulation p oints have ergo dic problems ie if equilibration at

constant V is a problem then they will take a long time to obtain The extreme example of

this is a phase transition Conversely however if F V is wellbehaved then TI will b e very

ecient and is likely to outp erform most other metho ds listed here not least b ecause of the

way it scales with system size Many metho ds require the number of simulation p oints or the

d

equivalent to increase at least as L b ecause they require that adjacent simulation p oints

have some likely congurations in common and the size of the uctuations in a canonical

d d

ensemble increase as L while the size of the system L determines the separation of the

reference state and the state of interest With TI this is not so the separation of simulation

p oints dep ends only on the smo othness of the integrand and need not increase much with L at

all see section for further comments One example where a very smo oth integrand has

enabled a very large system to b e investigated is discussed further in section b elow

Estimating the error in measurements of free energy made by TI is not easy to do and

must b e counted a disadvantage of the metho d An estimate can b e obtained by the blo cking

pro cedure of section or by lo oking at the nitesize scaling b ehaviour of the estimate

which is done in leading to a claim of an accuracy of However the total error

thus obtained may b e an underestimate of the true error b ecause it includes only the eect of

random uctuations while the eect of for example rapid variation in E which is not

well captured for the chosen spacing of simulation p oints is to put in a systematic error which

is not detectable simply by rep eating the simulation with dierent random numbers Errors of

this kind are found in our investigations in section In some systems ergo dic problems may

also aect the estimates of E It is also imp ortant though time consuming to equilibrate

CHAPTER REVIEW

each simulation separately it has b een found that failure to do this may cause hysteresis

A common way to reduce the amount of equilibration that must b e done is to use the nal

conguration of one simulation as the starting conguration of the next

In the case where TI is done in an ensemble in which the system has a constant order

parameter so the result is F rather than G it is easiest to nd the co existence curve by using

the doubletangent construction describ ed in the context of a uid simulation in app endix B

see also This avoids the necessity of mapping out the whole of F The pro cess is easier

once one co existence p oint has b een found b ecause the ClausiusClapyron equation

dp H

d V

coex

may b e used to predict how p must change for a given change in to keep on the co existence

curve Integrating the ClausiusClapyron equation using a more complex predictorcorrector

metho d is known as GibbsDuhem integration

Multistage Sampling

In this metho d also known as the overlapping distributions metho d the idea is to measure

the free energy dierence b etween two canonical ensembles dened on the same conguration

space but with dierent values of the eld variables by using the overlap of the pdfs of some

macrostate The metho d seems to have b een used rst by McDonald and Singer Later

implementations include Valleau and Card and the metho d has in fact b een fairly widely

used To see how it works consider distribution functions of the internal energy in

two canonical ensembles at temp eratures and We have

can

P E E exp E Z

and

can

E E exp E Z P

can can

Clearly we can measure P and P from Boltzmann sampling MC simulation let the

can can

estimators derived from histograms of visited states b e P and P As we have said b efore

only the unknown value of E prevents us from estimating Z But we can eliminate it b etween

CHAPTER REVIEW

the two equations

can can

Z Z P E P E exp E

So we can now estimate Z Z exp F F as long as the state E is such that

can can

we obtain b oth P E and P E that is to say as long as the measured probability

can can

distributions P and P overlap If they do overlap it obviously makes sense to use all the

energy states in the overlap region which gives us

R

can

P E exp E dE

ov

F F ln R

can

P E dE

ov

If one of the states at say is a reference state of known free energy equation now

gives us F

A similar equation results from considering the pdfs of the magnetisation or whatever the

order parameter is at dierent values of the eld H and using the overlap region to eliminate

can can

exp F M compare the equations for P E and P E with equation

However this simple implementation runs into trouble for more than small free energy dier

ences b etween small systems b ecause the two measured pdf s will fail to overlap This problem

b ecomes more acute as the system size increases as we discussed in section the fraction

of the p ossible energy states of the system which have a canonical probability signicantly dif

ferent from zero go es like N making overlap with the reference state harder to achieve In

this problem is solved in a fairly obvious way by generating bridging distributions a series

of simulations is p erformed using mo died Boltzmann weighting P E exp E for a

j

j

set of co ecients f g in the range ie eectively for a range of temp eratures

j j

b etween and The set f g is chosen so that adjacent distributions overlap and the

j

coldest bridging distribution overlaps with the Boltzmann simulation at say while the

hottest overlaps with the simulation at Equation is applied rep eatedly to eliminate

all but F and F though if the pro cess is applied b etween and any intermediate

temp erature it can b e used to nd F there to o This implementation is what gives the metho d

the name multistage sampling If the temp erature is varied as ab ove we commonly choose

the reference state to b e at innite temp erature when the free energy is often known

exactly or to a go o d approximation as describ ed in section

In Valleau and Card test the metho d on hard spheres with Coulomb forces and rep ort

results with a quoted error in go o d agreement with results obtained by other metho ds They

CHAPTER REVIEW

also p oint out that the exp onential bridging distributions used are not the most ecient b eing

sharply p eaked though of course they require hardly any mo dication of a normal Metrop olis

MC program to pro duce

It is interesting to note that if the two distributions overlap almost entirely then

R

can

P E dE and we need sample only the ensemble at evaluating exp

ov

E This case corresp onds to the evaluation of O for the suitable op erator O in

gure However unless is quite small exp E will b e more like O

from the same gure and the estimator of O exp E obtained will

accordingly b e bad for the reasons describ ed while discussing this gure The normal multistage

metho d thus oers a way of overcoming the problem of incompatibility of sampled distribution

and op erator provided that the pdfs at and overlap to some extent by sampling

b oth ensembles We can see intuitively that the estimator of O which tends to b e an

R

can

E dE P underestimate will b e increased by dividing by

ov

A variation on this temp eraturechanging version of multistage sampling called Monte Carlo

recursion has b een developed by Li and Scheraga They express the free energy dierence as

and use accelerationofconvergence a sum of terms of the form ln exp E

i i

i

techniques to extrap olate it to The metho d has b een applied to the LennardJones

uid with particles and to two mo dels of liquid water In a comparison of the

metho d is made with TI and multistage sampling and it is found that the eciency is ab out

the same In fairly go o d agreement with exp eriment is obtained A simpler version of the

same technique with temp erature doubling at each stage has recently b een rediscovered

The multistage metho d as we have describ ed it is like TI only suitable for sampling along

a path on which there are no rst order phase transitions b ecause at the phase transition

which let us say o ccurs at there is a discontinuous jump in E which if it is larger

can

than the size of typical uctuations is likely to prevent overlap of the pdfs P E and

can

P E Thus like TI it can only b e used for nding the free energy of a single phase

separately not for measuring a free energy dierence by sampling through the co existence

region It is p ossible to overcome this problem by p erforming a series of simulations all at

each with an articial constraint on E to keep it in some range of values narrow enough that

all macrostates have appreciable Boltzmann probability and overlapping with its neighbours

Then by matching the probabilities in the overlap regions it is p ossible to reconstruct the

whole pdf of E across the transition region though with the caveat that equilibration of

CHAPTER REVIEW

congurations containing interfaces may b e extremely slow Some authors however would

describ e this kind of implementation as umbrella sampling see b elow The same problem

arises with the pdf of the order parameter with replaced by the appropriate value of the

external eld and it can b e overcome to a certain extent in the same way

Advantages and Disadvantages

The multistage sampling metho d has very similar advantages and disadvantages to TI though

it may b e considered that in its concentration on probabilities it leads to free energies in a

rather simpler and more transparent way On the other hand the metho d do es demand overlap

of the pdf s of adjacent simulations which as we have seen is not necessarily the case for

thermo dynamic integration Like TI multistage sampling has the advantage that it requires

little mo dication of existing Boltzmann sampling routines but also like TI it cannot in its

simplest form deal with rst order phase transitions on its path of integration It is similarly

vulnerable to p o or estimation of the canonical pdf caused by problems of slow equilibration

or free energy barriers in any of the substages of the simulation particularly those at low

temp erature

Whether the metho d gives us F V or G p dep ends on whether the order parameter

is constant in the ensemble simulated or may vary If it is constant then a doubletangent

construction app endix B will b e required to nd the co existence eld just as for TI

The Acceptance Ratio Metho d

Introduced by Bennett in this metho d has also b een used by Hong et al and in

mo died form in see section b elow It is a similar metho d to multistage sampling

but extends it somewhat and addresses problems of optimising the pro cess of measuring F We

should also remark that is also a go o d general reference to the problem of measuring free

energy the author discusses what were the ma jor metho ds of doing so when it was published

as well as contributing several new ideas He connes himself to metho ds where the system of

interest is connected with a reference system of known free energy and indeed asserts that

the free energy can in general only b e found by using a reference state In fact this is not the

case the metho ds of section do not do so but the vast ma jority of metho ds do use this

technique

Bennett then go es on to discuss in detail the problem of nding the free energy dierence

CHAPTER REVIEW

b etween two canonical ensembles where one is a reference system this will clearly give the

absolute free energy of the other He also treats in detail the statistical problem of analysing

the available MC data to extract the b est estimate of the free energy although he has to treat

this problem using the assumption that the data p oints are uncorrelated

Let the two systems b e denoted by suxes and Then for any function W of the

co ordinate variables we nd

R

W exp E E d Z

Z

R

exp F F

Z

W exp E E d Z

R R

W exp E E d W exp E E d

Z Z

W exp E

W exp E

This is clearly reminiscent of equation If we choose W exp E with E E then

it reduces to the equation for a single stage of multistage sampling with almost complete overlap

The acceptance ratio metho d itself is pro duced by the choice W min fexp E exp E g

in which case equation b ecomes

M E E Z

e

Z M E E

e

where M is the Metrop olis function M x min f expxg

e e

So we see that we can estimate Z Z exp F F by p erforming two simulations

and in each recording the average of the Metrop olis function of the dierence of the two energy

functions M E E is the average of the probability for making a transition from

e

ensemble to ensemble hence the name acceptance ratio metho d However no transitions

are made in fact we sample each ensemble separately This should b e compared with the

expanded ensemble describ ed in section

Note also that the metho d is dened ab ove for two systems with dierent p otential energy

functions but the same canonical variables f g However one system could actually have more

than the other the smaller system could b e given dummy variables whose contribution could

b e factored out of the congurational integral

Bennett continues by making variational calculations in which however the eect of correla

tions has to b e ignored to predict the conditions under which the metho d gives the results with

the smallest variance It proves necessary if b oth W exp E and W exp E

CHAPTER REVIEW

are to b e measured accurately that the two distributions of E should overlap and Bennett

advocates shifting the origin of one of the p otential functions to achieve this The optimal

amount of shift is just the free energy dierence b etween the ensembles which must b e found

iteratively Note that it is not optimal to shift by E E as might rst seem

to b e the case b ecause a correction for the relative likelihoo d of other congurations in each

ensemble is necessary cf section

Since transitions are not made b etween the and systems it is also necessary to consider

what fraction of the available time to sp end in each The results is that it is in fact almost

always nearoptimal to devote the same amount of time to each system Finally it can b e shown

that the choice for W that minimises the variance of the estimate of the free energy dierence

is not M x but the Fermi function f x expx though the dierence is small

e f er

If the distributions cannot b e made to overlap we must either generate bridging distributions

as in Multistage sampling or try extrap olating them if they are smo oth enough Though it

is not stated explicitly the criterion for eective extrap olation seems to b e the same one that

determines the separation of the p oints in thermo dynamic integration Bennetts extrap olation

can

will work well if the shap e of P E is welldescribed by its rst few moments which are

related to E and d E d while TI over widely separated p oints also works if F s

higher derivatives are small and these are related to the derivatives of E by equation

This seems to put overlappingdistribution metho ds back on an even fo oting with TI though

extrap olation is more complex to implement

The pap er also contains a discussion of other metho ds numerical integration the p ertur

bation metho d which is what has since b ecome known as the singlehistogram metho d

which is viewed as a limit of the acceptance ratio metho d where one ensemble is not sampled

at all and overlap metho ds like Multistage sampling In this discussion the multicanonical

metho ds of Berg and Neuhaus see section and chapter are also anticipated when the

use of at bridging distributions is suggested

Advantages and Disadvantages

The acceptance ratio metho d and the other extensions and optimisations of the overlapping

distribution metho d that are considered here seem to improve it to an extent where it is again

comp etitive with TI However the analysis applies directly only to a two stage pro cess and in

a real application many stages will usually b e necessary To choose the spacing of the stages

CHAPTER REVIEW

and optimise the metho d using Bennetts criteria is a complicated problem which would involve

several trial runs but fortunately it seems that the eciency maximum is in practice quite

wide and at and rough improvements inserting an extra stage wherever the acceptance ratio

cannot b e measured prop erly etc quickly give an answer very close to the optimum

Another problem which is not addressed in the pap er is that around a rst order phase

transition ergo dic problems may o ccur which cause the averages measured in each simulation

separately to converge slowly This problem is obscured in b ecause it is assumed through

out the analytic derivations that the congurations that are sampled in each simulation are

uncorrelated No reference is made to the p ossibility of reducing the correlation time itself

It seems that in some cases the ergo dic problems might b e overcome by actually performing

the transitions whose probabilities are measured this is the basis of the expanded ensemble

metho d This is a p oint we shall return to in the discussion section and indeed often in later

parts of the thesis

Mons FiniteSize metho d

Mons metho d relies on MC sampling to nd the dierence in free energy density b etween a

large and a small system The free energy is then calculated for the small system by evaluating

the partition function explicitly

The metho d describ ed as implemented for the d Ising mo del with H with an obvious

generalisation to d is this consider the Ising mo del on a L L square lattice Two dierent

kinds of b oundary conditions are considered rstly the normal p erio dic b oundary conditions

for which the total energy function is E and secondly b oundary conditions which divide the

L

L L lattice into L L lattices each individually having p erio dic b oundary conditions

for which the energy function of the lattices considered as one comp osite system is E It

L

follows that

P

exp E

Z

L

f g

E

L

P

exp E Z

L E

2L

f g

P

exp E E exp E

L L L

f g

P

exp E

L

f g

exp E E

L L L

CHAPTER REVIEW

so that the dierence in free energy density is

ln exp E E

L L L

g g

L L

d

L

The Gibbs free energy is found for the Ising mo del since M varies in the NV T ensemble

The pro cedure can b e iterated until L is ab out in d or in d when g can b e found

L

exactly Then so can g for the larger lattices Rather than attempt to measure the average

of the exp onential directly Mons metho d employs techniques derived from the acceptance

ratio metho d For small systems b oth ensembles are simulated with the transition probability

b etween the two b eing measuredusing the Fermi not the Metrop olis function then it is found

that

Z

f E E

f er L L L

E

L

Z

f E E

E

f er L L L

2L

For large systems esp ecially in d there is insucient overlap b etween the ensembles even

for this and Mon uses multistage sampling simulating in addition interpolating Hamiltonians

E E E for in the range and nding the free energy dierence by

L L

Z Z

Z Z

E E

E E

1 n

L L

Z Z Z Z

E E

E E

2L 2L

1 2

where each ratio on the RHS is found by measuring Fermi transition probabilities Up to

six stages were used in In this pap er g g was measured for Ising mo dels at T

L L c

for L up to d d simple cubic and d b o dycentred cubic g was obtained to

ab out accuracy with MC passes p er site for each Hamiltonian simulated An

investigation is also made of the predictions of nitesize scaling theory that in d dimensions

d d

g T H L g U dL O L see app endix A Since the metho d measures g g

c b L L

directly the contribution g from the background analytic part of the free energy density is

b

removed and he was able to conrm the theory directly and measure the scaling amplitudes U

to a few p ercent

More recently in the same metho d has b een applied to large Ising mo dels on lattices up

to and U measured to extremely high accuracy In this implementation the interpolating

ensembles E a are used but with thermo dynamic integration in the version describ ed by

equation rather than multistage sampling used to nd the free energy dierences This is

the only case we have seen where the system size has b een suciently large and E

CHAPTER REVIEW

suciently smo oth that TI can b e used with high accuracy in a situation when the energy

histograms do not overlap TI stages were used for the system many more would have

b een necessary with multistage sampling without Bennetts extrap olation

Advantages and Disadvantages

This is another metho d that is most easily implemented for lattice mo dels and though it could

also b e applied to olattice systems this has not to our knowledge b een attempted Presumably

we would either consider sub dividing the system until we reached one that contained only a

single particle or we would stop when the system b ecame small enough for another metho d to b e

easily used to measure the free energy However esp ecially in a dense uid the energy function

of the large system considered divided up would most frequently b e innite corresp onding to

the case where at least some particles overlap the walls partitioning the large system into

the subsystems This would drastically reduce the eciency of the metho d some kind of

intermediate ensembles with the walls put in gradually would b e required

The numerical eort involved dep ends on the dimensionality and nature of the problem The

idea of the metho d is attractive and the accuracy is high b ecause it concentrates on measuring

a correction term the dierence in free energy densities b etween the two systems rather than

the free energy directly While it is similar to other multistage sampling metho ds we would

exp ect that fewer intermediate stages would b e necessary in this case b ecause of the intelligent

choice of systems E and E the dierence b etween the two Hamiltonians will b e small

L L

over most congurations and will b e caused mainly by the presence of the extra interface

like terms pro duced by evaluating the E energy in the E ensemble The size of these

L L

d

interfaces increases only as L so this is also the scaling of the quantity in the exp onent

to b e averaged In a similar multistage technique with ensembles diering in temp erature for

d

example the analogous quantity would scale like L The result is that fewer stages are needed

for a particular system size than in a normal Multistage metho d It is also probably the b est

metho d for measuring the correctiontoscaling amplitude U

Widoms ParticleInsertion Metho d

This metho d was introduced in Its goal is to measure the chemical p otential GN

for a system with only one kind of particle by making trial insertions or removals of a particle

CHAPTER REVIEW

Consider the denition of

F

N

V T

or for a nite system

F N F N

Z N

ln

N Z N

Because the number of particles changes it is necessary to include the eects of the the

p

M kinetic energy which pro duces the is called the thermal wavelength h

and the indistinguishability of the particles which pro duces the N N N However

we can remove the need to consider them explicitly by using lnN V for the

id

ideal gas giving

Z N

ln

ex id

V Z N

R

N N N

exp E exp E d d

R

ln

N N

V exp E d

R

N

exp E d

N

ln

V

where we have cast the ratio of partition functions into the form of a canonical average over an

N

N particle simulation of the excess energy E of the interaction b etween an N th

particle and the other N The pro cedure is thus to p erform co ordinate up dates in a

constantNV T simulation of N particles as normal but to alternate them with trial insertions

of a ghost particle at randomly chosen lo cations where we then evaluate E to estimate

exp E The ghost particle is not in fact inserted A virtual trial removal of a

particle can also give us in a similar way see

The metho d can b e thought of in the framework of the acceptance ratio metho d We are

sampling the transition probability exp E two canonical ensembles diering in the number

of particles they contain The exp ectation of the transition probability gives the free energy

dierence If the transitions are actually made then we have grand canonical MonteCarlo

CHAPTER REVIEW

Advantages and Disadvantages

The metho d works well for uids at low density but at high density the Boltzmann factor

b ecomes very small b ecause almost any insertion move would result in the test particle over

lapping some other resulting in a very high energy Once again the necessity of nding the

average of an exp onential reduces the eciency Similarly removing the particle would leave a

high energy cavity For solids the metho d do es not work at all b ecause we cannot insert or

remove a particle without disrupting the lattice

Various metho ds have b een tried to improve the p erformance of the metho d at high densities

For example in the inserted particle is moved around so that highenergy congurations are

sampled b etter while in we actively search for the cavities where insertions remain p ossible

This metho d is similar to the generalised form of umbrella sampling see b elow where certain

congurations are generated preferentially

Histogram Metho ds

Renewed interest in histogram metho ds was the result of pap ers by Ferrenberg and Swendsen

They present their metho ds as a way of optimising the analysis of data obtained

by conventional MC simulations though it is also relevant to the free energy problem Because

the Boltzmann distribution has the same form at any temp erature we can use the MC estimate

can

P E from a simulation done at temp erature to estimate E and so by substituting in

equation and normalising get

can

P E exp E

can

E P

P

0

can

P E exp E

i

E

follow where This is called histogram reweighting Exp ectation values O

0

as b efore so we can calculate TD quantities away from the temp erature of the simulation

but only slightly away from it b ecause the canonical distribution is very sharply p eaked As a

result once terms coming from the wings of the distribution b ecome imp ortant the accuracy

falls just as in multistage sampling or the evaluation of an exp onential average in section

In the metho d is used to exactly lo cate turning p oints in quantities which are functions of

temp erature like sp ecic heat Simulations are p erformed at a temp erature near that of the

sp ecic heat maximum the exact temp erature at which this maximum lies is unknown then

reweighted to obtain a much b etter estimate of the lo cation of the maximum and the value of the

CHAPTER REVIEW

sp ecic heat there If the whole pdf at can b e accurately constructed then the free energy

can

dierence G G is also calculable b ecause to b e able to nd P E accurately

is also what is required to nd exp E This use of the histogram metho d is thus the

same as the singleensemble versions of multistage sampling section that has already

b een mentioned In a more recent pap er Rickman and Philp ot have suggested that an

analysis of the distribution function in terms of its cumulants provides an approximation which

can b e extrap olated with more condence into the wings of the distribution They show using

data from a simulation at one temp erature that various thermo dynamic quantities including

free energy dierences b etween the two temp eratures can b e calculated by this metho d more

accurately over a wider range of temp eratures than by simple reweighting this clearly connects

with Bennetts extrap olation techniques

The metho d has b een extended to b e more useful in the context of free energy estimation in

where Ferrenberg and Swendsen extended it to overlap data from several simulations at

dierent temp eratures obtaining iteratively soluble equations giving the partition function and

its error at any temp erature The p eaks in the error show where further simulations need to b e

done This pro cedure is very similar to the overlapping done in multistage sampling but the

analysis showing where simulations should b e done is new A detailed discussion of the errors

of estimators of free energies and other thermo dynamic exp ectation values obtained by b oth

single and multiplehistogram metho ds can b e found in

Although histogram metho ds have b ecome a standard technique as a result of the work of

Ferrenberg and Swendsen the idea of reweighting the histogram to give estimators of thermo

dynamic quantities at other temp eratures was used by many previous authors see this chapter

ibid One early example is the work of Macdonald and Singer in the late s on the Lennard

Jones and uids and esp ecially Histogram metho ds are also essential in many

of the noncanonical metho ds of the next section Indeed the full p ower of the technique is

p erhaps b est released by use of the multicanonical distribution When histogram metho ds are

applied to Boltzmann sampling simulations the fundamental problem of the unsuitable shap e

can

of P E is still not solved only alleviated

CHAPTER REVIEW

NonCanonical Metho ds

Umbrella Sampling

This name is used in the literature in several ways to describ e any or all of a number of dierent

metho ds In its most general sense it is the name given to what we have called nonBoltzmann

sampling As we saw in chapter the Metrop olis algorithm can b e used to sample from any

distribution It can b e shown see section that if we do sample from a nonBoltzmann

distribution then canonical averages can b e recovered by a suitable reweighting provided that

can

the chosen sampled distribution some general P not P put appreciable weight in

can

those states that dominate the canonical pdf P at the temp eratureeld of interest

Now as we have seen section and elsewhere the canonical sampled distribution is

incompatible with the estimation of the average of certain op erators in particular those that

lead to the measurement of absolute free energies or free energies dierences b ecause the

can

congurations that dominate the maximum of O E P E to take the example of energy

macrostates are generated with almost zero probability If the sampled distribution is carefully

can

chosen so that as well as the states in P E it also puts weight in the states that dominate

can

the maximum of O E P E then it can b e used to overcome this incompatibility problem

This normally requires a sampled distribution that is wider over the E macrostates than the

canonical distributionhence the whimsical name umbrella sampling coined on the grounds

can can

that the wider sampled distribution is like an umbrella extending over P E and O P E

Though it was not the rst use of this technique the seminal pap er on this metho d seems

to have b een where the term nonBoltzmann sampling is also introduced Some similar

metho ds were employed earlier for example and esp ecially which used a kind of

reweighting based on equation These latter references also made the fundamental p oint

that estimators pro duced by multiplying small histogram entries by large reweighting factors

are no use in practice b ecause of their high variance

Possibly b ecause of the inuence of the name umbrella sampling is often used used in

a restricted sense where free energy measurement is the goal However as we have said the

name is also applied by some authors for example Frenkel in to any kind of nonBoltzmann

sampling in chapter the name umbrella sampling is applied to overlapping canonical

pdf s with constraints which we would call multistage sampling but this usage seems to b e

rare In this wider sense the range of p ossible applications is almost endless the metho d

CHAPTER REVIEW

can b e used to generate any event that is rare in the canonical ensemble with a high enough

frequency that its probability b ecomes measurable for example it has b een used to investigate

large uctuations in an order parameter and in rough measurements of the free energy

barriers to nucleation are made If we think of these uctuations as taking us across the free

energy barrier b etween two phases then we see that the problem of free energy measurement

by direct connection of the two phases can also b e approached this way as we sp eculated in

section we could use a sampled distribution which has a higherthancanonical probability

of b eing in the twophase region Another p ossibility used in uid simulations is to use a non

Boltzmann distribution parameterised by the distance of closest approach of the molecules

However like umbrella sampling itself these techniques are usually given their own names the

multicanonical ensemble and density scaling They are b oth describ ed separately b elow

A nal p oint is that although we have constrained the shap e of an eective umbrella sampling

distribution we have not xed it entirely In fact it seems to b e an unresolved question

which sampled distribution is the b est for measuring a particular op erator in the sense that

estimators pro duced from it have the lowest variance The issue seems to have b een addressed

rst by Fosdick and more recently in the context of the study of spinglasses by Hesselb o

R

E

and Stinchcombe who recommend a distribution where P E E dE We

shall return to this matter in section

Advantages and Disadvantages

Umbrella sampling is a p owerful metho d once a go o d sampled distribution has b een obtained

we can reweight it to obtain not only free energies but a variety of canonical averages for all

can can

values of the control parameters such that b oth P and O P put almost all their weight in

the sampled region

Another very signicant advantage is that any ergo dicity problems in the low temp erature

states may b e largely overcome by the increased volume of phase space available to the system

It may for example move from states typical of a low temp erature where movement through

conguration space is typically slow up to states typical of a high temp erature where congu

rations change rapidly Then it may co ol down again and return to a region of conguration

space far away from the one where it started The whole pro cess may take much less time than

would b e required to pass b etween the two regions by way of the lowtemperature states alone

The most serious disadvantage of the metho d seems to b e the diculty in nding a suitable

CHAPTER REVIEW

sampled distribution in the rst place the problem b eing as we shall see in chapter

that some knowledge of the free energy that we are trying to measure is required It has

apparently b een rare in the literature to achieve a sampled distribution more than a few times

wider than the canonical and so the metho d has frequently b een combined with multistageTI

techniques and has mainly b een used with small systems where the fractional uctuations are

larger Indeed in it is stated that the umbrella sampling metho d cannot b e applied to large

systems b ecause a suitable sampled distribution cannot b e pro duced However various authors

have given metho ds for evolving a suitable sampled distribution of any shap e see and

references in section see also an early reference that seemed to go largely unnoticed

and we have also addressed the problem at length in chapter

Multicanonical Ensemble

This metho d is originally due to Berg and Neuhaus whose work has prompted much recent

interest and further study some of which can b e found reviewed in and The

metho d is describ ed in detail in chapter It is really a rediscovery and reapplication of the

ideas of nonBoltzmann sampling that were already present in a sampling distribution

is generated so that the probability distribution of some chosen observable X typically E

or M is roughly at over some range of its macrostates other workers have used sampled

distributions with similar prop erties without calling them multicanonical eg We shall

characterise this distribution by its dierence from the Boltzmann distribution by using weights

E or M etc as appropriate so that the sampled distribution has measure

Y exp E exp The results can then b e reweighted as outlined in section

to recover the desired results for the canonical distribution

In the metho d was originally presented as a way of measuring interfacial free energy with

high accuracy However it has much wider applicability than this it enables us to tackle the

free energy problem b oth by the approach of simulating direct co existence of the two phases or

by measuring the absolute free energy of each phase separately

In their rst pap er Berg and Neuhaus investigated the rstorder phase transition of the

d state Potts mo del and measured the interfacial free energy f with high accuracy At the

s

transition temp erature which is known exactly for the Potts mo del at H so there was no

need to search for it the canonical probability distribution of the internal energy is double

p eaked with one p eak corresp onding to a disordered phase and the other to the equivalent

CHAPTER REVIEW

ordered phases The states in b etween which corresp ond to mixed states containing interfaces

are exp onentially suppressed see section Because the tunnelling times b etween the

symmetrybroken phases in canonical Metrop olis simulations is so long a preweighting E is

can

used to pro duce a sampled distribution which is roughly at b etween the two p eaks of P E

then reweighted the results to recover the canonical distribution The accurate measurement

of the probability of the mixed states leads to an estimate of the interfacial tension With the

probability of each energy action macrostate roughly constant it was found that the ergo dic

time the time to tunnel b etween the symmetrybroken phases scaled with system size as

r w

d d

L comparable with the ideal L exp ected for a simple random walk and an

r w r w

d

enormous improvement on expL f exp ected for a Boltzmann sampling simulation

r w s

Later applications and extensions of the metho d have included

Berg and Neuhaus together with various coworkers have measured interfacial tensions

for several other systems by preweighting of the order parameter magnetisation they

call this the multimagnetical ensemble though we use multicanonical for all applications

In a pap er written with Hansmann they simulate the d Ising mo del as a strict test of

the metho d these results also app ear in along with similar measurements for the d

Ising mo del Like the energy distribution of the state Potts mo del the magnetisation

distribution of the critical Ising mo del has a doublep eaked shap e and preweighting is used

in a similar way to facilitate tunnelling and to measure the probability of the M states

A recent pap er written with Billoire p erforms similar measurements on the state

and state d Potts mo dels and presents data on the d state Potts mo del

and the d SU lattice gauge theory In all cases the interfacial tension is measured

with much greater accuracy than had previously b een achieved The later pap ers contain

further development of error analysis and comparison with nitesize scaling theories

Further work on lattice gauge theories has included the application of the metho d to to

the SU and SU theories and to QCD itself

The multicanonical metho d has also b een used in simulations of spinglasses with energy

preweighting where particular advantage is taken of the algorithms ability to move

rapidly across the freeenergy barriers that severely slow canonical simulations Three

pap ers investigate the Isinglike EdwardsAnderson spin glass The rst by Berg and

Celik lo oks at the d mo del The innitevolume zerotemp erature ground state

CHAPTER REVIEW

entropy and energy are estimated from nitesize data They nd that the problems of

d

slowingdown with increasing system size are more severe than b efore L though

r w

this is still much b etter than canonical simulations exp onential slowing down and is

d

b etter even than simulated annealing for which L The others by Berg Celik

r w

and Hansmann lo ok at the spinglass in d and present more complete results

for energy density entropy and heat capacity evaluated at all temp eratures including

by reweighting They also show the order parameter distribution Considering

that energy not the order parameter was preweighted it may seem strange that this

distribution could b e accurately obtained however the lo cations and heights of the p eaks

can b e found b ecause the multicanonical algorithm makes high energy congurations

accessible where the order parameter has a singlep eaked distribution and then the

system can co ol down again in any of the mo des of the order parameter as was describ ed

in the advantages and disadvantages part of section Thus the system has the

eective ability to tunnel through the free energy barriers separating the phases The only

signicant dierence is that with energy preweighting the order parameter distribution

is not well determined in the regions of low canonical probability and so no estimate of

d

interfacial tension is obtained Again the simulation slows down with L In

r w

the suitability of the algorithm is checked by applying to the wellunderstoo d Van

Hemmen spin glass The use of the multicanonical algorithm to investigate spinglasses is

also referred to in a general pap er on the multicanonical metho d and its application

to multipleminimum problems

Hansmann and Okamoto have applied the multicanonical metho d to the problem of

proteinfolding Like the spinglass this is a multipleminimum problem where ergo d

icity problems can prevent the lo cation of the groundstate In they preweight the

congurational energy of metenkephalin a simple protein consisting of amino acids

and containing in the simplied mo del used continuously variable dihedral angles as

parameters They obtain the lowestenergy conguration in agreement with results of

simulated annealing and by reweighting the measured probabilities of energies evaluate

E and the heat capacity for a range of temp eratures This work is also referred to in

In they have found the lowest energy states of three p olyp eptides of length

residues each containing only one type of amino acid

CHAPTER REVIEW

Rummukainen has combined the multicanonical algorithm with micro canonical de

mon metho ds in a hybrid algorithm By separating the demons which form a heat

bath from the multicanonical part he is able to apply fast cluster metho ds and p oten

tially parallelism to them This is advantageous b ecause the changes of the preweighted

variable in multicanonical algorithm are inherently serial in nature b ecause b eing a

function of a variable E or M that dep ends on all the spins couples together all the

spinsparticles of the system To some extent this limitation is overcome by the use of

the demons The metho d is tested on the d state Potts mo del preweighting energy

and it is found that the ergo dic time is much reduced b elow that of the simple multicanon

d

ical algorithm L It is interesting to note that this is b etter even than ideal

e

randomwalk b ehaviour and demonstrates the eect that algorithmic improvement can

have Interfacial tension is also measured the results seemingly revealing inadequacies of

the normal nitesize scaling ansatz

In work on multicanonical multigrid Monte Carlo Janke and Sauer combine the

multicanonical ensemble with the multigrid metho d to reduce critical slowing down They

investigate the theory in b oth one and two dimensions with particular reference to the

auto correlation time of observables in the multicanonical simulation and its relation

O

to the error bars of the estimators of canonical averages As they p oint out is not

O

necessarily the same as They nd that the b ehaviour dep ends on the nature of the

r w

p otential barrier they are trying to tunnel through If the barrier height do es not dep end

on the system size as is the case with the d theory then the scaling of with L is much

O

the same for normal metrop olis and multicanonical metrop olis though the multicanonical

algorithm always has the lower and in b oth cases it is much reduced by using the

O

multigrid metho d to o However if as will usually b e the case in physically interesting

dimensions the barrier height increases with system size the increase of with system

O

size is much smaller for the multicanonical algorithms and employing multigrid techniques

to o further reduces but keeps its scaling roughly constantthe opp osite of what was

O

found in d They conclude that where the multigrid algorithm can b e used it should

b e used in combination with the multicanonical algorithm since the two together can

pro duce an orderofmagnitude improvement in p erformance over that obtainable with

either alone In a more recent pap er Janke and Kappler have combined the metho d

with the SwendsenWang cluster algorithm to pro duce what they call a multibondic

CHAPTER REVIEW

cluster algorithm and applied it to rstorder phase transitions of the the qstate Potts

d

mo del nding that the multicanonical auto correlation time grows as the ideal L

We remark that Janke and Sauers extension of errorpropagation analysis to the multi

canonical ensemble and a similar approach adopted in so that the error bars of

the estimators of canonical averages can b e obtained from the correlation times of vari

ables in the multicanonical simulations pro duces expressions that are very complicated

In practice we favour the use of blo cking We present our own investigations of in

O

section

Lee introduces what he calls Entropic Sampling in this is really no more than

a multicanonical preweighting carried out on internal energy rather than magnetisation

However he do es give an algorithm for evolving the preweighting and discretises the

preweighting function at the level of the fundamental granularity of the problem rather

than making it piecewise linear within a series of bins the choice of Berg and Neuhaus He

also avoids using their idea of an eective temp erature which we to o feel is unnecessary

He presents some measurements on the state Potts mo del and a small d Ising mo del

comparing numerical values for the co ecients in the hightemp erature expansion of the

partition function with exact results Very recently Lee Novotny and Rikvold have

applied the multicanonical metho d and Markov chain theory to study the relaxation of

metastable phases in the d Ising mo del The multicanonical ensemble is applied as usual

to achieve a at sampled distribution over M then the observed matrix of transitions

b etween macrostates is used to predict rst passage times and thus to study relaxation

prop erties in particular the bino dal and spino dal decomp osition of metastable phases

A mo died algorithm is used to sp eed up dynamics at highjM j It turns out that the

description of the relaxation in terms only of the global order parameter M is surprisingly

accurate In the macrostate transition matrix is not used in the generation of the

multicanonical distribution a technique that we have found to b e useful see section

In what is to our knowledge the only application to date to the problem of nding the

phase diagram of an olattice system Wilding in has used the Multicanonical

technique to map out the co existence curve for the d LennardJones uid The liquid

and vapour phases are connected directly by tunnelling across the interfacial region in one

CHAPTER REVIEW

of the clearest examples of the metho ds use in a free energyphase co existence problem

In the problems of the generation of the parameters that pro duce

the multicanonical distribution is addressed In nitesize scaling from a

small simulation is used to pro duce the parameters of a larger one In an overlapping

distribution metho d rather like multistage sampling is used to give an initial estimate for

the spinglass problem where nitesize scaling do es not work In all the metho ds

are reviewed and a slightly ad hoc metho d is given for combining the results of several

previous rening stages to give the b est estimate for the reweighting function

Advantages and Disadvantages

This metho d has the same key advantages and disadvantages as umbrella sampling of which

it is a sp ecial case pro ducing a go o d sampled distribution requires an iterative pro cedure but

once it is established a single simulation suces to measure the quantity of interest which in

our case will b e a free energy or free energy dierence The larger set of accessible states reduces

error due to ergo dic problems which enables ecient simulation of systems like spin glasses

but means that the random walk time across the multicanonical region b ecomes long since the

multicanonical region is much wider than the p eak of the sampled distribution of a Boltzmann

sampling simulation eective auto correlation time of observable O also b ecomes long but

O

it do es not feed directly into the variance of estimators of canonical averages in the same way

that was describ ed in app endix C b ecause of the eect of the reweighting pro cess see

We shall investigate the nature of multicanonical errors more in chapter

The ability to treat the two phases simultaneously in a single simulation for lattice mo dels

and in some circumstances olattice mo dels to o is also clearly an advantage though we

should p oint out that the direct linking the two phases is not easily applicable to olattice

systems where one phase is solid and the other uid b ecause this would still pro duce severe

ergo dic problems due to the diculty in growing crystals out of a uid

Another p otential disadvantage which would also aect most umbrella sampling simula

tions is the imp ossibility of p erforming parallel up dates on dierent degrees of freedom if those

up dates would change the preweighted variable In canonical simulations with shortrange

forces particles or spins that do not interact either b efore or after the move may b e up dated in

parallel With the same forces but multicanonical sampling only one particle or spin may b e

up dated at a time b ecause the change in preweighting which we must know to calculate the

CHAPTER REVIEW

acceptance probability dep ends on the global value of some macrostate variable and so would

dep end on whether or not parallel up dates going on elsewhere in the system were accepted As

we have seen some metho ds like have already b een developed to partially overcome this

problem We shall discuss it further in sections and

The Expanded Ensemble

This metho d or metho ds similar to it has also b een discovered indep endently several times

The way it is presented here follows the approach of Lyubartsev et al In some ways it is

similar metho d to the multicanonical ensemble section the system is again encouraged

to explore more of its p ossible energy states and preweighting is again involved We shall explain

the connections more formally in section though they will b e obvious while reading this

First let us consider the temp erature expanded ensemble which is the rst version introduced

in A system of charged hard spheres the RPM electrolyte is simulated which can also

make transitions b etween a number of dierent temp eratures f g so the total expanded

j

ensemble partition function Z is given by

X

Z exp Z

j j

j

where the Z are the normal canonical partition functions at each temp erature

j

X

exp E exp G Z

j j j j

f g

and the Gs are Gibbs free energies and the co ecients f g are chosen so that ab out the same

time is sp ent in each j state we call the j states sub ensembles Transitions b etween the

sub ensembles are implemented using the normal Metrop olis algorithm the same conguration

is retained and the move from ensemble i to j is accepted with probability M E

e ef f

where E E The nding of a suitable set f g is a nontrivial

ef f j i j i

problem which must b e approached iteratively just like the evaluation of the preweighting

co ecients of the multicanonical distribution

We exp ect to b e at temp erature with probability P given by

j j

P Z exp Z

j j j

CHAPTER REVIEW

so

Z exp

j j

P P exp G G

j j j j

Z exp

If we can arrange for the state lab elled by zero to b e a state of known free energy a p erfect

gas say for or an Einstein solid then we can nd the absolute free energies In

P

C the given references we estimate the P from the histogram of visited states P C

j j j j

j

The co ecients f g are required to stop the system sp ending almost all its time in the state

of lowest G The ideal form for the f g is G under which the probability of all

j j

j

sub ensembles is constant We require O if their probabilities are to b e accurately

j

j

measured In known approximate values for G are used to b o otstrap the estimates of f g

the Gs are in general unknown and are indeed the very quantities we seek This should b e

compared to section

Although we have describ ed the metho d as it was implemented for sub ensembles diering

in temp erature it can easily b e generalised to to dierent values of other eld variables such

as H or to dierent forms of the interaction in itself use of the temp eratureexpanded

ensemble is only sucient to transform the RPM electrolyte into the hardsphere uid at

b ecause the hard core in the p otential remains Therefore another expanded ensemble

is used where the Hamiltonian is changed to move from hard spheres to a p erfect gas via

increasingly p enetrable spheres All the kinds of transformation of the energy function that

have b een applied in thermo dynamic integration are also applicable here

In the investigation of the RPM electrolyte Lyubartsev etal quote an error of ab out in

the free energy and achieve go o d agreement with results of multistage sampling and theoretical

calculations This metho d is one of the few that can as easily b e applied to olattice as to

latticebased systems The same authors have extended the metho d to Molecular Dynamics

simulation and applied it to the LennardJones uid and to a mo del of liquid water

Marinari and Parisi indep endently discovered the expanded ensemble which they call

simulated temp ering and applied it to the randomeld Ising mo del which at low temp era

ture has a pdf of the order parameter with many mo des separated by free energy barriers

They present the metho d as a development of the simulated annealing algorithm in which the

temp erature is steadily reduced Their concern is not with the absolute free energy dierence

of the system or the free energy dierence b etween hot and cold ensembles but with the

overcoming of ergo dic problems in the lowesttemperature ensembles the ability of the system

CHAPTER REVIEW

to reach high temp eratures where it can travel easily across free energy barriers means that

when it co ols down again it is likely to enter a dierent mo de of the pdf than the one it

started from This results in an eective for travelling b etween the mo des of the low

r w

temp erature ensembles which is much less than in a single lowtemperature simulation even

taking into account the extra eort in simulating the other sub ensembles They measure E

and M in the lowesttemperature ensemble and show how the rapid tunnelling is indeed

facilitated They compare simulated temp ering with Metrop olis and clusteripping algorithms

it far outp erforms Metrop olis and seems b etter even than the cluster algorithm though data

on comparable to that presented in the results of multicanonical simulations is not given

r w

In however the tunnelling time b etween b etween phases of opp osite magnetisation is

investigated for the d nearestneighbour Ising mo del and it is claimed that L

r w

with ve temp eraturestates used for all system sizes but dierently spaced

Thus while Lyubartsevs implementation corresp onds to the measurement of an absolute free

energy of a phase Marinari and Parisis shows how the metho d could also b e used to connect

two phases for the measurement of a free energy dierence we would use a variable order

parameter M within each sub ensemble and nd co existence by nding control parameters such

can

that each mo de of P M had equal weight in the particular sub ensemble that had the control

parameters of interest However b ecause the sampled distribution within each sub ensemble is

can

unaltered P M would only b e measurably dierent from zero for b oth mo des if H were

very close to H see equations and

coex

Other applications of the multicanonical ensemblesimulated temp ering have included the

following

The folding of simple mo dels of proteins has b een investigated in and In the

second pap er the global minima of p olymer chains of residues are investigated by

two metho ds one is a conventional temp eratureexpanded ensemble while in the other

the p olymer sequence changes b etween ensembles This latter approach gives some idea

of the exibility of the expanded ensemble In this context we should also mention

where MC parameters are optimised during a run to achieve the fastest decorrelation in

a proteinfolding problem The convergence pro cess resembles the nding of f g in an

expanded ensemble simulation though nonBoltzmann sampling is not used

The d Ising spin glass has b een studied The metho d allows the ecient generation

CHAPTER REVIEW

of a go o d sample of equilibrated congurations at temp eratures b elow the glass transition

It is found that the predictions of replica symmetry breaking theory rather than droplet

theory are supp orted

The metho d has b een used for the measurement of the chemical p otential free energy

of a p olymeric system Used by Wilding et al an expanded ensemble approach is

here combined with the particleinsertion metho d The simple particle insertion metho d

fails b ecause of the large size of the molecules which results in a very small acceptance

probability Therefore a series of intermediate ensembles are introduced in which the

interaction of the test p olymer molecule with the others is gradually switched on The

expanded ensemble technique is used to move b etween these ensembles For long p olymer

chains the metho d is more eective than the commonlyused congurationalbias Monte

Carlo

The metho d has b een applied to the D Ising spin glass and very recently to the

U lattice gauge theory by Kerler and coworkers In the Ising case a temp erature

expanded ensemble is used for the gauge theory an set of ensembles containing progres

sively more of a monop ole term that transforms the transition from rst to second order

We remark that in and issues relating to the nding the co ecients f g are also

investigated The metho d of uses a linear extrap olation technique while in a metho d

using information in the observed transitions b etween sub ensembles is used This resembles a

metho d we developed indep endently which is describ ed in chapter where we investigate the

problem of nding suitable f gs Our investigations are p erformed using the multicanonical

ensemble but would also apply to the expanded ensemble

One issue that is imp ortant in expanded ensemble calculations is the spacing of the sub ensem

bles Let us consider the temp erature expansion case as an example Unlike the multicanonical

ensemble there is no natural granularity of the temp eratures so they must b e chosen with a

separation wide enough that the random walk time b etween the ends of the chain is not to o

long but not so wide that the acceptance ratio r falls excessively low It is p ossible to make

a

an approximate calculation of what the spacing should b e see we know P i j j

M E and E E Consider only transitions b etween adjacent

e ef f ef f j i j i

states so j i and supp ose we have the ideal set f g Then we may expand as a

Taylor series which gives writing

i i

CHAPTER REVIEW

i i

O

i i

i i

E C O

i H i

i

k

B

i

where in the second line we have used G and expressed the derivatives of G in terms

i i i

i

of canonical averages

We demand E O for a reasonable r Thus E C k

ef f a ef f H i B

i

so we can use this expression to estimate a suitable and thus given and a measurement

i i

of C the heat capacity in the ith ensemble Then equation enables us to estimate

H i

d d

It is instructive to observe that since C L we require L

i H i i

This is the exp ected size of the fractional uctuations in the energy and so we see that once

again we require an overlap b etween adjacent sub ensembles It is the same sort of scaling as is

required for multistage sampling However use of the expanded ensemble where congurations

pass from one ensemble to the other makes it clearer that what is really required is an overlap

of the pdfs of the congurations of the adjacent sub ensembles With a large dierence in

temp eratures the dominant congurations in one ensemble are not the dominant congurations

in the otherthey will scarcely have any weight at all Attempting a transition in temp erature

without changing the conguration is liable to pro duce a nonequilibrium conguration in the

new ensemble which is then very unlikely to b e accepted just as when making co ordinate

up dates in the normal metrop olis algorithm the conguration can b e altered only slightly at a

single step while maintaining a reasonable acceptance probability

Metho ds Related to the Expanded Ensemble

Here we shall discuss two interesting metho ds which app ear in the Statistics literature They

are similar enough to the Expanded Ensemble to b e treated here but also have imp ortant

dierences They have not yet b een applied to problems in physics to our knowledge

The rst due to Geyer and Thompson is called Metrop olisCoupled Markov Chain Monte

Carlo MC Transitions are made b etween sub ensembles dened just as for the expanded

ensemble but every one of the sub ensembles is active at a particular time and the ensemble

changing moves consist of swaps where the conguration from ensemble i ie the ensemble

i

with temp erature or energy function E is moved into ensemble j and viceversa The swap

i i

CHAPTER REVIEW

is accepted with probability

P P

j i i j

r min

a

P P

i i j j

With this choice the stationary distribution P of each ensemble is preserved Note that we

i

do not need co ecients b ecause the metho d ensures that every ensemble is always active

so there is no problem with the simulation seeking out the ensemble where G is the smallest

and staying there However we are as usual constrained by the acceptance ratio r which will

a

b ecome very small if the ensembles have a large free energy dierence indeed r has the form

a

of the pro duct of two expanded ensemble transition probabilities and so will b ecome unusably

small rather faster We should also note that we do not get absolute free energies directly

b ecause the expanded partition function is now the product of the Z s for each sub ensemble

not the sum However MC do es oer a way of moving rapidly b etween the mo des of a

multimodal probability distribution Geyer presents it primarily as a way of doing this and

so the application allowing tunneling b etween co existing phases would still b e p ossible

In this resp ect the second metho d Tempered Transitions is similarit to o would give

can

g not directly but by sp eeding up tunnelling b etween the mo des of P M where M varies

within a sub ensemble However here we seek to reduce the time taken for a random walk

through the sub ensembles by forcing the system to follow a tra jectory from the temp erature of

interest up to high temp eratures and then back down then p erforming a global acceptreject

of the whole tra jectory However it is found at least in the trial problem that is investigated

in that to achieve a go o d acceptance ratio we need N intermediate states where the

expanded ensemble requires only N so that there is no reduction in time Nevertheless for

some systems Tempered Transitions may b e a b etter way of moving b etween the mo des of

can can

P M the details dep ending in a rather complex way on the shap e of P M in the

intermediate ensembles

Advantages and Disadvantages

Like the multicanonical ensemble the expanded ensemble enables a wider exploration of con

guration space and gives access to free energies and free energy dierences through a direct

measurement of probability It can b e used either to measure the free energy of each phase

separately or to improve the exploration of the conguration space at low temp eratures by

CHAPTER REVIEW

giving the simulation the ability to connect to hightemp erature states where the pdf of the

order parameter no longer has a double p eak or in the case of spinglasses a multiple p eak

This thus enables the simulation to bypass the p otential barriers due to interfaces b etween

phases though searching in the space of the other control parameters will b e required to ensure

that the hightemp erature phase do es not always connect to only one of the lowtemperature

phases

We would remark on the clear similarities b etween this metho d and the acceptance ratio

metho d see section By recording M E rather than a histogram

e

of visited states it is likely that the variance of the estimator of sub ensemble probability is

reduced the acceptance ratio metho d records not only if a transition would b e accepted or

rejected but also in eect how much it would b e rejected by However b ecause no transitions

are made we lose the ability to connect with hightemp erature sub ensembles to sp eed up the

decorrelation of the system

Just as for multicanonical sampling the expansion of the accessible volume of conguration

space leads to a long random walk time for its exploration However the analysis is somewhat

simpler in this case since as we saw in equation the quantities we are interested in here

are expressed directly as the ratio of the measured probabilities of b eing at the two ends of

the chain of sub ensembles In section we shall investigate how the length of the random

walk aects the error of the metho d addressing in particular an argument that accuracy can

b e improved by sub dividing the chain of sub ensembles

The question of parallel up dating once again arises but here is generally less of a restriction

Just as in section it was found that for the multicanonical ensemble those up dates that

aect the preweighted variable must b e carried out serially so here the up dates that change

the sub ensemble must b e carried out serially However in the expanded ensemble they could

not b e carried out any other way since the energy function or temp erature is naturally a global

prop erty Within a particular sub ensemble we may p erform whatever parallel up dates of the

particles co ordinates would b e p ossible in a Boltzmann sampling simulation

Valleaus DensityScaling Monte Carlo

This is another application of a nonBoltzmann sampling technique which enables the estimation

of canonical averages of uid systems accurately over a wide ranges of a variable in this case

density by sampling in a single simulation all relevant parts of conguration space Free energy

CHAPTER REVIEW

dierences b etween the average canonical densities within the sampled range are also obtained

The metho d is describ ed in It is argued that a go o d way of parameterising the sampled

distribution at least for a hard sphere uid is to choose the measure Y Y s where

nn

s is the distance b etween the pair of particles closest to each other in the simulation ie

nn

N N

s min s s are the reduced p osition vectors so s r L where r are the real

nn ij ij i i

p ositions and L is the length of the side of the b ox In the canonical ensemble we do not exp ect

r to dep end much on the density b ecause of the shortrange repulsive forces which are

nn

very strong and increase rapidly as interparticle distance decreases and therefore s

nn

Supp ose we are interested in the dierence in free energy b etween ensembles at and By

selecting a suitable form for Y we can make sure that that s covers the range from s

nn nn

to s where s is a typical value in the canonical ensemble at and similarly for s

nn nn nn

DS

at We therefore lo ok for P s constant for the range of interest If and are

nn

representative of dierent phases then the metho d is oering a way to connect the phases

directly like multicanonical sampling with a variable order parameter Also like multicanonical

sampling canonical averages can b e recovered by reweighting

As well as the hard sphere uid the metho d is applied to the primitive mo del Coulombic

uid which has a spherical hard core plus Coulomb forces there are equal numbers of and

charged ions A slightly dierent sampled distribution is used here in order to sample

N

those congurations which are still imp ortant when weighted by exp E s a distribution

N N

Y s w s s is used where is a function chosen to ensure that an appropriate range

nn

of energies is sampled

For the hard sphere uid results are presented gathered from only DS simulations that

cover the range Excess free energy excess pressure and also the pair correlation

function g r are measured and found to b e in very close agreement with analytic results For

the reduced primitive uid extensive results are given for b oth and electrolytes covering

excess free energy internal energy excess osmotic co ecient mean ionic activity co ecient and

like and unlike pair correlation functions The ranges of density are for

and for b oth ranges covered by DS simulations The number

of particles N throughout Results match those of other canonicalgrand canonical

simulations and seem b etter for high densities when those can suer ergo dic problems

In a second pap er the Coulombic Phase Transition the lowtemperature lowdensity

phase separation of a uid of charged spheres is studied

CHAPTER REVIEW

Advantages and Disadvantages

N

Though it is not presented as such this is a similar metho d to multicanonical sampling Y s

N

w s s is used It seems that a sampled distribution of the form

nn

DS N N

Y s exp E s exp s

nn

could well b e appropriate ie a multicanonical distribution weighted not in but in s This

nn

shows the main innovation of the metho d to b e the use of s rather than or V itself as

nn

the parameter that controls the nonBoltzmann weight of a conguration and thus extends the

range of sampled densities It is not clear what the relative merits of the two approaches are

though one feels that there is useful physical insight in Valleaus identication of hardcore

overlaps as a crucial factor controlling the variation of the density

Probably the most serious disadvantage of the metho d as presented is that Valleau gives no

DS

systematic pro cedure for evolving Y but uses physically motivated tting functions contain

ing s the hard core radius L and some t parameters From a series of short initial runs

nn

these can b e set to give a suitable sampled distribution at least for the fairly small systems

studied here Aside from this advantages and disadvantages seem to b e as for the general

umbrella samplingmulticanonical metho ds

The Dynamical Ensemble

This metho d was introduced by Gerling and H uller in The system of interest an L

Potts mo del in is regarded as b eing coupled not to an innite heat bath as is the case

with the canonical ensemble but to a nite bath they consider a d ideal gas of N particles

where N L whose density of states function is known The combined system has constant

energy E so the probability that the Potts mo del has energy E is

TOT p

dy n

P E E E E

p p p bath TOT p

N

E E E

p p TOT p

Because is known the bath need not b e simulated directly but its eect can b e included

bath

by sampling from a nonBoltzmann distribution of energy the Metrop olis algorithm can b e

CHAPTER REVIEW

used with

N

dy n

P E E

TOT p

dy n

P E E

TOT p

Thus it falls into the nonBoltzmann sampling framework though its physical motivation is

clearer when it is describ ed as ab ove

As for the canonical ensemble only a small volume of phase space is sampled at a time and

so it is necessary to do several simulations at dierent values of E measuring E

TOT p dy n

Then using all of these the entropy sE is tted by least squares to

p

R

dy n

E P E dE

p p p

R

E

p dy n

dy n

P E dE

p p

with

dy n N

P E expN sE E E

p p TOT p

After this canonical averages and Z may b e found easily

Gerling and H uller make measurements on the and state Potts mo del near the tran

sition p oint nding that they can discern the transition from a continuous to a rstorder

transition much more easily than by conventional metho ds In the metho d is applied to

the d Ising mo del and the critical exp onents and are measured with fair accuracy even

from this small system

Advantages and Disadvantages

It is not immediately apparent from the ab ove why the dynamical ensemble is b etter than the

canonical however its sampled distribution do es confer imp ortant advantages Firstly and

dy n

most signicantly while P E is doublyp eaked at a rst order phase transition P E

can

remains singly p eaked so removing the diculty of tunnelling Secondly nitesize eects are

much smaller b ecause the heat bath and the system of interest have similar sizes Thirdly we

nd canonical quantities E and Z in this metho d by a kind of laplace transform of the

can

tted sE This is in fact extremely numerically stable so that go o d accuracy is obtained even

when there are appreciable errors in sE The reverse Laplace transform which would give

sE or E from E is of course extremely un stable which is another manifestation

can

of the fact that free energy measurements require information ab out E at for a wider range of

E s than can b e obtained from a Boltzmann sampling simulation at one The only disadvantage

CHAPTER REVIEW

to the metho d seems to b e the necessity of doing a series of simulations it is not limited to

latticebased systems

The metho d do es then seem to b e very attractive though it has only very recently app eared

and so has not b een widely applied It is not clear that its ma jor advantagethe removal of

the free energy barrier b etween the phaseswould necessarily remain if it were applied to other

systems such as olattice systems if it did the metho d would certainly b e extremely useful

Grand Canonical MonteCarlo

In a sense this metho d is really a kind of Boltzmann sampling but it is most convenient to treat

it in this section as the varying number of particles gives it sp ecial prop erties It was introduced

in and is applicable to olattice uidlike systems Like the constantN pT ensemble it

allows the density of the system to vary but this is achieved by varying the number of particles

in the simulation b ox N rather than the b oxs volume This is an attractive metho d b ecause

the Gibbs free energy b ecomes one of the input parameters GN for particles of a single

type and so the need to measure it is removed It is necessary to measure the pressure p which

may b e done using equation

The partition function simulated is

Z

X

N N N

exp E d Z N exp N

V T

V

N

which we do by making the usual particle moves and also trying particle insertionsdeletions

see or Chapter for a derivation of the required expressions for acceptance of these

moves As for Widoms metho d we may replace by using the ideal gas equation of state an

approach that was rst adopted in

For the correct value of for the temp erature of the simulation the metho d should allow

direct simulation of co existence the system should move back and forth b etween densities

characteristic of the two co existing phases However b ecause the sampling is still from the

Boltzmann distribution alb eit in the Grand Canonical Ensemble this time the problem of the

low probability of the interfacial states remains and eectively prevents tunnelling b etween the

two phases just as we describ ed for Boltzmann sampling MC with a variable order parameter in

section In practice the metho d would b e used more easily to treat each phase separately

The problem this time would b e the equalisation of pressures b etween the two phases rather

CHAPTER REVIEW

than the equalisation of free energies We would simulate one phase for a variety of densities by

varying and measuring p and then rep eat the whole thing for the other phase for the same

set of values of lo oking for the pair of p oints where the pressures equalise Once a single

co existence p oint has b een found it would probably b e easier to use GibbsDuhem integration

as we explained in section

Advantages and Disadvantages

The most serious disadvantage is clearly that the problem of the interfacial region is not over

come tunnelling b etween the two phases is still suppressed so the two phases cannot b e simu

lated together However given this the metho d has the attraction that the need to measure free

energy is replaced by the simpler problem for p otentials without singularities of measuring

the pressure

The use of the metho d is limited only by the density of the system for dense uids acceptance

of particle insertionsremovals b ecomes very low b ecause of the likelihoo d of hardcore overlaps

and in the solid phase the metho d fails altogether b ecause the insertion or removal of a particle

disrupts the crystal lattice The N pT ensemble is b etter for the simulation of solids We should

also note that nitesize eects are large though wellunderstoo d see where the metho d

was applied to a critical d LennardJones uid and the metho d seems to b e unusually sensitive

to the quality of the random numbers used

There is a clear similarity b etween this metho d and Widoms which measures the acceptance

ratio for a single particle insertion without actually p erforming it Both metho ds work well for

the same kinds of system uids of low to medium density It should b e noted that care

must b e taken to avoid a situation where the acceptance ratio of particle insertionsdeletions

app ears to b e high but in fact a particle is simply b eing removed leaving a vacancy then

replaced in the same density This is b est avoided by equilibration moves b etween the particle

insertionsdeletions

The Gibbs Ensemble

This metho d was introduced only recently but has already gained a great

deal of p opularity see and references therein Its use is restricted to phase equilibria

of uid systems but given this restriction it has the great advantage that b oth p and are

guaranteed to b e the same in the two phases so that a co existence p oint on the phase diagram

CHAPTER REVIEW

is found immediately without the need to search lab oriously for the values of the control

parameters that pro duce equilibrium as is the case with TI and most other metho ds

In the Gibbs ensemble two phases are simulated simultaneously They do not co exist in

the same simulation volume but are still in a sense kept in thermo dynamic contact This

thus avoids the necessity of creating an interface b etween them with the concomitant ergo dic

problems of crossing it The way this is achieved is as follows two simulation b oxes are

used of volumes V and V and volumechanging MC moves of b oth b oxes are made as for

constantN pT simulation but the total volume V V V is constrained to b e constant

This ensures equality of pressure in the two phases Similarly the number of particles in

each volume is not constant but N N N is constant so we move particles from one

simulation b ox to the other This ensures equality of chemical p otential As well as this we

move particles around in each simulation b ox in normal MC fashion The statistical mechanics

of the system and the acceptreject pro cedure for the various kinds of moves are describ ed in

and Equilibration normally causes one of the simulation b oxes to come to contain

the dense phase and the other the rare phase and the pressures and chemical p otentials of the

two b oxes equalise Note that although pressure and chemical p otential are the same in the

two phases we do not immediately know what they are b ecause the net pV and N terms

for a volumechange or particleswap move are zero for the two b oxes together They must b e

measured separately the pressure by using equation and by a version of Widoms metho d

See for details

Advantages and Disadvantages

The metho d is attractive b ecause it is very easy to investigate the co existence curve once the

initial rather complex co ding has b een done The metho d b ecomes dicult to apply in two

regions One is at the critical p oint where the dense phase and rare phase change identities

regularly and more seriously one cannot apply the nitesize scaling theory necessary to correct

the results of MonteCarlo studies of critical phenomena The other is at low temp eratures

where one phase is very dense and the other is very rare As well as requiring a large total

volume this makes the particleswapping moves into and out of the dense phase dicult We

should mention that as with Grand Canonical Ensemble simulations it is easy to b e misled into

thinking that an go o d acceptance ratio of particleswapping moves b eing achieved when in fact

the same particle is b eing moved back and forth b etween the same vacant sites Nevertheless

CHAPTER REVIEW

the Gibbs ensemble is extremely attractive for most noncritical uiduid phase equilibrium

problems for which it seems to have made previous metho ds obsolete

Other Metho ds

Coincidence Counting

This metho d was developed by Ma and provides a way of estimating the density of states

can

E for all states with appreciable P E To do this we generate a set of N congurations

c

x

of which N E have energy E and measure n E the number of coincidencesthe number

of times we get the same microstate more than once Now supp ose we generate one more

conguration If the probability of another coincidence is P hit then clearly

E

x x x

n E n E P hit n E P miss

E E

N E

If we may assume that the set of p oints N E is randomly distributed in the set E which

implies that we must not sample the tra jectory at intervals shorter than the relaxation time

then P hit N E E and P miss N E E Substituting these we nd

E E

x x

n E n E N E E

N E

x

and averaging over values of n E the angle brackets now denote an average over all N E

steps not just the previous one

x x

n E n E N E E

N E N E

Taking the ansatz

x

n E N E N E E

N E

we pro ceed by induction

x

n E N E N E E N E E N E N E E

N E

which has the correct form It remains only to check the case N E for which we indeed nd

CHAPTER REVIEW

x

n E E E E E N E N E E j

N E

x

completing the pro of To apply the metho d we use our measured n E to approximate

x

n E and then equation gives E In the estimates of E are used to

estimate the total entropy S it is straightforward to show from G E TS with no

external eld and the denitions of G and E in equations and et seq that this is

given by

X

can can

S k P E lnP E E

B

E

In the metho d was demonstrated for a small D Ising mo del and it has since b een

applied to a lattice mo del of an entangled p olymer

Advantages and Disadvantages

p

x

To get a go o d estimate of E we need n E implying N E E however E

grows so fast with increasing system size that this requirement so on b ecomes imp ossible to

d d

satisfy E expaL sE k so the required N E expaL sE k Nevertheless

B B

note that this is a much less stringent criterion than N E E which would b e necessary

if we attempted to nd E by measuring directly the probability of a particular microstate

dir ect d

In this case N E expL sE k exp onential growth at twice the rate Although it

B

is ultimately limited by the exp onential growth of N E Mas metho d may b e go o d enough

to enable us to measure the entropy of subsystems of the simulation that are large enough to

b e indep endent ie their size exceeds the correlation length so that we can get the total

P P

S where S entropy by combining the sub entropies in a simple way S

AB A

AB A

S S S S It is clearly most suitable for systems where the increase in entropy

AB AB A B

with system size is comparatively slow ie the prefactor a in the exp onential is quite small

The entangled p olymer studied in falls into this category b ecause the entanglements reduce

the accessible volume of phase space

Lo cal States Metho ds

First used by Meirovich these have b een developed further by Schlijper et al

We measure a lo cal entropy S L dened as in eqn but with the sum taken over small

lo

sublattices clusters of linear size L embedded in a larger simulation typically x in d L

is normally chosen to b e or increasing it improves accuracy but increases the computer

CHAPTER REVIEW

time and memory required exp onentially Since the clusters are small we can use eqn

d

directly The bulk entropy density s is found by extrap olations of L S L using the cluster

lo

variation metho d CVM The accuracy of the metho d dep ends on the rate of convergence

d

of L S L as L increases it is thus worst near Meirovich calculated S for the fcc Ising

lo c

antiferromagnet and compared with integration lo cal states seems appreciably b etter in

high magnetic eld but shows no improvement in low eld

Schlijper et al have improved the metho d by supplementing the CVM calculations with a

Markovian calculation of s based on the dierence of the cluster entropies of two stepp ed d

dimensional clusters with one having one more site at the step than the other This second

calculation provides an upp er b ound on s whereas the CVM gives a lower b ound Schlijper et

al therefore take the average of the two as their b est estimate of s and half the dierence as a

rigorous b ound for the intrinsic error They present results for the d and d Ising mo dels and

the d state Potts mo del and claim an impressive accuracy

Advantages and Disadvantages

Because of its reliance on the CVM for extrap olation the metho d is applicable only to a

restricted class of mo dels lattice mo dels with translational invariance This rules out large

classes of interesting systems eg uids spin glasses The CVM is also not a transparent

metho d and requires considerable mathematical eort to apply Having said this high accuracy

is available where the metho d can b e applied and the existence of b oth upp er and lower b ounds

on the entropy is attractive

Rickman and Philp ots Metho ds

In this metho d the problem of the measurement of the free energy of a solid is tackled

using the idea of measuring the probability of a single microstate in this case the ground state

in which all the particles are xed on their lattice sites This probability is far to o small to

measure so instead MC simulation is used to measure the probability that all the

particles are inside spheres of radius ab out their lattice sites then extrap olate to nd

p

N

by tting to the data a function of the form where N

f it

is the number of particles and x y the incomplete gamma function They give results for

the LennardJones solid and claim an accuracy of although the agreement with results

of integration is only Their metho d is attractive but is rendered dubious the very complex

CHAPTER REVIEW

form of Although they do have physical grounds for chosing such a tting function

f it

More recently Rickman working with Srolovitz has devised another metho d which also

falls into this category It works only for lattice mo dels with discrete conguration spaces and

is describ ed and tested for the d Ising mo del The idea is to estimate E by dening some

large subset E which can b e easily identied and enumerated by fairly simple combinatoric

metho ds for example those congurations which consist of n isolated spins in a sea of spins

p ointing in the opp osite direction so these have an energy E n with coupling J We

then run a normal Boltzmann sampling algorithm and test whether or not each conguration

falls into set E The fraction that do E gives us an estimate of E from E

E E From a single E the partition function for the temp erature of simulation can

b e estimated from a set of them Z for a range of temp eratures can b e found

The metho d gives very high accuracy from congurations for the admittedly

unarduous task of nding G H for the Ising mo del Lattices up to are

studied the metho d is limited by the tendency of to b ecome immeasurably small for any

suitable set that is b oth enumerable and can b e easily identied Extension to other lattice

mo dels or to F M is fairly obvious

The Partitioning Metho d of Bhanot et al

This metho d like multistage sampling concentrates on nding the density of states func

tion E and uses overlapping distributions to do so We b egin from a simple sampling

MC algorithm see section which generates congurations with a pdf prop ortional to

E This time however we partition the allowed energy sp ectrum into narrow overlapping

blo cks In one MC run we start the simulation in one of the blo cks and constrain the energy to

remain within it by rejecting all MC moves that would take us outside The blo ck is narrow

in the sense that all energies within it app ear with appreciable probability in the MC run but

it must also b e wide enough that we can in principle reach any conguration in the blo ck from

any other We then rep eat the pro cess for all the blo cks Within each blo ck we have obtained

the relative probabilities of the dierent energies and b ecause the blo cks overlap we can com

bine results of adjacent blo cks and normalise eventually obtaining the absolute probability of

any energy in an unconstrained simulation Knowledge of E or then gives us the

TOT

density of states function and F follows As do all other thermo dynamic quantities There

is a cumulative error in overlapping large numbers of probability distributions but nevertheless

CHAPTER REVIEW

Carter and Bhanot obtain high accuracy comparable with any other metho d in their simula

tions The d Ising mo del is treated in reference and the metho d has had several other

applications for example to the phase transition in the nitetemp erature SU lattice gauge

theory

We note that this metho d is somewhat dicult to classify we have put with the direct

metho ds b ecause E is measured directly but we have already said that it resembles mul

tistage sampling while the use of constraints and the absence of Boltzmann weighting would

justify classifying it in the nonBoltzmann metho ds

Advantages and Disadvantages

The metho d is applicable to olattice systems and is simple conceptually and algorithmically

b ecause we do not need to calculate Boltzmann factors there is a substantial increase in the

sp eed for a single up date compared with multistage sampling However the necessity of choos

ing narrow blo cks means that we delib erately make worse the ergo dic problems with metastable

states that we know o ccur around phase transitions This is not a problem with the Ising mo del

for which it can b e shown that as long as the blo cks contain at least four energy macrostates

ergo dicity is guaranteed but might b e exp ected to b e so with more physically realistic mo dels

Discussion

We can see that a great many MC metho ds of measuring free energy have b een tried Figure

b elow is an attempt to group conceptually related metho ds to simplify the task of seeing

the connections b etween them

This gure requires some explanation b ecause even those metho ds that we have classied

dierently frequently have p oints of resemblance to one another In fact a single classication

is imp ossible b ecause the measurement of free energies and the nding of phase b oundaries

is a pro cess which involves a several dierent issues not a single one and one metho d may

resemble a second metho d in the way it tackles one issue and a third in the way it tackles

another Realising this enables us to analyse and classify the metho ds more easily and also

to envisage p ossible improved metho ds which might combine the strong p oints of dierent

approaches

First it is convenient to deal with those metho ds discussed mostly in section that stand

CHAPTER REVIEW

Free Energy & Entropy

Non-Boltzmann Methods Direct Measurement of Probability of State Integration-Perturbation Methods Umbrella Sampling Overlapping Canonical Ensembles Coincidence Counting Multicanonical Ensemble Local states Thermodynamic Integration Density-Scaling Rickman & Philpot's Methods Histogram Methods Dynamical Ensemble Partitioning Multistage Sampling Expanded Ensemble Acceptance Ratio Method }

Mon's Method Particle Insertion Grand Canonical Monte Carlo

Gibbs Ensemble

Figure A p ossible grouping of metho ds of measuring free energy The three main group

ings reect the three groups into which metho ds have b een classied in this chapter while the

dotted b oxes show connections b etween metho ds in dierent groups withing the dotted b ox

the most closely related metho ds are arranged side by side

out from the mainstream in that they attempt to tackle the problem of measuring a free energy

in a direct way using the fact that if we could measure the probability of the app earance of a

single microstate or a group of degenerate microstates whose degeneracy we knew we would

can

know the partition function by P Z exp E These include Mas coincidence

counting metho d the lo cal states metho ds and Rickman and Philp ots metho d

for estimating the probability of the ground state Because of the exp onentially large

number of microstates these metho ds usually rely on ancillary extrap olation pro cedures and

in any case do not work well for large systems However we should mention that at least for

lattice mo dels the multicanonical metho d can b e applied in a way that puts it in this group

we can measure the probability of a single macrostate in the multicanonical ensemble directly

without the need to extrap olate and then calculate from this the corresp onding probability in

the canonical ensemble If therefore we choose a macrostate that consists of only one or two

microstates for example the ground state of the Ising mo del then Z can b e calculated We

describ e this variant of the metho d in more detail in section

Now let us turn to the remainder of the metho ds which are attempts to deal with one or

b oth of the problems identied in section The most obvious way to categorise them is

CHAPTER REVIEW

by way of the algorithm employed by which we really mean the sampled distribution here

First there are those metho ds that employ simply the NV T or sometimes N pT canonical

ensemble in largely unmo died formthermo dynamic integration multistage sampling and

the acceptance ratio metho d These are put in the central b ox of gure Because of the

unsuitability of the Boltzmann distribution to direct free energy calculations these metho ds rely

on using several indep endent Boltzmann sampling simulations which b etween them cover all the

necessary states in conguration space These metho ds are most naturally applied to each phase

separately connecting the state of interest to a state of known free energy They do not handle

the direct connection of the two phases well b ecause the Boltzmann distribution for states in

the interfacial region is slow to equilibrate and extremely sensitive to minute variations in the

control parameters that push the system from one phase to another Under some circumstances

ergo dic problems may also aect the estimates of E in those simulations that are run

at low temp eratures or pV see equation in those simulations that are run at high

density

The dierences b etween these metho ds themselves lie less in the nature of the Monte Carlo

pro cessone could imagine them pro ducing and using the same Markov chain of congurations

than in what they measure from the congurations generated and the estimators of the free

energies that they extract from the measurements In TI a canonical average is measured for

each stage which is integrated to give a free energy In Multistage sampling a pdf is measured

and overlapped b etween the stages or extrap olated and overlapped if Bennetts extension is

b eing used while in the acceptance ratio metho d the transition probability b etween the en

sembles of each stage is measured This classication by estimators is the second way that

metho ds of free energy measurement can b e classied and is to some extent separate from

the classication by algorithm though the choice of algorithm usually has implications for a

sensible choice of estimators we would not necessarily choose the spacing in temp erature or

whatever of the series of canonical simulations in the same way for TI and multistage sampling

All the metho ds are fairly easy to implementTI p erhaps the easiestand for all of them the

b ehaviour of MonteCarlo errors is similar the total error is obtained by combining estimated

errors from within each ensemble as describ ed in section We can also include Widoms

particle insertion metho d and Mons metho d in this group Widoms metho d is like the accep

tance ratio metho d b etween two canonical ensembles diering by one in the number of particles

present while Mons metho d is like multistage sampling with an unusual but intelligent choice

CHAPTER REVIEW

of dierence b etween the canonical ensembles that means rstly that the exp onential whose

average must b e taken grows only slowly with system size and secondly that the nitesize

corrections to the limiting b ehaviour of the free energy are obtained directly

To continue with the classication by algorithm the second group put in the righthand

b ox of gure can broadly b e called noncanonical metho ds Unlike the metho ds of the

previous group they allow direct connection of the two phases In Grand Canonical Monte

Carlo V T ensemble it is unnecessary to measure since it is input but the do es suer from

the presence of the interface necessitating a search in to obtain equal values of p which must

b e measured in b oth phases The Gibbs ensemble elegantly avoids the problem of sampling

across through the interface region and as we have said is probably the metho d of choice

where it is applicable At least for lattice mo dels the shap e of the pdf of the Dynamical

ensemble may also oer a way round the problem of low sampling probability in the interface

region And the two metho ds that will most concern us in this thesis also b oth fall into this

group the expanded ensemble and the multicanonical ensemble By expanding the volume

of conguration space accessible to the simulation they b oth enable the determination of free

energies in a single simulation They may b e applied either to connect two phases directly or

to nd the absolute free energy of each The free energies are obtained from MC estimates of

the probabilities of the macrostatessub ensembles reweighted to correct for the nonBoltzmann

sampling There is an obvious similarity b etween these two metho ds and we can in fact put

them b oth into the same framework though we shall wait until section to do this

These metho ds are relatively go o d at overcoming ergo dic problems b ecause they do not

consist of a series of Boltzmann simulations each conned to a narrow range of macrostates

they have the ability to reach macrostates of high energy or low density where decorrelation of

congurations is rapid Of course ergo dic problems are to some extent an inevitable feature of

MonteCarlo simulation one can never b e sure that equilibration is complete and that there is

not some hidden region of phase space that has not yet b een found Nevertheless the situation

here is obviously b etter than in the case of TI and the other metho ds of the rst group The

total error of the pro cess is also obtained more simply than for the multistage metho ds the

standard blo cking metho ds that work for a single canonical ensemble can b e used

Though we have classied them ab ove as using b oth dierent algorithms and dierent es

timators there is an obvious kinship b etween some of the multistage metho ds and some of

CHAPTER REVIEW

the nonBoltzmann metho ds as we have shown in the gure by connecting them with a dot

ted rectangle in particular the acceptance ratio metho d b ears an obvious similarity to the

expanded ensemble in the case where each of the sub ensembles of the latter corresp onds to a

separate stage of the acceptance ratio metho d the dierence is that whereas we merely measure

the transition probability of a trial move in the acceptance ratio metho d we actually perform

the transition in the expanded ensemble The issue is slightly obscured b ecause the estimator

of the probability of the sub ensembles is dierent in the two cases as well M E

e

in one case and the histogram of visits to the sub ensemble in the other In the same way grand

canonical MonteCarlo and the particle insertion metho d are related the rst metho d actually

p erforms the transitions whose probability is recorded in the second

The partial separation we have eected b etween the algorithm used and the estimators of free

energy dened on the congurations pro duced enables one not only to understand more clearly

the plethora of metho ds in the literature but also to think of combinations of the two that

have not previously b een tried For example we could employ the expanded ensemble metho d

making transitions b etween the sub ensembles but record M E and use that as

e

the estimator of the probability of a state Indeed it is to b e exp ected from Bennetts analysis

that the variance of this estimator would b e slightly lower b ecause it contains information

on how much each transition is accepted or rejected by This information is discarded by using

only the histogram Thus by this metho d we would keep the advantages of using the expanded

ensemble reduction of ergo dic problems but combine them with some of the virtues of the

multistage metho ds Or we could envisage measuring E in a multicanonical simulation

can

then evaluating free energies by integrating as in TI This combination will b e tried in section

We should p oint out that almost all the metho ds we have discussed are united in the need

to face the problem of exploring in some way a large volume of conguration space whether

they tackle it by doing one or a series of simulations whereas all congurations within a single

phase are in a sense similar this is not the case with the congurations in two dierent

phases or the congurations in one of the phases of interest and those characteristic of the

high temp eraturelow density limit which can serve as a reference system Now the metrop olis

algorithm p ermits only a small change in conguration at each up date step if the acceptance

ratio is to remain reasonably high in the expanded ensemble context this corresp onds to a small

change in temp eratureenergy function Therefore to sample b oth phases or to connect to the

CHAPTER REVIEW

reference state the conguration must b e changed completely in kind by the accumulation of

small p erturbations It is this that makes the free energy problem esp ecially and to some extent

unescapably demanding Even though this transformation of the conguration is not explicitly

carried out in metho ds like multistage sampling the requirement of overlap of distributions

means that an equivalent pathway must b e op ened up This way of lo oking at the problem

makes it clear why Mons nitesize scaling metho d gives go o d results at least for lattice

mo dels the congurational dierence b etween the system with energy functions E and E

L L

is comparatively small It also explains why TI may in the right circumstances give very go o d

results since it can avoid the need to pass in small steps b etween very dierent congurations

although it has its own disadvantages to o

Issues to b e Investigated

In the remainder of the thesis we shall concentrate on investigations of the multicanonical

ensemble and to a lesser extent the related expanded ensemble We do not see a clear

dierence b etween umbrella sampling and the multicanonical ensemble which itself is similar

to the expanded ensemble The dierence seems largely to have b een in the class of problems

to which they have b een appliedcondensed matter for umbrella sampling lattice mo dels and

lattice gauge theory for the multicanonical ensemble Thus the multicanonical ensemble is in

many ways more a rediscovery of the principles of umbrella sampling than a new development

in its own right Nevertheless this rediscovery has clearly provoked a new wave of interest

prompted p erhaps by a realisation that umbrella sampling ideas are not as limited as had b een

thought by the diculty of nding a suitable sampled distribution this was we b elieve the

most imp ortant reason that the original umbrella sampling was not more widely adopted The

advantages of the multicanonical approach are particularly clear in cases where there are very

severe ergo dic problems like spin glasses but even for more general problems of free energy

measurement we b elieve that the metho d is made attractive by the fact that only a single

simulation needs to b e p erformed and the free energy is obtained more transparently by direct

measurements of probability rather than through an integration pro cess

The most signicant disadvantage remains the diculty of nding a suitable sampled distri

bution Since to generate the sampled distribution requires knowledge of the very free energies

that we are trying to measure it must b e done by an iterative pro cess Though ad hoc metho ds

that work reasonably well in practice have b een introduced see eg further progress in

CHAPTER REVIEW

this regard is required b efore the metho d can b e used otheshelf We shall make extensive

investigations of iterative metho ds to do this in chapter applying Bayesian metho ds to the

problem and introducing a new metho d based on the use of a histogram of accepted transitions

to construct estimators of the probabilities of the states We shall lo ok at the application of

the metho d b oth to the singlephase and to the co existence problems

The quantity that controls the multicanonical error might b e exp ected to b e the random

walk time over the wide range of accessible states While this is to some extent true we

r w

shall show that the relation of to the error of the nal free energy estimators is not what

r w

might b e exp ected We shall also compare the eciency of the multicanonical and expanded

ensembles with that of thermo dynamic integration While developing simulation metho ds we

shall concentrate on applications to the d Ising mo del but in chapter we shall apply the

techniques to the simulation of a solidsolid phase co existence in a simple mo del of a colloid

Chapter

Multicanonical and Related

Metho ds

Introduction

In section we gave a qualitative description of the multicanonical ensemble and reviewed

its uses in the literature To recap the dening characteristic of multicanonical simulations is a

sampled distribution which is more or less at over at least a part of the space of macrostates of

a chosen variable called the preweighted variable of the system which will b e either internal

energy E or magnetisation M here

Since its introduction in the multicanonical ensembles uses have included the measure

ment of interfacial tensions in Ising mo dels Potts mo dels

and lattice gauge theories application to the EdwardsAnderson spin glass to

measure internal energy entropy and heat capacity the study of the d

theory combined with multigrid metho ds and the study of protein tertiary

structure The metho d is reviewed by Kennedy and Berg and some recent

algorithmic developments are reviewed by Janke

In this chapter we shall mainly though not exclusively b e concerned with the application of

the multicanonical ensemble to the measurement of absolute free energies We shall rst describ e

how the multicanonical ensemble may b e used to do this and then in section we shall

investigate ways of pro ducing the required imp ortancesampling distribution which is unknown

CHAPTER MULTICANONICAL AND RELATED METHODS

a priori This will lead us to a new metho d where inferences are made from the observed

transitions made by the simulation a metho d which will turn out to have other imp ortant uses

In section we shall then present some new results for the b ehaviour of the critical pdf of

the magnetisation of the Ising mo del at high M along with results for densities of states and

canonical averages for the d Ising mo del and a comparison with thermo dynamic integration

Finally we shall widen the discussion to include other nonBoltzmann sampling metho ds such

as those of Fosdick Hesselb o Stinchcombe and the expanded ensemble section

We shall expand on the comments we have already made in chapter ab out the similarity

of this latter metho d to the multicanonical ensemble We shall also present some new theory

on the variance of estimators obtained from the multicanonicalexpanded ensemble As well

as laying to rest some folk theorems this will enable us to investigate the question of which

nonBoltzmann imp ortancesampling scheme is optimal for the measurement of a particular

quantity All the investigations made in this chapter will b e made using the d square nearest

neighbour Ising mo del with coupling constant J as describ ed in section

First then let us briey describ e how the multicanonical ensemble will b e used in this

chapter We shall describ e two ways of measuring absolute free energy by preweighting in

energy and we shall also describ e magnetisation preweighting

The Multicanonical Distribution over Energy Macrostates

As we have said b efore section absolute free energy can in principle b e calculated from

TOT

exp E

can

Z

ln Z equals F or G dep ending on whether the ensemble we are simulating has a constant

or variable order parameter We shall b e concerned here with the case where the order parameter

the magnetisation is variable ln Z G for Ising Now Boltzmann sampling cannot b e

used to evaluate the free energy with equation b ecause it gives exp onentially small weight

to the highenergy congurations that dominate the exp ectation value in equation compare

O in gure in section

We require then to give more weight to the high energy states To describ e how to do this

we shall rst give a general formulation of the use of nonBoltzmann sampling distributions

then sp ecialise to the energy case As we saw in chapter the Metrop olis algorithm can b e

CHAPTER MULTICANONICAL AND RELATED METHODS

used to sample from any distribution over the congurations space Supp ose we sample from a

distribution with measure Y which we can do by accepting trial transitions from to

with probability min Y Y Then the exp ectation value of an op erator O is

P

O Y

f g

P

O

Y

Y

f g

Now even though we have sampled from the distribution with measure Y we can also write

expressions for averages with resp ect to another distribution with measure W To see this

consider the ratio

P P

O W Y Y Y

O WY

f g f g

Y

P P

WY Y W Y Y

Y

f g f g

P

O W

f g

P

W

f g

O

W

To nd canonical averages O when the congurations are sampled from a distri

can

bution with measure Y P we substitute W exp E giving

O exp E Y

Y

O

can

exp E Y

Y

The notation is slightly simplied if we characterise the sampled distribution by its dierence

from the Boltzmann distribution introducing a function which gives an extra weight

prop ortional to exp to each microstate The sampled distribution thus has measure

Y exp E exp

which gives

O exp

Y

O

can

exp

Y

An estimator of this from a nite sample of N congurations is

c

P

N

c

O exp

eb

i i

i

O O

P

can

N

c

exp

i

i

where the congurations are assumed drawn from the distribution dened by equation

CHAPTER MULTICANONICAL AND RELATED METHODS

Note that the right hand side of equation provides an estimator of O for any

though as only a few choices are usable in practice for most the sampling probability is such

that the congurations that dominate the sums of either O exp or exp or

b oth will b e generated with innitesimal probability The p ossibility of applying equation

has b een appreciated since the early days of computer simulation see Fosdicks pap er of

an early attempt at nding an optimal sampled distribution It includes as sp ecial

cases Boltzmann sampling put Y exp E when the denominator b ecomes a minimum

variance estimator a constant for all congurations and simple sampling Y However

it has b een little used for a long time mainly b ecause Boltzmann sampling is very successful

for most choices of O internal energy magnetisation and is very simple conceptually

Though we shall come back to more general sampled distributions in section we shall

until then concern ourselves only with multicanonical distributions All the equations for the

estimators O that we pro duce equation and so forth are true for any values of f g in

the limit of very long sampling time but b ecause of the failure to sample the imp ortant

congurations frequently would give very p o or estimators in the runtimes accessible in practice

We rst consider the case of a multicanonical distribution with energy the preweighted variable

so only the value of the energy macrostate is relevant to determine E As

we said when introducing this metho d in section multicanonical sampling means that

xc xc

the sampled distribution P E for energy preweighting P M for magnetisation etc of

energies extends right up to very high energies and is roughly at as shown schematically in

xc

gure Such a distribution is pro duced by a set of co ecients E E we shall use

xc to signify multicanonical in mathematical expressions

Equation rewritten as a sum of energy macrostates and written sp ecically for the

multicanonical ensemble is

P

xc xc

P E O E exp E

E

O

P

xc xc

P E exp E

E

xc xc

where P E are estimators of P E the most obvious way to pro duce them is simply

P

xc xc xc xc

C E where C E is the histogram of energy macrostates to use P E C E

E

xc

visited in the multicanonical run However there are other ways of estimating P E Now

xc xc can

P E exp E P E so the sum in the denominator is dominated by energies

CHAPTER MULTICANONICAL AND RELATED METHODS

can P (E)exp( β E)

can P (E)

E

Figure A schematic diagram of a typical Histogram sampled from a multicanonical

can can

distribution and the estimates of P E and P E exp E that may b e recovered from

it

around E as indicated in the diagram Conversely for O exp E

xc xc can

P E O E exp E P E exp E E

so the sum in the numerator is dominated by the maximum of E which o ccurs at high

energy at E L for the Ising mo del However since the multicanonical distribution

extends over b oth regions of energy space b oth sums are estimated to go o d accuracy Indeed

for the multicanonical distribution shown in gure the estimators of O will b e accurate

for al l op erators O which dep end only on E and its conjugate eld We get not only free

energies but also internal energies heat capacities etc and not only at temp erature but also

at all other temp eratures to evaluate these we return to equation and replace exp E

by exp E and if appropriate O by O This leads eventually to the following equation

analogous to equation

P

xc xc

P E O E exp E exp E

E

O

P

xc xc

P E exp E exp E E

CHAPTER MULTICANONICAL AND RELATED METHODS

can

The denominator is now dominated by the maximum of P E while for O exp E

to estimate G the sum in the numerator is still dominated by the maximum of E

Thus dep ending on O and terms from various parts of E space will dominate the sums in

equation but since the multicanonical distribution is at we are sure to have sampled the

relevant part of E space Even if the sampled distribution is only multicanonical over a part

of E space then these assertions will still b e true as long as O and are such that all the

appreciably large terms in the sums in equation come from the multicanonical part We

should also note that the estimators O in equations and are ratio estimators that is to

say they are the ratios of sums and as such are slightly biased A way of removing this bias is

to use doublejackknife biascorrected estimators which are describ ed in app endix D

One other op erator that we shall consider explicitly is the one that allows the measure

ment of free energy dierences b etween and This is O exp E it follows

G

straightforwardly from the denition of the Boltzmann distribution that

O Z Z

G can

exp G G

Substitution of this op erator into equation and consideration of the numerator and denom

can

inator reveals that an accurate estimator will b e obtained provided that the p eaks of P

can

and P are b oth in the multicanonical region This op erator is of interest b ecause there

are many systems such as uids for which we cannot not use equation to calculate abso

lute free energies b ecause E increases without limit However we can still use O to

G

estimate free energy dierences and if is such that G is known exactly or is calculable

to high accuracy by some approximation scheme eg the virial expansion then the absolute

free energy at can b e obtained this way Indeed the formula can b e seen as the

innite temp erature limit of O using our knowledge that l im G ln

G TOT

provided is nite

TOT

As will b e observed for op erators that lead to free energies or free energy dierences there

are two widely separated regions of E macrostate space that make imp ortant contributions

and the rest of the multicanonical distribution do es not contribute directly but is imp ortant

only in so far as it p ermits tunnelling b etween them necessary to nd the relative weights of

numerator and denominator in equations and Since in the multicanonical distribution

CHAPTER MULTICANONICAL AND RELATED METHODS

every macrostate is equally probable one would exp ect that at each MonteCarlo step the

probability of moving to a higher macrostate would b e ab out the same as the probability of

moving to a lower one This suggests that the simulation should p erform a random walk through

macrostate space and therefore that E V the total number of macrostates N V the

e m

can

volume of the system and the separation of the p eaks of P E and E scales in this way

to o This is very similar to the way that the multicanonical metho d reduces the tunnelling

time b etween the two phases in a simulation of a rst order phase transition see section

e

dep ends only on a p ower of the system size instead of increasing exp onentially with it

e

see equation Indeed when the multicanonical ensemble was introduced this asp ect

of the algorithms p erformance was particularly emphasised V in However it is

e

not obvious that the at multicanonical distribution is necessarily optimal and we shall return

to the question of exactly how much weight should b e put in the region b etween the p eaks in

section

We should note that while many of the applications of the multicanonical ensemble have

b een to rstorder phase transitions the measurement of

absolute free energies by using knowledge of is referred to only in where it is used

TOT

in a calculation of S E E F E T The overall normalisation is also used in calculations

of the degeneracy of the ground state of spinglasses The dierence in approach in

the phase transition problem arises b ecause most authors use the multicanonical metho d in the

form where the free energy dierence b etween two phases is measured by tunnelling through the

interfacial region Absolute free energies are not required to do this provided that a way can

b e found to directly connect the two phases Indeed at co existence the free energy dierence

can

is zero and so all that is required is to reconstruct P and show that the sums over the

two phases are equal This metho d is not appropriate for energy preweighting of the d Ising

can

mo del b ecause P E never develops a doublep eak structure however it is appropriate for

magnetisation preweighting

Finally let us return to the question of what s we may regard as multicanonical First

xc

we rep eat that the required set is unknown a priori to pro duce a p erfectly at sampled

xc can

distribution we would need E F E ln P E where direct measurement of

can

P E gives us an estimate that is at rst indistinguishable from zero for most E states

Thus to pro duce the multicanonical distribution implies that we need at least a partial knowl

edge of the very quantities we wish to measure In practice then we shall never b e able to

CHAPTER MULTICANONICAL AND RELATED METHODS

use a p erfectly multicanonical distribution but only an approximately at distribution In

fact all the advantages of the ideal multicanonical distribution remain as long as the distri

bution is approximately at so that we obtain go o d sampling in all states while it is the

case that sampling from a distribution that is only roughly at will lead to larger exp ected

errors than sampling from a completely at one the dierence in the exp ected error bar is

only a few p ercent b etween a completely at distribution and one where we imp ose only that

xc xc

P E O P E E E in the range of interest We shall therefore regard such sampled

distributions to b e multicanonical to o Where necessary we shall use the notation E for

the ideal multicanonical distribution in which every macrostate has exactly equal probability

xc

to distinguish it from E which implies only one of many p ossible sampled distributions

xc

that are close enough to multicanonical to b e used as such in practice The condition on P

xc

demands that E should dier from E by terms of order unity ie a constant abso

d

lute error But we know that at least away from criticality F f L so therefore

the fractional accuracy with which must b e known to pro duce a multicanonical distribution

increases with increasing system size Moreover the set is not xed absolutely even by the

requirement that it pro duce a particular sampled distribution if gives a multicanonical dis

tribution or indeed any other sampled distribution for a particular then so do es where

E E k E k constant We shall adopt the convention that k is to b e chosen such

that min E Indeed there is even less restriction than this it will b e noted that the

E

parts of E space that dominate the estimator of exp E in equation are indep endent of the

temp erature of the multicanonical simulation This shows that can b e chosen more or less

at will if we have a multicanonical distribution pro duced by co ecients E for one temp er

ature then the same sampled distribution would b e pro duced by E E E

at temp erature It would p erhaps b e simplest in practice to choose though this is not

what has generally b een done in this thesis We shall consider various iterative pro cedures for

xc

generating a suitable E in section

An AlternativeThe Ground State Metho d

Aside from the use of equation there is another way that a multicanonical simulation can

give access to absolute free energy by enabling us to measure the canonical probability of a

macrostate that contains a known number of microstates such as the ground state E which

is twofold degenerate in the case of the d Ising mo del We rst calculate

CHAPTER MULTICANONICAL AND RELATED METHODS

xc xc

P E exp E

can

P

P E

xc xc

P E exp E

E

can

And then use P E alone to determine the free energy

can

G ln Z E ln P E ln

can

Thus we need to know P E b oth at E and also for those macrostates near its maximum

which will dominate the normalisation the denominator of equation This implies that we

require the multicanonical distribution to overlap completely with the region around E

and also to extend down to the ground state This is in contrast to the previous metho d

where the multicanonical distribution had to extend upwards from the region around E

can

to overlap with the maximum of exp E P E E As b efore free energies at other

temp eratures may b e estimated provided that the multicanonical distribution extends to

can

cover the p eak of P E

Whether this technique or the one of measuring exp E is b etter dep ends on the

algorithm and the temp eratures of interest to a rst approximation if E is near

to E as will b e the case for large ie low temp erature then the ground state metho d

can

will b e b etter if it is near to the maximum of P E exp E as for high temp erature

then the exp E metho d will b e b etter However the situation is complicated by the

variation of acceptance ratio over the wide range of E that we are covering for instance the

Metrop olis algorithm slows down very dramatically near the ground state of the Ising mo del

d

the acceptance ratio decreases like L owing to the diculty it has in nding the spins

that must b e ipp ed to steer the system into the single microstate of the ground state out of

the higher energy states with their exp onentially large number of microstates However other

algorithms like the nfold way and generalisations of it can alleviate this problem

Once again we shall comment further on this matter in section

In the literature measurement of the ground state probability seems only to have b een used

to nd the unknown ground state degeneracy of spin glasses not to give the overall

normalisation in a case where the ground state degeneracy is known and thus the absolute free energy

CHAPTER MULTICANONICAL AND RELATED METHODS

The Multicanonical Distribution over Magnetisation Macrostates

In the canonical ensemble we show an external eld for generality though in the Ising case we

shall b e concerned only with H

X

can

M M exp E P M exp H M Z

f g

exp H M exp F M Z

dening the free energy functional F M

can

We discussed the form of P M in section For H it has two p eaks for

and one at M for Except exactly at these are Gaussian and change

c c c

d

shap e as L in such a way that their width when expressed in terms of m M L

b ecomes vanishingly small For the states around M corresp ond to mixedphase

c

congurations

xc

By introducing M so that

xc xc

P exp E M

and

xc xc

P M exp H M exp F M M

we may pro duce a multicanonical distribution at over some range of M values From measured

multicanonical probabilities we can then recover the canonical distribution at a value of the

applied external eld H dierent from the value H prevailing during the simulation by using

xc xc

M exp M P H H exp

can

P M H

P

xc xc

P M exp H H exp M

M

xc xc

where P M may b e estimated from C M the histogram of visited magnetisation states

in the multicanonical distribution or by some other means This equation may b e used to

tackle the free energy problem in the same ways as it was in the energy case If the range of

M that is multicanonical embraces those values typical of the two co existing phases then we

may simulate co existence directly a go o d nitesize estimator of the innitevolume rstorder

CHAPTER MULTICANONICAL AND RELATED METHODS

can

phase transition o ccurs where the two p eaks of P M H have equal weight ie

X X

can can

P M H P M H

M A M B

can

where A and B are the two phases It is essential to determine P M indirectly via the

multicanonical ensemble for two reasons rst to allow tunnelling b etween the p eaks which

is necessary to nd their relative weight whatever the external eld may b e and second to

allow reweighting to dierent values of H until we nd the one that satises the equalweights

criterion This metho d has b een used to determine the phase co existence curve of the Lennard

Jones uid in and we shall use it in section of chapter It is not needed for the

Ising mo del where the lo cation of the co existence line is determined by symmetry It also

can

allows accurate measurement of P M in the interface region which enables us to measure

interfacial tension see the discussion of mixed states in section and

Equation can also b e used to measure the absolute free energy of a single phase without

crossing the interface if this is dicult for some reason for example b ecause for an olattice

system it would necessitate growing a crystal out of a uid In the Ising case the absolute

free energy G is most easily found by pro ducing a multicanonical distribution that extends all

d

the way to the fully saturated states at M L This has b een done for the rst time to

our knowledge in section It enables us to obtain another very accurate estimate of G

by the metho d describ ed in section We also use this distribution to study the scaling of

can

P M at large M The sampled distribution used in section in fact extends over all

magnetisation states so that G for the entire system is obtained however the free energy of just

one phase would b e obtained if M were restricted to embrace just the values characteristic of

the phase This approach would b e less useful for olattice systems for example where there

is not a single saturated state at very large values of the order parameter volume while at very

small values the there is the closepacked crystal which has innitesimal canonical probability

For them a metho d analogous to the use of equation in section could b e used We

shall keep within the Ising context to describ e this metho d By substituting into equation

it is easy to show that

exp M H H exp G H G H can

CHAPTER MULTICANONICAL AND RELATED METHODS

If the range of M is restricted but there is appreciable canonical probability outside it then

G H should b e replaced by G H the free energy of the phase Just as in section

A

can

to estimate this accurately we require the sampled distribution to overlap b oth P M H and

can

P M H and so must use the multicanonical estimator If the state at H is such that

its free energy is known exactly or to a go o d approximation eg using the virial expansion in

the case of a dilute gas then the absolute free energy at H follows

Techniques for Obtaining and Using the Multicanon

ical Ensemble

To travel hopefully is a better thing than to arrive

FROM El Dorado

ROBERT LOUIS STEVENSON

In this section we shall b e concerned with techniques relating to the multicanonical ensemble

xc

we shall discuss various iterative pro cesses for the generation of the co ecients we use the

vector notation for succinctness and b ecause what we are ab out to say applies to b oth E

and M and we shall also describ e a metho d that may make implementation of the metho d

ecient on parallel computers We have devoted particular attention to the development of

xc

quick ecient and reliable metho ds for nding a usable b ecause the absence of such

metho ds seems to have b een the principal obstacle to the wider application of previous

nonBoltzmann sampling metho ds metho ds such as Umbrella sampling section

We present and discuss the results we have obtained from simulations using the multicanonical

ensemble in section

The usual approach to multicanonical simulation in the literature has

b een to divide the application of the metho d into two parts the nding of an approximately

multicanonical distribution which is done as fast as p ossible followed by a lengthy pro duction

run in which a much longer Markov chain is generated without changing the sampled distribu

tion Only the results of the pro duction run are then used in equation or its equivalents to

estimate the quantities of interest with error bars coming from jackknife blo cking We to o

shall divide the tasks up like this the results of section come from simulations implemented

CHAPTER MULTICANONICAL AND RELATED METHODS

in this way though at the end of this introductory section we shall discuss further why and if

this division is really necessary

xc

First then let us discuss the generation of In a real problem the sampled distribution

never will b e p erfectly at on macrostate space b ecause this would imply exact knowledge of

can

the probabilities P exp F which as this expression shows are dep endent on the very

free energies that we are trying to measure To pro duce the multicanonical distribution and

can

to measure P or F are therefore the same problem and to solve it requires an iterative

pro cedure We b egin with some initial guesses which may well b e corresp onding to a

Boltzmann sampling algorithm and generate a short Markov chain Generally this will sample

only a small fraction of the macrostates we are interested in In the sampled region we can

make inferences ab out the underlying sampled distribution from the data We then use these

inferences to generate a new sampled distribution which will b e approximately multicanonical

in the sampled region while outside we increase the sampling probability so that the next data

histogram will b e a little wider Then we rep eat the pro cess hop efully getting closer and closer

to the multicanonical distribution

xc

Let us formalise this a little We wish to nd approximating to for the N macrostates

m

of a system We shall denote the iteration number by a sup erscript n and the ith macrostate by

n

a subscript i i N eg C for the nth histogram of visited states very few expressions

m

i

contain anything raisedtoap ower so this is seldom ambiguous The macrostates could b e

n

of energy or magnetisation We are going to make inferences ab out P the unknown true

n

macrostate probabilities in the nth sampled distribution generated by on the basis of data

n

gathered by a Monte Carlo MC pro cedure constructed to sample from P The data do not

n

determine P exactly b ecause of the eect of noise and b ecause at least at rst many states

are not sampled at all The b est way to try to treat this problem which implicitly handles the

n

problem of distinguishing the signal due to P from the noise is to use Bayesian Probability

Theory where probabilities describ e our state of know ledge ab out

n

quantities so that constant but unknown parameters like P may b e assigned probabilities

n n

In this case what we obtain from the data is P P a probability density function of P

In the frequentist interpretation of probability theory where probabilities have meaning only

in so far as they express the exp ected frequency in a long series of trials an expression like

n

P P is not admissible However it is now fairly well established that the Bayesian and

frequentist formulations make almost exactly the same predictions where b oth are applicable

CHAPTER MULTICANONICAL AND RELATED METHODS

while there are many situationsand as we shall see this is one of themwhere the Bayesian

interpretation is the more p owerful

n

To return to the main thrust of the argument P P is determined by the data according

to Bayes Theorem which is after the nth set of data has b een gathered

n n n n n

P P j H D D P D j H P D D

n n

R

P P j H D D

n n n n n n

N

m

P P P j H D D P D j P H D D d

Here

H represents the knowledge as expressed by equation and its magnetic analogue of how

n

n

is related to P

n

D represents the data which consists of either the visited states or recorded transitions

n

of the Markov chain and the set that pro duced them

n n n n

P P j H D D is the prior probability distribution of P b efore the data D have

b een considered

n n

P D j P H D D is the likelihood function the probability that the observed data

n

are pro duced given that a particular set of values of the parameters P holds To calculate the

likelihoo d we must generally assume a mo del

n n n

P P j H D D is the posterior probability distribution of the parameters P including

n

the eect of the data D

n

n

From the p osterior pdf we generate estimators P of the true P The mean though it

may not in practice b e calculable is one obvious p ossible estimator others are the mo de and

median The width of the pdf gives us a measure of the uncertainty in the estimator We

p

exp ect this width to b e of the order of C b ecause the sto chastic nature of MC sampling

i

p

n

C which cannot b e distinguished from pro duces uctuations in the histogram C of size O

i

uctuations of the same size due to the true structure of the sampled distribution

We then use the estimator however dened to generate the next sampled distribution The

obvious way to do this though as we shall see there may b e b etter alternatives is to put

n

n n

k ln P

i i

i

where k is an arbitrary constant which we choose so that the minimum of is zero This

n

corresp onds to sampling from a distribution P obtained by setting

CHAPTER MULTICANONICAL AND RELATED METHODS

n

n n

P P P

i i

i

n

n n

P would indeed b e exactly multicanonical if P P In practice we never reach this

n

situation b ecause of the random errors in the measurements but we can exp ect P to b e

n

closer to the multicanonical distribution than was P it is shown in that this algorithm

converges almost surely

The reader may b e wondering why we have written Bayes theorem for the sampled distri

n can n

bution P when P is our real interest We do this b ecause it is P that determines the

n n

data D so that the likelihoo d is naturally expressed as a function of P as we shall see in

can n

section It is of course p ossible to write an expression relating P P to P P just

as in one dimension we would write

P x P y xdy dx

so here we can write

n

P

can n n n

i

P P j H D D P P j H D D mo d

can

P

j

where

can n

P exp

can

n i i

P P

P

i

N

m

can n

P exp

k

k k

n can

enter the expression for each P most of them in However that fact that al l the P

i

k

the denominator makes the transformation algebraically very complex even if we make

simplifying approximations like using a uniform prior and a simple mo del of the likelihoo d

function for example one neglecting correlations Indeed given that N the dimensionality

m

of the problem usually is at least O the RHS of equation cannot b e handled either

analytically or numerically and so the transformation cannot b e p erformed exactly though

see section b elow Certainly it is imp ossible to integrate the function to nd exp ectation

can

values of P etc Instead we are forced to make our inferences ab out the sampled distribution

n

n

P obtaining estimators P and then to use

n n

n P exp

can i i

P P

P

i

N

m

n n

P exp

k k k

CHAPTER MULTICANONICAL AND RELATED METHODS

which is the N dimensional analogue of using x y as an estimator of x

m

Moreover the same diculty in making a transformation from one set of P s to another

n

aects us even if we conne ourselves to inferences ab out P On the rst iteration n we

can

may have no information ab out the sampled distribution which is P in this case Therefore

can

it is appropriate to choose a uniform prior P P j H constant in which case the p osterior

will dep end only on the likelihoo d On every subsequent iteration however we have prior

n

information ab out P which comes from the p osterior of the previous sampled distribution

n n n

P But to get b etween P P and P P involves just the same transformation as

n n

equation in terms of the variables P and P it is

n

P

n n n n

i

P P j H D D P P j H D D mo d

n

P

j

where

n

n

P P

n

n

i

i

P P

P

i

N

n

m

n

P P

k k k

Once again the algebraic complexity of this expression prevents its b eing calculated and we

n

seem to nd ourselves forced to use a uniform prior on P at each stage We thereby discard

n

much of the information from the previous iterations which inuence the current P P only

n

n

by the choice of P itself through With a unform prior at each stage Bayes Theorem

as given in equation reduces to

n n n n

P P j H D P D j H P

n

Note that no approximation is involved in rewriting the likelihoo d function just as P D j

n n n n

H P D D are irrelevant to D given P

What disadvantages result from having to disp ense with the informative prior At least

initially n small the p osterior changes rapidly with n and the new likelihoo d will b e much

narrower than the prior over most of the macrostate space as we start to sample regions we

previously had to guess ab out In this case it makes little dierence to approximate the prior

as uniform and base the inference only on the likelihoo d function However as we converge

towards the multicanonical distribution the sampled distribution changes little b etween itera

tions Thus if we keep N a constant the prior is as narrow as or narrower than the likelihoo d c

CHAPTER MULTICANONICAL AND RELATED METHODS

so we throw away a lot of information by disregarding it and convergence stops once the dif

ference b etween the sampled distribution and the true multicanonical one has come down to

the order of the random uctuations in the likelihoo d which are inevitably incurred in the

simulation pro cess Indeed if N is to o small a large uctuation may throw us a long way away

c

xc

from P Recently an ad hoc renement of this metho d has b een prop osed which uses

the histograms of all previous iterations each contributing with a weight inversely dep endent

on the size of lo cal uctuations The easiest way to skirt this problem though is by increasing

n

the length N of the Markov chain generated at each iteration of the convergence pro cess at

c

rst in order to comp ensate for the increasing number of sampled macrostates and later to

keep smo oth convergence and to minimise the eect of the waste of previous information The

eventual move to the pro duction run can b e seen as the nal limit of this On the other hand

it is clearly true that increasing N at lot when we are still some way from a multicanonical

c

distribution wastes computer time that we would like to devote to the pro duction run There

is thus a scheduling problem of deciding on a suitable initial N deciding when and by how

c

much to increase it later and nally deciding on when to move to the pro duction run This is

usually found to need some initial exp erimenting and even when a scheme is found that do es

seem to converge smo othly a certain amount of human monitoring is required though quite a

lot of progress has b een made on automating the pro cedure

By using the Bayesian framework we have started to set up ab ove we have made some

signicant progress in incorp orating prior information to stabilise the algorithm and in putting

the ad hoc nding metho ds that are employed on a rmer fo oting This will b e describ ed in

sections and in the context of inferences made using the observed visited macrostates

of the chain as data However with this visitedstates metho d it it may b e the case that the

nding stage inescapably gets increasingly lengthy for example for large system sizes so that

it consumes a large part of the total available computer time This problem can only really b e

alleviated by the choice of a more ecient nding algorithm and we have also contributed here

through the development of a metho d to our knowledge new that converges very rapidly to

something close to the multicanonical distribution by making inferences based on the observed

transitions b etween macrostates made by the system section

Before embarking on a description of these techniques we shall return briey to the question

of whether the division b etween nding and using the s is really necessary This somewhat

n

n

inelegant strategy is in fact forced up on us by a failure of P to b e a go o d estimator of P

CHAPTER MULTICANONICAL AND RELATED METHODS

outside the sampled region and by the algebraic diculties we have in handling expressions

like equations or If these expressions were tractable and could b e integrated then

can

the whole simulation would b ecome a single continuous pro cess of narrowing P P Then

can

we would nally need some way of transforming the pdf of P into a pdf of the required

P

can

O P from which we could nd a mean and standard error the estimators O

i

i

i

standard error is particularly problematic as it requires treatment of the correlations b etween

can

the estimates of the comp onents of P The metho ds we describ e in section go some

way towards a solution of the rst problem but we do not really tackle the second though other

recent work has addressed closely related matters Signicant further work is required

b efore the division b etween the nding stage and the pro duction run can b e removed

Metho ds Using Visited States

In this section we will consider inferences made from what is p erhaps the obvious choice of

data the visited macrostates of the Markov chain We shall thus call this the visited states or

VS metho d Supp ose that N congurations are sampled from the Markov chain with sam

c

pling o ccurring at wide intervals so that successive congurations are eectively uncorrelated

n

will then b e multino The likelihoo d function for the data the histogram of visited states C

i

n

mial multivariate binomial with N indep endent variables P The N th is determined

m m

by the normalisation If we keep the assumption of a uniform prior then by Bayes Theorem

in the form of equation

Q P

n

n

N N

C m m

n C n

N

i

m

P P

n n

i

k

i k

P P j H D D

R

P Q

n

n

N N

C

n

m m

C

n n

N

N

m

m i

d P P P

i

k k i

R

P P

N

m

n

sub ject to C N and where the domain of integration R is such that P

i c

i

i

The combinatoric factors that would b e present in the multinomial likelihoo d have disapp eared

in the normalisation

n

Several p ossible estimators of P can b e used The simplest is the maximum likelihoo d

n

estimator MLE the set of values of P most likely to have pro duced the given data It is

dened by the equations

n n

P D jP

n

n

P

i

P

M LE

These are easily solved for the multinomial distribution using a Lagrange multiplier for the

CHAPTER MULTICANONICAL AND RELATED METHODS

n

n

normalisation the result is the intuitively obvious one P C N leading to

c

M LE

n

n n

ln C k

This up dating scheme is used in the frequentist interpretation of probability used

in these references leads naturally to maximum likelihoo d estimators the unknown parameters

can only b e considered as having one true value which is most naturally taken to b e the

n

one which is most likely given the data Clearly the scheme do es not work where C

i

which happ ens extremely frequently in the early iterations there are many macrostates that

n

the Boltzmann sampling algorithm do es not visit For these states it would imply

i

n n

We could try to x this by dening a norm k k and setting a b ound on the magnitude

that it may have suggested in or by the metho d adopted by Lee of leaving

i

unaltered except for the arbitrary constant k if C ie b ehaving as if C in this case

i i

However with the multinomial approximation for the likelihoo d and using Bayesian infer

ence it is p ossible to do b etter than this we can evaluate the exp ectation values

n

n

P P

AV

R

Q P

n

n

N N

n

C

m m

N n C n n

m

N

i

m

d P P P P

i j

k

i k

R

n

P

R

P Q

n

AV j

n

N N

C

n

m m

C

n n

N

N

m

m i

d P P P

i

k k i

R

C N N

j c m

n

n n n

which leads to lnC k we note that for C this gives the same

i i i

i

up dating scheme as was introduced arbitrarily by Lee For other states it gives slightly dierent

n

estimators but the dierence is negligible if C is large The eect of this up dating whether

i

using AV or appropriately xed MLE estimators is to decrease the probability in the sampled

region by a factor of C and thus since the probabilities must add to one to increase it by a

i

n

uniform factor in the nonsampled region However the true P almost certainly decreases as

least for a while as we move away from the sampled region though we do not know purely from

n

C whether it decrease monotonically or has other lo cal maxima elsewhere The extent of this

decrease is generally many orders of magnitude so in the nonsampled region the true value of

n n

n n

P is much lower than the estimate P except at the very edge of C since P eectively

AV j AV j

assigns one count to nonsampled states The result of this is that on the next iteration the

CHAPTER MULTICANONICAL AND RELATED METHODS

sampled region widens slightly b ecoming roughly multicanonical at over that part that was

sampled b efore and extending a little further into the wings The n th histogram tends to

have large uctuations in the states that were at the edge of the nth b ecause the p o or statistics

n

at the edges of C tend to pro duce an which is inaccurate here However these get smo othed

away on subsequent iterations The convergence is fairly smo oth as long as we increase N to

c

comp ensate for the increasing number of sampled macrostates

n

We should contrast this b ehaviour with that exp ected if the true value of P in the wings is

larger than the estimate We have found that this is likely to happ en if despite having very little

n

evidence ab out P far away from the sampled region we attempt to get faster convergence by

n

tting some function to the part of that comes from the sampled region and extrap olating

n

it In that case P may put a great deal p ossibly almost all its weight in the wings and

n n

C may b ecome separated from C Convergence then b ecomes irregular and awkward with

the latest frequently needing to b e discarded and a return made to earlier ones though linear

extrap olations seem to b e relatively safe in this regard We shall discuss this further at the

start of section

Incorp orating Prior Information

Let us return to the idea of incorp orating the prior pdf As we discussed ab ove using a

n n

uniform prior for P at each stage do es not accurately reect the condence we have in P as

n

a result of the earlier iterations indeed we do not have any real measure of the error in P or

n

n

We could evaluate the variance of P using an expression similar to equation but

since this implicitly neglects correlations which aect the variance much more than they do

the mean it is unlikely to b e accurate

can

We have tried to build in at least a measure of the ideal iterative scheme in which P P

is narrowed continually To do this we have avoided the diculties of changing variables see

equation et seq by using a dierent datadriven but still Bayesian strategy to estimate

the the pdf of the transformed variable of interest

n can

Rather than lo oking at P or even P let us consider P we shall justify this choice

b elow where introduced in section is that set of preweighting co ecients that would

give an exactly multicanonical distribution

P exp F constant

CHAPTER MULTICANONICAL AND RELATED METHODS

ie

F constant

can

lnP constant

xc

We do not use the notation since this also embraces all the nearly multicanonical

distributions

Bayes Theorem now b ecomes

n n n

P j H D D P D j H D D

n

R

P j H D D

n n n

N

m

P j H D D P D j H D D d

We now choose to mo del the pdf of as a multivariate Normal distribution with the

n

covariance terms set to zero ie after the collection of data D

n

n n

P j H D D exp

i i

i i

It is our desire to use a Normal mo del that has led us on phenomenological grounds to con

can

sider rather than P the latter will have an asymmetric pdf b ecause of the constraint

n

can

P i which will not b e well approximated by a Gaussian The same applies to P

i

We exp ect that P will b e b etterb ehaved We have n sup erscripts on the parameters

n

n

n n

and b ecause b eing the mean of P j H D D is the set of weight

factors that we shall use to generate the n th sampled distribution

This parameterisation takes care of the prior and p osterior in equation however the

likelihoo d function still remains We cannot transform a multinomial likelihoo d expressed

n

naturally in terms of P without encountering the diculties of equation To pro ceed

by jackknife blo cking the data histogram at each iteration then we estimate the pdf of

n

nding the exp ectation value of P for each blo ck then transforming these exp ectation values

to give a series of estimators of the variable of interest whose distribution outlines the

shap e of P

1

this is an ad hoc choice chosen to make the problem computationally more tractable it will cause us some

diculties b elow

CHAPTER MULTICANONICAL AND RELATED METHODS

To see in detail how this works we rst note that

n n n

P D j H D D P D j H

n n

b ecause as we stated when discussing equation D D are irrelevant to D given

Thus we can apply Bayes theorem once more

n n n

P j H D P D j H P j H P D j H

evidently we may take P j H to b e uniform which gives

n n

P D j H P j H D

n

n

n

c

We mo del P j H D by another Normal distribution parameterised by and We

estimate the parameters simply by jackknife blo cking the recorded histogram into m N

J

n

nm

N O pieces generating a from each blo ck and measuring their mean and

J

n

c

variance Thus we avoid the changeofvariable problem

Putting all this into equation we arrive at

n

n n

n n n

c

exp exp exp

i i i i i i i

i i

which implies

n

n

n

c

i i

i

and

n

n n

n n n

c

i i i i

i i

n n

Thus is an average of its previous value and the simple estimate of its new value

n

obtained from equation with the two combined according to their estimated variance

The variance itself is always reduced as we would exp ect since we are adding new information

This is a more systematic way of trying to build in the results of previous iterations than one

n

based simply on the magnitude of C as is given in and it is likely to b e more accurate

i

b ecause the eect of correlations in the sampling pro cess is implicitly included Note moreover

CHAPTER MULTICANONICAL AND RELATED METHODS

that the basic ideathat of creating a MonteCarlo sample from the unknown pdfis not

n

b ound to any particular algorithm for sampling P or to any mo del for the likelihoo d so we

can apply it to other metho ds to o

Having said this we have also found that there are several caveats attached to its use in

practice as we shall now describ e First in order to b o otstrap the technique we assume is

large so that dep ends only on D We also have to b e careful with our p olicy of having the

nm

normalisation min This normalisation should not b e applied to each individually

nm

since if min falls at the same macrostate each time we would otherwise estimate a

n

variance of zero for this state However it is b est to set min to zero b efore combining it

n

with so that over those regions that b oth sampled the two s will b e approximately equal

n

c

This reduces the eect of uctuations in see b elow

n

c

Moreover the estimate of is itself sub ject to fairly large random errors which need

n

careful treatment or they will sp oil the estimator Supp ose that over some range of states

n

and are separated fairly widely Then as we see from equation random uctuations in

n

n

n

n n

c

and indeed in will pull the estimator back and forth b etween and We

n

n n

can thus end up with a that is far less smo oth than either or This is presumably

a consequence of neglecting the covariance terms in the Normal mo del of the pdf of

including them would serve to force some smo othness on the function as a whole However to

do this would make the up dating pro cess much more complicated and timeconsuming b ecause

equations and would b ecome matrix equations involving matrix inversion Therefore

n

c

in practice we have adopted an ad hoc solution to the problem First we smo oth by lo cally

averaging it This is found to alleviate the problem in the region where b oth the n th

and nth iterations pro duced counts However we must also treat those regions where neither

iteration pro duced counts and where the n th iteration did not but the nth did At rst

we tried simply assuming some arbitrary large variance in unsampled states given that we

nm

adopt the technique of averaging the without setting the minimum to zero we pro duce an

n

c

of zero in the unsampled states which clearly needs some xing This was found estimated

i

to work well in the newly sampled region where the n th iteration do es not pro duce counts

n

n

but the nth do es pro ducing a which dep ends almost entirely on as we would desire

n

n n

However in the region which is still unsampled this scheme leads to since

n

n

n

n

c

We thus found that tended to increase up to the edge of C and then fall

i

back to a lower value again This severely slowed down the desired spreading of the sampled

CHAPTER MULTICANONICAL AND RELATED METHODS

n

region The problem was solved by recording the lo cation of the edge of C at each stage and

n

n

putting in b oth the newlysampled and unsampled regions ie anywhere past the

n

edge of C

n

nm

Although it is found that using the average over m of as gives p erfectly adequate

convergence it is found that with the same total time devoted to each iteration it spreads

into the nonsampled region a little more slowly than do es the nave scheme of simply using a

uniform prior at each stage and up dating with equations and alone This is b ecause

min C is a larger fraction of the total counts in one jackknifed histogram than it is of

i

nm

the total counts in the single histogram of the nave metho d C contains N N

J AV

n

N N The maximum change in that may o ccur is counts while C contains N

J AV

AV

therefore a little larger in the nave case To achieve the same spreading rate we could sp end

n

n

N N more time on the metho d with prior but in fact we choose to use

J J

which is the estimator dened on al l the p o oled data from the nth iteration and thus identical

n

to the nave estimator The eect of the use of the prior is thus only on the way that is

combined with previous estimators of the weighting function

To summarise then the metho d as we have applied it is

Start with and estimate to b e some large constant

nm

record m N O histograms C

J

i

for each one set

nm nm

n

ln C

i

i i

n

nm

c

calculate from the

n

n

calculate dened on all the data from the nth iteration

n

n

calculate and using equations and with the caveats mentioned ab ove

n

use only in the regions where only iteration n samples or where no iteration has

yet sampled

if the distribution do es not extend over all the macrostates of interest then return to

otherwise stop

To illustrate this iterative scheme we shall examine the energy preweighting of the L

and L Ising mo del starting from a canonical simulation at We wish to extend

CHAPTER MULTICANONICAL AND RELATED METHODS

the sampled distribution up to the states characteristic of to use the metho d of section

to nd G This is only for purp oses of demonstration since with inverse temp erature

the ground state already has signicant weight for these small systems indeed is the

most probable macrostate for L so G could b e found directly

To b egin with let us consider the L system in the case where N histograms are

J

d

gathered at each iteration and we choose N such that N where N N L the

c AV AV c

number of counts exp ected p er subhistogram p er energy state in the multicanonical distribu

tion sampling once p er lattice sweep In gure we show the rst three histograms in fact

the average over the six pro duced at each stage and the weighting functions that they

pro duce

In the gures we plot the estimates of pro duced at the end of every iteration lab elled

with their iteration number We show only the b ottom half of the energy range up to the state

can

2

E E At higher energies we use E E which corresp onds to the mean of P

L

can

E but E which ensures that the multicanonical distribution matches the shap e of P

do es not extend to very high energy states which have signicant canonical weight only for

antiferromagnetic spinspin interactions

As we see the rst histogram extends up to E is quite well determined imme

diately over most of this range there is very little dierence b etween and subsequent s

It b ecomes rather less accurate around E b ecause very few visits to these states were

recorded so the fractional uctuations are larger and at higher energies is changed from its

original value only by the constant we add to keep The second histogram is then

min

roughly at as we exp ect for a multicanonical sampled distribution up to ab out E then

it falls to zero at E There are some uctuations around E with some states

having appreciably more than their multicanonical probability due to the uctuations in the

tail of the previous histogram having passed into the weighting function As we might exp ect

therefore extends up to E b efore going at it is now extremely close to the true

for E where the second histogram obtained go o d statistics in all states and

fairly close it up to E though once again with larger uctuations where few counts were

recorded The third histogram C is then more or less at up to E the p oint where C

cut o and then extends up to E but again the last few states have p o or statistics so

is not welldetermined for them The pro cess can clearly b e rep eated extending to higher and

n

higher energies until has stopp ed changing apart from small random uctuations which

CHAPTER MULTICANONICAL AND RELATED METHODS

600

histogram 1 400 histogram 2 histogram 3 C

200

0 -128 -96 -64 -32 0 E 30.0

η 2 20.0 η 3 η η 4

10.0

0.0 -128 -96 -64 -32 0

E

n

Figure Initial Convergence rst three iterations of the weighting function using the

visited states metho d for Ising L The upp er gure shows the histograms C

pro duced at each stage and the lower the resulting s

is the p oint where we would move to a longer run with a xed in the ndingpro duction

scheme

We show the later progress of the iterative scheme iterations and in gure

It is apparent that at least on the scale of the whole gure convergence has o ccurred by

xc

the fth iteration we have pro duced a usable Examination of the inset shows that the

uctuations in are small after this though it is not clear that convergence continues

Now let us examine how using the Normal mo del of with an informative prior compares

n

with simply using a uniform prior at each iteration up dating using equation alone In

gure we show the results of such a nave run We used N at each stage so that AV

CHAPTER MULTICANONICAL AND RELATED METHODS

30.0

12.0 5

4 η 20.0 11.0

η 3

10.0 -50 -49 -48 -47 -46 -45 E 10.0 2

0.0 -128 -96 -64 -32 0

E

n

Figure Main gure convergence of the weighting function using the VS metho d for

Ising L N blo cks p er iteration iterations Inset detail of

AV

E

the same total time would b e sp ent on each iteration as was the case b efore

It is apparent from the main gures that there is nothing to choose b etween the two metho ds

as regards their sp eed of convergencethe sp eed with which they move into the unsampled

region this is as we would exp ect b ecause the tweaks we have given the informativeprior

metho d have rendered it almost identical to the nave metho d over for these states The

dierence only b egins to b ecome apparent when examining the insets If a uniform prior is

used uctuations in in the sampled region are larger and p ersist rather than dying away

as they do in gure This is shown more clearly in gure where we have plotted the

n

dierence b etween and for the E macrostate Thus our metho d of incorp orating

f inal

prior information yields improvements in the the smo othness of convergence and go es part of

the way to removing the necessity of switching to a pro duction run since the error in

b ecomes continually smaller even though we continue up dating it implying a convergence of our

knowledge of F We should note that the normal mo del is only accurate when we are already

close to the multicanonical limitit dramatically underestimates in the early iterations

nm

b ecause the uctuations in are reduced in size by the up dating using lnC and do

i

in the nonsampled region though this is unimp ortant for not reect the real uncertainty in

CHAPTER MULTICANONICAL AND RELATED METHODS

30.0

12.0 5

4 η 20.0 11.0

η 3

10.0 -50 -49 -48 -47 -46 -45 E 10.0 2

0.0 -128 -96 -64 -32 0

E

n

Figure Main gure convergence of the weighting function using the VS metho d for

Ising L N blo ck p er iteration uniform prior used in up dating

AV

Iterations Inset detail of E

up dating since we do not make use of in this region

A similar pattern emerges in the L case The sampled distribution widens gradually and

fairly smo othly extending to higher and higher energies until the multicanonical distribution

is reached Once again we present results for iterative schemes using b oth a normal mo del for

n

gure and a uniform prior gure In the latter case uctuations in p ersist

but if the prior is incorp orated they die away once we are close to the multicanonical limit

just as in the L case This is shown by the insets in gures and and more

clearly by the approach of E to its nal value plotted in gure However the

most serious disadvantage of the visitedstates metho d its slow convergence for all save very

small systems is now b ecoming apparent The most that any can change by in any one

i

n

n

iteration is max lnmax C thus the greatest p ossible change is

max i i i

i

i

lnN and more typically ln N This is not a large change considering

max c max AV

d

that generally scales like L and the metho d b ecomes tedious even for simple systems like

the Ising mo del when the range of to b e covered is The use of an informative prior

n

do es not help b ecause the problem is the extension of into the unsampled region where

we do not have any prior information to incorp orate We see that with L we require

CHAPTER MULTICANONICAL AND RELATED METHODS

20.0

NAV =300, one block/iteration

NAV =50, 6 blocks/iteration 0.4

15.0 0.3 n n

η 0.2 η − 0.1 − final 10.0 η 0.0

final -0.1

η -0.2

-0.3 5.0 -0.4 5 10 15 20 Iteration n

0.0

5 10 15 20

Iteration n

n

Figure for for E using the VS metho d for Ising L

f inal

N and blo cks p er iteration uniform line and normal triangles with error bars

AV

priors used in up dating Insert detail of later iterations

iterations to get as close to as we were after iterations of the L system while we

abandoned an application of the metho d to an L system which was not near convergence

even after running for several days on a workstation

We shall now make some remarks on the scaling of the metho d with system size and on how

we should choose N The analysis ab ove shows that in iterations the maximum change

AV

in we exp ect to pro duce is ln N while if we use the same time to do just one iteration

AV

then we shall pro duce a of only ln N Thus if this were the only consideration

max AV

the fastest convergence would b e pro duced with N However this neglects the eect of

AV

random errors which we must b e able to distinguish with sucient accuracy from uctuations

in the histogram due to the true structure of the underlying sampled distribution Even for

p

the uncorrelated case the random errors will generally b e of size N We thus need as a

AV

which explains our choice of bare minimum an N which is large enough that N

AV

AV

N In fact we nd that convergence is just ab out maintained for the very small L

AV

system and using prior information with N but N is imp ossibly small For

AV AV

larger systems the lower limit on N is enforced by the requirement that the simulation must

AV

make several random walks over all the macrostates accessible to it even though in a single spin

CHAPTER MULTICANONICAL AND RELATED METHODS

120.0

100.0 54.0 12-21 9 52.0 15

η 80.0 50.0

η 48.0 60.0 8 10 46.0 -160 -150 -140 -130 -120 E 40.0

5 20.0

2 0.0 -512 -384 -256 -128 0

E

n

Figure Main gure convergence of the weighting function using the VS metho d for

Ising L N blo cks p er iteration iterations Inset detail of

AV

E

up date it is only able to move from E to E or E The number of accessible macrostates is of

d

the order of L at least when we have moved some way towards the multicanonical distribution

d d

so a simple random walk argument implies N L and therefore N L Moreover since

c AV

d d

L we shall require O L d ln L iterations Neglecting logarithmic corrections the total

d

time for this metho d to converge thus scales like L L for the d Ising mo del

To summarise then this simple metho d of pro ducing the sampled distribution provides

slow but sure convergence which is suitable for small systems where the weighting function

varies over only a few tens of orders of magnitude Our Bayesian analysis of the problem

has claried the pro cedure of up dating the sampled distribution and enables us to combine

information from dierent iterations the sampled region but do es not serve to sp eed up the

n

slow convergence of to

It is interesting to note particularly in relation to the slowness of convergence that our

equation was rst derived by Laplace in It has apparently b ecome disreputable for

which reason p erhaps it is not to b e found in precisely b ecause of the high

probability it assigns to events that are known to b e p ossible but not observed which leads

in our application to the slow spreading into the nonsampled region In certain applications

CHAPTER MULTICANONICAL AND RELATED METHODS

120.0

100.0 54.0 10-21 9 52.0 15

η 80.0 50.0

η 48.0 8 60.0 10 46.0 -160 -150 -140 -130 -120 E 40.0

5 20.0

2 0.0 -512 -384 -256 -128 0

E

n

Figure Main gure convergence of the weighting function using the VS metho d for

Ising L N blo ck p er iteration uniform prior used in up dating

AV

Iterations Inset detail of E

this can lead to counterintuitive predictions We b ecame aware of this after completing the

work of this chapter through in which a dierent though still Bayesian formulation of

the problem starting with a uniform prior on all strings of congurations of length N rather

c

than on the unknown state probabilities is used to arrive at a result which resolves some of the

counterintuitive cases and p erforms demonstrably b etter in a test on datacompression This

result is identical to equation in the case where all macrostates are visited but generally

n

gives a much smaller probability to unvisited states in our case we would assign P E

O N where C E This would lead to a maximum change in of lnN lnN

i c

c c

so convergence if it remained uniform would require ab out half as many iterations However

n

it also app ears that this choice might well lead to an overestimate of P in the states just

n

past the the edges of the histogram C with consequent slowing of convergence In any case

the p o or scaling with L of the time for convergence would remain the same Another metho d

is still required for all but small systems

CHAPTER MULTICANONICAL AND RELATED METHODS

60.0

NAV =300, one block/iteration NAV =50, 6 blocks/iteration

1.5 40.0

1.0

n

η η

- 0.5 -

final η

final 20.0 0.0 η

10 15 20 Iteration n 0.0

0 5 10 15 20

Iteration n

n

Figure for E using the VS metho d for Ising L

f inal

N and blo cks p er iteration uniform dashed line and normal triangles with error

AV

bars priors used in up dating Inset detail of later iterations

CHAPTER MULTICANONICAL AND RELATED METHODS

Metho ds Using Transitions

It is apparent that the ma jor problem with the ab ove metho d of evolving the multicanonical

distribution is that large areas away from the central p eak of the Boltzmann distribution are

initially not sampled at all and we cannot make reliable inferences ab out them for example we

n

have seen that assuming a multinomial form for the likelihoo d and evaluating P leads to

an assignment of a constant probability in the nonsampled region when it is clear physically

n

that P will decrease for at least some distance away from the sampled p eak There may b e

other p eaks lying some distance away Convergence is thus rather slow

n

n

We can try to increase the sp eed of convergence by tting a function to P or to from

the sampled region and extrap olating it into the wings of the distribution This way we make

n

some use of our knowledge ab out the likely shap e of P that is unused by the VS metho d of

section However as we discussed there the distance of extrap olation cannot b e made

n

very large b ecause of the danger of heavily underestimating P in the unsampled region If

this happ ens the next sampled distribution may then put almost all its weight in this region

and another extrap olation if made with a uniform prior and extending into the originally

sampled region will then result in the loss of all the information that we have built up there

Convergence therefore b ecomes irregular and awkward with the latest frequently needing to

b e discarded and a return made to earlier ones

can

Linear extrap olation is fairly safe b ecause P will usually have a negative second derivative

n

and so will b e smaller than P thus It still needs to b e combined with

extr ap

constraints on the distance of extrap olation though to avoid problems with subsidiary maxima

can

in other regions of macrostate space where P s second derivative is not negative There is

still some danger of overestimating b ecause the gradient must b e estimated from the p oints

near the edge of the sampled region where statistics will inevitably b e p o or The prescription

suggested in is to choose some cuto state near but not to o near the b ottom of the

n

sampled histogram and extrap olate from there so that we are fairly sure that P will not b e

to o large The choice of the cuto can either b e made by hand or automatically from the

size of the histogram combined with a strategy for retreating from a bad extrap olation

Linear extrap olation with various ad hoc constraints has b een successfully applied by several

authors and found to improve appreciably the sp eed with which metho ds based

on visited states extend the region that they sample

Despite the go o d p erformance of some extrap olation metho ds we shall here describ e the

CHAPTER MULTICANONICAL AND RELATED METHODS

results of a dierent approach to the problem It would b e very app ealing if instead of simply

trying to make b etter inferences ab out states we have not sampled we could sample all parts of

the macrostate space immediately We have developed a metho d to do this by using inference

based on the recorded transitions of the Markov chain rather than the visited states We shall

call this the transition probability TP metho d

To our knowledge this metho d is new it is not to b e confused with Bennetts acceptance

ratio metho d though in very recent work on an expanded ensemblelike simulation by

Kerler et al it is also recognised that the observed transitions oer useful information

ab out the sampled distribution

Initially the system is prepared in a macrostate with a low canonical probability such as

the ground state Unless we have already reached a multicanonical distribution this state

has extremely low probability When we make MC up dates the system therefore moves away

from this state until it is moving randomly among its equilibrium macrostates for the present

sampled distribution The pro cess resembles the equilibration of a normal simulation At each

n

MC step we record in a histogram C the transition p erformed b etween energy macrostate

ij

i b efore the step and the macrostate j after it Rejected trial moves and accepted moves

n

that do not change the macrostate are recorded alike in C We then rep eat the pro cess if

ii

necessary for a release from an unlikely state at the other end of the macrostate space Then

the entire pro cedure is rep eated until the array of recorded transitions is reasonably full We

n

used an ad hoc criterion based on a parameter N to decide this C is deemed full when

TP

n n

N for all i Now supp ose the transition probabilities b etween macrostates C C

TP

ii ii

n

n n n

to give an P i j j i P We can use C ie in the sampled distribution are

ij ij ij

P

n n n

which as b efore C estimator of this The maximum likelihoo d estimator is C

ij ij

M LE ij

j

n

for macrostates b etween which transitions are allowed We preferred needs xing if C

ij

n

to use for which by a calculation similar to that in equation based on a uniform

AV ij

n

prior on for allowed transitions it can b e shown that

ij

X

n n n

C C

ij ij AV ij

j

n

This expression requires no xing for the case C though like it is obtained by

ij

n

assuming a uniform prior for and so do es not contain information from earlier iterations

ij

n

Before we pro ceed to obtain an estimator of P itself we shall discuss the circumstances under

CHAPTER MULTICANONICAL AND RELATED METHODS

which is legitimate to consider the matrix of transitions b etween macrostates as describing a

Markov pro cess and how it can b e related to the matrix of transitions b etween microstates

which is the true determiner of the microscopic dynamics of the system and thus ultimately

of the dynamics of the transitions b etween macrostates

Transitions Between Macrostates as a Markov Pro cess

n n

Let us lab el the microstates with r s and the macrostates with i j so we write P and P

r

i

n

P to mean the equilibrium state probabilities under the prevailing sampled distribution

Thus

n n

P Z exp E r r

r

and

n n

P Z exp F

i

i i

The particular set of microstates in macrostate i we shall write as r i We assume that the

macrostates partition the microstates exhaustively and uniquely

Then the transition matrix for the macrostates is at time t

n n

t P i j ji t

ij

X X

n n

P r sjr P r jit

r i sj

X X

n n

P r jit

r s

r i sj

n

where is the transition matrix for the microstates which is not timedep endent

r s

n

We would like to dene a simple non timedep endent Markov pro cess In general this

ij

P

n n

constant r i In that case it do es not matter what P r jit will only happ en if

r s

sj

is and we have what is called a mergeable process chapter the microstructure of

each macrostate is completely irrelevant to the b ehaviour of the macrostates In our case this

condition will not b e satised However let us supp ose that the underlying pro cess is in a

n

sort of lo cal equilibrium in the sense that P r ji is constant with time at its equilibrium

value given by the Boltzmann distribution since do es not aect the relative equilibrium

2

Note that this is dierent from the notation used in section where i and j were used for microstates

CHAPTER MULTICANONICAL AND RELATED METHODS

n

probabilities within each macrostate Then we can regard as dening a Markov pro cess

ij

and we get simply

n n n n

P r jit P r ji P P

r i

and

X X

n n n

P r ji

ij r s

r i sj

We shall discuss the extent to which equation is satised later Assuming it for now

n

and applying the detailed balance condition that is known to satisfy to equation we

r s

then obtain

n

X X

P

s

n n n

P r ji

ij sr

n

P

r

r i sj

X X

n n

P

s sr

n

P

i

r i sj

n

P

j

n

j i

n

P

i

where in the last line we have used

X X X X

n n n n n

P P sjj

s sr sr j i

n

P

j

r i sj r i sj

n n

But this then necessarily means that So if ob eys detailed balance then so do es

r s

ij

P

n

n n n

P is satised so that the equilibrium distribution P is the left the equation P

j ij i

j

n

eigenvector of the transition matrix

Thus we have proven the following Firstly that if the probability distribution of microstates

within each macrostate is constant with time transitions b etween macrostates can b e regarded

themselves as dening a Markov pro cess determined by the transitions b etween the microstates

which we shall call the underlying Markov process Secondly that if the ab ove holds and the

probability distribution of microstates within each macrostate is the same as it would b e in the

n

equilibrium distribution P then the Markov pro cess of the transitions b etween macrostates

r

n

as its stationary distribution We have checked these results explicitly for small has as P

i

matrices binning some of the states and conrming that the equilibrium state probabilities

CHAPTER MULTICANONICAL AND RELATED METHODS

change in the way exp ected

We may dene the macrostates as any exhaustive and unique partitioning of the microstates

As we shall show this result has many useful consequences we may use the theory of Markov

n

pro cesses to relate the matrix to for example the mean and variance of the number of

visits to each state in a run of a particular length Imp ortant results relating to the exp ected

error of sampling from the Markov chain then follow see section This analysis cannot

b e applied directly to the transition matrix for the microstates b ecause it is to o large but

the eective transition matrix for the macrostates will usually b e a manageable size for the

energy macrostates of the Ising mo del with singlespinip Metrop olis its natural form is

d d

p entadiagonal of size L L and it can b e reduced further in size by binning the macrostates

d d

L L

which also makes it tridiagonal In contrast the microstate matrix has size

n

What of the validity of the approximation made in equation that P r jit may b e

replaced by its equilibrium Boltzmann value To the extent that the macroscopic variables

the macrostate lab els are the slowest to evolve one may exp ect this approximation to b e

xc

reasonable and to improve as P is approached where the simulation moves in an increasingly

n

diusive less directional fashion which gives time for relaxation of P r ji We have found

go o d evidence see section that the approximation is indeed essentially exact in the

multicanonical limit Moreover we emphasise that whereas equation was obtained by

using a mo del that is in fact known to b e wrong multinomial distribution of counts each bin

indep endent of the others equation assumes only that each transition is indep endent of

the preceding ones This is indeed true it is just the denition of a Markov pro cess and we

have just shown that the real simulation in equilibrium is describ ed by a Markov pro cess with

n

transition matrix

ij

It might b e argued that the approximation will b e less go o d in the early iterations where

the system will for at least part of the time b e moving rapidly from a release state with a very

can

low P to more probable states However we have found in practice as we shall describ e

n

b elow that even the rst iteration often gives a surprisingly go o d estimate of P

n

Now to return to the task of estimating P it is clear that having used equation to

n

n

n

we may use P the eigenvector of its transp ose as an estimator of P nd

E AV ij

CHAPTER MULTICANONICAL AND RELATED METHODS

n

Although the transition matrix is N N it may b e indeed should b e chosen sparse

m m

or even tridiagonal by binning the macrostates andor choosing the matrix R introduced

in section to prohibit transitions b etween widely separated energies which are in any

case very unlikely to b e accepted If this is done it takes only O N op erations to nd the

m

n

n

eigenvector Indeed if is tridiagonal it is trivial to nd P using equation we neglect

E

n n n

to generate P Then we use P normalisation initially and take P

ii ii

E i E i E

P

n

P In the rst one or two iterations all the others successively and nally imp ose

E i

i

n xc

when P still diers substantially from P it may b e necessary to work with the logarithms

n n

of P to prevent arithmetic overow and it is necessary to generate P in the direction in

which it is increasing to prevent the buildup of rounding errors Thus we should start from

can

the release states which were chosen b ecause of their low P and iterate to the equilibrium

states

As in section it is p ossible to incorp orate prior information from previous iterations

combining the latest estimate with the previous one using the variance as a weight However we

have found that this slows down the very rapid initial convergence that is the main advantage

of this metho d and is only of advantage near to the multicanonical limit where as we shall

see the visitedstates metho d is probably preferable Therefore we up date the preweighting

n

n n

co ecients using the simple expression ln P k of equation

E

The pro cedure is thus

start with

n

record histogram C

ij

a release simulation from unlikely macrostate

b p erform several thousand spin up dates of each

c go to b until simulation has moved to equilibrium or until it is moving only through

macrostates that have all b een visited enough times

d go to a choosing a dierent release state if necessary until all macrostates have

b een visited enough times

estimate transition matrix using equation

n

estimate eigenvector P E

CHAPTER MULTICANONICAL AND RELATED METHODS

set

n

n n

ln P k

E

if pro cedure has not converged go to otherwise stop

We have tested this metho d on the d Ising mo del with preweighting b oth of energy and

magnetisation we describ e the results for energy rst We used two release states the ground

state at E L spins all up or all down with equal probability and the innitetemp erature

states around E where we simply generated a starting state with each spin randomly up

or down Simulations launched from these states covered complementary parts of macrostate

space those coming from E L approaching nishing states around E for small n

from b elow and those from E approaching them from ab ove The iterative scheme outlined

ab ove was therefore mo died to run with the step b alternated for the two simulations so

that we could use the crossingover of the simulations as a criterion of their having moved to

equilibrium

To keep the matrix tridiagonal we blo cked the E macrostates so that the width of each

blo ck was E each blo cked macrostate except the lowest thus containing two of the

underlying macrostates The parameter N was set to for L and for L

TP

which meant that each iteration to ok a rather shorter time than a single iteration of the visited

states metho d since we are now counting spin ips not lattice sweeps We should note that

the VS metho d p erforms more than twice as many spin up dates p er second as the TP metho d

do es b ecause in the TP metho d the data histograms must b e up dated at every spin ip the

spins for up dating must b e chosen at random and we must check for nishing and reinitialise

the lattice more frequently In the visitedstates VS metho d the fundamental up date step is

a complete lattice sweep rather than a single spin ip and so there is less b o okkeeping and two

calls to the random number generator p er ip are saved For this reason the TP metho d would

not in normal circumstances cf section b e a candidate for use in the pro duction stage

of an Ising simulation even if it were not for its other disadvantages see b elow

n

The results for the convergence of are shown in gures and

It is apparent that the shap e of is outlined alb eit with quite a lot of noise right from

the rst iteration using this metho d The sup eriority at least in the early iterations that

this metho d has over the VS metho d is demonstrated more clearly in gures and

n

where the dierence b etween E and E is plotted for b oth metho ds

f inal f inal

CHAPTER MULTICANONICAL AND RELATED METHODS

120.0

100.0

80.0

5 η 4 60.0 3

40.0 2

20.0

0.0 -512 -384 -256 -128 0

E

n

Figure Convergence of the weighting function n using the TP metho d for

Ising L

is established using a long visitedstates run for L and nitesize scaling see section

followed by a long visitedstates run for L Moreover the advantage of using the

transition probabilities clearly increases with increasing system size for L it pro duces

a usable weighting function after ab out fteen iterations ab out hour while extrap olation

of the VS results suggests that it would take at least ten times as long probably appreciably

d

more since in the few early iterations that were p erformed far fewer than L macrostates were

sampled

We can see why the TP metho d converges so much faster by considering the maximum

n

change in that we can exp ect to pro duce in one iteration Supp ose we make N releases

R

from one of the unlikely starting states in the course of one iteration Then the maximum

dierence in the estimated probability of two adjacent states i and i would arise if every

one of the simulations followed a tra jectory that to ok it from i to i and then on to i

etc never returning to i In this case we would have C N C and we would

ii R ii

estimate

n

n

P

ii

i

N

R

n

n

P

ii i

CHAPTER MULTICANONICAL AND RELATED METHODS

450.0 400.0 350.0 300.0 η 250.0 4 3 200.0

150.0 2 100.0 50.0 0.0 -2048 -1536 -1024 -512 0

E

n

Figure Convergence of the weighting function n

using the TP metho d for Ising L

n n

which would mean that lnN But a change in the dierence of of this

R

i i

magnitude can now b e pro duced b etween every pair of states in the chain so that the total

d

available is N lnN L lnN Thus even for a fairly small N

max m R R R

say the metho d is able at least in theory to converge on the rst iteration no matter

d

what the system size The time taken p er iteration should also increase like N L

m

In practice the metho d do es not converge on the rst iteration and there is clearly a small

residual bias remaining even after many iterationsthe weight necessary to reach E is

overestimated We shall discuss these two problems in turn

The rst problem is largely due to the blo cking of the macrostates which compromises

the assumption underlying equation that a lo cal equilibrium is maintained within each

blo cked macrostate For this to b e true all degrees of freedom withing the macrostate must

relax on a faster time scale than that characterising the transitions b etween the macrostates

which is clearly not the case since the blo cked macrostates now contain dierent values of

energy In fact it is not hard to show that fewer transitions o ccur in the the direction through

n

macrostate space in which P is increasing and more o ccur in the opp osite direction than

CHAPTER MULTICANONICAL AND RELATED METHODS

100.0

80.0 Transition Probability Visited States n 60.0 −η final

η 40.0

20.0

0.0

0 10 20 30

Iteration n

n

Figure L Convergence of the weight as a function of iteration number for

b oth the TP metho d and the VS metho d The ordinate shows the dierence b etween and

where is the limiting b ehaviour of the VS metho d

f inal f inal

would if lo cal equilibrium were established within each blo cked macrostate This result follows

from considering the fact that transitions that cross the b oundary b etween blo cked macrostates

n

are more likely to come from underlying macrostates near the b oundary The upshot is that P

E

n

continually underestimates the change in required to reach This explains a large part

of the b ehaviour of the metho d though in the early iterations it should b e noted that there

d

is a particularly large underestimate of the weight required in states around E L

We attribute this to the complex b ehaviour of the system near the critical p oint where the

typical canonical congurations have more or less these energies In the early iterations when

the system moves ballistically down to these states from E there is not enough time for

relaxation of the large clusters typical of criticality In later iterations when the motion is more

diusive there is time for this relaxation equation is b etter satised and the anomaly in

n

disapp ears

n

We do not have such a full understanding of the bias in the limiting b ehaviour of P it

E

3

though the analysis of section may well b e applicable here to o

CHAPTER MULTICANONICAL AND RELATED METHODS

400.0

Transition Probability 300.0 Visited States n −η

final 200.0 η

100.0

0.0 0 10 20 30 40

Iteration n

n

Figure L Convergence of the weight as a function of iteration number for

b oth the TP metho d and the VS metho d

may simply b e a result of the fact that

n

n

n

P P

E

in which case b etter p erformance would b e obtained by making each iteration longer which

would reduce the bias For the very long runs in section for example the bias is not a

n

problem As a test of this we show in gure the dierence for an L

f inal

simulation where the fullness criterion N is doubled at each step so starting at N

TP TP

and increasing to N by the th iteration We b egin not with but with

TP

from the run shown in gure for which has not yet exceeded its limiting VS value We

combine the increase in N with jackknife biascorrection see and app endix D which

TP

however assumes that the bias go es as N

TP

It is apparent that the bias is indeed much reduced by increasing N but it is clearly

TP

much more p ersistent than N However using very large N while it may remove the

TP TP

bias also clearly removes the TP metho ds main advantage that of rapid convergence Unless

as in sections or there is some particular reason for sticking with measurements

CHAPTER MULTICANONICAL AND RELATED METHODS

4.0

NTP =1200 ... 153600 Limiting behaviour with NTP =600 2.0 n −η 0.0 final η

-2.0

-4.0

0 2 4 6 8 10

Iteration n

n

Figure L Convergence of the weight as a function of iteration number for

the TP metho d with increasing N and bias correction

TP

n

of transitions we would recommend switching to visitedstates when is changing b etween

n

iterations by less than could b e obtained by visitedstates in the same time If the nal do es

not then immediately pro duce a viable multicanonical distribution the bias is scarcely more

than an order of magnitude only one or two iterations of the VS metho d will b e required to

reach it

can

If the distribution P has more than one p eak then the metho d needs slight mo dication

For example consider magnetisation preweighting for an Ising mo del for If we are

c

to sample all of the macrostate space by releasing the system from a state of low probability

and letting it move to one of high probability then we need a release p oint at M as well

as at M jM j otherwise states around M will rarely if ever b e visited We

M AX

could initially imp ose the constraint M and generate equilibrium congurations with this

constraint b efore allowing M to change but here we use random M congurations In this

case we have adopted a twostage pro cess The rst stage serves to outline the structure of the

macrostate space and b o otstraps the second stage which completes the weight assignment

In the rst stage we p erform a sequence of simulations launched successively from one of

the three initial states then once this has converged we rene the weights using transition

probability data gathered using only the two ordered microstates as starting states We do

CHAPTER MULTICANONICAL AND RELATED METHODS

this extra renement b ecause it is not generally to b e exp ected that the limiting set of weights

pro duced with three launch states will b e multicanonical b ecause the conditions presupp osed

in equation are not fullled A typical M microstate at comprises large

c

clusters of spins of the same orientation which take a long time to evolve As a result the

information the algorithm gleans ab out the M macrostate is biased by the systematic

launch from a microstate which b eing random has an energy signicantly higher than those

typical of M For this reason the algorithm would b e exp ected to overestimate the weight

to b e attached to the M macrostate in the rst stage But this then ensures that in the

d

second stage simulations released from the M L states will b e able to reach the M

states which they would not with the canonical weighting In general the metho d will require

can

N stages where N is the number of maxima in P the number of phases if each

p p

maximum has the same weight This scheme is shown in op eration in gure for L

at we show only half of the macrostate space to reduce cluttering c

50.0

40.0 iteration 1 (VS) iteration 1 (TP) iteration 5 (TP) naive fss of L=16 η (M) 30.0 limiting L=32 η (M) η

20.0

10.0

0.0 0 256 512 768 1024

M

Figure for Ising with L at inferred from one iteration of

c

d

the VS metho d one or ve iterations of the TP metho d and nave L nitesize scaling

xc

of L The solid line shows the limit established from long VS runs p erformed to

gather data for section

It is apparent that notwithstanding our concern that the result with three release p oints

CHAPTER MULTICANONICAL AND RELATED METHODS

would b e biased the estimate of pro duced is found to converge on the rst iteration The

can

faster convergence compared with the application to energy presumably results from P M

b eing wide and at in the central region so that the relaxation of M is naturally slow even

for the rst iteration The relaxation time of the energy is much faster so within only a short

time equation is approximately satised The eect of the random M launch p oint is

n

therefore small It is also signicant that the matrix is naturally tridiagonal so that it is

ij

unnecessary to blo ck it The estimate of is in fact go o d enough to b e used immediately in

a multicanonical pro duction run in contrast to the result of a visitedstates run of the same

d xc

length or to nave L nitesize scaling of L which are also shown in gure

We also show the eect of pro ceeding to the second stage of renement using only the

d

microstates at M L as release p oints this is marked as iteration in gure which

was the rd iteration conducted with only two release p oints In this case b ecause the rst

stage has p erformed so well only a marginal further improvement is obtained and as with

energy there seems to b e a small residual bias

To summarise then we have found that the TP metho d provides very much faster initial

convergence to the multicanonical distribution than the visitedstates VS metho d of section

we have demonstrated its ecacy for fairly large systems L and L energy

L magnetisation where variations in canonical probability of more than one hundred

decades must b e covered If the transition matrix has a suitable structure convergence can b e

achieved on the rst iteration however if this is not the case the nal convergence may b e

p o orer than that of the VS metho d and there may b e a residual bias In practice at least for

the Ising mo del it is probably b etter to switch to the VS metho d for nal rening when is

changing only a little b etween iterations

FiniteSize Scaling

xc

It will b e noticed that the shap e of the nal E generated in the previous sections is

d

very similar for dierent system sizes merely b eing scaled by the system size L This is a

manifestation of the extensivity of the canonical averages and free energy away from criticality

xc

section As a consequence for a small system once generated and the metho ds

xc

we have examined ab ove make it quite easy to do this can b e used to predict for a large

xc

system we t a function such as a spline or Chebyshev p olynomial to the smallsystem

then scale and interpolate it The predicted can b e rened if necessary to a multicanonical

CHAPTER MULTICANONICAL AND RELATED METHODS

form again by using one of the two previous metho ds The renement thus corresp onds to

measuring correctiontoscaling terms In the Bayesian framework the use of nitesize scaling

FSS corresp onds simply to b eginning with prior information ab out the sampled distribution



which comes from smaller systems reected in a P jH which has its mean at the nitesize

scaling estimate and a width chosen to reect or to underestimate the exp ected magnitude of

the unknown correction terms

This has b een found to work extremely well away from criticality for the Ising mo del at

xc

scaling the E gives an estimate accurate to within which then requires

only a few iterations of the visitedstates metho d to converge to a fully multicanonical form

This convergence is shown in gure The subgures that comp ose this gure are to b e

read in the order that text would b e from left to right and top to b ottom At the top left is

xc

the initial estimate of L pro duced by nitesize scaling of L Next to this

on the right is the histogram C pro duced in a MC run sampling with We then used this

histogram to pro duce The dierence b etween and is very small on the scale of

so we show the dierence b etween them rather than itself nd row left

Sampling with then pro duced the second histogram C nd row right and the remaining

i

two lines of the gure show the equivalent data for iterations and

It is apparent that spreading of the sampled distribution o ccurs just as in section

fractionally large uctuations o ccur around the edge of C where it go es to zero which translate

into a rather jagged C but the irregularities are then smo othed away in subsequent iterations

As in the previous investigation of the VS metho d we so not extend sampling also to those

can

states b elow P The histogram broadens rather faster than in section b ecause

has approximately the right shap e even in the region that is not sampled in C Thus a small

unform increase of the probability of all the macrostates in this region renders many more of

them accessible than would b e the case if were constant there

It will b e noted that in this set of runs we adopted a slightly dierent way of dealing with

n

ab ove E simply cutting it o at a constant value rather than letting it increase as

E E E The result is that the states ab ove E are scarcely sampled so

when using these results to calculate free energies as we do in sections and we use

can

the symmetry of the Ising density of states ab out E to reconstruct P E for E

However there are situations where the simple FSS describ ed ab ove cannot b e applied

can

One is P M at where a simple FSS scaling do es not correctly predict the shap e of c

CHAPTER MULTICANONICAL AND RELATED METHODS

see gure As we shall see in section b elow in some ranges of M values M

c

d

scales like L but in others it ob eys dierent laws With a knowledge of the correct critical

can

exp onents a critical P M at could b e scaled correctly however for the imp ortant case

c

of the simulation of spinglasses see there exists no FSS theory to predict the scaling of



In such cases the TP metho d or linear extrap olation would b e preferable

4

Though we have not yet tried to apply this metho d to spin glasses

CHAPTER MULTICANONICAL AND RELATED METHODS

500.0 750.0

500.0 η1 1 250.0 C 250.0

0.0 0.0 -2048 -1024 0 -2048 -1024 0 E E

5.0 750.0

500.0 ∆η1 2 0.0 C 250.0

-5.0 0.0 -2048 -1024 0 -2048 -1024 0 E E

5.0 750.0

500.0 ∆η2 3 0.0 C 250.0

-5.0 0.0 -2048 -1024 0 -2048 -1024 0 E E

5.0 750.0

500.0 ∆η3 4 0.0 C 250.0

-5.0 0.0 -2048 -1024 0 -2048 -1024 0

E E

Figure Renement of a FSS estimate of using VS top left estimate for L

xc n n n

pro duced by nitesize scaling of L Below on left for iterations

top to b ottom on right from top to b ottom histograms for iterations

CHAPTER MULTICANONICAL AND RELATED METHODS

Using Transitions for Final Estimators Parallelism and Equili

bration

We now present some new results which show how the multicanonical metho d could b e im

plemented eciently on a parallel computer and how in some circumstances we can do away

with the necessity of p erforming full equilibration of the simulation in the sense in which it is

usually meant

We may reasonably sp eculate that any algorithm which is to b e widely used in MC simulation

in the future will have to b e amenable to running on parallel computers Unfortunately the

multicanonical metho d cannot b e parallelised in the same way as Boltzmann sampling often

can b e by geometrical decomp osition in which each pro cessor of the parallel computer lo oks

after a subvolume of the whole simulations Geometrical decomp osition works for Boltzmann

sampling when the forces b etween particles are shortranged so that calculation of E to

b e used in equation for each trial particle move is a lo cal op eration Particles which

are suciently widely separated that they cannot interact b efore or after any p ossible moves

may then b e up dated in parallel However with multicanonical sampling this is no longer

p ossible the transition probability dep ends on where is a function of the total energy or

magnetisation of the system Therefore if we generate several trial moves in dierent regions

of the simulation the transition probability for each will dep end on the nal macrostate which

we do not know a priori since it will dep end on how many of the other moves are accepted

However some kinds of parallelisation are still p ossible First it is p ermissible to generate

several trial moves with geometric decomp osition and then to p erform just one at random of

those that would b e accepted This would pro duce a sp eedingup in a situation where the

acceptance ratio was very low but not otherwise It is also p ermissible to up date in parallel

with geometric decomp osition if the moves that we generate are chosen to keep the value of

the preweighted variable and thus constant so this would corresp ond to Kawasaki dynamics

on the Ising mo del with M This kind of parallelisation was used in combination

with primitive parallelismsee b elow for the simulations of chapter Moves that change

must b e p erformed serially and so the total sp eedup with parallelisation is unlikely to b e

very large since the preweighted variable which varies over a wide range in a multicanonical

5

Very recently and in a dierent context a fuzzy MC metho d has b een introduced which we sp eculate

may enable errors introduced by parallel multicanonical up dating to b e corrected This is one area of p ossible

future investigation

CHAPTER MULTICANONICAL AND RELATED METHODS

simulation and thus has the slowest uctuations still has to b e explored in serial This kind

of parallelisation is b etter suited to the expanded ensemble see section for a discussion

of the kinship b etween this and the multicanonical ensemble where the number of weighted

states is generally quite small Finally we may simply p erform primitive parallelism where

N copies or replicas of the simulation are run in parallel one on each pro cessor The results

r

p

It is this of all the replicas are then averaged to give estimators with a variance lower by N

r

kind of parallelisation that we shall now discuss in more detail showing how the multicanonical

ensemble is particularly suited to it

d

Imagine p erforming a simulation with primitive parallelisation where b oth N and L the

r

system size are quite large As we discussed in section a MC simulation usually needs

an equilibration p erio d b efore unbiased results are pro duced ie conguration is generated

n n n

with P where P is the eigenvector of the microstate transition matrix that we

r s

n d

have chosen with the goal of sampling P In the case we are discussing as N and L

r

increase the equilibration time b ecomes a larger and larger fraction of the total simulation

time once equilibrium is reached a short run suces b ecause N is large but getting to

r

d

equilibrium takes a long time b ecause L is large This problem aicted us quite severely in

the simulations of section

The multicanonical ensemble might seem to oer a way of solving this problem b ecause

macrostates that require no equilibration the ground state or equilibrate very fast innite

temp erature have a high probability under it Can we therefore simply start every simulation

in one of these states and b egin collecting data immediately with little or no equilibration

If conventional estimators are used the answer is no but we shall show that by using the

eigenvector estimators we have just introduced we may indeed do just that

n n

is that states near C The problem with using conventional visitedstates estimators P

i i

to the starting state and the nishing state to o in fact receive to o much weight Supp ose we

have a p erfect multicanonical distribution in magnetisation P M const Let us start each

of our simulations in the ve magnetisation ground state and let it evolve until it has done just

a few random walks over the whole range of M this will take quite enough time since we are

d

assuming that L is large The exp ected distribution of C M is then as shown qualitatively

in gure even though the underlying P M is at there is a concentration of probability

near the starting state and this only disapp ears in the C M limit What we have is

basically a diusion problem the probability of b eing in state M at time t P M tjM starts

CHAPTER MULTICANONICAL AND RELATED METHODS

as a delta function at M M t and slowly spreads out over the allowed states nally

b ecoming uniform only as time go es to innity The exp ected C M is prop ortional to the sum

over time of P M tjM and the acceptance ratio in each state plays the role of a diusion

co ecient

We should note that this problem is always present in MC simulation but its eect is usually

very small the number of sampled macrostates in a Boltzmann sampling simulation is small

and the Markov chain is long so the the slight bias toward the starting state which dies away

p

as N is swamped by the random errors of order N It o ccurs even in a simulation which

c c

is well equilibrated by which we mean that the memory of any transient initial state which

may have had a very low probability under the prevailing sampled distribution has died away

and if the the simulation were a black b ox we could not lo ok into we would ascrib e to every

microstate and macrostate its equilibrium probability But of course in fact the simulation is

in exactly one state with probability one when we b egin sampling To ignore this fact has little

eect in a Boltzmann sampling simulation but it would lead to serious inaccuracy here where

the number of sampled macrostates is large and the Markov chain is comparatively short

0.0070

0.0060

0.0050 c

0.0040

0.0030

/N 0.0020

0.0010

0.0000 -128 -64 0 64 128

M

Figure Exp ected histogram of visited states for a diusive system with states with

a separation of It is assumed that we b egin in the state at and do steps moving

only to the two adjacent states with equal probability This gure was generated by solving the

diusion equation for a system with a constant diusion co ecient and b ears only a qualitative

relationship to the Ising problem

We can largely overcome this problem if we record the transition histogram C and use ij

CHAPTER MULTICANONICAL AND RELATED METHODS

equation to give eigenvector estimators of the equilibrium state probabilities as we have

already seen when we were using the metho d to nd the multicanonical distribution it largely

removes the bias of the starting and nishing states The actual number of visits to each state

n

should not aect our estimators of P except in so far as statistics will b e p o orer for the states

that are more rarely visited Nevertheless there are two p ossible sources of bias in the result one

coming from equation and the other from the fact that the use of eigenvector estimators

do es dep end on P ji having its Boltzmann value see section The bias coming from

equation should get smaller as the number of counts gets bigger For this reason we p o ol

the results of all the runs to dene the eigenvector estimator and use jackknife blo cking to

give the error As regards the other source of error it is not in fact the case that P ji do es

necessarily have its Boltzmann value even given that it did at t but we argue on physical

grounds that in the multicanonical ensemble where movement through the macrostates is

always diusive rather than directional the macroscopic variables will evolve much more slowly

than the microscopic so giving time for a lo cal equilibrium within each macrostate to re

establish itself which must b e the Boltzmann equilibrium since the microstate transition rules

xc

are chosen to pro duce this This assertion is conrmed implicitly by the results for P M

and more directly by measurements we have made of spinspin correlation functions

We show some results for the L and L d Ising mo del in gures and

We use magnetisation preweighting at with an M that is symmetric ab out M so

c

xc

we know that the underlying distribution P M should b e symmetric even if it is not quite

d

at We p erform runs starting in the M L ground state These results were in fact

generated using a serial HP computer but the imp ortant p oint is that they could have b een

generated in parallel since each run started from the same state and was indep endent of all the

others For L each run was stopp ed when the simulation returns to that state after having

d

in the intervening p erio d visited the M L ground state for L we simply p erformed

a constant number of up dates ip attempts for each run so the nishing states were

distributed over all the macrostates Then the transition histograms from all the runs were

xc

p o oled for the estimation of P M It is apparent that using C M solid line to give the

xc

estimator of P M would pro duce a systematic error but that this bias is removed to within

the random error by using estimators from the transition matrix The slightly biased jackknife

estimator from the p o oled transitions is called evecJ op en triangles and the doublejackknife

biascorrected estimator is evecJJ lled triangles The error bars are shown only on evecJJ

CHAPTER MULTICANONICAL AND RELATED METHODS

but are ab out the same size on evecJ On the L graph we also show an evecJ estimator

calculated just from the rst runs circles there is an obvious bias present here although

it is in fact still no larger than the random error which is of course also larger this time For

L there is very little dierence b etween evecJ and evecJJ but for L it do es app ear

that there is a systematic error in evecJ which is eliminated by the biascorrection In a real

can xc

implementation P M would of course b e recovered from P M for the determination of

free energies and canonical averages though we have not done this here

0.0050

P(M) from 1-500, using C P(M) from 1-100, using evecJJ 0.0045 P(M) from 1-500, using evecJ P(M) from 1-500, using evecJJ P(M) 1/257

0.0040

0.0035 -256 -192 -128 -64 0 64 128 192 256

M

xc

Figure Normal and eigenvector estimates for P M at L

c

We chose to study P M at to show that the metho d of doing many short runs can

c

cop e with the large slowlyevolving clusters of criticality without introducing a bias We do

not exp ect that it should if we allow two random walks over the whole range of M rep eated

enough times in parallel then we exp ect to generate all the p ossible cluster structures There

can b e no structure whose generation requires a complex reentrant tra jectory with several

visits to each end of the chain of states b ecause the pro cess is Markovian and so starting once

from the nondegenerate end state is equivalent in terms of the probabilities of what happ ens

CHAPTER MULTICANONICAL AND RELATED METHODS

0.0012 P(M) from 1-500, using C P(M) from 1-500, using evecJ P(M) from 1-500, using evecJJ 0.0011 1/1025 P(M)

0.0010

0.0009

0.0008 -1024 -768 -512 -256 0 256 512 768 1024

M

xc

Figure Normal and eigenvector estimates for P M at L

c

after of having b een there any number of times b efore

We have also investigated lo cal spinspin correlation functions to corrob orate the claim that

P ji has its Boltzmann value to go o d accuracy Let the spins b e represented by s and let

r b e a vector from one lattice site to another Then r M the correlation function with

magnetisation M is dened by

P

ssr s

sr M given that r M

0

r

0

s s

d

ssr M L

M

d

M L

where r runs over the lattice sites To estimate ssr we calculate

0 M

X

d

sr sr r L

0 0

r 0

CHAPTER MULTICANONICAL AND RELATED METHODS

and then our estimator of ssr is the average of over those congurations which

M

have magnetisation M We consider r only along the rows and columns of the Ising lattice

and average over equivalent directions so that r b ecomes a scalar r L in units of

the lattice spacing

We have measured r M for the d Ising mo del under two conditions rst for a series

of runs like those that pro duced the eigenvector estimator and second for a single long

run containing the same number of congurations in total starting in a random conguration

therefore probably near M and allowed to equilibrate for spin ips twice as

d d

long as required for a random walk from M L to M L and back b efore gathering

data The rst set of conditions thus gives r M as pro duced by the eigenvector estimator

and the second gives it under multicanonical equilibrium conditions The results are shown in

gure it is apparent that the two correlation functions are identical to within exp erimental

error In particular it should b e noted that there is no evidence of M M asymmetry in

the eigenvector runs even though the visited states histogram C is decidedly asymmetric

As regards the shap e of the correlation functions themselves we observe that as we would

exp ect r M decreases with increasing r and is close to zero for large jM j If r is small it

then increases to a maximum at M but for r L it is negative around M a

consequence of the tendency of the system to exist in large clusters of opp osite magnetisations

To summarise then we have shown that we can remove the bias of the starting state in MC

simulation by the use of the transition matrix to measure macrostate probabilities combined

with the multicanonical ensembles ability to reach nondegenerate macrostates so we do not

have to worry ab out the probability distribution of microstates within the starting macrostate

This op ens up the p ossibility of doing simulation without full equilibration of the preweighted

variable and thus of massively parallel implementations of the multicanonical ensemble in which

each pro cessor do es only a short run on a large system

We do not however recommend using this metho d for simulation on serial computers

b ecause as we said in section the TP metho d p erforms fewer spin up datessec than

do es the normal VS metho d If there is no problem with bias as is the case with a serial

implementation where a single long Markov chain can b e pro duced then any extra sp eed is

clearly advantageous It is also obviously b etter not to have to do biascorrection if it can b e avoided

CHAPTER MULTICANONICAL AND RELATED METHODS

0.8 r=1 r=2 0.6 r=3 r=5 r=7 0.4 r=10 r=15 (r,M)

γ 0.2

0.0

-0.2 -1024 -512 0 512 1024 M

0.8 r=1 r=2 0.6 r=3 r=5 r=7 0.4 r=10 r=15 (r,M)

γ 0.2

0.0

-0.2 -1024 -512 0 512 1024

M

Figure Estimators of the correlation function r M for L Ising Error bars

are not shown but their size is comparable with the scatter of the symbols Upp er diagram

d

results from a series of short multicanonical runs starting in the M L ground state lower

diagram results from a single long multicanonical run after equilibration

If the system is so large that only a few random walks over the preweighted variable can b e

n

pro duced with the serial machine then no accurate estimate of P can b e made either by VS

or TP metho ds

CHAPTER MULTICANONICAL AND RELATED METHODS

Results

In this section we shall present estimates of free energy and canonical averages of the d Ising

mo del which will allow a comparison b etween the multicanonical ensemble with E

thermo dynamic integration see section and exact results Then we shall present some

new results where we use multicanonical sampling to check analytic predictions ab out the form

can

of P M for the d Ising mo del at

c

Free Energy and Canonical Averages of the d Ising Mo del

Measurement of free energy is of course our central concern in this thesis so we shall b egin

d

with measurements of g G L made using the multicanonical ensemble for the L

and L d Ising mo del The distribution was obtained by the TP metho d the by

xc

nitesize scaling followed by rening with the VS metho d Once a suitable was found VS

estimators were used for all pro duction runs The pro duction runs comprised a total of

L and L lattice sweeps generated in ten blo cks with jackknife blo cking

used to estimate errors for all results The early iterations which contribute only to nding

the multicanonical distribution to ok an extra L and L lattice

sweeps

Free energy results are shown in gures and resp ectively along with the exact

nitesize results from solid line The error bars are much smaller than the data p oints

triangles The multicanonical distributions that we used for these measurements were de

signed for the measurement of g at by using equation as describ ed in section

so we did not make any particular eort to extend the multicanonical sampling to energies

can can

lower than the p eak of P E or higher than the p eak of P E We would

therefore exp ect to b e able to determine g accurately for and this is indeed

can

found to b e the case For L P E gives appreciable weight to the ground

state E so we can in fact calculate g for any higher value of to o we show g up to

in the gure For L this was not p ossible but we were able to nd g for up to

The sd error bars were in all cases smaller than for L and smaller than

for L while g itself varied b etween and over the range of investigated the scale

of the inset in gure gives some idea of the accuracy obtained Figures and in

section where the multicanonical results are compared with Thermo dynamic Integration

CHAPTER MULTICANONICAL AND RELATED METHODS

show the dierence b etween the MC estimates of g and the exact value

0.0

-5.0 g

-10.0 Exact Multicanonical

-15.0 0.0 0.2 0.4 0.6 0.8 1.0

β

Figure free energy g of Ising mo del L million sweeps

xc

By reweighting with exp and normalising to recover the canonical distribution we can

also measure E and the sp ecic heat capacity

E

C E E

H

Again We can also calculate these exactly using results from The exact results for

d d

e E L are shown in gure and for c C L in gure The anoma

H H

lous b ehaviour in the critical region asso ciated with the continuous phase transition in the

L limit is clearly visible the gradient of the internal energy b ecomes steep er as L in

creases this manifesting itself also in the increasing height of the p eak in the sp ecic heat

capacity In gures and we show the corresp onding results from a single multicanon

ical ensemble simulation of the L system It is apparent that agreement with the exact

results solid lines is very go o d In the main gures which show the full range of the errors

are much smaller than the symbols to see deviations from the exact results we must magnify

smaller regions The largest fractional error in e is ab out o ccurring at

very near the critical p oint where dE d is large Away from criticality the typical error is

more like For the sp ecic heat the largest fractional error again o ccurring at the critical

CHAPTER MULTICANONICAL AND RELATED METHODS

0.0

-5.0 -2.0240 g -2.0250 g -2.0260

-10.0 -2.0270

-2.0280 0.555 0.560 0.565β 0.570 0.575 Exact Multicanonical -15.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6

β

Figure free energy g of Ising mo del L million sweeps Inset detail of vicinity

of vertical scale expanded more than times

p oint is while elsewhere it is typically ab out We do not show results for L

which are very similar

A Comparison Between the Multicanonical Ensemble and Ther

mo dynamic Integration

We have also p erformed simulations using thermo dynamic integration to determine g As

we describ ed in section the derivatives of free energies can b e related to more accessible

canonical averages in this case we have used

g

e

We therefore made measurements of e using Boltzmann sampling simulations for

evenly spaced values of b etween and investigated in that order An

interpolating spline was tted to the data p oints and integrated numerically with resp ect to

g was then found by using lim g ln The lengths of the runs were chosen

so that L and L sweeps were p erformed at each temp erature

enabling a more or less direct comparison with the multicanonical ensemble Each simulation

was started using the nishing conguration from the previous one to reduce the equilibration

CHAPTER MULTICANONICAL AND RELATED METHODS

0.0

-0.5

L=16 L=32 -1.0

-1.5

-2.0 0.0 0.2 0.4 0.6 0.8 1.0

β

Figure Exact internal energyspin of L and L Ising

required which was conned to lattice sweeps

The results are shown in gures L and L We show b oth the

results from thermo dynamic integration circles and those from the multicanonical ensemble

triangles So that the accuracy obtainable is clearly visible we have plotted the dierence

b etween g from simulation and the exact g from For L at all temp eratures and

L at small the two metho ds yield comparable accuracy although the multicanonical

results are a little b etter and the error bars include the line g as they should However

for L a large deviation from the exact result app ears in the thermo dynamic integration

p oints at ab out This is not a random error the error bars which represent the

measured spread of the estimator from blo cking are approximately the same size as those on

the multicanonical data but is instead a systematic error caused by the presence of a phase

transition on the path of integration This is a problem that often severely reduces the accuracy

of simulations that use thermo dynamic integration Here the innitevolume Ising mo del has

a continuous phase transition at and for the L system e changes

c

so rapidly with around this p oint that the data p oints are inadequate to determine its shap e

pro ducing the systematic error in g The shap e of the deviationrst p ositive then negative

suggests that the corners of a sharp sigmoid curve are b eing smo othed away To reduce this

error we would have to space the integration p oints dierently clustering them around the

CHAPTER MULTICANONICAL AND RELATED METHODS

2.0

1.5

L=16 L=32 cH 1.0

0.5

0.0 0.0 0.2 0.4 0.6 0.8 1.0

β

Figure Exact sp ecic heat capacity of L and L Ising

phase transition p oint By contrast no such sp ecial care is required in the application of the

multicanonical ensemble It is untroubled by the Ising phase transition and indeed can even

b e used to sample through a rstorder phase transition see and section The error

bars get larger b ecause of the large critical uctuations but they still contain the line g

The multicanonical error bars unlike those on the the thermo dynamic integration p oints thus

still provide a trustworthy condence limit on the accuracy of the results

We sp eculated in section that consideration of the algorithm used and the estimators of

free energy dened on the congurations pro duced could b e partially separated To investigate

this we have tried a multicanonicalintegration hybrid where g is estimated by rst nding

the internal energy e from the results of the multicanonical ensemble then integrating it

with resp ect to The variation of e through the critical region is tracked well this time

so we would not exp ect systematic errors In fact we nd that exactly the same estimators are

pro duced this way as by direct estimation of exp e right down to the seventh signicant

gure There is therefore no advantage to the pro cedure

It has b een suggested that thermo dynamic integration along a path that avoids a phase

transition outp erforms the multicanonical ensemble or related metho ds like overlapping dis

tributions for large L b ecause it do es not require that all macrostates b e sampled We see

no evidence of this but we have not examined suciently large system sizes For the L

CHAPTER MULTICANONICAL AND RELATED METHODS

0.0

-1.3

-0.5 -1.4

-1.5 -1.0 0.43 0.44 0.45 β

-1.5 exact Multicanonical

-2.0 0.0 0.2 0.4 0.6 0.8 1.0

β

Figure The sp ecic internal energy e for the L Ising mo del Solid line exact

results Points MC results

can

thermo dynamic integration the system was still suciently small that P E at each sim

ulation p oint overlapped signicantly with its neighbours This suggests that we would need

to b e dealing with extremely large systems b efore this b ecame an issue and it is precisely in

large systems that the b ehaviour around phase transitions is to o singular for thermo dynamic

integration to cop e with

To summarise then we have found that multicanonical sampling p erforms at least as well

as thermo dynamic integration in a singlephase region and is much b etter able to deal with

phase transitions The question of the sup eriority of thermo dynamic integration in very large

systems has not b een resolved but it is clear that it could only apply to large systems away

from phase transitions where internal energy is very smo othly varying

CHAPTER MULTICANONICAL AND RELATED METHODS

2.0

1.95 1.5 1.90 1.85 cH 1.80

1.75 c 1.70 H 1.0 0.420 0.425 0.430 0.435 0.440 0.445 0.450 β

Exact Multicanonical 0.5

0.0 0.0 0.2 0.4 0.6 0.8 1.0

β

Figure The sp ecic heat capacity c for the L Ising mo del Solid line exact

H

results Points MC results

P M at

c

As we have already discussed sections the pdf of the magnetisation of the Ising mo del

has an unusual form at the critical p oint related to the critical scaling of the free energy G

with L see app endix A It is well known that at with no external eld and for

c

large L

can

P mdm p xdx

L

d

where m M L x mm and

d

L m M

L

where p x which is in general nonGaussian is unique to a particular universality class a

universality class is a collection of p ossibly highly disparate systems united by spatial dimen

sionality and certain gross features of their interactions which are found to have very similar

CHAPTER MULTICANONICAL AND RELATED METHODS

0.0010

0.0005

0.0000

-0.0005 exact -g Thermodynamic Integration

MC -0.0010 Multicanonical g

-0.0015

-0.0020 0.0 0.2 0.4 0.6 0.8 1.0

β

Figure Dierence b etween exact free energy and various MC estimates L

critical b ehaviour and is the equationofstate exp onent for the d Ising universality

class p x for this universality class is shown in gure

From rigorous results and scaling arguments we may make the following ansatz for

p x at large x

p x p x p x expa x

ie for the d Ising universality class

p x p x expa x

The form of this function with p and a is shown in gure it is

clearly in at least qualitative agreement with the measured p x gure for x

The form of the prefactor x is in accord with a recent theory that relates p x to

the stable distributions of probability theory However this theory also suggests the existence

of further nonuniversal contributions to p x which would fall o as a p ower at large x

CHAPTER MULTICANONICAL AND RELATED METHODS

0.0010

0.0005

0.0000 exact

-g -0.0005 Thermodynamic Integration MC Multicanonical g

-0.0010

-0.0015 0.0 0.1 0.2 0.3 0.4 0.5 0.6

β

Figure Dierence b etween exact free energy and various MC estimates L

and so b e asymptotically dominant To see whether these nonuniversal corrections exist we

need to compare the ansatz quantitatively with an accurate measurement of p x ie

can

P M in the largex regime This measurement cannot b e p erformed by Boltzmann

c

sampling whatever the exact form of p x may b e it clearly falls o very fast for large x and

it will b e imp ossible to measure it accurately ab ove x The ground state by comparison

d

lies at x L which tends alb eit very slowly to innity in the thermo dynamic limit

max

and for L is at

We therefore used a multicanonical simulation with M arranged to b e at over

d d

the entire range M L to M L The usual reweighting then enables us to recover

can

P M and thus p x accurately measured over the whole range of M The inset in gure

shows the canonical probabilities of the ground state and rst excited state of the

system measured this way To b e certain we were in a regime where correctionstoscaling were

small we investigated quite large system sizes L and L the latter is ab out as large

d

near as is feasible using singlespinip Metrop olis whose acceptance ratio falls o like L

CHAPTER MULTICANONICAL AND RELATED METHODS

1.4

8.0e-85

1.2 6.0e-85

p*(x) 1.0 4.0e-85

2.0e-85

0.8 0.0 p*(x) 1.61280 1.61300 1.61320 x 0.6

0.4

0.2

0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

x

Figure The critical probability density function p x of the scaled order parameter x

for the d Ising universality class determined by MC simulation p x is symmetrical ab out

x The inset shows the canonical probability of the ground state and rst excited states

of the d Ising mo del which lie in the extreme tail of p x and have b een measured by

multicanonical simulation as describ ed in the text

the ground state for reasons we explained in section Because the exact scaling form of

can

P M was unknown or rather was part of what we wanted to investigate we used the TP

c

xc

metho d to generate initially moving to the visitedstates VS metho d for nal renement

and for the pro duction stage

For L we did iterations of the TP metho d the rst taking minutes each on an

HP workstation the last taking hours Then we allowed the simulation to pro ceed using

the VS metho d with a gradually increasing automatically increased N For L we did

AV

spin ips h for each then switched iterations of the TP metho d generating ab out

to the VS metho d The details of the implementation of the VS metho d are in table The

TP program used was an early version which converged much more slowly than the one that

pro duced the results of section for M for Ising it had not in fact quite converged

when we moved to VS estimators

During b oth the nal renement and pro duction stages we allowed continued up dating of

n

using the metho d of section to incorp orate prior information Finally we reanalysed

the results combining s according to the prescription of equations and but using

CHAPTER MULTICANONICAL AND RELATED METHODS

1.4

1.2

1.0 p*(x)~ 0.8 0.6

0.4

0.2

0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

x

Figure The ansatz p p x expa x with p and a

L N iterations lattice sweepsiter timeiter

AV

h

h

h

h

h

h

h

Table Details of the visitedstates sampling for L and L with magnetisation

preweighting

only the last seven L and ten L iterations for which all sampled distributions had

b een approximately multicanonical Thus we avoided the bias that would have resulted from

n

using the early estimators which were not multicanonical while still avoiding at runtime

the division of the pro cess into nding and pro duction phases or at least we were able

to decide a posteriori where the pro duction phase was to b egin This pro duces a function

can can

M the corresp onding b est P M is recovered using P M exp M

best best

The nal estimate of M ranged over for L and for L corre

can can

sp onding to decades of variation in P M This P M was used to pro duce the graph

d

of p x in gure From M L we can estimate g as describ ed in section

c

d d

We nd using the dierence b etween L and L to estimate the error the remarkably

CHAPTER MULTICANONICAL AND RELATED METHODS

accurate estimates

g L cf g L from

c c

g L cf g L from

c c

These demonstrate the accuracy with which the ground state probabilities as shown in the

inset in gure have b een measured Now let us consider the predictions of the scaling

ansatz for p x Figure shows q x lnx p x estimated from the multicanonical

results plotted against x for x According to the ansatz we would exp ect this to b e

linear with gradient a In fact there is a linear regime for mediumsized x with a deviation

at b oth ends though the deviation near the origin cannot b e seen on this graph b ecause of the

scale At small x we nd that p x is larger than exp ected from the ansatz while at large

x it is smaller The lowx deviation comes from the approach to zeromagnetisation states to

which the ansatz ascrib es zero probability while the highx one is a nonuniversal nitesize

eect caused by the underlying microscopic structure of the system b ecoming apparent as we

approach the ground state The relative weights of the ground state and rst excited state

d

for the Ising mo del for example must b e L exp in contradiction to the scaling

c

ansatz while some dierent expression will apply to the pdf of for example the density of

d

the d LennardJones uid However in line with x tending to innity with L the

max

breakdown of the scaling ansatz o ccurs at larger x for the larger L system

Figure also shows that we can discount the suggestion that the asymptotic b ehaviour of

p x is a p owerlaw decay If it were then we would exp ect that at large enough x q x ln x

which would deviate downwards ie with a negative second derivative from the prediction

Instead the observed deviation from the ansatz has the opp osite sense

In dening q x we included an x to cancel the exp ected x in the ansatz for p x

However we would also like to demonstrate that the p olynomial prefactor in p x is indeed

x This is not easy to do b ecause the eect of the p olynomial is of course dominated by the

very strong exp onential decay Nevertheless we have tried tting functions of the form

p x expa x

to p x determined by MC over a series of windows of xvalues choosing the values of

6

at least for the Ising mo del for which it must b e admitted nonuniversal correction terms are often found

to have zero amplitude b ecause of the high symmetry

CHAPTER MULTICANONICAL AND RELATED METHODS

200

150 L=64 L=32

q*(x) 100

50

0 0 500 1000 1500 2000

x16

Figure The function q x lnx p x plotted against x x for L

and L We exp ect this relationship to b e linear with gradient a in the region where the

ansatz applies

and a to give the b est t The results for the b estt plotted against

the central x of the window are shown in gure It is apparent that there is reasonable

agreement b etween the measured and for a range of xvalues around x ie

b efore the exp onential decay b ecomes to o strong and that the width of this range is larger

for the larger system L triangles on gure It is another demonstration of the accuracy

obtainable with multicanonical simulations that we have b een able to pick out a p olynomial

prefactor from data with an overall exp onential b ehaviour

Finally in this section we shall use p x at large x to determine the scaling amplitude U

where U is the constant term in the expansion in p owers of L of G H L see app endix

c

A To carry out this calculation we introduce the function

Z

F y ln dxp x exp y x

d

where y H m L is a scaled version of the external eld The integrand in F is thus

can

a scaled unnormalised version of P M H Now if we use the scaling ansatz p x

c

equation for p x it can b e shown using steep est descents arguments that for

CHAPTER MULTICANONICAL AND RELATED METHODS

1.4 L=64 L=32

ψ/7 1.0

0.6

1.0 1.1 1.2 1.3 1.4

x

Figure Results of windowtting the exp onent Squares L triangles L

The lines are guides to the eye only

large enough values of y

F y b y U

where we have made the identication

p

ln U

a

in terms of quantities in equation

We are now in a p osition to determine U we determine F y through numerical integration

of equation using multicanonical results for p x then plot F y against y obtaining

an estimate for U by extrap olating the linear form back to y In gure we show the

graph of F the main gure showing just the region near the origin and the inset showing the

d

whole range of y up to y for which the fully saturated M L state was the

most probable Figure shows the eective U the ordinate intercept obtained by tting

the data lying within a window of y values plotted against the central value of y in the window

CHAPTER MULTICANONICAL AND RELATED METHODS

6.0 200.0

5.0 150.0

100.0 4.0 F*(y)

50.0 3.0 0.0 0.0 50.0 100.0 150.0 200.0 y1+1/δ

F*(y) 2.0

1.0

0.0

-1.0 0.0 1.0 2.0 3.0 4.0 5.0

y1+1/δ

Figure The function F y determined from multicanonical measurements plotted

against y The main gure shows b ehaviour near origin while the inset shows the b e

haviour right up to y

It is apparent that convergence to the largey form of equation o ccurs rapidly o ccurring

d

in the interval in which H L the size of eld required to drive the system out

of the critical region to a region where the steep est descents arguments are valid Within the

linear region the estimates of U are in go o d agreement with the exact result U ln

ln from although much the b est estimates of U come from

fairly small y values see the inset of gure at its minimum we obtain U

There are three reasons for this b ehaviour one is that the pro cess of extrap olating back to

y for a window of y values that far from the origin magnies the error in the intercept the

second is that the width of p x exp y x gets smaller as the eective eld y increases reecting

the b ehaviour of the susceptibility Thus at large y F y is dominated by only a few p oints

of p x and its random errors are therefore larger Third and probably most imp ortantly at

large y the states that dominate F y come from the extreme tail of p x where we have just

shown see gure that the nitesize p x deviates appreciably from the ansatz p x

Thus we would exp ect a small systematic change in the eective U

CHAPTER MULTICANONICAL AND RELATED METHODS

-0.10

-0.625 -0.20 -0.630 eff -0.30 0

U -0.635 eff 0 -0.640

U -0.40

-0.645 -0.50 0.0 10.0 20.0 30.0 y1+1/δ

-0.60

-0.70 0.0 20.0 40.0 60.0 80.0

y1+1/δ

Figure The eective value of U given by the ordinate intercept of the linear t to F y

within a window of y values plotted against the central value of y in the window

We should compare these results with those from where U was estimated in the same

way but using results from Boltzmann sampling MC simulations As a result of the fact that

p x was not well determined for large x the tting was limited to y though as we have

seen this is in fact all that is required for a go o d estimate of U

A brief discussion of the physical signicance of U can b e found in app endix A

CHAPTER MULTICANONICAL AND RELATED METHODS

Beyond Multicanonical Sampling

It must be admitted that she has some beautiful notes in her voice What a pity

it is that they do not mean anything or do any practical good

FROM The Nightingale and the Rose

OSCAR WILDE

We shall now expand the scop e of our discussion of the multicanonical ensemble putting it in

a more general framework of other imp ortance sampling distributions including the expanded

ensemble see section While discussing the expanded ensemble we shall present

new results on the scaling of its exp ected MC error Then in section we shall use

similar analysis to predict for the rst time the exp ected error of an estimator for various non

Boltzmann distributions including the multicanonical and we shall identify a nearoptimal

sampled distribution for a particular quantity O and algorithm sampled distributions of any

desired shap e may b e pro duced by a simple generalisation of the metho ds of section We

shall then check our predictions by explicit MC measurement of the variance of O

The Multicanonical and Expanded Ensembles

We shall now make more explicit the connection b etween the multicanonical ensemble and the

expanded ensemble that we rst mentioned in the discussion of them in chapter The apparent

dierences b etween the two in the way they have b een formulated up to now come from the

choice of ensemble and we can put the two into the same framework as the expanded ensemble

by considering what the multicanonical ensemble would b e like if applied to ensembles other

than the NV T ensemble or what the expanded ensemble would b e like if it were not made

from NV T ensembles In the same way it was necessary in chapter to consider the Ising

ferromagnet in the NV T ensemble and the the uid system in the N pT ensemble in order for

the similarity b etween them to b ecome apparent

Consider for example a multicanonical system with the co ecients dep ending on the

magnetisation M The multicanonical partition function is

X

exp E exp M Z

f g

CHAPTER MULTICANONICAL AND RELATED METHODS

We can cast this into the form of the expanded ensemble cf equation by writing

X

Z Z M exp M

M

where

X

M M exp E Z M

f g

exp F M

So from this p ersp ective the multicanonical ensemble app ears as an expanded ensemble com

p osed of xedM ensembles

This example is p erhaps a little awkward since we are not accustomed to thinking in terms

of the xedM ensemble However in just the same way an expanded ensemble where each

sub ensemble is microcanonical is exactly equivalent to the multicanonical ensemble with E the

preweighted variable Another example is a nonBoltzmann sampling N pT simulation with

an V designed to increase exploration of the volume macrostates which can naturally b e

regarded b oth as an expanded ensemble generated by putting together canonical ensembles

diering in V and as a multicanonical N pT ensemble with providing a noncanonical weighting

to the dierent volumes We shall use just such a simulation in chapter A grand canonical

ensemble canonical ensembles weighted by N with an extra N can b e considered in the

same way The expanded ensemble is thus the more general concept a multicanonical ensemble

can always b e describ ed as in terms of an expanded ensemble

Sometimes then it is not even clear what a particular sampling scheme should b e called

though multicanonical seems to b e established for E and M expanded ensemble for

the system with sub ensembles having dierent energy functions and expanded ensemble or

simulated temp ering for the system with sub ensembles having dierent temp eratures Because

there are no Gibbsian ensembles analogous to the systems with variable temp eratures or energy

functions they can only really b e thought of as expanded ensembles the issue is the status of the

multicanonical ensemble We suggest that a classication on the basis of the b ehaviour of the

order parameter rather than the nature of the sub ensembles thus an expanded ensembles s

would always b e related to a Gx where x is an intensive eld variable for example while

the multicanonical ensembles would b e related to F X where X is an extensive mechanical

CHAPTER MULTICANONICAL AND RELATED METHODS

variable such as E or M With this nomenclature the ensemble with xed N p and T and

V to make P V at over some range of V would b e called multicanonical This classication

has the advantage of generally reecting the way that the quantities of interest are extracted

from the simulation In multicanonical simulations the required free energies are found by

reweighting and summing over all values of the preweighted variable eg equation while

with the multicanonical ensemble they come from the probabilities of the end states of the

chain of sub ensembles only eg equation

Because of the similarity b etween the them all the metho ds describ ed in this chapter for nd

ing for the multicanonical ensemble are also applicable to the expanded ensemble provided

that the notions of microstates and macrostates are redened slightly the microstates b ecome

the joint set ff gf gg of co ordinates and temp eratures taking the temp eratureexpanded en

semble as an example and the macrostates b ecome the canonical sub ensembles of which the

expanded ensemble is comp osed In particular the transitions b etween the sub ensembles still

form a Markov pro cess and the TP metho d of section can b e applied but with the proviso

that some care must b e taken in the choice of starting state since we cannot choose a starting

state that contains only one microstateevery state of the expanded ensemble contains all

the microstates of its constituent canonical ensembleswe must make sure that the simulation

has had time to equilibrate within the starting state b efore we p ermit it to move to the other

sub ensembles

We should also note that parallelism of the geometric decomp osition kind is more naturally

applied to an expanded ensemble than to a multicanonical ensemble Any parallel co ordinate

up dating that could b e applied to the canonical sub ensemble if it were a simulation in its own

right can clearly also b e applied in the multicanonical simulation and it would never o ccur to us

to change the value of the eld prop erty that varies b etween the sub ensembles temp erature

or whatever in one part of the simulation volume but not another In the multicanonical

ensemble conversely the need to keep the preweighted variable constant during any parallel

up dating may well aect our choice of the kind of particle moves or spin ips that are done

Nevertheless the similarity b etween the two metho ds remains in so far as the preweighted

variable whether it b e the energyorder parameter or the temp eraturehamiltionian must b e

explored serially

CHAPTER MULTICANONICAL AND RELATED METHODS

The Random Walk Problem

The description of the dynamics of the expanded ensemble as a Markov pro cess enables us to

derive an imp ortant new result ab out the exp ected error of expanded ensemble simulations In

these simulations the quantity we wish to measure is simply the ratio of the probabilities of

two of the sub ensembles in the chain usually the two at the ends this makes the analysis here

easier than for the multicanonical ensemble For go o d accuracy the system must visit b oth

ends of the chain several times and so the the accuracy is ultimately limited by the time to do

a onedimensional random walk over N the number of sub ensembles in the chain assumed

m

not to b e very small This time is N We now outline an argument suggested

r w r w

m

by this fact but which we think is fallacious that expandedensembletype calculations can

have their accuracy improved if the chain of sub ensembles is divided up into pieces and the

sampling p erformed in each piece separately We then go on to give what we think is the correct

argument

First then the fallacious argument

Supp ose that the underlying probability distribution is such that each sub ensemble has equal

probability but that we are pretending that we do not know this and are trying to measure P

i

This situation will in fact b e realised to a very go o d approximation in the pro duction stage

of real applications and the small deviations from a constant P while b eing exactly what we

i

want to measure will have no eect on the random walk arguments that we are ab out to give

Supp ose we generate N counts in total in our histogram of the o ccupancies of the N

c m

sub ensembles If there were no correlations then the number of congurations going into the

ith sub ensemble would b e C which would have a binomial distribution with mean C N N

i i c m

eb

C N or and variance N N N N N Thus we would estimate P by P

i c c m m c m i i

C N N with exp ected error

i c m

r

N C

m

N

C

c

In practice adjacent congurations are highly correlated Now as we have seen even though

the Markov chain we are simulating is really a microscopic one b etween the microstates of the

system we can treat the pro cess that describ es transitions b etween sub ensembles as a eective

Markov chain in its own right and given that equation is satised it ob eys its own detailed

balance condition that we can use to estimate the stationary probability distribution Because

CHAPTER MULTICANONICAL AND RELATED METHODS

the underlying Markov pro cess is is highly correlated so is the eective pro cesseach ensemble

will usually only make transitions to its neighbours indeed we shall usually restrict attempted

moves to the neighbours to save wasting eort in attempting transitions that are almost certain

to b e rejected It is now tempting to apply equation equating the eective correlation

time that multiplies the variance of the estimators obtained with the random walk time

N This would give for the correlated case

r w

m

s

C N

m

N

C

c

the estimate This would also b e the scaling of the fractional error in the ratio r C C

N

m

assuming that the ends are far enough apart for the errors to b e indep endent of P P

N

m

s

v ar C v ar C r

N

m

r

C C

N

m

This result implies that we could achieve a b etter accuracy by dividing up the N state

m

expanded ensemble into b groups with one state or a few states overlapping b etween adjacent

groups Neglecting the overlaps the number of states in each group would b e m N b and

m

we would devote time N b to each The estimator of r would b e

c

b

Y

eb

j

j

C C r

m

j

and then equations and would give

s

p

N b r

m

b

r N b

c

s

N

m

bN

c

implying that the error decreases as b N ie as the expanded ensemble is divided up

m

into smaller and smaller pieces We exp ect that in practice errors would eventually b egin to

increase again b ecause if b were very small a large fraction of transitions would have to b e

rejected as taking us outside the group and p ossibly b ecause correlation times within some

sub ensembles may b ecome longer if they cannot connect with others at higher temp eratures

but it is not clear what the b est value of b to choose would b e we would have to investigate each

CHAPTER MULTICANONICAL AND RELATED METHODS

b separately and measure the error directly The core of this argument that the correlation

time is the same as the random walk time can b e found in one form or another in

Now let us do a rather more careful analysis of the eective Markov chain using results from

chapter Dene state occupancies C N jj the number of times that state i is recorded

i c

in N steps of the chain given that it starts in state j It can b e shown that in the largeN

c c

limit N N the eect of the starting state j disapp ears

c

m

C N jj C N N P N N

i c i c c i c m

as b efore and

g

v ar C N jj N P P t

i c c i i

ii

g

g

where the t are comp onents of the transient sum matrix T which is given by

ii

g

T I

g

It app ears that we need to do a full matrix inversion to nd T which will cost N

m

op erations using a trick see chapter op erations but in fact we can do it in only O N

m

which dep ends on the fact that is sparse and has all its rows equal This makes the

calculation of v ar C economical even for large N the N case requires less than one

m m

minute on a workstation

Let us test these predictions using a simple Markov mo del where there are no microstates

within each state of the chain We make the simple choice

for i j and i N j N

m m

for i j and i N j N

m m

ij

for i N and j i i i

m

otherwise

This has eigenvector P N as required In gure we show the b ehaviour of N v ar C

i m i

c

p

and N v ar C C as a function of N for i and i N

c i i m m

It is clear that except at very small N v ar C is roughly constant rather than increasing m

CHAPTER MULTICANONICAL AND RELATED METHODS

1000 2.0

state Nm/2 state 1 1.0 c

/ 100 1/2 0.5 0.4 var(C)/N

0.3 var(C)) c 10 state Nm/2 0.2 state 1 (N

0.1 1 1 10 100 1000 1 10 100 1000

Nm Nm

Figure Behaviour of variance left and fractional error right of the number of visits

C to states and N in a D randomwalk over N states with reecting b oundaries

m m

p

p

like N and v ar C C increases linearly with N rather than going like N r will r

m

m m

b ehave similarly except that it will show an even smaller deviation from linearity b ecause C

i

are strongly anticorrelated at small N whereas they are roughly indep endent for and C

m N

m

large N If we now consider sub dividing the range N into b pieces we nd

m m

p p

r bN b N b r

m c

p

N N

m c

so this time the variance of the estimator r is not decreased by sub dividing N and therefore

m

in an expanded ensemble calculation we may as well always do a single random walk over the

whole range to get whatever b enets of improved ergo dicity are available We shall present

results in chapter that demonstrate that this b ehaviour is indeed what is found in a real

expanded ensemble calculation

Where then do es the mistake lie in the fallacious argument we gave rst Equation

is certainly applicable since it is based on very general principles of timeseries analysis The

error comes in fact from assuming that eective correlation time in equation is the O

CHAPTER MULTICANONICAL AND RELATED METHODS

same as or of the same order as the op erators we need to consider are simply

r w

if i

O

i

otherwise

so that N O C

c i i

It is shown in app endix C equations C and C that

X

t

i O

i

t

where t lab els time in units of the basic up date and the correlation functions t are

i

O O t O

i i i

t

i

O O

i

i

so using equation it follows that here the t are

i

N P i tji N

m m

t

i

N N

m m

where P i tji is the probability that the simulation is found in state sub ensemble i at time

t given that it was there at time Summing the s can b e done analytically using results

from for the case of a random walk with p erio dic b oundary conditions the result is

N This is clearly also the result for the central state i N for a random walk

m m O

i

with reecting b oundaries which is what we have in an expanded ensemble simulation For the

other states of the random walk with reecting b oundaries t can b e summed numerically

i

is larger and this reveals the same dep endency on N though

m O

i

N when substituted into equation this gives the The key result then is that

m O

i

same form for the scaling of the error in r as comes from our analysis of v ar C in equation

There is thus no contradiction b etween these two expressions and equating the expressions for

the variance we nd

g

t

ii

O

i

P

i

g

though this equation is true only for the simple thus relating T to the more familiar

O

i

deltafunction form for O that is given in equation

should increase more slowly than but we may It is certainly rather surprising that

r w O i

CHAPTER MULTICANONICAL AND RELATED METHODS

rationalise this by noting that the initial decay of the correlation functions t dep ends on the

i

average time to diuse away from the starting state which is essentially a lo cal prop erty and

so is scarcely aected at all by N It is only the longtime tail of the decay where the system

m

is returning to its starting state after wandering far away that dep ends strongly on N The

m

interplay of these two eects gives rise to the observed b ehaviour We should also emphasise

N does dep end on the condition that the runtime of the simulation that the result

m O

i

should b e N

r w

m

Optimal Sampling

We now return to more general questions of MC imp ortance sampling and ask what sampled

distribution P is optimal given an algorithm for the measurement of O the canon

can

ical average of O an op erator on the congurations An estimator of O can b e found

can

using equation for any sampled distribution P though in general a choice that is not

tuned in some way to O and the Boltzmann distribution will not give a useful estimator in

a time that is not exp onentially long Clearly it is desirable that the standard error of the

estimator obtained should b e as small as p ossible so that computer time can b e used as e

ciently as p ossible this is what we mean by optimal In what follows we shall concentrate on

op erators on E macrostates in particular the free energy op erator O exp E and we shall

parameterise P as usual by a set of weights We shall use O to mean the ratio estimator

of O coming from sampled distribution dened by

can

Two concepts introduced by Hesselb o and Stinchcombe are useful here for explaining

the requirements of optimal sampling They serve to make more concrete ideas that we have

already discussed or alluded to These concepts are ergodicity which is measured by and

O

pertinence measured by N I which is the average number of indep endent samples that are

s

required to obtain the information that is sought so here I O The total time required

for the problem is thus N O and this should b e minimised as a function of the

O s

weights Of the two the p ertinence is in fact the easier quantity to handle and the problem

of nding the sampled distribution with the b est lowest p ertinence was solved analytically by

Fosdick by minimising O O the result which is unique given O and the

system is

CHAPTER MULTICANONICAL AND RELATED METHODS

O

f s

exp E P

O

can

or for O O E

O E

f s

exp E P E E

O

can

which in terms of corresp onds to

O E

f s

E ln

O

can

This distribution seems never to have b een used in practice its implementation is complicated

by the app earance of O itself in the expression and its ergo dicity turns out to b e very

can

p o or for reasons we shall describ e b elow

The ergo dicity is rather harder to deal with It dep ends not only on O and the system

but also on the algorithm so general solutions like equation cannot b e given Moreover

it is usually not analytically tractable and while can b e measured by simulation for a

O

particular this do es not at least if standard techniques are used tell us ab out for

O

any other sampled distribution Thus while we could envisage nding the minimumvariance

P by this metho d treating or v ar O as a function of the s to b e minimised with

O

resp ect to them this would b e extremely timeconsuming b ecause there are N more or less

m

indep endent variables in and each function evaluation requires an entire MC simulation

Such a pro cedure is likely to waste more computer time than we could hop e to recoup from

more ecient sampling

We shall now go on to discuss issues related to optimal sampling in greater depth using

the notions of p ertinence and ergo dicity where appropriate We shall show how an expression

for the sampled distribution that is very similar to equation follows from consideration of

f s

the structure of the ratio estimator O and discuss the ergo dicity of P and other sampled

distributions Then in section we shall give new theory showing how measurement of

the macrostate transition matrix for one sampled distribution the multicanonical is b est for

this purp ose enables us to estimate it for any other From this we can make an approximate

calculation of the error in the ratio estimator for any other sampled distribution implicitly

including the eect of correlations without needing to calculate itself O

CHAPTER MULTICANONICAL AND RELATED METHODS

Pertinence

As we rst said in section the numerator and denominator of equation are dominated

can can

by energies around the p eaks of P E O E and P E resp ectively as shown for O

exp E in gure The states that lie b etween these p eaks contribute hardly anything

to either the numerator or denominator of the equation the weight they are given by the

multicanonical distribution serves only to enable the system to tunnel b etween the two This

is irrelevant to p ertinence which is related only to the information provided by independent

samples so it is clear that the p ertinence can b e increased by downweighting these states

Indeed within the p eaks themselves the contribution of a particular macrostate to the integral is

can can

prop ortional to its value of P E O E or P E so the maximallyp ertinent P E should

have a shap e that follows the shap es of these p eaks All that then remains is to determine the

relative weights to b e assigned to sampling the two p eaks and it follows from simple error

propagation that the fractional error of the ratio estimator is minimised when the fractional

errors of its numerator and denominator are equal Thus without any detailed calculation we

arrive at the ansatz

can can

P E O E P E

R R

P E

can can

P E O E dE P E dE

E E

R

can

can

P E O E

P E

E

R R

can can

P E O E dE P E dE

E E

O E

can

P E

O

can

ie

O E

E ln

O

can

These equations dier from the analogous equation only in a single sign For O

f s

exp E P E and P E are almost identical P E is shown for the Ising mo del with

f s

in gure Only in lnP E would any dierence b e apparent lnP E

at

O E O while as can b e seen from the inset P E it is in fact nite there

f s

though very small Thus the high p ertinence of P E is justied intuitively

f s

For other op erators P E and P E are not so similar If O is not a particularly rapidly

CHAPTER MULTICANONICAL AND RELATED METHODS

0.04 100

-5 0.03 10 P(E)

10-10 P(E) 0.02 10-15 -512 -384 -256 -128 0 128 E

0.01

0.00 -512 -384 -256 -128 0 128

E

Figure Main gure the sampled distribution P E for the Ising mo del with

Inset the same but with logarithmic vertical scale

can can

increasing function of E then the p eaks of P E O E and P E will overlap P E

which is a weighted sum of the two p eaks then b ecomes a singlep eaked function of E while

f s

P E which always gives zero weight to the state where O E O retains its double

p eaked structure For O E this this means that the congurations at E are not sampled

an extreme contrast to Boltzmann sampling where these congurations are among the likeliest

to o ccur

Let us now comment on the multicanonical ensembles p ertinence There is in fact no canon

ical average for which it has the b est p ertinence rather it is the b est if one is interested in

knowing E for all E with constant fractional accuracy Nevertheless while its p ertinence is

not optimal b ecause of the weight given to states b etween the p eaks and p ossibly ab ove and

b elow them it has in the language of go o d worstcase p ertinence and also reasonable

ergo dicity a p oint we shall return to By this we mean that whatever op erator O we choose

the multicanonical ensemble will have sampled that part of macrostate space and so at least a

tolerably accurate estimator of O will b e obtained The observables that it estimates worst

are those that dep end only on a narrow region of macrostate space for example E for

can

these the eort that the multicanonical simulation puts into sampling all the other macrostates

is wasted But even in the worst case the multicanonical distribution could never need more

than N times more indep endent samples than the optimal distribution this in the case that

m

the sp ectrum of the observable was so narrow that it dep ended only on one macrostate This is

CHAPTER MULTICANONICAL AND RELATED METHODS

f s

in contrast to Boltzmann sampling and indeed to the optimised sampled distributions P E

and P E If we wish to nd the exp ectation of a dierent op erator they will in general put an

exp onentially small fraction of their weight in the region of macrostate space which dominates

d

the ensemble average and so would require O exp L times more samples Multicanonical

sampling can never have this problem b ecause it puts an equal amount of weight in every region

of macrostate space

According to the sampled distribution with the best worstcase p ertinence is the k

R

E

k

ensemble for which P E E dE see also section This gives rather

more weight to lowenergy states than do es the multicanonical ensemble However the scaling

of the p ertinence a factor ln N worse than an ideal estimator is the same as that

TOT m

of the multicanonical distribution so any improvement is presumably only in the size of the

prefactor

Ergo dicity

We shall now discuss in qualitative terms the ergo dicities of these various sampled distributions

when implemented with an algorithm like singlespinip Metrop olis that can make transitions

only over a very short distances in macrostate space For the case O exp E it is apparent

f s

that the distributions P and P in fact have very poor ergo dicity the sampling probability

f s

is exp onentially small b etween the two p eaks for P and zero for P which makes tunnelling

b etween them extremely slow Thus an MC simulation of normal length is in fact likely to

f s

sp end all its time in the region of one of the p eaks of P and not to sample the other at

all O from equation then has eectively innite error nothing has b een gained over

Boltzmann sampling where the similarly enormous error is due to lack of p ertinence Thus

it b ecomes apparent that the demands of p ertinence and ergo dicity may well b e mutually

contradictory if to improve the p ertinence of the sampled distribution the states b etween the

p eaks are downweighted the ergo dicity will suer and the net eect may b e to degrade overall

f s

p erformance For op erators such as O E P would still suer from severe ergo dic problems

while P would b ecome satisfactory However it can scarcely b e claimed that nonBoltzmann

sampling is necessary to estimate E

The multicanonical ensemble b eing at over all accessible macrostates has no such self

inicted ergo dicity problems though with the Metrop olis algorithm the step size is not large

and the acceptance ratio b ecomes low near the ground state see section Thus the

CHAPTER MULTICANONICAL AND RELATED METHODS

tunnelling time b etween the regions that are imp ortant in the ratio estimator is N

r w

m

The k ensemble has similarly robust ergo dicity with scaling in the same way It is

r w

shown in that the acceptance ratio of the k ensemble may b e b etter than that of the

multicanonical ensemble so giving slightly b etter ergo dicity

It is imp ossible to make this discussion more quantitative while as we have said analytical

calculation of is imp ossible and measuring it by MC from visited states for more than a

O

few sampled distributions would b e prohibitively exp ensive Thus while we might imagine that

the true optimal sampled distribution for exp E would b e say similar to the multicanonical

but giving less weight to the states b etween the p eaks though more than that assigned by P

f s

or P we cannot say exactly what the tradeo b etween ergo dicity and p ertinence should b e

Use of the Transition Matrix Prediction of the Optimal Dis

tribution

We shall now show how at least for the O O E problem the diculties caused by our

inability to calculate may b e skirted by using macrostate transition information Supp ose

O

we carry out a simulation with some weighting and estimate the macrostate transition matrix

using equation Now for any microstates r s in i and j resp ectively we have

ij

R min exp E E

r s r s s r s r

where the matrix R dened in section describ es which transitions are allowed Similarly

for some other set of weights

R min exp E E

r s r s s r s r

so returning to equation we nd that for j i b ecomes

ij

X X

P r ji

r s ij

sj r i

X X

min exp E E

s r s r

P r ji

r s

min exp E E

s r s r

sj r i

X X

min exp E E

j i j i

P r ji

r s

min exp E E

j i j i

sj r i

CHAPTER MULTICANONICAL AND RELATED METHODS

min exp E E

j i j i

ij

min exp E E

j i j i

follows from

ii

X

ii ij

j i

Equations and show that we can calculate the macrostate transition matrix for any

desired weighting if we have measured it for single weighting So that is accurately

ij

xc

determined for all macrostates it is obviously a go o d choice to use

Note that in the derivation of equation in order to b e able to take the ratio of the

Metrop olis functions outside the sums over the microstates it is necessary that E and

r r

should b e the same for all r i and similarly for s j thus the derivation we have just given

is valid only for energy macrostates From equation the detailed balance condition on

macrostate transitions it follows that for any system where the TP metho d is valid we can

write

exp

ij ij j i

exp

j i j i j i

but it is not clear how the normalisation of the individual terms in follows from this

Now we wish to move from this to an estimate of v ar O the ratio estimator derived from

sampling with Rather than trying to calculate t and we shall use an approximate

O O

metho d casting the problem into the same form as an expanded ensemble calculation and

bringing to b ear the machinery of just as in section This will give a useful qualitative

estimate of the error of a particular sampled distribution

We argue that the dominant source of error in O is the tunnelling back and forth b etween

the states that dominate the p eaks of the numerator and denominator of the ratio estimator not

the sampling within each p eak That is to say that the problem is the estimation of the relative

weights of the p eaks of numerator and denominator not the shap e of each p eak individually

So the fractional error in the estimate of the numerator is

P

r

C E O E exp E C E

in

P

C E O E exp E N C E

p

CHAPTER MULTICANONICAL AND RELATED METHODS

can

where E is some state in the p eak of O E P E the mo de say is some sort of correla

in

tion time for sampling within the p eak and N is the width of the p eak We argue that will

p in

b e similar for all sensible sampled distributions that we may choose and so will not aect the

p

comparison of dierent s Therefore we shall need only to calculate v ar C E C E

The same arguments can clearly b e applied to the denominator with the mo de at E say and

lead to

s

O v ar C E v ar C E

O

C E C E

This is now of the same from as equation for the estimation of r r and we can once

again use equations and to calculate the righthand side from the transition

matrix

Thus our pro cedure for the sp ecic case of the d Ising mo del is the following

xc

Measure the macrostate transition matrix p entadiagonal by MC

Calculate p entadiagonal using equation

Blo ck into a tridiagonal form By blo cking at this stage we can correctly take

into account the variation of the probability of the underlying E macrostates with each

blo cked macrostate and so avoid the problems discussed in section

C and v ar C using equations and and thus estimate the error Calculate

of O

go to and rep eat until the optimal is found

Though this pro cess is vastly faster than p erforming a full MC simulation for each it is

still not so fast as to make a full multidimensional minimisation over feasible there would

in any case b e little p oint in this b ecause of the approximations that have b een made Instead

we should choose to examine only certain likely forms of the sampled distribution To test

the predictions of the ab ove theory we have p erformed simulations using various sampled

distributions for the Ising mo del with Clearly it would b e desirable to investigate

larger system sizes and dierent temp eratures but pressure of time has prevented this For each

sampled distribution investigated we initially predict the error of the estimator of g

using equation then p erform the simulation and measure it using jackknife blo cking of the

histograms We also measure the random walk time b etween the p eaks of the distribution r w

CHAPTER MULTICANONICAL AND RELATED METHODS

and compare it with a prediction made using the mean rst passage time the average time

ij

can b e calculated taken to reach state j for the rst time starting at state i Like v ar C

ij i

from see

As candidates for the optimal sampled distribution we have chosen to examine distributions

f s

interpolating b etween the b estp ertinence P distribution and one that is very similar to the

can can

multicanonical These sampled distributions follow the shap es of P and O P in the

p eaks which are arranged to have equal weight where the canonical distribution is that at

and O exp E but also add in a constant weight b etween them This weight is

parameterised by w for w P w E b etween the p eaks is linear passing through the p oints

whose y co ordinates are w times the maximum heights of the p eaks and whose xco ordinates

are the E values at the maxima As w increases we exp ect the p ertinence to get worse

b ecause less time is sp ent on the p eaks but the ergo dicity to improve as The distribution

r w

with w is very close to multicanonical diering from it only b ecause the insistence that

b oth p eaks have the same weight means they must have slightly dierent heights pro ducing

a sampled distribution which is not at but has a slight slop e We also investigate sampled

distributions that put the ma jority of their weight in the region b etween the p eaks since we

anticipate this will further reduce These are also parameterised by w with w but

r w

here P w E is chosen such that ln P w E b etween the p eaks is parab olic passing through

can can

the maxima of ln P and ln O P and rising half way b etween them to w their average

height To make this clear we show the sampled distributions that have b een used in gure

To generate them requires only a few seconds

First b efore comparing the predicted and actual error we show in gure the MC results

for gw g for the various sampled distributions We do this to demonstrate that the MC

exact

results are consistent with the exact answer taking errors into account gw is the measured

estimator and g is its exact value for the Ising mo del taken from

exact

g A total of N lattice sweeps were p erformed for each sampled

exact c

distribution p erformed in ten blo cks The error bars come from jackknife blo cking

Now we pro ceed to the results that are our real interest the variance of the estimators of g

We show in gure the size of the error bars on g b oth predicted and measured

as a function of w Because equation is only a prop ortionality dep ending on unknown

parameters we show not the absolute error but the error for each w divided by the error

7

this also corrects for the fact that the op erator O is directly related to Z and thus to exp g not g itself

CHAPTER MULTICANONICAL AND RELATED METHODS

0.04

w=0.01 w=0.03 w=0.4 0.03 w=1 w=4 w=32

P(w,E) 0.02

0.01

0.00 -512 -384 -256 -128 0 128

E

Figure The sampled distributions used to test the predictions of the error of g

lab elled by the value of the parameter w that determines how much weight is put in the region

can can

b etween the two p eaks P and O P

predicted or measured as appropriate from the rst p oint which is at w We call this

and measured values of quantity g g We also show gure the predicted from

ij

can can

b etween the mo des of P and O P for the various sampled distributions

r w

Agreement b etween prediction and exp eriment is generally very go o d the estimate of is

approximately right throughout the range of w while the estimate of the error bar g is very

go o d for w though the w nearmulticanonical distribution has a rather smaller error

bar than predicted while the error for large w is also rather lower than exp ected We attribute

some of this discrepancy the low value of g w the fact that g w g w

to error in our estimate of g itself which we have not quantied but for large w we may b e

overestimating g b ecause more weight than is taken into account by equation go es into

the p eaks of P w This o ccurs b ecause we assumed in deriving that equation that the shap e

can can

of P w matches that of P and O P in the p eaks but in fact b ecause the extra weight

that is put into the region b etween the p eaks is matched on to them at their maxima for half

can

the macrostates in the p eaks P w is substantially larger than it would b e if it followed P

can

and O P We correctly predicted the distribution that gave the smallest g which was the

nearmulticanonical one

The most striking result of the investigation is probably the surprisingly small eect that

CHAPTER MULTICANONICAL AND RELATED METHODS

0.002

0.001

0.000 exact g-g

-0.001

-0.002 10-3 10-2 10-1 100 101 102

w

Figure The dierence b etween the measured estimator g and its exact value g

exact

for the Ising mo del at for six dierent sampled distributions param

eterised by w N lattice sweeps were p erformed for each sampled distribution The

c

error bars come from jackknife blo cking

variation in the sampled distribution has on the estimator of g It is clear that a wide range of

nearmulticanonical distributions can b e used without any signicant eect on the size of the

error bar It is also apparent that for the singlespinip Metrop olis algorithm no signicant

improvement on the multicanonical distribution seems p ossible Ones intuition which p erhaps

leads one to favour distributions with high p ertinence can b e misled b ecause of the need for

ergo dicity to o the error of the highlyp ertinent w estimator is appreciably larger than

the others b ecause its long random walk time means that a p o orer estimate of the relative

weights of numerator and denominator is obtained The w estimator is b etter b ecause

its ergo dicity remains go o d while at least some appreciable weight is put in the regions that

dominate the ratio estimators

It seems then that the error of a particular sampled distribution can b e predicted with

reasonable accuracy Even though no large reduction in error is p ossible here we could use

the metho d to decide b etween the metho ds of sections connect to innite temp erature

states and connect to the ground state Given the transition matrix of a multicanonical

distribution extending from the ground state right up to the states we should b e able

to calculate the exp ected error of g for any by the two metho ds For the metho d using

CHAPTER MULTICANONICAL AND RELATED METHODS

1.2

1.0 MC data prediction

0.8 1

δ 0.6 g/ g δ 0.4

0.2

0.0 10-3 10-2 10-1 100 101 102

w

Figure Predicted line and measured p oints error bars on g as a function of the

parameter w shown as a fraction of their size at w Values for the Ising mo del at

MC data gathered from lattice sweeps p er value of w with jackknife blo cking

the probability of the ground state the k distribution of which was designed for a very

similar problem should b e considered as well as the multicanonical distribution

The reason that scarcely any improvement on multicanonical sampling is p ossible using the

singlespinip Metrop olis algorithm is that the algorithm is limited by the time taken to move

b etween the two regions that dominate the ratio estimator Any large improvement would

require the use of a dierent algorithm cluster ipping demon etc It seems likely that as

the correlation time decreases the optimal sampled distribution will b ecome less like the mul

ticanonical and more like Fosdicks b ecause in the limit of uncorrelated congurations when

p ertinence alone need b e considered Fosdicks prescription do es give the sampled distribution

of lowest variance Thus we anticipate that for such an algorithm the the use of the metho ds

describ ed here could well lead to the prediction of a sampled distribution with a substantially

lower variance than the multicanonical even though it did not for the simple Metrop olis

Discussion

We have made a thorough investigation of issues related to the pro duction and use of the

multicanonical distribution in the free energy measurement problem and elsewhere and we

CHAPTER MULTICANONICAL AND RELATED METHODS

106

MC data prediction

105 τ

104

103 10-3 10-2 10-1 100 101 102

w

Figure Predicted line and measured p oints values for the random walk time

r w

can can

b etween the mo des of P and O P as a function of the parameter w Values for the

Ising mo del at MC data gathered from lattice sweeps p er value of w with

jackknife blo cking

have also investigated its relationship to the expanded ensemble metho d metho d and to other

nonBoltzmann imp ortance sampling metho ds Though the d singlespinip Ising mo del has

b een used throughout most of our results particularly in so far as they reect new metho ds

are not limited to this system and would b e exp ected to b e of wider applicability we shall use

them to study a phase transition in an olattice system in the next chapter

In our investigation of the generation of the multicanonical distribution we have studied

distributions preweighted b oth in the internal energy and in the magnetisation this latter

case corresp onding to the problem of measuring the pdf of the order parameter over a

range embracing b oth the phases the former corresp onding more to the application of the

metho d to nd the absolute free energy of a single phase We lo oked at several ways of

pro ducing a suitable sampled distribution as rapidly as p ossible First we examined p erhaps

the most obvious metho d based on the visitedstates of the MC algorithm While the absolute

p erformance of this metho d was inferior to that of the other metho ds b ecause of its slowness in

spreading out from the region sampled by the initial Boltzmann sampling algorithm it did

serve to show the usefulness of Bayesian metho ds in this problem Bayes theorem is the natural

way of distinguishing uctuations in the observed data that are really due to the underlying

CHAPTER MULTICANONICAL AND RELATED METHODS

structure of the sampled distribution from uctuations that are simply the pro duct of the

sto chastic nature of MC sampling However as we saw the full priorp osterior formulation of

the problem leads to the app earance of integrals of some complexity We used a simple Normal

mo del to treat these approximately though this extra eort did not help the rate of convergence

to the multicanonical distribution In general in fact we should always b ear in mind that it

is b etter as well as easier in regions where data is sparse to use further MC sampling from

an up dated distribution to conrm and rene the results of a simple approximate estimation

of the underlying sampled distribution rather than to devote the computer time to a lengthy

Bayesian analysis of the sparse data

Secondly we introduced a new metho d the Transition Probability metho d to try to over

come the slowness of convergence of the VS metho d by sampling all the macrostates imme

diately We found this metho d to b e of use b oth for energy and magnetisation preweighting

though it p erformed b etter for magnetisation convergence was faster in fact almost immedi

ate and there seemed to b e little or no residual bias in contrast to the energy case Indeed

it was p ossible to use the TP metho d not just to generate the multicanonical distribution but

to obtain accurate nal estimators of the canonical pdf This promises to alleviate to some

extent problems caused by the fact that the multicanonical algorithm cannot b e fully paral

lelised by p ermitting the length of a multicanonical run to b e shorter than it would have to

b e if VS estimators were used while still giving an unbiased estimator This makes it easier

to use massive parallelism To anticipate future results for a moment we shall use the TP

estimator and massive parallelism in this way in the next chapter and we shall also do further

work which will shed more light on the conditions required for the approximation equation

to b e satised

In section we have conrmed that the Gibbs free energy g and the canonical averages for

the sp ecic internal energy e and sp ecic heat capacity c are correctly pro duced for all values

H

of the inverse temp erature from a single simulation We have compared the multicanonical

ensemble with thermo dynamic integration for the problem of measuring absolute free energy

showing that the size of random errors for the two is ab out the same but that the multicanonical

metho d cop es far b etter with a singularity the continuous phase transition at H

c

on the path of integration Though it do es not relate directly to the problem of

free energy estimation we have also used the ability of the Multicanonical metho d to make

accessible states of very low canonical probability to pro duce new results on the scaling form

CHAPTER MULTICANONICAL AND RELATED METHODS

of P M

c

In the last section we have addressed rather more general questions ab out imp ortance sam

pling First we have put the multicanonical and expanded ensembles into the same framework

and have shown how the theory we developed for use with the TP estimators of the multicanon

ical and ensemble is also useful in the context of the expanded ensemble where it enables us

to show that simulation within subgroups of the sub ensembles of which the expanded ensem

ble consists do es not confer any advantage Second b ecause we may sometimes b e interested

only in a single canonical average at a single temp erature we have addressed the question of

optimal sampling We have shown that from TP measurements on a multicanonical distribu

tion we may calculate approximately the exp ected variance of the estimator from any sampled

distribution We have in fact b een able to make only preliminary investigations lo oking only

at O exp E for the Ising mo del and investigating various candidate distributions ex

plicitly to conrm that our predictions of their variance are approximately correct For this

observable and the singlespinip Metrop olis algorithm the multicanonical distribution turns

out to b e very near the b est that can b e used There are of course observables for which it is

far from optimal internal energy at any particular temp erature b eing one But there are none

d

for which it is bad in the way that Boltzmann sampling it can require at most O L more

sampling time than an optimal distribution whereas Boltzmann sampling can require a time

that is longer by a factor that is exp onential in the system size

Chapter

A Study of an Isostructural

Phase Transition

Introduction

Two partially intertwined threads run through this chapter The rst is the continuing study of

the techniques relating to the generation and use of the multicanonicalexpanded ensemble and

comparison of its eciency with that of thermo dynamic integration TI We shall investigate

these matters in section by applying TI and the expanded ensemble to the squarewell solid

just as in section we applied TI and the multicanonical ensemble to the Ising Mo del

We shall also in section continue the exploration of the use of the transition probability

estimators introduced in the last chapter section

The second thread is the examination of the squarewell solid as a system of physical interest

in its own right We shall in particular conrm and elab orate up on recent results that suggest

that this system displays an isostructural solidsolid phase transition

Thus at some times the fo cus will b e on the way that the results obtained by various simula

tion techniques compare with one another whereas at others it will simply b e on the results

themselves and their physical meaning

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

The SquareWell Solid and Related Systems

We shall now introduce the system that we shall investigate in this chapter and describ e the

asp ects of its physical b ehaviour that we shall study In order to motivate the choice of the

the squarewell solid and to place our work in a wider context we shall also briey describ e

related theoretical and exp erimental work in particular on colloidal systems

Consider a system of N particles moving in a continuous threedimensional space and inter

acting with one another via a simple sphericallysymmetric pairp otential E r where r is

ij ij

the separation of the centres of two particles i and j The p otential we shall b e using consists of

a hardcore repulsion dening the diameter of the particles and a shortrange attractive force

if r

ij

E r

ij E if r

ij

if r

ij

as shown schematically in gure

To Infinity E

r E σ 0

λσ

Figure Schematic diagram of the pair p otential of the squarewell solid the width of the

well is exaggerated compared with its actual width in the simulations of this chapter

The width of the attractive well is a fraction of the hard core repulsion distance though

1

V r is more commonly used to represent this but we reserve V for volume and keep E for internal energy

ij

as in previous chapters

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

may b e varied at will we shall in fact always use the value We choose to measure

E is the depth of the p otential well so that E r inverse temp erature in units of E

ij

for r in the well We shall deal exclusively in this chapter with densities where the system

ij

is solid and we shall examine only facecentred cubic fcc crystals containing N particles

where N m for integer m in practice m The total energy of the conguration

is the sum over all pairs of particles of E r

ij

X

N

E r E r

ij

ij i

We shall quantify the density mainly by the volume p er particle v V N which we

shall call the specic volume We measure lengths in units of so that for closepacked spheres

p

v

We shall b e concerned with sp ecic volumes in the range v For comparison

the fcc crystal of hard spheres is stable up to v where it melts into a uid of density

v

The form of the p otential dened in equation do es not b ear more than a very rough resem

blance in shap e to the pair p otentials usually employed in mo delling the interactions b etween

atoms such as the familiar LennardJones p otential E r E r r r r

LJ ij ij ij

which is softer everywhere dierentiable and has a much wider attractive well relative to the

repulsive part with an longrange attractive tail However while extremely idealised it does

b ear a closer resemblance to the eective p otential that may b e induced b etween the particles

in colloidal systems

Colloids which o ccur frequently in nature esp ecially in biological systems consist of particles

of one material susp ended in a medium of another The diameter of the particles is b etween

nm and m and b oth colloid particles and medium may b e either solid liquid or gas though

the overwhelming ma jority of scientic studies have b een on colloids consisting of solid particles

in a liquid medium A detailed review of the prop erties of colloidal susp ensions can b e found

in here we are mainly concerned with the equilibrium phase b ehaviour of mono disp erse

colloids By suitable stabilisation we can obtain colloids where the particles b ehave like hard

spheres then by adding a p olymer to the solution we can induce an attractive force b etween the

colloid particles This force is strictly a manybo dy entropic eect when the colloid particles

are close together the number of accessible p olymer congurations is greater than when they

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

are separated However the manybo dy part of this eective interaction comes from excluded

p olymer congurations that would intersect three or more colloid particles so if R the radius

g

of gyration of the p olymer coils is small as will always b e the case here then the manybo dy

part is also small and the force can b e welldescribed quantitatively by an eective pairp otential

called a depletion p otential The depletion p otential can b e thought of as an osmotic eect

if two colloid particles approach closer than R the p olymer b etween them is squeezed out

g

and so they exp erience a net osmotic pressure from the rest of the p olymer that serves to

drive them together The range of this attractive force is thus controlled by R and its overall

g

depth by the concentration of p olymer Its strength as a function of r can b e most easily

ij

handled by treating the p olymer coils as if they b ehaved like hard spheres of radius R in their

g

interaction with the colloid particles so that the depth of the p otential is simply prop ortional

to the volume of the depletion region the region from which the p olymer is excluded The

resulting eective pairp otential is called the AsakuraOosawa p otential Even though the

shap e of the depletion p otential is still appreciably dierent from a square well there is now

a much greater degree of similarity there is a hard core and an attractive well of nite width

whose width and depth may b e varied freely in particular the well width may b e chosen to b e

much less than the hard core radius We mention in passing that in colloid science the usual

choice of parameter to quantify the density is the volume fraction the fraction of the volume

of the system o ccupied by the hard cores v in our units

We were motivated to lo ok at the solid phase with by recent theoretical

and computational results that suggest that systems with very shortranged attractive

p otentials may exhibit an isostructural solidsolid phase transition b etween a dense and an

expanded phase with a phase diagram like that in gure The qualitative structures of

the dense and expanded phases are shown in gure The inner circles represent the hard

cores of the p otential of diameter the outer circles represent the attractive wells of width

Examination of gure then indicates that the interaction of a pair of particles is E if

the outer circle of one particle touches or cuts the inner circle of the other Thus in the dense

phase top each particle has a low energy but the crystal is tightly packed so the entropy is

also low while in the expanded phase b elow each particle is in the p otential well of only two

or three of its neighbours on average but they are still close enough to form a cage to hold it

2

In drawing this gure we are to some extent anticipating results we will not nd until section see in

particular section

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

on its lattice site The energy is therefore much nearer to zero though the free volume in the

crystal is also much larger pro ducing a comp ensating increase in entropy

β

TrL

F S 1

CP S 2 S

v

Figure A schematic illustration of the phase diagram of a system with a very short

ranged attractive p otential according to Between the triple line TrL and the critical

p oint CP two solid phases lab elled S and S may co exist The horizontal lines are tielines

if the system is prepared with a density in the tielined regions it exists as a mixture of two

phases with densities given by the p oints at the ends of the lines F lab els a uid phase The

dotted b ox contains the region that is studied in this chapter

We have not developed a metho d to measure the free energy of a uid or to connect it

reversibly to a solid Therefore we shall investigate only the part of the phase diagram shown

inside the dotted region in gure We have no way of knowing where the triple Line is so

it should b e b orne in mind that part or p ossibly even all of any co existence curve we obtain

may b e metastable it may b e energetically favourable for the expanded solid to decomp ose

into the dense solid and the uid

As well as the presence of two solid phases another unfamiliar feature of gure is the

presence of only one uid phase where normally we would exp ect two a liquid and a gas

b ecoming indistinguishable at a liquidgas critical p oint It is interesting to digress for a moment

to put this into the context of the more general theory developed in According

to this theory as the range of the attractive part of the p otential is altered we pro duce in

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

Figure Schematic d diagram of the dense top and expanded solid phases Each pair

of circles represents a particle the inner circle represents the hard core of the p otential of

diameter the outer circle represents the attractive well of width

sequence the three types of phase diagram shown in gure This gure is based on results

obtained analytically for the squarewell system in

The gure on the left represents the familiar solidliquidgas phase diagram that is ubiquitous

in nature for example it describ es the phase b ehaviour of all simple atomic and molecular

systems However here we see that the condition for its existence is that If

is smaller than this as in the central gure then the liquidgas critical p oint disapp ears and

the phase diagram contains only one solid and one uid phase Finally if is extremely small

as in the gure on the right we obtain the phase diagram with one uid and

two solid phases There is thus a pleasing symmetry b etween long range and very short range

p otentials We shall describ e the physical reasons for the existence of the solidsolid phase

transition for small in section

However we must p oint out that aside from analytic calculations the solidsolid phase

3

The calculations were based on the use of a variational principle to obtain an estimate ofor strictly an

upp er b ound onthe free energy dierence b etween the squarewell system and an appropriate reference system

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

β β β

L G S S F F S 1

F S 2 S

v v v

Figure Schematic phase diagrams of the squarewell and related systems for

left centre and right based on those calculated in The

limiting value for the disapp earance of the liquid phase is taken from while

for the app earance of the expanded solid is taken from in it is suggested

that the expanded solid app ears only for Ssolid Fuid Lliquid Ggas

co existence has b een observed by only one group of workers in computer simulations of the

squarewell solid it has not yet b een seen exp erimentally Exp eriments lo oking for it

have b een carried out on colloids with inconclusive results caused by practical diculties in

working with such dense systems and with p olymers with a very small R It is interesting to

g

note that isostructural solidsolid phase transitions each with a co existence line ending in a

critical p oint have b een observed exp erimentally in the heavy metals Cerium and Caesium

However these phase transitions are not pro duced by very shortranged interatomic p otentials

the eective p otentials b etween atoms of these metals are of the usual longranged type that

would b e exp ected to pro duce solidliquidgas phase diagrams They are thought to b e caused

by quantum eects p ossibly involving lo calisationdelo calisation of f electrons though there

is no agreement on the detailed mechanism

Even the existence in real systems of phase diagrams of the second type solid one uid

phase has b een demonstrated only quite recently in colloidal systems it is found exp eri

mentally that the liquid disapp ears when R a value conrmed by analytic calculations

g

on the AsakuraOosawa mo del Computer simulations have b een p erformed on C with

a mo del p otential pro duced by smearing and summing LennardJonestype interactions

and on hard spheres with an additional Yukawa attraction The results imply that the

phase diagram of C is of this type almost uniquely for a fairly simple molecular material

and that the Yukawa system also has a twophase phase diagram if the exp onential decay is

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

strong enough The phase diagram of C has not yet b een fully established exp erimentally

The fact that p otentials of dierent detailed shap es all pro duce phase diagrams of this

kind lends credence to the idea that the detailed shap e of the p otential is not imp ortant in

determining what kind of phase diagram o ccurs only gross features like the relative ranges

of the attractive and repulsive parts This fact combined with a desire to b e able to check

our results explicitly with inuenced us in our decision to use the

squarewell p otential instead of a more physically realistic mo del like the AsakuraOosawa

p otential We also originally intended to study a range of dierent values of and all parts of

the phase diagram when it would have b een advantageous to have a p otential whose range is

unambiguously dened However pressure of time prevented more than a part of this ambitious

pro ject from b eing completed

Details of the Simulation

We have used three dierent approaches to the measurement of free energy in this chapter

each of which requires its own computer program dierent in detail from the others However

there is a large core part of the program that they all have in common which deals with

the basic problem of simulating the motion of the particles of the N particle squarewell solid

This core program and its implementation on the Connection Machine CM is describ ed

in app endix E it is shown there that the most ecient way to map the problem onto the

machine is to use a mixture of geometrical decomp osition and primitive parallelism running

N O indep endent copies or replicas of each simulation in parallel Each replica is

r

quite small containing particles The individual details of the mo dications required

to implement each metho d of freeenergy measurement will b e describ ed in the section devoted

to that metho d

Comparison of Thermo dynamic Integration and the

Expanded EnsembleUse of an Einstein Solid Refer

ence System

In this section we shall mainly b e concerned with the relative eciency of these two metho ds

we shall obtain comparatively little information ab out the physics of the squarewell system

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

lo cating a single pair of co existence p oints only The metho d of freeenergy measurement that

is used is the smo oth transformation of the squarewell solid by way of a series of interpolating

systems into a system for which the free energy can b e calculated exactly in this case the

harmonic Einstein solid at the same temp erature and density This technique was rst used in

and is discussed in section We shall measure the free energy b oth by using TI as in

and by using the states along the thermo dynamic integration path to make an expanded

ensemble a case such as this where several Hamiltonians are dened on the same phase space

is more naturally describ ed as an expanded ensemble than a multicanonical ensemble We

note that this technique of transforming the energy function was used combined with TI in

the determination of the phase diagrams of the d squarewell system in however the

reference system used there was the corresp onding hardsphere system whose free energy was

taken as b eing already known absolutely from previous simulations andor theoretical equations

of state for each phase The equation of state used for the solid phase was due to Hall

It is therefore necessary to mo dify the core simulation routine describ ed in app endix E The

p otential function used is no longer simply

N

X X

E r E

ij SW

i j i

where E r is as dened in equation in making particle moves we now use the energy

ij

E E E

SW ES

where E is the p otential energy of the harmonic Einstein solid

ES

N

X

r r E k

i ES s

i

i

where the set r are the lattice sites of the Einstein solid which are of course arranged here

i

to corresp ond to the fcc lattice sites and k is a spring constant Each particle thus feels an

s

additional harmonic attraction to its lattice site By varying in the range we thus

interpolate b etween the pure Einstein solid and the pure squarewell solid

In the expanded ensemble implementation we allow transitions b etween dierent values of

in thermo dynamic integration which we shall now describ e we simply p erform a series of

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

indep endent simulations at dierent values of

Thermo dynamic Integration

The principle underlying this metho d is to use the equation

Z

F

N

ln d r exp E

R

N

d r E exp E

R

N

d r exp E

E E

SW ES

where denotes a canonical average measured in the ensemble dened by E Thus it

follows that

Z

F F E E d

SW ES SW ES

and at intermediate p oints along the path we dene

Z

0 0

d E E F F

ES SW ES

F can b e calculated exactly F N ln k

ES ES s

We shall estimate F by measuring E E for a series of values of

SW SW ES

at a xed temp erature and volume tting a function to the data p oints and evaluating the

integral numerically We choose the Einstein spring constant k to b e suciently large that the

s

particles are prevented from moving far o their lattice sites even in the ensemble so

that there is not an excessive variation in the typical congurations of the system as varies

By doing this we hop e to keep the total free energy dierence b etween and fairly

small We to ok the criterion that the particles should stay close to the lattice sites as implying

the following let P jr r jdr b e the probability that a particle whose lattice site is at r is

found in a shell of radius jr r j and thickness dr ab out the lattice site Then with Einstein

solid interactions only it follows that

P jr r jdr k r r exp k r r dr

s s

So we demand that

N

P jr r j R

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

where R is the radius of the sphere within which the centre of a particular particle can move

without approaching within of neighbouring lattice sites at the prevailing density This

criterion is thus designed to ensure that in the simulation as a whole all the particles are

further than from their neighbours and so contain no hardcore overlaps at least half the

time

However diculties arise in the and states that prevent us from implementing

this strategy exactly as describ ed ab ove First in the pure Einstein solid ensemble

however large we make k there is a nite probability of generating a conguration where at

s

least one pair of particles are closer together than Thus E is innite exactly at

SW

though it is nite and wellbehaved everywhere else This is a consequence of the hard

core in the p otential which remains in E for all nite without diminishing in size at

all only disapp earing precisely at In this problem is handled by expanding F

analytically around but as we shall explain in section it can also b e treated

easily with the expanded ensemble Since we have an expanded ensemble simulation available

we use it rather than TI to connect with the next state we have used and

The second diculty arising as must b e dealt with rather more carefully

The Centre of Mass Problem

This problem is a consequence of the fact that E is invariant under translations of the center

SW

of mass of the whole system whereas E is not As a result as the centre of mass has

ES

an increasingly large volume accessible to it and E increases At the probability

ES

density of the p osition of the centre of mass b ecomes uniform over the simulation b ox which

has side length L corresp onding to a mean square displacement of the order of L and for the

large values of k that we are using an extremely large value of E This would make

s ES

the evaluation of the integral in equation very dicultnot much easier in fact than if the

integrand had an integrable singularity Many simulation p oints would b e needed for close

to one where E is large and rapidlyvarying and these simulations would have to b e

ES

extremely long since we would have to sample for long enough to allow the centre of mass to

wander through the whole simulation volume

If the particle moves are made serially it is fairly easy to solve this problem by enforcing the

constraint that the centre of mass of the particles should remain at all times at the p osition of

4

The p ossibility of doing this was also mentioned in

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

the centre of mass of the lattice sites preventing the Einstein energy from b ecoming excessively

large This is done by accompanying every trial displacement of a particular particle r say

by a displacement of all the particles by r N The squarewell energy is invariant under

this translation the Einstein energy is not but do es not have to b e recalculated by lo oping

over all the particles it can easily b e shown that this rigid translation of all the particles

simply pro duces a reduction of k r N in the total Einstein energy The Metrop olis test can

s

therefore still b e carried out on the basis of lo cal interactions only Small analyticallyknown

corrections have to b e made to the expressions for the free energies of b oth the Einstein solid

and the squarewell solid reecting the extra constraints on the system

Unfortunately this technique which is the one used in cannot b e applied to simulations

where particle moves are made in parallel it is not clear how much the centre of mass will move

by until all the trial moves are accepted or rejected but the acceptance probability dep ends on

the energy which itself dep ends on the motion of the centre of mass In fact all the simulations

p erformed in this subsection were carried out using primitive parallelism only ie with serial

up dating of particle co ordinates within each simulation Thus the centre of mass could have

b een kept xed and with the b enet of hindsight this would have made b oth the simulations

and the analysis easier to p erform Nevertheless we used in practice the following metho d

which is applicable to a system with parallel up dating

For simulations carried out at only we keep the centre of mass xed so that E

ES

remains quite small It is p ermissible to do this here even with parallel up dating b ecause

E E so the Einstein energy do es not app ear in the Metrop olis test controlling the

SW

acceptance of particle displacements For other values of we allow centre of mass motion

and accept the growth in E However rather than attempting to use equation

ES

directly we choose a function q describ ed b elow which shows the same neardivergence at

as do es E but which is analytically tractable Then we split up the integral in

ES

into two pieces

Z Z Z

E E d E E q d q d

SW ES SW ES

5

We do this by calculating r by summing r over all accepted transitions at the end of every lattice

CM

0

We move the centre of mass of the Einstein lattice to follow that sweep then adding r N to all the r

CM

i

of the simulation rather than the other way round b ecause otherwise it is found that rounding errors in adding

rN to the particle p ositions can pro duce spurious hardcore overlaps

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

The second integral on the RHS can b e done analytically or numerically while the rst

is now much b etter b ehaved b ecause the two neardivergences cancel In particular we may

extrapolate the integrand with condence from the largest value of that was tractable to

Our pro cedure is therefore to t a function we used an interpolating cubic spline to

E E q extrap olate to and integrate numerically

SW ES

The function q that we use is just

N

q E

(N )

ES

E

ES

where by the notation on the right hand side we mean the average energy of a single Einstein

particle with spring constant N k evaluated in an ensemble with eective coupling N k

s s

Thus q describ es to a go o d approximation the energy due to the motion of the centre of mass

of the Einstein lattice which moves almost exactly like a single particle with spring constant

N k the hard cores in the squarewell p otential keep the lattice rigid and so keep all the springs

s

very nearly parallel

Written out more fully we have

R

L

N k x expN k x dx

s s

L

q

R

L

expN k x dx

s

L

p

c L expcL

p

p

erf c L

where in the last line we have written c N k There is no analytic closed form

s

R

for q d but it can easily b e integrated numerically to whatever precision is required

N

though the MC data shows that there is approximate In general E E

(N )

ES

ES

E

ES

equality However to b e condent that we can extrap olate the integrand to we require

that the approximate equality should continue to hold even for very close to one where we

do not have MC data to conrm it In fact we exp ect the approximation to improve as

and to b ecome exact in the limit for the following reasons

First the probability density of the centre of mass b ecomes uniform over L L

N

and so the hard cores have no eect on the exp ectation value of E

ES

Second as can b e shown by explicit calculation the exp ectation of the energy of the single

particle with spring constant N k is equal in this limit to the energy of N indep endent particles

s

each with spring constant k s

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

As a consequence q captures al l of the Einstein solid contribution

N

N k L E E E

(N =1) (=1)

s ES ES

ES

E E

ES ES

Having shown how the diculties at and may b e overcome we shall now go on

to present the results

Results

We examine the p oints v and v which are chosen on

the basis of results given in as b eing on or very close to the solidsolid co existence curve

We therefore entirely avoid for now the dicult problem of lo cating the co existence curve in

the rst place We remind the reader that our units are such that E and F are in units of E

is in units of E and v is in units of Thus the pressure p is in units of E and the

spring constant k is in units of E

s

We chose to examine the small system N ie unit cells and to use only primitive

parallelism with N so the simulation parameters dened in app endix E are NPR

r

NEDGE and LPV

The values of at which simulations were conducted were chosen simply by eye rst a few

simulations were p erformed at and and then other simulation p oints

were chosen on the basis of an assessment of where E E was changing

SW ES

most rapidly The largest value of used with the centre of mass free was we also

p erformed a simulation at but with the centre of mass constrained For each simulation

p oint estimates of E E were output every lattice sweeps and

SW ES

examination of the b ehaviour of these blo ck averages were used to determine when convergence

had o ccurred the b est estimate of E E was then obtained by averaging

SW ES

over subsequent blo cks When changing the nal conguration from a run at a nearby value

of was used as the starting conguration to reduce the equilibration time though this still

b ecame lengthy as Typically lattice sweeps were p erformed for each of which

ab out were used for estimation for for v and

sweeps were generated and used The simulations achieved a sp eed of ab out

sweepshour

Altogether for v lattice sweeps representing ab out hours pro cessing

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

time were p erformed at values of and of them were used in evaluating

E E For v lattice sweeps were p erformed at values

SW ES

of and of them were used For b oth densities these gures include expanded

ensemble simulations of length sweeps that were used to connect to the state We

consider in retrosp ect that more accurate results would probably have b een obtained if more

values of had b een considered and less time sp ent on each one even though this would have

increased the fraction of time sp ent on equilibration

Let us rst consider v The MC results are shown in gure The data p oints show

the raw MC results for N E E with their error bars the solid line is

SW ES

the spline t to N E E q We should note that q

SW ES

on the same scale as the gure so N E E would reach ab out at

SW ES

This makes it strikingly clear that to p erform the thermo dynamic integration accurately

without the analytic correction would b e a near imp ossibility

0.0

(1/N)[-] -10.0 SW ES (1/N)[--q]

-20.0 >-q] ES -20.0 -25.0 >-q]

ES -30.0 >--

(1/N)[

-45.0 (1/N)[

-50.0 0.0 0.2 0.4 0.6 0.8 1.0

α

Figure v Diamonds N E E measured by MC solid

SW ES

line spline t to N E E q The main gure shows the whole

SW ES

range of while the inset shows the region around We emphasise that the p oint exactly

at is generated with the centre of mass constrained to b e stationary while it is free for

all the other values of see the discussion of this problem in section

We nd

Z

E E q d N

SW ES

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

Z

N q d

so

Z

N E E d

SW ES

And from the single expanded ensemble simulation used to link and

F F N The Einstein solids spring constant generated according

ES

to the criterion expressed in equation was k which leads to f Thus

s ES

Z

f f N E E d

SW ES SW ES

Z

f F F N N E E d

ES ES SW ES

The corresp onding results for v are shown in gure

0.0

(1/N)[-]

(1/N)[--q]

-10.0

-10.0 >-q]

ES -15.0 >-q]

-20.0 ES -20.0 >--

(1/N)[

(1/N)[

-40.0 0.0 0.2 0.4 0.6 0.8 1.0

α

Figure v Diamonds N E E measured by MC solid

SW ES

line spline t to N E E q The main gure shows the whole

SW ES

range of while the inset shows the region around

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

We nd

Z

N E E q d

SW ES

Z

q d N

so

Z

E E d N

SW ES

And F F N from the expanded ensemble simulation This time we

ES

have k it must b e stronger in this denser solid to maintain the criterion ie to

s

prevent hard core overlaps leading to f Thus

ES

Z

f f N E E d

SW ES SW ES

Z

E E d f F F N N

SW ES ES ES

The errors in b oth cases were estimated from the size of the error bars on the MC data

p oints The error is dominated by p oints near so only the eect of these p oints on

the estimate of f was considered We have thus obtained estimates of the Helmholtz free

energy f for the two densities under consideration however this is not sucient to establish

that these are indeed the densities of the co existing phases at since the condition

for co existence is of course that the Gibbs free energies of the two phases should b e equal

and g diers from f by a pV term that is still unknown b ecause the pressure p is unknown

For many systems p can b e estimated from a constantV simulation as a canonical average

pV but the expression contains the average interparticle force which for the square

well system is a pair of deltafunctions and so inaccessible To see what the magnitude of the

pV term is therefore it would b e necessary to do more simulations for dierent sp ecic volumes

around those already investigated to establish the shap e of f v in these two regions Then the

co existence pressure and the densities of the co existing phases could b e established with the

6

and N corrections

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

doubletangent construction We do not attempt this here but to demonstrate that go o d

accuracy has b een achieved in the intergration estimates f of the absolute Helmholtz free

TI

xc

energy we show in gure f at v and v together with f v calculated

TI

from the results of section This metho d do es not yield absolute free energies so the vertical

scale has b een xed by constraining the two estimates to b e equal at v The desired

consistency check is thus obtained by the go o d agreement at v

xc 12.0 f f TI

11.0 f(v) 12.05

f(v) 10.0 12.00

11.95 0.7200 0.7210 0.7220 0.7230 0.7240 v 9.0 0.71 0.73 0.75 0.77

v

Figure The absolute Helmholtz free energy f as a function of sp ecic volume v the

dashed curve is obtained by the multicanonical N pT ensemble of section the absolute

additive constant b eing established using the absolute free energy calculated in this section

for v These absolute free energies f are marked by circles The solid line is the

TI

doubletangent to the dashed multicanonical curve

We also show the doubletangent to the multicanonical f v the p oints of tangency estimate

the sp ecic volumes and its gradient estimates the co existence pressure The results are p

coex

v and v These are appreciably dierent from

dense expanded

v and v which were chosen it will b e recalled b ecause they were the estimates

for v given in However we shall nd in section that these discrepancies in

coex

p and v can b e largely attributed to nitesize eects in this small N system in

coex

the system size was N

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

Expanded Ensemble with Einstein Solid Reference System

It is also p ossible to use an expanded ensemble approach to measure the free energy dierence

b etween the squarewell and Einstein solids

The intermediate systems are dened by a p otential energy function that is once again a

linear combination of the squarewell and Einstein p otential energies just as it was in the TI

approach so that it is p ossible to make a direct comparison b etween the two metho ds We

shall also briey investigate two other imp ortant issues in the use of the expanded ensemble

we shall lo ok at the eect of the number of intermediate states used and we shall explicitly

conrm the result of section that the sub division of the intermediate states into groups to

b e simulated separately do es not aect the overall accuracy

The simulation itself must b e adapted to pro duce the expanded ensemble First dene

X

exp E Z

i i

f g

for some suitable set f g i N Now as we know from section by allowing

i m

transitions b etween the sub ensembles according to rules dened b elow we may construct an

expanded ensemble with partition function

N

m

X

Z exp Z

i i

i

which given that so that F F leads to

ES

F F lnP P

i ES i i

F F In particular using

SW N N

m m

P ln P F F

N SW ES N

m m

P arranging for the set We use the MC simulation to measure the last term ln P

N

m

to b e such that P P i j The metho ds of chapter could have b een adapted to nd a

i j

suitable F but since it is easy to obtain estimates of F from the TI results of section

7 xc

In the next section we do adapt them to nd V for the multicanonical N pT ensemble

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

we chose to use these right from the start rather than to make an indep endent estimate

of starting from

We use the Metrop olis algorithm b oth to make particle moves within each ensemble and

to make transitions b etween the dierent ensembles To calculate the Metrop olis function

for ensemblechanging moves is computationally extremely cheap since we do not move the

particles while changing ensemble and so need only multiply E and E by the appropriate

SW ES

values of and b efore and after However there is some cost in communication on

the Connection Machine since clearly the prevailing value of must change in all parts of a

simulation at once so all the subvolumes of each simulation must b e considered together There

is thus some reshaping and summing of arrays to b e done b efore the move and broadcasting

of the results afterwards which requires the use of general communication routines This

restriction corresp onds to the rule discussed in section that any attempted up dates in

a single simulation of any co ordinates or parameters must b e made serially if they change

However we do of course make the trial ensemblechanging moves in parallel for all the N

r

indep endent simulations the prevailing value of in each simulation b eing indep endent of its

value in the others

Before presenting the results we shall comment again on the end states and

First we should note that with the expanded ensemble we have no problem reaching the pure

Einstein solid at Transitions into are not in any way sp ecial and while it is

the case that some attempted transitions out of will nd that the trial energy is innite

b ecause they come from an Einstein solid conguration where there would b e an overlap of

the hard cores of the squarewell p otential this simply results in a transition probability of

exp Thus this situation is handled transparently by the algorithm and we need only

ensure that the spring constant k is large enough that a reasonable number of congurations

s

with no overlaps are generated This is in contrast to TI where the presence of any states with

hardcore overlaps in the ensemble prevents the evaluation of E E

SW ES

As we said b efore we used a simple twostate expanded ensemble to handle this in the previous

subsection In that simple case adequate sampling of b oth states was obtained without any

preweighting that is to say with

Once again however the wandering of the centre of mass of the simulation as presents

diculties Though we b elieve it would b e p ossible to nd some way to x this problem within

the expanded ensemble formalism we have simply avoided it in practice by stopping at

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

instead of extending the simulations right up to Thus the free energies that we obtain

are of interest only in so far as they compare with corresp onding measurements made with

TI and in so far as they enable us to further investigate some other questions relating to the

expanded ensemblemulticanonical ensemble metho d itself

Comparison with Thermo dynamic Integration

Using the estimates obtained from TI we constructed a weighting function for the system

xc

with sp ecic volume v The size of the system and all the simulation parameters are

the same as for TI We used N sub ensembles results for other choices of N are given

m m

b elow and chose the parameters so that constant

i i

A total of lattice sweeps of the expanded ensemble were p erformed with all simulations

starting in the sub ensemble sweeps were used for estimation of the free energy

dierences when from examination of the visited states it was clear that equilibration had

o ccurred We then estimate f from equation with P determined using simple visited

i i

states estimators The error bars come from jackknife blo cking

To compare the results of TI and the expanded ensemble we calculate f the free energy

at various p oints along the path f F N where F is dened in equation

Graphs of the two estimates of f are indistinguishable so we instead plot the dierence

b etween the two see gure The sizes of the random errors are comparable with those on

the TI p oints slightly smaller as one would exp ect given that rather more time was devoted to

it than to the expanded ensemble

It is apparent that as in section the thermo dynamic integration and expanded ensemble

estimates dier by substantially more than the random error in certain parts of the integration

range In the Ising case the availability of the exact results meant that we could attribute the

discrepancy entirely to systematic errors in the TI p oints due to a phase transition on the path

of integration That is not the case here lacking exact results we can only sp eculate which

result is the more accurate However it seems more likely that the fault once again lies with

thermo dynamic integration and is again caused by choosing the simulation p oints to o far apart

in a region where the integrand is changing rapidly here that is the region near We

have already commented that the simulation p oints p erhaps should have b een chosen closer

8

The lengthy equilibration time already a problem here b ecomes so long in section that sp ecial techniques

must b e introduced to overcome it

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

0.010

Thermodynamic Integration (reference)

fEE-fTI

0.005

0.000 free energy difference

-0.005 0.0 0.2 0.4 0.6 0.8 1.0

α

Figure The dierence b etween estimates of f the free energy p er particle along

the path transforming the Einstein solid into the squarewell solid f from thermo dynamic

TI

integration is taken as the reference so all p oints circles lie on the horizontal axis only the

error bars are of interest The other p oints triangles show f f where f is the

EE TI EE

estimate of the free energy from the expanded ensemble

together Nevertheless no strong conclusion on the relative merits of the two metho ds can b e

drawn from these data

As in the previous section though we have determined F with high accuracy we cannot

SW

say anything ab out phase co existence b ecause we do not know the shap e of f v or the pressure

To map out f v the pro cedure could b e rep eated for other sp ecic volumes or the results of

the N pT simulation could b e used as they were b efore

Other Issues

Let us return to questions related to the most ecient use of the expanded ensemble metho d

First how many values of is it b est to choose With the Ising mo del there is a natural

granularity to the problem that of the discrete macrostates Here that is not the case so we

have investigated the eect that changing N has on the accuracy of the results Second we

m

have used the simulation to conrm the result derived analytically for a simple Markov pro cess

P is not improved by dividing up in section that the accuracy of the estimate of P

N

m

the N states into overlapping subgroups m

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

N f r

m EE a r w

Table estimates of free energy dierence f average acceptance ratio of changing

EE

r and random walk time in lattice sweeps for the squarewell solid with various moves

a r w

N

m

First then let us consider the eect on the accuracy of the estimate of f f

EE

f of dividing up the range of into and parts The values of to b e

used are generated from the TI results arranged so that F F is constant for all states i

i i

though equation for the prediction of spacing of states given in section suggests that

this may not have b een the b est p olicy The probabilities of the sub ensembles are estimated

with the visitedstates VS metho d

The results are shown in table The data for each value of N were pro duced from

m

blo cks of sweeps each with two such blo cks discarded for equilibration and all the

simulations started in the state One sweep here comprises one attempted up date of all

the particles and one attempted change of the prevailing for each replica The value of

r w

is a rough estimate only calculated from the fraction of the N simulations that in fact

r

did make a random walk during the course of one blo ck

The results do demonstrate clearly that reducing N to a very small value here impairs

m

overall accuracy by reducing the acceptance ratio r They also suggest that there is quite a

a

wide range of N that gives an acceptable accuracy b oth N and N would b e

m m m

usable The random walk time is ab out the same for these two the larger acceptance ratio of

the N simulation comp ensates for the greater distance to b e covered

m

We should note that the results obtained here for f when compared with the results

EE

of the longer runs that we are ab out to present suggest a systematic underestimate of the

magnitude of f This is probably attributable to insucient equilibration time given that

EE

all the replicas are launched from a single state the state This explanation is reinforced

by the particularly p o or agreement of the N results where equilibration over was the

m

slowest b ecause of the low acceptance ratio

Now let us concentrate on the case where a total of states are used in the range

and investigate the eect of sub dividing the range into b subgroups with each

replica simulation restricted to one subgroup and with one state in common b etween adjacent

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

j

b N f

EE

m

Table The b ehaviour of the expanded ensemble estimator f up on sub division of

EE

j

the range of the expanded ensemble N represents the number of sub ensembles in the j th

m

subgroup j b and so shows how the sub ensembles are allo cated to the subgroups The

estimates of the error are scaled by the square ro ot of the total number of sweeps p erformed

subgroups Once again the simulations are p erformed in blo cks here blo cks of

sweeps each and calculations are p erformed on each blo ck so that the standard error of the

nal estimators can b e determined The measured standard errors have b een corrected for the

variation in the total number of lattice sweeps p erformed Probabilities are measured with VS

estimators where the probability P of a state is estimated from the number of visits to it C so

i i

there is an increasing need to discard early iterations as b decreases as the time for equilibration

over the sub ensembles lengthens for the b case it is necessary to discard the rst four of

thirteen blo cks Use of transition probability estimators with a uniform initial distribution

of replicas over the sub ensembles see section might have reduced this problem though

equilibration within the high ensembles would still have b een dicult The results are shown

in in table

The most signicant results are the measured sizes of the error bars on the estimates of

f which seem unaected by the pro cess of dividing the range Certainly our results allow

EE

us to condently exclude the fallacious argument of section which would suggest that

f should b e estimated with three times the accuracy of a single expandedensemble run

EE

when the range is divided into nine parts The data thus provides reasonably strong evidence

is indep endent of b in supp ort of the argument of that section that P P

N

m

Direct Metho dMulticanonical Ensemble with Vari

able V

We now wish to estimate the phase transition pressure and the volumes of the co existing phases

for the squarewell solid using a direct metho d that eliminates the need for a reference system

We shall do this by applying the multicanonical ensemble that was introduced in chapter We

shall study nitesized systems by generating and using a at sampled distribution extending

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

over a wide range of macrostates which in this case are macrostates of V and must embrace the

volumes of the co existing phases This distribution can then b e reweighted to obtain estimates

can

of F v and the canonical probability density P p v for a range of values of p From this

N

we can nd the co existence pressure p and canonical averages

coex

can

p v consists of a single Gaussian p eak but near p it Away from co existence P

coex

N

develops a doublep eaked structure as shown schematically in gure Each p eak corresp onds

to one of the co existing bulk phases It is obvious that the natural nitesize estimator of p is

coex

can can

phase B phase A P that for which the two p eaks have equal weight that is to say P

N N

R

can can

p v dv We shall discuss in section b elow P phase j where P

coex

N N

v phase j

the way that this and other nitesize estimators of p approach the innitevolume limit

coex

We note that except near the critical p oint or for very small systems there is a region of very

low canonical probability b etween the two p eaks which will b e enhanced in the multicanonical sampling

P(v)

v

can

Figure Schematic diagram of a typical canonical probability density P v for the

squarewell solid at phase co existence

We shall present and comment up on the results that are pro duced in section however

since there are some imp ortant dierences b etween the way that the multicanonical ensemble

is generated and used here and how it was used in chapter we shall rst devote some time to

explaining and justifying the pro cedure adopted in this chapter We shall rst describ e how we

have implemented the variablevolume multicanonical ensemble then we shall explain why it is

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

that for this system and this computer the time required for even a single random walk

r w

over all the volume macrostates of the squarewell system b ecomes prohibitively long Then

we shall describ e in detail the pro cedure for estimating the sampled distribution in spite of this

showing in particular that the transition probability TP estimators introduced in section

here outp erform visited states VS estimators in all stages of the simulation pro cess Thus

there are two main parts to this section in the rst we describ e the technique we use to pro duce

the results in the second we concentrate on the physical b ehaviour of the squarewell system

The Multicanonical N pT Ensemble and its Implementation

The appropriate Gibbsian ensemble for describing this system is dened by the partition func

tion

Z Z

N

N

d r exp pV E r dV Z p

N

V

with asso ciated probability densities

can N

N

P r V Z p exp pV E r

N

N

and

Z

N can

N

d r exp pV E r P V Z p

N

N

V

It is quite easy to construct a MonteCarlo scheme for sampling from this distribution

constantN pT MC however as describ ed in section the barrier of low probability

b etween the phases in this case means that a very lengthy simulation would b e needed to esti

mate the relative probabilities of the two phases even if they were of roughly the same order of

magnitude Otherwise the b est that can b e done is probably to put a wide bracket on p

coex

p is certainly less than a pressure p that drives a simulation started in the rare phase into

coex h

the dense phase where it is then observed to stay but it is certainly more than p that allows

l

a simulation started in the dense phase to pass into the rare phase where it then stays

Instead we chose a multicanonical approach with the order parameter V preweighted by

xc

V such that the sampled distribution is approximately at over some range of V from V

to V say The multicanonical pdfs are thus

xc N xc xc

N

r V Z exp V E r P

N N

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

and

Z

xc N xc xc

N

exp V E r d r P V Z

N N

V

and we recover the canonical probability using

can xc xc

P p V exp pV V P V

N N

which we normalise using

Z Z

V

can can

P p V dV P p V dV

N N

V

can

This gives a go o d estimate of P for any p provided that p V and V are such that equa

tion is true ie provided that the canonical pdf has eectively all its weight in the

multicanonical sampling range Thus it is not necessary to know p a priori only to have a

coex

rough idea of the volumes of the co existing phases so that V and V may b e chosen to bracket

them

Computational Details

We start from the core CM program describ ed in app endix E We mention at this p oint the

choice of the maximum displacement x of particle moves in the congurational up dates The

same x was used for all the volumes and it was varied only a little b etween simulations done

at dierent temp eratures its values was x ab out half the width of the p otential

well With this choice the acceptance probability varied b etween ab out in the states with

lowest V to ab out in those with highest V However to implement the Multicanonical N pT

ensemble we need to make volumechanging moves as well as co ordinate up dates The volume

changes are realised by making uniform contractions or dilations of the b ox that leave the

relative p ositions of the particles unchanged To avoid the necessity of up dating the particles

p osition co ordinates whenever a volume change is accepted we work with scaled co ordinates

s r L where L V As a consequence the p otential energy function b ecomes a function

9

Even if we do not dene this interval correctly the metho d is robust we either nd a singlep eaked

can can

P p V straddling our estimate of the lo cation of the phase b oundary or a P p V that in

coex coex

and then cuts o In either case the necessity of widening the interval is clear and at creases up to V or V

1

N

m

least in the second case it is clear how this should b e done

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

P

N

E V s where of V E V s

ij

ij i

if s L

ij

E V s

ij E if L s L

ij

if s L

ij

ie there is a volumedependent eective hardcore diameter L

Particle moves are made in parallel within the same simulation where p ossible using the

usual Metrop olis rule which is here

N

N

0N N 0

min exp E V s E V s P s s

The volume changes are made by discretising the range of V into a discrete set fV g

i

i N The Metrop olis rule is used in the mo died form

m

xc xc

N N

P V V min expN lnV V V V E V s E V s

i j j i j i j i

where the scaled co ordinates are left unchanged and j is restricted to b e i or i

chosen with equal probability trial moves that would take us outside the chosen range of V

are immediately rejected The N lnV V term reects the Jacobian of the transformation

j i

xc

from r to s co ordinates in the partition function Z Even though the conguration do es not

N

change we must still recalculate all the particleparticle interactions since the eective hard

N

core diameter do es alter and then sum E V s over all the subvolumes of each simulation

This requires general communication on the CM and so is quite an exp ensive pro cess Our

pro cedure is to p erform sweeps consisting usually of one attempted up date of the p ositions of

all the particles and one attempted volume change One iteration then consists of N sweeps

s

after which we up date We shall discuss the eect of the relative frequency of co ordinate

up dates and volume changes in section b elow

10

We do not b elieve that it is essential to discretise V it could b e left continuous and the transitions group ed

into a histogram However the TP metho d would then encounter the same diculties that we found in section

coming from imp erfect equilibration within each V state in particular underestimation of the eigenvector

of the sampled distribution would resultthough as we shall see this problem may arise anyway

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

The ensemble pro duced by this up date pro cedure has partition function

Z

N

m

X

N xc xc N

N

V exp V Z d s exp E V s

i

i N

i

and we measure

Z

xc xc N xc N

N

P V Z V exp V d s exp E V s

i i

N N i

and then reconstruct the canonical ensemble by

Z

V

r

P

xc xc can

V exp pV V P V P p V dV

i i i i

N N

i

V

q

P

where V V V and signies that the sum is restricted to the range V V V

i i i q i r

As b efore normalisation follows from

Z

N

m

V

X

can xc xc

P p V dV V exp pV V P V

i i i i

N N

V

i

The Pathological Nature of the SquareWell System

We have not yet said how the set fV g is to b e chosen as in an expanded ensemble calculation

i

there is no natural granularity to the macrostate space so we should choose the number and

spacing of the states to give the b est p erformance As we outlined in section there is a

tradeo b etween the r the acceptance ratio of volumechanging moves and the number of

a

accepted steps required to cross the macrostate space and nearoptimality is obtained for a

fairly wide range of p ossible choices as long as the acceptance ratio is kept fairly high

A suitable way of choosing the states an of expanded ensemble simulation is describ ed in section

However the squarewell system presents unusual diculties in this regard b ecause of

the hard core in the p otential any trial volume change that pro duces even a single hardcore

overlap will b e rejected which means that only small volume changes have a reasonable chance

of b eing accepted Thus we nd that N must increase rapidly with N and must b e made

m

quite large even for a small system to avoid a very low acceptance rate Moreover the problem

is more acute for small sp ecic volume where the particles are more closely crowded together

so the spacing of the states should vary with V to achieve a roughly constant acceptance ratio

In practice for N we exp erimented to nd spacings that gave an acceptance ratio of ab out

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

We used only two values of V the smaller at low volumes and the larger twice as big

at high For N and N we used a rough ansatz for the dep endence of r on V and

a

V to generate a suitable set fV g with N determined by the desired starting and nishing

i m

volumes This metho d turned out to b e quite eective in keeping r constant we nd that for

a

the N system at it varies b etween and while for the N system at

the same temp erature it varies b etween and it is higher at the lowvolume end so we

are slightly overcompensating for the increased diculty of accepting volume transitions here

For comparison r varies b etween and for the N system with two dierent r s

a

More imp ortant is the scaling of N with N The pro cedure just describ ed which as we

m

have said is found to keep r roughly constant pro duces N for and

a m

N so that it seems that N N which can also b e predicted by approximate

m

scaling arguments In a normal expanded ensemble calculation by contrast we would exp ect

that N would not need to b e so large even for the smallest N and that to keep r constant

m a

would require only N N since the scaling of the microstate space to b e covered N

m

would b e partially cancelled by the N scaling of the size of typical uctuations see section

and It is this necessity of using large numbers of volume macrostates combined

with the rather slow sp eed per replica simulation of the Connection Machine that pro duces the

very long random walk time As we shall show in section this forces us to mo dify

r w

the way that the multicanonical ensemble is used extending the use of the TP estimators to

all stages of the simulation pro cess

We shall now go on to explain how an approximately multicanonical distribution is generated

section and then section how it is used to pro duce the desired nal estimators of

canonical averages and quantities related to the phase transition The pro cedure is once again

an iterative one with the iterations numbered with n as in chapter

Finding the Preweighting Function

n

Here we shall deal with the nding stage of the multicanonical distribution where converges

uniformly towards its ideal value In section we have already established the utility

of using TP estimators in this stage of the pro cess so we shall not justify their use further

However we shall nd that the way that co ordinate up dates and volume changes are separated

in the squarewell system enables us to gain useful further insight into why the convergence

pro cess takes place as it do es

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

Let us show how the nding pro cess works by considering it in action for an N system

We start with all the particles on their lattice sites and for each simulation choose the volume

macrostate index i with uniform probability in N Thus we start with the N replica

m r

simulations distributed fairly uniformly through macrostate space rather than launching them

all from only a few macrostates as in section We then p erform ab out equilibration

sweeps through the lattice up dating the particles p ositions but not yet attempting the volume

changing moves The purp ose of this is to allow the Markov chains to reach equilibrium at

N

constant volume so that P s jv analogous to P ji from section has its canonical

form Because the random walk time is so long for this system it is much quicker and easier to

establish equilibrium this way than it would b e by allowing the replica simulations to spread out

over all these volume states from only a few release macrostates Then we allow volume changes

as well and gather histograms of volume transitions over short iterations with N This

s

time is much shorter than so each replica will explore only its lo cal macrostates however

r w

we can nd estimators for the whole macrostate space by p o oling all the transitions of all N

r

n

replicas into a histogram C at the end of each iteration C at the end of the nth iteration

ij

ij

n n n

Then we use C to estimate the TP matrix using equation and up date using the

ij ij

simple scheme

n

n n

ln P constant

n

where the estimator P is the eigenvector of the estimated transition matrix This is the

ij

n

same scheme that was used in section In so far as P is a go o d estimator we would

n

exp ect to b e multicanonical

n

In fact as we shall see tends to converge to a limit over the course of several iterations

just as in section As always the accuracy of the TP estimators is reliant on equation

b eing satised which means here that we must b e able to achieve equilibration at constant

volume even for those volumes that lie b etween the two equilibrium phases in the initial

constantvolume equilibration phase and then we must b e able to reestablish it by co ordinate

up dates after each accepted volume change We shall present strong numerical evidence b elow

to show that this approximation works well while converging to and shall show that it is

essentially exact for the multicanonical distribution However it do es mean that our metho d

would not b e applicable to a system like a spinglass where equilibration is very dicult except

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

in a particular set of macrostates To simulate a spinglass that required the same computational

eort p er up date as the squarewell system would probably require a substantially more p owerful

computer than the CM

We now comment briey on the choice of N It is invariably the case that the initial P V

s can

n

and the early smalln iterates P V have almost no weight in the lowvolume states so that

notwithstanding the slow evolution of V these tend to b ecome uno ccupied as the simulation

n

pro ceeds This implies that should b e up dated rapidly on the basis of relatively few lattice

sweeps p er simulation so that the simulations do not have time to move far and it is not

necessary to reinitialise and reequilibrate them to access the lowvolume states once again

There is clearly a tradeo to b e made b etween this and the requirement that enough transitions

should b e recorded for C to estimate accurately an eective fullness criterion cf N in

ij ij TP

section We shall return to this matter and to the eect of the amount of congurational

up dating p erformed b etween V transitions a little later

In gure we show the results of the nding pro cess for the N system with

N N N the results are shown as a function of v V N

m r s

this will b e our p olicy from now on as it allows results from dierent system sizes to b e

more easily compared It is apparent that shown in the large upp er gure and its inset

xc

converges to covering decades of probability within or iterations which require only

ab out minutes to p erform on this small system The lower gures show the VS histograms

they are not used for up dating but are shown to give an indication of the distribution of

the p ositions of the simulations at each stage of the iterative pro cedure An initial tendency

to equilibrate by moving to high volume states which have high equilibrium probability for

is quickly reversed though the lowestvolume states do briey b ecome uno ccupied in the

nd iteration We emphasise that the histograms do not reect in any real sense the underlying

sampled distribution the simulations do not move far enough an average of only th of

the width of the macrostate space during each iteration for the eect of the starting state to

disapp ear Thus here where the starting states are uniformly spread out the histograms give

the impression of a sampled distribution that is more uniform than is really the case wherever

in the macrostate space they were initially clustered they would seem to indicate that that

region was the most probable irresp ective of its real equilibrium probability

11

While it is true that simulations that had left the lowest states would eventually o ccupy them again once

a multicanonical distribution over that part of macrostate space had b een established the time to reo ccupy

these states under a random walk is much longer than the time to leave them under a directed walk

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

To show the inadequacy of VS estimators see section at this stage of the pro cess we

show in gure the generated by the TP metho d ab ove together with two VS estimators

of it One Visited States i shows the eect of trying to use the VS histogram fC g from

i

the same iteration where the replicas were originally spread uniformly The other Visited

States ii derives from a dierent simulation but of the same length where the simulations

are started more or less with their equilibrium canonical distribution over the volumes so that

they can reach it in the same equilibration time as was given to the TP simulation This

estimator is thus the b est that one could exp ect to do by visited states without extrap olation

It is apparent that the TP estimator gives by far the b est estimator of the true shap e of

Most of the larger systems N and N are treated in practice by rst using

d xc

simple L FSS see section of from a smaller simulation at the same temp erature to

give an initial estimate of which is then rened with the transition metho d as b efore The

xc

FSS estimator is normally found to b e very close to with the discrepancy getting larger

as decreases so only one or two iterations are required b efore only random uctuations in

are observed signaling the end of the nding stage and it is p ossible to move to the

pro duction stage

xc

However to demonstrate that FSS is not essential we also show the pro cess of nding

from a zero start for N with N N and N in gure

m r s

12

The equilibration time from a start where the simulations were all in the state of highest sp ecic volume

would b e more than ten times as long

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

80.0 80.0 η2

η3 75.0 η4 η η5 60.0 η6 70.0 η7

η8 65.0 0.715 0.720 0.725 0.730 0.735 η η13 v 40.0

20.0

0.0 0.70 0.75 0.80 0.85 v 8000.0 8000.0

6000.0 6000.0

C 4000.0 C 4000.0

2000.0 2000.0

0.0 0.0 0.70 0.75 0.80 0.85 0.70 0.75 0.80 0.85 v v

8000.0 35000.0

30000.0 6000.0 25000.0

20000.0 C 4000.0 C 15000.0

10000.0 2000.0 5000.0

0.0 0.0 0.70 0.75 0.80 0.85 0.70 0.75 0.80 0.85

v v

Figure Top gure the convergence of the preweighting for with N

We show to with N We also show the preweighting function

s

pro duced after two more iterations of sweeps and three of In the inset a detail of

n

the lowvolume end is shown The gures b elow show the histograms of VS fC g p o oled for

i

all simulations for iterations top left top right b ottom left and b ottom

right It is apparent that the initial tendency to move out of the lowvolume states which

have low equilibrium probability for is reversed by iteration and the distribution of

simulations through the macrostate space is once again approximately uniform for the longer

iterations like iteration

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

40.0 η2 (Transition Probability) η2 (Visited States (i)) η2 (Visited States (ii))

30.0 η 20.0

10.0

0.0 0.70 0.75 0.80 0.85

v

Figure Estimators of evaluated using the TP estimator and two VS estimators

N for all The rst VS estimator is pro duced by using the histogram C from the same

s

iteration that gave the TP estimator the second derives from a simulation where the replicas

are initialised with their equilibrium canonical distribution

Once again the pro cess of approaching to within a few p ercent of the multicanonical distribu

tion is quite rapid to generate the estimators in gure requires only hours of pro cessing

time which is ab out of the time sp ent on the pro duction stage

We have not made a detailed study of the eect of the length of each iteration on the sp eed

and stability of the algorithm during the nding stage However the following approximate

argument may b e used and was used in the generation of the data in gure to predict

a value of N that is found empirically to b e more than adequate for stability while still

s

maintaining a distribution of the simulations that is at all times fairly uniform over fV g Let

i

the average number of visits to each macrostate p er iteration summed over all the replicas b e

P Then N Approximate the transition matrix by a i c i Let R P

v ii ii N

m

N

m

R ac

so assuming the transitions and the resulting estimates of are all indep endent and using ij

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

300.0 300.0

280.0

η2 η η3 260.0 η4 η5 η xc 200.0 240.0 η2 220.0 η η3 0.710 0.720 0.730 0.740 η4 v η5 η xc 100.0

0.0 0.70 0.75 0.80 0.85

v

Figure The convergence of the preweighting for with N N

s

and N We show to We also show the eventual nal preweighting

r

xc

function pro duced after iterations of sweeps but starting from a FSS estimator

In the inset a detail of the lowvolume end is shown Visited states histograms are not shown

but are similar to those in gure

simple errorpropagation we nd

p

ac

R R N

m

ac

acac is controlled by N and largely the smaller of a and c taking it to b e c we

v

have

ac

p

ac

cN

v

Now N N N N and for stability we may demand R R O which leads to

v m s r

N

m

N

s

cN

r

In fact the results for various test runs imply that the algorithm is robust down to a value

of N rather smaller even than this the value of N used for N ab ove was arrived

s s

at by using equation but with c and the algorithm still converged to though with

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

much more noise with N and N Certainly the sweeps allowed for N

s r

was far more than required Equation has certain intuitively app ealing prop erties one

would exp ect that for a single serial computation the length of time required to estimate R to

a minimal accuracy would b e approximately equal to and so increase as N the result

r w

m

then shows that on a parallel computer this time is reduced by N the number of replicas

r

run in parallel We should note that there is no contradiction b etween this result and the

results of section which apply to a slightly dierent system There we were considering a

single random walker and the necessary assumption was made initially that the total runtime

N was much greater than N

s r w

m

The most striking dierence b etween gures and is that the N simulation

actually converges faster than the N despite having the smaller N As we shall conrm

s

b elow this is in fact the result of b etter equilibration b etween volumechanging moves the

rate of convergence do es not dep end on N to any measurable extent As in section the

s

n

TP metho d gives an accurate estimator of P v to the extent that equation is true ie

n N

to the extent that P s jv has and maintains its canonical value Since the volume changes

n N

preserve the conguration and P s jv varies with v this must b e dep endent on the amount

of congurational up dating done p er sweep In the N simulation all the particles

co ordinates were up dated b etween attempted volume changing moves but b ecause of the way

that the N simulation was mapp ed onto the CM this was in fact not the case there In

n n n n

the limit of p erfect equilibration should estimate P P exactly whatever the sampled

ij j i j i

distribution so immediate convergence would b e observed that is to say would already

b e multicanonical apart from the eect of random uctuations The underestimate of the

dierence b etween the present sampled distribution and the multicanonical distribution is the

result of incomplete equilibration just as it was for Ising energy in section There are two

issues arising from this that must b e checked First and more imp ortantly for the nal accuracy

of the results it is necessary to establish whether the same is the xed p oint of the iterative

pro cess apart from random uctuations whether or not equilibration is go o d To check this

xc

we have run simulations each starting with the eventual limiting obtained for N

with imp erfect equilibration b etween volume changes All the simulations p erform

the same number of volume up dates but dier in the number N of co ordinate up dates of

eq

13

it also implies that the accuracy in R obtained is indep endent of whether or not the range of macrostates is sub divided

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

all the particles that they attempt b etween them Several blo cks of data are generated for

xc

each N and without up dating so that error bars can b e obtained to see if there is

eq

xc

any evidence that N alters systematically away from It is found that there is no

eq

xc

discernible N dep endent change in even when several volume changes are made for every

eq

co ordinate up date Thus we exp ect that whatever N the algorithm will still eventually

eq

converge to the correct multicanonical limit if it converges at all We shall comment again on

the sp ecial status of the multicanonical distribution a little later

Having reassured ourselves that the multicanonical limit is correctly found we now show

n

in gure the eect of varying N on the early stages of convergence of for the same

eq

N system again with the same number of volume up dates to conrm that this is indeed

the cause of the dierent rates of convergence in gures and The iterative pro cess is

started from and iterations p erformed with up dated after each iteration this time

all the simulations had N but N varies b etween and

s eq

This time increasing the amount of equilibration p erformed during each iteration do es have

a clear eect it increases the sp eed of convergence to the multicanonical limit This o ccurs

b ecause the extent to which equation is satised controls the extent to which equation

n

provides a go o d estimate of P v That equation is not satised immediately is reected

in the fact that convergence to is not immediate

n

However this do es not explain why it is that the eigenvector of continually underestimates

the change required to reach To understand this it is necessary to consider what is o ccurring

physically in the simulations Initially they clearly tend to drift to higher volume states Moves

to higher volumes can b e made freely since the conguration is preserved so there can b e no

hardcore overlaps and there is only the energy cost to consider while the reverse moves to

lower volume are strongly suppressed by the likelihoo d of a hardcore overlap However as N

eq

is reduced the moves to higher volume are largely unaected while the moves to lower volume

b ecome more likely This o ccurs b ecause they are likely to b e simply reversals of a move that

came from the lower volume state on the previous sweep to the extent that equilibration is

imp erfect the conguration is preserved from that sweep and so less likely to contain a hard

n N n n

core overlap than one that truly reects P s jv Thus the ratio is nearer to unity

ij j i

n n n

than the true the magnitude of the eigenvector is underestimated and so is

ij j i

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

Neq=0 Neq=1/2

80.0 80.0 η2 η2 η3 η3 η4 η4 η5 η5 60.0 60.0 η xc η xc

η η 40.0 40.0

20.0 20.0

0.0 0.0 0.70 0.75 0.80 0.85 0.70 0.75 0.80 0.85 v v

Neq=1 Neq=2

80.0 80.0 η2 η2 η3 η3 η4 η4 η5 η5 60.0 60.0 η xc η xc

η η 40.0 40.0

20.0 20.0

0.0 0.0 0.70 0.75 0.80 0.85 0.70 0.75 0.80 0.85 v v

Neq=4 Neq=8

80.0 80.0 η2 η2 η3 η3 η4 η4 η5 η5 60.0 60.0 ηxc η xc

η η 40.0 40.0

20.0 20.0

0.0 0.0 0.70 0.75 0.80 0.85 0.70 0.75 0.80 0.85

v v

n

Figure Convergence of for n N N N

s r

xc

Also shown is a suitable which in fact is from gure N

eq

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

Though we have not studied in detail the b ehaviour of the size of the systematic underesti

n

mate as a function of the distance to the multicanonical limit the results of gure

and gures and seem to indicate empirically that the fractional extent of the

underestimate remains ab out the same at each iteration or p erhaps decreases slightly as the

more diusive movement of the simulations through macrostate space gives greater time for

equilibration within a set of a macrostates However this constant fractional error in the

eigenvector corresp onds to a decreasing absolute error and thus a geometric convergence to

wards whatever N may b e we have already shown that simulations conducted in what

eq

we b elieve to b e the multicanonical limit but with various N do not drift away from it so

eq

implying the same is the limit of the generation pro cess for all N Once we have arrived

eq

n

at a situation where further iterations pro duce only uctuations in around we move to

the pro duction stage

Although it applies rather more to the next section section than to this one another

particularly imp ortant result that b ecomes apparent from this investigation of the eect of N

eq

is why it is imp ortant to generate and use the multicanonical distribution in particular It might

at rst seem that b ecause TP estimators can generate an estimate of a sampled distribution that

varies over many orders of magnitude multicanonical sampling is not necessary any sampled

distribution would app ear to b e adequate even the original canonical distribution It seems

that we require only that adjacent macrostates are similar enough in equilibrium probability

for transitions in b oth directions b etween them to o ccur The investigation of the eect of

n

N shows why this is not adequate the fact that generated at the end of any stage n

eq

n

is not generally equal to shows that the estimator P is not generally equal to the real

n

underlying P the dierence b eing due to incomplete equilibration at constant volume Thus

n

can

any estimates of P or canonical averages made on the basis of P would b e heavily biased

n

n

unless N were extremely large It is only in the multicanonical limit where P P

eq

n

constant that P b ecomes more or less indep endent of N and the arrival in this limit is

eq

n

signaled by the convergence of Thus we must reach this limit to b e able to accurately

can

reconstruct P

Why should it b e that P alone do es not dep end on N To understand this it is necessary

eq

to return to the equation cf equation

X

n

P t P t

s r

r s r

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

that describ es the evolution of the pdf of the microstates r and s which here are the joint

N

set of co ordinates and volumes fs v g As we know from section this converges for

n n n n

large t to the equilibrium distribution P P We have in fact two up date matrices

s r r s

one equation describing co ordinate up dates and the other equation v up dates

n n

They b oth preserve P once it is established Now to b e sampling from P implies b oth

s s

n n N

that lo cal equilibrium P sji P s jv should b e established and that the distribution

n

over the macrostates P should have its equilibrium value In early iterations the rst con

i

dition is satised at t b ecause b efore starting we relax all the replicas with co ordinate

up dates but the second is not b ecause the replicas are distributed uniformly over macrostate

n

space but the equilibrium P is far from uniform Thus when we up date with a v transition

i

n

P t P t P and P sji t loses its equilibrium value It must b e relaxed

s s

s

n

again with co ordinate up dates if equation is to b e satised However if we have P P

s s

then P t P b ecause we chose the distribution of the replicas to b e uniform The result

i

i

given that P sji t P sji is that the the underlying Markov pro cess is in equilibrium

n

right from the start P t P and so stays in equilibrium at all later times even

r

r

under the action of v up dates alone N is irrelevant The sp ecial status of multicanonical

eq

n

sampling then is due to the fact that the equilibrium P accurately reects the distribution

i

of the replicas

It should b e noted that the investigation of N that has just b een carried out is only p ossible

eq

b ecause the congurational up dates and the volume changes are completely separate here and

the one can b e p erformed without aecting the other In the Ising case though similar eects

are clearly present see section the congurational up dates spin ips are also the means

by which the macrostate is changed so an investigation which relies on disentangling the two

would b e much harder

Finally we remark that it is not clear from the results of gure what the b est compu

tationally cheapest strategy is in practice in the nding stage Increasing N decreases the

eq

number of iterations required but of course the total computer timeiteration increases linearly

with N from minsiteration for N to minsiteration for N The b est

eq eq eq

strategy is probably intermediate b etween these two though we do not exp ect any improvement

to b e very great and we have not investigated it in detail

14

If we chose some other nonuniform distribution of replicas then of course the sp ecial sampled distribution

would b e the one that reected that distribution

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

The Pro duction Stage

As we have seen imp erfect equilibration at constant volume leads to an initial stage of sim

n n n

ulation during which is not close to but do es converge to it is uniformly

n

p ositive The corollary of this is that the arrival of close to is signaled by the move to a

n

situation where undergo es only random uctuations b etween iterations We have also seen

n

n

that it is necessary to reach this regime b efore P is a trustworthy estimator of P At this

p oint we move to the pro duction stage

In the applications of the multicanonical metho d in chapter it was found b est to use VS

xc

estimators at this stage whatever metho d had b een used to establish with TP estimators

there was a reduction in sp eed and more imp ortantly a systematic bias at least for energy

preweighting Therefore to motivate and justify our use of these estimators for the squarewell

system we shall show analytically and conrm by direct simulation that the VS metho d is

unusable b ecause of the extremely long random walk time We shall also show that any bias is

very small here on the scale of the random error

The VS estimator only provides a go o d estimator of the probability if the runtime of each

replica simulation is much greater than the ergo dic time at least equal to the random walk

time in this case so that the replica simulations have forgotten all information ab out their

starting states which would otherwise heavily bias the result see the discussion in section

Let us see what this would entail for the simulation with N for which it is

found that N and the average acceptance ratio of volume transitions is ab out

m

Simple random walk arguments imply that ab out volume up dates ie sweeps would

b e required for a single simulation to traverse the whole range of macrostates By contrast the

highest sp eed we can achieve on the CM is sweeps p er simulation p er hour for N

N implying the need for over hours even for one random walk p er simulation In

r

fact ab out sweeps are needed for a simulation to p erform a directed walk from one end

of the macrostate space to the other in the case when these states dier in probability

by more than orders of magnitude Thus we see that using visited states estimators it is

imp ossible to take advantage of the full p ower of the parallel computer to run many replicas

of the simulation since the initial distribution of the simulations over the macrostate space

15

The extra b o okkeeping of the transition probability metho d is negligible for the squarewell system

16

We should note that this is far from the greatest value of N N N that can b e obtained with N

c s r r

6

N sweepshour while with N sweepssimulationhour can b e attained so N

c r c hour

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

will p ersist throughout the simulation Of course given that we keep the uniform distribution

of replicas over the macrostates that was used in the nding stage then once we are close

to the ideal multicanonical distribution the VS histograms do app ear to reect the sampled

distribution as we remarked in the previous section However this is only a result of the

fact that the distribution of simulations through the volume macrostates is itself chosen to b e

nearly uniform Any further information that we might app ear to gain ab out the the sampled

distribution from using such VS estimators would b e illusory

The inadequacy of VS estimators means that we have no choice but to stay with the TP

estimators used in the nding stage As we have seen TP estimators implicitly include and

correct for the eect of the starting state so the lack of equilibration over all the volume

macrostates is not a problem In chapter it was found that TP estimators were sometimes

biased however after the analysis of section showing that here the underlying Markov

pro cess should always b e in equilibrium we would not exp ect any problem with this Never

theless to b e doubly certain we shall present substantial evidence b elow to show that there

is no bias As further corrob orating evidence we recorded VS estimators throughout the long

simulations that were used to generate the results in sections and and none of them

ever showed any indication of systematic drift of the replicas through macrostate space showing

that any bias is certainly smaller than can b e detected by VS in a run of practical length

We shall now describ e how the TP estimators are used in practice to generate the results

xc

Rather than keeping constant in the pro duction stage as was done in generating the results

in section it is up dated after each iteration using equation just as in the nding

stage The only dierence is that N is normally longer of the order of a few thousand sweeps

s

p er iteration rather than a few hundred This simple scheme is used since we found in chapter

that the more complex up dated scheme describ ed by equations and seems to yield

only a marginal improvement We can now recover an estimator of the canonical probability

for each iteration using equation

The pro cedure in nding the co existence p oint and generating canonical averages is then to

n n

nd a set fp g by identifying for each iteration p as that pressure which gives a doublep eaked

can

P v with equal weight in the two p eaks Next the members of the set are averaged to give

ncan

n

a mean p p and an error bar Finally fP p v g are reevaluated using the same

coex coex

b est estimate p for every iteration Prop erties like the densities and compressibilities of

coex

the co existing phases then follow by calculating v and v v for each phase

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

Type of Estimator p

coex

simple average TP

single jackknife TP

double jackknife biascorrected TP

Table Various estimators of co existence pressure for N N

s

xc

N with constant All estimators come from transition probabilities

r

ncan

on each member of fP p v g and averaging while averaging the members of the set

coex

ncan

fP p v g itself gives a b est estimate of the pdf of v at co existence

coex

All the ab ove can b e done if desired for pressures away from p provided that equation

coex

is satised In all the simulations p erformed here all the iterations used were the same

length though it is not essential that this should b e the case

xc

The one disadvantage of this continual up dating of is that we lose the ability to use

jackknife biascorrected estimators To see whether biascorrection would yield signicantly

xc

dierent estimators we have p erformed a test run with constant to allow jackknife bias

corrected TP estimators to b e calculated The parameters of the simulation were N

N N N and eight blo cks of data were gathered The results

s eq r

for the co existence pressure evaluated using the equalweights criterion are shown in table

To evaluate p the canonical probability distribution is reconstructed using equation

coex

xc

with three dierent kinds of TP estimator for P v The rst simple average is dened

n

on each transition histogram C separately to pro duce the second simple jackknife all

ij

the transition histograms except the nth are p o oled to make the nth estimator The third

is a doublejackknife biascorrected estimator see app endix D It is clear that within the

n

error bars the three estimators are identical The pro cedure of continually up dating is

thus justied and so is the analysis of section showing that equilibrium is always exactly

maintained and so that there should b e no bias given that the actual distribution of the

replica simulations reects their equilibrium distribution It thus seems likely that the bias in

chapter had this source b ecause no eort was made to keep the visited states histograms at

simulations were purp osely released from only a few starting macrostates

Having justied the use of TP estimators in all stages of the simulation pro cess including

the obtaining of canonical averages we shall now comment on the accuracy of the estimators

xc

obtained The absolute magnitude of the uncertainty in is O for the larger systems ie

17 13

Once again  from the N TP simulation was used r

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

can

xc xc

O and thus b ecause P exp the fractional error in the reconstructed

can

can can

canonical probability is also O that is P P P O at least for some states

This can b e conrmed by insp ection of the results of section in particular gure

This uncertainty is large compared with what was achieved in chapter we are contenting

ourselves with a lower level of accuracy in our knowledge of the sampled distribution and

thus of the canonical distribution and canonical averages However the error bars alone do

not quite tell the whole story Very little of the error is attributable to lo cal uctuations

can

the shap es of the p eaks of P are individually welloutlined so averages calculated over

the p eaks separately ie calculated for a single phase such as the average sp ecic volume

v and the isothermal compressibility for a particular input pressure do not have

T

can

such large errors as one might exp ect from the O error in P Indeed comparison of

gures and shows that the error bars are smaller away from co existence when only

can

one phase need b e considered By far the larger part of the error in P p comes from

coex

uncertainty in the relative weights of the two phases the interphase distance in V space is

V V N v v while the width of the pdf

expanded dense expanded dense

of a single phase N as we know from statistical mechanical arguments see section

and shall conrm in section Thus for all but very small systems or systems very near to

criticality the interphase distance is larger and so has the greater eect on error b ecause the

can

lo cal uncertainties in P p accumulate over the large distance in volume that separates

coex

the phases In addition the error in the intensive parameter p or equivalently in the

coex

dierence in free energy density is smaller than might b e exp ected This o ccurs b ecause p

coex

aects the relative weights of the phases via a term p V V

expanded dense

thus b ecause V is extensive an O error in the relative weights corresp onds only to

an O N error in p We anticipate that for still larger systems accurate results would

coex

b e obtained even if the error in the sampled pdf were larger than O that is to say even

b efore the establishing of what we have dened as a multicanonical distribution

That the error bars on p are indeed small even for N is shown in gure b elow

coex

In fact as we shall see notwithstanding the O error our results are at least as go o d as the

results already published for this system

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

The Multicanonical Metho d Compared with Literature Thermo dynamic Integra

tion

It is apparent that by using the transition metho d p ossibly combined with FSS we can measure

the entire F v of the squarewell solid from knowledge only of a volume interval that we exp ect

to contain the co existing phases We should contrast this with the thermo dynamic integration

pro cedure used in where a reference system must b e used for each F v p oint When

enough of F v is mapp ed out p and v are lo cated with the doubletangent construction

coex

The reference system used in is the hard sphere solid for which a go o d equation

of state is known this is therefore not a computationally exp ensive pro cedure since the

p otentials of the two systems are similar and so only a short integration path is requiredthough

of course for other systems such a convenient reference system might not exist and there is

some evidence even here that there are problems with it near the critical p oint see section

In any case compared with the multicanonical metho d extra complexity is introduced

by the use of the reference systems while we still need an a priori guess of the volumes where

we b elieve the phase co existence lies Moreover the doubletangent construction is equivalent

can

to nding p that pro duces equal heights of the p eaks of P p v the equalweights and

coex

equalheights criteria b oth have the same largesystem limit but for small systems the equal

weights criterion is the more natural We shall discuss in section which gives the b etter

estimate of the innitevolume transition p oint

Canonical Averages

We now present some results for various canonical averages evaluated using this version of

the multicanonical metho d The averages are calculated b oth at the nitesize co existence

p oint and for a substantial range of pressure around co existence We choose as an example

the N system at though similar calculations can b e made for all the other

system sizes and temp eratures investigated We used N and N First in

r m

can

gures and we show how P v can b e reconstructed for various pressures from

can

the multicanonical results Figure and its inset show how P p v can b e accurately

coex

measured over a range of here more than decades Once again we emphasise that even if

we had a serial computer as fast as the Connection Machine and so were not so hamp ered by

the long random walk time Boltzmann sampling would fail on a problem like this where two

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

mo des are separated by a region of very low probability

800.0

p=70 p=30.2 600.0 p=25

1.5e-07 P(v) 400.0 1e-07 P(v)

5e-08

200.0 0 0.72 0.73 0.74 0.75 0.76 v

0.0 0.70 0.75 0.80

v

can

Figure Main gure P v at high p low p and intermediate p

pressures for and N p is the nitesize estimate of the co existence

pressure where the two phases have equal weights note that the probability density is much

smaller in the rare phase on the right b ecause of its higher compressibility p is the lowest

can

pressure which can b e reliably investigated b ecause at lower pressures P has appreciable

can

weight outside the investigated range of volume The inset shows P with typical error bars

can

in the region b etween the p eaks P was smo othed using a moving average over a window of

volume states

Then in gure we show v the average volume p er particle evaluated from these

distributions as a function of pressure v is calculated by averaging only over the phase

that is favoured at the pressure under consideration this should b e compared with gure

b elow The discontinuity in v at p showing the presence of a rstorder

phase transition is clearly visible The estimates of v at p are also shown as data

coex

p oints on this gure Figure shows the isothermal compressibility v v p

T

N v v v as a function of p with the averages once again calculated

only over the favoured phase Once again the eect of the phase transition is clearly apparent

the very dierent values of reect the dierent structures of the two phases see section

T

We comment up on the nitesize scaling of v in section

We should also conrm that the system is indeed solid for all densities studied To do this we

show in gure the pair correlation function g r P r r for N

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

where P r dr is the probability that a particle has a neighbour in a shell of radius r and

thickness dr centred on its present p osition The full line shows averages gathered using those

simulations that were in the dense phase in fact the average was gathered for a range of

can

densities rather wider than the p eak of P p v and was formed without weighting by

coex

can

P p v so we would exp ect it to b e slightly dierent in detail from the true g r at

coex

constant N pT but qualitatively the same the dashed line shows the corresp onding results

for the expanded phase It is clear that the particles in the expanded phase have substantially

more freedom of movement nevertheless the fact that g drops to zero b etween the p eaks

shows that the particles remain lo calised on their lattice sites that is that b oth phases are

p

solid The ratio of the lo cation of the nth and mth p eaks for n m is nm as

exp ected for a fcc lattice

In gure we show a detail of g at around r This gure clearly

shows that g has discontinuities at r and r matching the discontinuities in

the p otential The discontinuity at r is simply of a consequence of the hard core in the

p otential which prevents any particles approaching more closely than The presence of the

other can b e rationalised by considering g for two isolated particles Since g r P r r

we would exp ect that as

g g exp E

In the solid g is of course mo dulated by the presence of all the other particles but there

is no reason to exp ect their contribution to pro duce a discontinuity at r and any

other b ehaviour will leave equation unaected The presence of this discontinuity is thus

explained and from the gure we can estimate its magnitude to b e

g g dense phase

expanded phase

in go o d agreement with exp

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

800.0

p=70 p=50 600.0 p=35 p=30.5 p=30.25 p=30 p=29.75 P(v) 400.0 p=29.5

200.0

0.0 0.716 0.718 0.720 0.722 0.724 v

60.0 p=29 p=28 p=27 p=26 p=25 40.0 P(v)

20.0

0.0 0.70 0.75 0.80

v

can

Figure A more detailed view of the p eaks of P p v for the same and

can

N system featured in gure but at a range of dierent pressures P was smo othed

using a moving average over a window of volume states and some typical error bars are shown

The upp er diagram shows the p eak corresp onding to the dense phase while the lower diagram

shows the p eak corresp onding to the expanded phase Note that in this diagram it is just

discernible for p that there is some weight at volumes corresp onding to the dense phase

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

0.84

0.82

0.80

0.78 v 0.76

0.74

0.72

0.70 20.0 25.0 30.0 35.0 40.0 45.0

p

Figure v of the favoured phase as a function of pressure for and

N Typical error bars are also shown The estimates of v for each phase separately

at co existence are also shown triangles

1.0e-02

1.0e-03

κΤ

1.0e-04

1.0e-05 20.0 25.0 30.0 35.0 40.0 45.0

p

Figure Isothermal compressibility of the favoured phase as a function of pres

T

sure for and N Because of the large variation in b etween the

T

two phases a logarithmic vertical scale is used The compressibility is evaluated using

N v v v Typical error bars are also shown T

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

300.0

dense phase expanded phase

200.0

g 12

100.0

0.0 0.0 1.0 2.0 3.0 4.0

r/σ

Figure g r for N Full line average for over the dense phase

dashed line average over the expanded rare phase

120.0

100.0 dense phase rare phase

80.0

g 12 60.0

40.0

20.0

0.0 0.98 1.00 1.02 1.04 1.06 1.08 1.10

r/σ

Figure g r for N detail of the region r Full line average over

the dense phase dashed line average over the expanded rare phase

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

FiniteSize Scaling and the Interfacial Region

can

We shall now lo ok at the nitesize b ehaviour of P p v and the various estimators

coex

obtained from it We shall use b oth N and L the side length of the simulation volume to

can

measure the system size It is instructive to compare P v for various system sizes to show

explicitly the decreasing size of the fractional uctuations with increasing N exp ected from

elementary statistical mechanics section This is done for N and

in gure As well as the exp ected narrowing of the p eak it is also apparent that the

nitesize estimate of v at the nitesize co existence p oint dep ends quite strongly on N

and an extrap olation must b e used in making an estimate of v in the N limit

500.0

N=32 N=108 400.0 N=256

300.0 50.0 P(v) 40.0 30.0 P(v) 200.0 20.0

10.0

0.0 100.0 0.76 0.78 0.80 0.82 0.84 v

0.0 0.70 0.75 0.80 0.85

v

can

Figure P v for N and The p eaks narrow as N increases

and there is also some variation of v with N The inset shows the expanded phase in

greater detail

can

The inset shows the expanded phase in more detail The narrowing of P v and its

movement to lower volumes are clearly visible In fact in addition to the exp ected qualitative

p

can

b ehaviour of P v there is also qualitative agreement with the prediction v v N

for example the compressibility of the expanded phase evaluated for the three system sizes

T

at a pressure just b elow the equalweights transition p oint is for N

T

for N and for N The approximate equality

T T

p

of the estimates of shows that v ar v v N ie v v N T

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

Figures and show more clearly the nitesize scaling of estimators in particular

those related to the transition p oint again for Figure shows v as a function

N

of p where v is now averaged over al l volume states cf gure The narrowing of

the pressure region over which the transition takes place is clearly visible as is its movement

to higher pressure The diagram also shows thermo dynamic integration results for an N

system taken from There is clearly go o d agreement b etween the two sets of results

0.820

0.800

0.780

0.760

0.740

N=32 0.720 N=108 N=256 Thermodynamic Integration (N=108) 0.700 18.0 20.0 22.0 24.0 26.0

p

Figure v evaluated by averaging over al l volume states as a function of p for

N

N Also shown are thermo dynamic integration results for N

from

Figure shows the nitesize scaling of the estimate of the transition pressure p eval

coex

uated b oth by equalheights and equalweights criteria Once again the results seem consistent

with the estimate given in Both the estimators seem to exp erience a N nitesize

correction with resp ect to the innitevolume limit though the correction is rather smaller

in the equalweights case than the equalheights For this reason and b ecause it is the more

natural nitesize estimator we have generally preferred to use the equalweights criterion

The leastsquares ts to b oth sets of data dashed lines are equally go o d and b oth have the

same intercepts within error The ordinate intercepts which are the b est estimates of the

innitevolume transition p oint are b oth at p

coex

Our ndings are interesting in the light of the theoretical prediction made for lattice mo dels

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

with p erio dic b oundary conditions in that the equalheights estimator of the

transition eld temp erature or magnetic eld should indeed have N corrections but the

equalweights estimator should have only exp onentially small corrections It might b e thought

that these arguments should apply to olattice systems to o but our results suggest that neither

estimator has exp onentially small corrections This may b e due to the fact that the ensemble

under consideration here with its variable volume is dierent to the constantvolume ensemble

of the lattice mo del

24.0

Equal Heights Equal Weights 23.0 Thermodynamic Integration

22.0 coex p 21.0

20.0

19.0 0.000 0.010 0.020 0.030 0.040

1/N

Figure p as a function of N evaluated using equalweights and equalheights

coex

criteria for Also shown are thermo dynamic integration results for N from

and leastsquares ts to the data dashed lines b oth of which have their ordinate intercept at

p

Another interesting application of nitesize scaling theory is to the canonical probability

can

P p v in the region b etween its two p eaks Let us dene

coex

can

P v

r ln

p

can

P v

l

can can

where P v is the probability density at one of the p eaks of P v and P v is the

coex l

pdf at its lowest p oint b etween them v and v vary slightly with system size

l

We show the measured b ehaviour of lnr against lnL for in gure the graph p

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

5.0

4.0 ) p ln(r

3.0

2.0 0.6 0.8 1.0 1.2 1.4

ln(L)

Figure lnr against ln L for and system size L N

p

can can

r lnP v P v by simulation

p l

has a gradient of Now the statistical mechanics of phase co existence see section

predicts that the dominant congurations around v should consist of regions typical of each

l

of the stable phases separated by the phase b oundaries Accordingly the free energy in this

region has the form

d d

F v L f v a v L f

b s

where f v is the free energy of the bulk a dep ends on the geometry and f is a surface

b s

can d

free energy Thus since P v exp F v we would exp ect r L L The

p

discrepancy b etween this prediction and the result of the simulation is we b elieve a consequence

of the particular simulation metho d we have chosen p erio dic b oundary conditions are used and

changes in volume are made by a uniform scaling of the particles p ositions which leaves the

shape cubic of the simulation volume unchanged If one tries to generate a mixedphase

conguration in such a cubic b ox one nds that it cannot b e easily made to t the size of the

b ox is determined by the largest length in the region of expanded phase and it cannot contract

around the regions of dense phase Some planes of particles in the dense phase are therefore

separated eectively breaking up the uniform structure of the phase In a simulation of a uid

particles would move b etween the phases to ll up the gaps but that do es not o ccur here

since the particles are held on their lattice sites and even if they could move there would b e

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

commensurability problems with the simulation b ox whose side length is only a few times the

lattice spacing Thus the simulation fails to op en up what is presumably for a real system

the dominant congurationspace pathway b etween the two phases b ecause of the suppression

of congurations containing interfaces Instead the most probable congurations in the inter

phase region have a uniform structure that is commensurate with the shap e of the simulation

b ox

This implies

d

F v L f v

d

where f v f v is the bulk free energy at intermediate volume This implies r L L

p

in much b etter agreement with what is found Insp ection of the distribution of free volume in

simulations around v conrms that the simulations here seem to consist of a single phase only

l

This means then that the simulation cannot b e used in its present form for the evaluation

of interfacial free energies We should note however that all our previous results ab out the

co existence pressure or the volumes of the stable phases remain valid these phases are of course

homogeneous in structure and so present no diculty to the simulation while to determine their

relative weight to calculate the co existence pressure simply demands the presence of some

reversible congurationspace path connecting them Whether this path is the one followed

in the real system is immaterial given that the states along the path have negligible weight

themselves b oth in the canonical ensemble and the ensemble which is accessible to simulation

Mapping the Co existence Curve

Using a series of multicanonical simulations of dierent nite sizes and having dierent tem

p eratures we have mapp ed out the solidsolid co existence curve of the squarewell system with

b etween and the critical p oint which app ears to lie at Simulations

were carried out for N N and for only N and for each simulation

the canonical pdf of v was reconstructed and the equalweights criterion used to identify

p

coex

While in sections and it was always easy to distinguish the regions of macrostate

space asso ciated with the two phases b ecause the temp erature was low this b ecomes pro

gressively more dicult as the critical region is approached The canonical probability of the

region b etween the two mo des increases and nally the mo des merge together This is shown

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

happ ening for N in gure the pdf has b ecome unimo dal by the time

For N the same pro cess o ccurs at lower temp erature

400.0

β=1.0 β=10/11

400.0 β=10/12 300.0 β=10/13 β=10/14 300.0 β=10/15 β=10/16 P(v) P(v) 200.0 200.0 β=10/17

100.0

0.0 0.718 0.720 0.722 0.724 100.0 v

0.0 0.72 0.74 0.76 0.78 0.80 0.82

v

can

Figure P v for N and a range of values at co existence N

s

The inset shows the dense phase on an expanded scale

In implementing the equalweights criterion we have used an arbitrary division of the range of

can

v at or near the p oint where P is minimal even though a tting of two overlapping Gaussians

might p erhaps b e b etter This is not exp ected to have a great eect on the assignment of p

coex

which as we have already seen is little aected even by the choice of equalheights or equal

weights criterion However it do es mean that the estimates of v tend to move away from

can

the mo des of the canonical P v Therefore once the temp erature exceeds a certain value

chosen by insp ection we move to using the lo cation of the mo des as nitesize estimators The

b est estimates of the innitevolume limits of p and the sp ecic volumes v are then

coex

calculated by extrap olating the nitesize data against N

We saw in section that this pro cedure works well for low temp eratures it is applied

here even to nearcritical temp eratures b ecause to treat the critical region prop erly to obtain

estimates of and critical exp onents it is necessary to make highly accurate measurements of

c

the joint pdf of the order parameter and energy which we have not had time to

do The phase diagrams are thus not exp ected to b e particularly accurate in the critical region

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

N N N N sp eed NEDGE N

m r s eq

sweepsiteration sweepshour

Table Parameters for the various pro duction simulations used in the evaluation of

can

the phase diagram P and derived quantities were generated by averaging over ab out

iterations

The reader might b e concerned that the suppression of congurations containing interfaces

as describ ed in section might aect the estimates of p and v at higher tem

coex

p eratures in the region where for a nitesize system the interphase congurations do have

signicant weight However these congurations only acquire signicant weight once the cor

relation length or the width of typical interfaces has exceeded the size of the system so

the dominant congurations are once more homogeneous in structure In particular we do not

foresee incommensurability problems aecting simulations in the critical region and we b elieve

that an accurate treatment using the metho ds of would b e a fairly straightforward

extension of the work p erformed here

The multicanonical distributions were pro duced rst for the N systems sometimes

xc

starting from as describ ed in section but sometimes using a preexisting from

xc

a simulation at a dierent temp erature as a rst estimate for the larger systems were

pro duced using nitesize scaling followed by renement The parameters used in the various

simulations are shown in table

The results are shown b elow gure is the p phase co existence curve while gure

is the v phase diagram We also show results from for comparison dashed lines

It is apparent that there is quite go o d agreement b etween the two estimates of the p

co existence curve Any discrepancies are at most of the order of and most lie within the

error bars on the multicanonical p oints some do not but since we have no error bars on the

thermo dynamic integration data this need cause no concern However it should b e noted that

the discrepancies that do exist still corresp ond to O dierences in the relative probabilities

of the two phases for a N system

The agreement in the lo cation of the phase b oundary in the v solidsolid phase diagram

is also fairly go o d though there is a small but clear systematic disagreement in the sp ecic

volume of the expanded solid at low the integration metho d consistently ascrib es to it a

higher v than the multicanonical There are several p ossible explanations for this none of which

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

120.0 Thermodynamic Integration 100.0 Extrapolated Multicanonical

80.0 p 60.0

40.0

20.0

0.0 0.4 0.6 0.8 1.0 1.2

β

Figure The p solidsolid co existence curve for the squarewell solid with

The data p oints are pro duced by extrap olating N and N and for N

data against N the equalweights criterion is used to nd the nitesize estimators of p

coex

The dashed line shows thermo dynamic integration results for an N system taken from

is completely satisfactory It may b e due in part to the dierence b etween the equalheights and

equalweights criteria equalheights implicit in the doubletangent construction of TI would

tend to pro duce the larger estimate of v However this cannot b e the whole story b ecause

equalheights would also pro duce a lower estimate of p while gures and show

coex

that if anything the opp osite seems to b e true Another p ossible cause is the nitesize scaling

can

movement of the p eak of P p v to lower v visible in gure but again this do es not

coex

seem a large enough eect to account for all the discrepancy the raw multicanonical results

for N still lie much closer to the extrap olated data p oints than to the thermo dynamic

integration curve The third and p ossibly most likely explanation is that the hardsphere solid

used as the reference system in is unsatisfactory near to the solidsolid critical p oint One

might exp ect this to happ en b ecause the hardsphere solid do es not itself have a critical p oint

and so typical congurations of the two systems are far less similar than they are at higher

making the integration path longer and more awkward The eect is in fact like that of having

a phase transition on the integration path itself In any case whatever the cause of the low

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

1.4 Thermodynamic Integration Extrapolated Multicanonical 1.2

1.0 β

0.8

0.6

0.4 0.70 0.75 0.80 0.85 0.90

v

Figure The v solidsolid phase diagram for the squarewell solid with The

data p oints are pro duced by extrap olating N and N and for N

data against N The dashed line shows thermo dynamic integration results for an N

system taken from

dierence in the estimates of v one of its consequences is that the multicanonical data suggest

that the critical temp erature is probably lower ie is higher than stated in for

c c

TI xc

the integration metho d is at while gure implies instead that

c c

though in the absence of a prop er critical FSS analysis we would not assert even this result

with very much condence

The Physical Basis of the Phase Transition

So far in our investigations we have concentrated only on describing the solidsolid transition

that is observed without attempting to give a physical explanation of why it should o ccur at

all A full understanding would require the study of dierent p otential ranges and of the uid

phase to o so we shall give only a brief semiqualitative description concentrating mainly on the

N system at and

A rstorder phase transition is always asso ciated with a nitesize pdf of the order pa

can

rameter P v here that has a doublep eaked structure the p eaks b eing asso ciated with the

N

co existing phases in the thermo dynamic limit Since the logarithm is a monotonically increasing

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

function this means that

can

ln P p v N f v p v constant

coex coex

N

has the same doublep eaked shap e Now this shap e means that the derivative ln P v must

pass from ve to ve and back to ve But N pv v so it is the other term on the

RHS that must pro duce the curvature f v v must go from ve to ve to ve We now

express f v itself as the sum of an energy and an entropy term

f v ev sv

where ev E v N is an average internal energy density and so is the means by which

the interparticle p otential exerts its inuence and sv is an entropy density we remind the

reader that in the present units k The multicanonical simulations give us f v and ev

B

was also measured at the same time internal energies b eing easily accessible in any sensible

MC sampling scheme sv which we exp ect to b e mainly geometric in origin is thus easily

can

calculated We show sv and ev together with f v p v ln P N in gure

coex

N

The functions sv and f v p v are arbitrarily shifted vertically so that they equal

coex

zero at the lowest v for which measurements were made We show f v p v rather than

coex

f v itself so that the b ehaviour of the curvature is made more apparent by the removal of the

overall trend The ve to ve to ve pattern of f v v is visible though the magnitude

of the nd derivative is clearly small except at low v

Before pro ceeding we shall comment on the b ehaviour of the functions outside the range of

v that is shown in the gure ie the range that was investigated by simulation We must

p

have ev as v v since then all particles interact with all twelve of their

CP

nearest neighbours and ev and all its v derivatives as v and the particles b ecome

very widely separated However sv as v v b ecause the volume of phase space

CP

accessible to the particles Therefore f v in the same limit As v we exp ect

sv and sv v to remain nite We should note that for some nite v rather larger

than shown in the gure the particles would no longer b e held on their lattice sites by their

neighbours even at zero temp eraturethat is the solid would no longer b e mechanically stable

As a rule of thumb this o ccurs for fcc crystals when they have expanded to of their

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

8.0

6.0

v s, β =1.0 4.0 e, β =1.0 coex s, β =10/16 e, β =10/16 2.0 β f+pcoexv, =1.0 β f+pcoexv, =10/16 0.0

-2.0

s(v), e(v) & f(v)+p -4.0

-6.0 0.72 0.74 0.76 0.78 0.80 0.82

v

Figure The internal energy density ev entropy density sv and free energy functional

f v p v for N at full lines with symbols and dashed lines An

coex

arbitrary constant is added to the entropy and free energy so that they are zero at the lowest

p

volumes investigated The v axis b egins at v

CP

closepacked volume which corresp onds to v At this p oint we would exp ect sv

to increase very rapidly or even discontinuously as the particles b ecome delo calised

Let us rst consider the data only Comparison with gure conrms that the

dense solid at v has low entropy but is stabilised by low internal energy while the

expanded solid at v has higher internal energy but also higher entropy in line with

gure As we move from low v to high the internal energy increases rst slowly e remains

p

very close to for v then more quickly as the interparticle separation

reduces to a p oint where the particles move in large numbers out of their neighbours p otential

wells pro ducing the strongly p ositive curvature around v The rate of loss of energy

then slows giving a strongly ve curvature around v and after this it increasingly

levels o and attens out as further increases in v make less dierence to the number of

neighbours with which most particles interact The rapid loss of internal energy even though

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

the density remains high will b e crucial in understanding the phase b ehaviour The p oint of

inection seems to b e quite close to the p oint where the internal energy would drop to zero

p

with a uniform dilation of a p erfect fcc crystal which is at v Apart

from the exp ected divergence at v v the shap e of sv is quite similar though its high

CP

volume gradient is nonzero and its lowvolume curvature b oth ve and ve is smaller in

magnitude We attribute sv largely to free volume in the crystal so initially as we move away

from close packing it increases rapidly then more slowly and with a ve curvature The region

of ve curvature around v cannot b e due simply to free volume and must relate to the

N

unknown b ehaviour of P s jv

The dierence of the curvatures of these two functions gives the curvature of f v which is

the determiner of the phase b ehaviour At low v the stronger curvature of ev is dominant rst

ve pro ducing the dense phase then ve pro ducing the minimum of the canonical probability

At high v ev attens out and the curvature of sv b ecomes the greater pro ducing a weak net

ve curvature of f v which stabilises the expanded phase The smallness of the nd derivative

of f v here is resp onsible for the large compressibility of the expanded solid the restoring

force resisting compression and dilation is essentially an entropic eect

At sv and ev have changed only a little from the largely geometrical

factors that pro duce their shap e are only slightly altered by the dierent relative probabilities of

the various congurations within each v macrostate The main dierence in the phase diagram

is pro duced not by these slight dierences but by the fact that the inverse temp erature aects

only the entropy term making its eect greater and not the energy term Thus the lowvolume

dominance of ev v is weaker while the highv dominance of sv v is stronger and

o ccurs for smaller v Hence we exp ect the two phases to move together the dense phase to

b ecome more compressible and the expanded phase less so and the depth of the probability

minimum b etween them to falljust as is observed in the simulations see gure Indeed

though we have not carried out the calculations we b elieve that the whole v phase diagram

could b e recovered qualitatively from the results of alone At itself the two

phases have eectively fused together as shown by the single minimum of f v

Conversely the opp osite would b e true for co existence at The dominance of the curva

ture of ev would b e greater and the expanded phase would move to higher volume eventually

reaching those volumes v as we said b efore where mechanical instability sets in How

ever it is very likely that even b efore this the expanded solid would b ecome thermo dynamically

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

unstable in comparison with the uid The uid is always favoured entropically compared with

the expanded solid and the dierence in e cannot b e very great since it is already near zero

for the expanded solid so for small the uid is suppressed we b elieve largely by the p v

coex

term v for the uid is much larger than v for the expanded solid and p is high

f xs coex

However as increases p falls and v increases so p v v which

coex xs coex f xs

measures the net eect of this term b ecomes smaller Eventually we presume the uid will

b ecome the favoured phase and we will have moved ab ove the triple line on the phase diagram

in gure

We can also sp eculate on the eect that increasing has ev will maintain a similar shap e

while b eing scaled in the v direction so its regions of high ve and ve curvature that pro duce

the dense phase and the interphase region as well as the region of weak ve curvature that

allows the ve curvature of sv to pro duce the expanded phase will all move to higher

volumes Once is large enough the expanded phase b ecomes unstable at all temp eratures

either b ecause the region of ve curvature of f v moves out of the range of solid mechanical

stability or b ecause p v v b ecomes small enough that uid always has the

coex f xs

lower g and a phase diagram containing only two phases results as in the central diagram of

gure This is consistent with the results of

Finally we note that the shap e of ev should not b e strongly aected by variations in the

shape of the interparticle p otential as long as the short range is preserved while sv b eing

primarily geometric should also b e similar The ab ove arguments thus remain valid for more

realistic shap es of E r supp orting the assertion in the introduction that the squarewell

ij

system should have a similar phase diagram to any other real or simulated system with a

suciently shortranged p otential

Discussion

We shall now comment in turn on the two main sub divisions of this chapter the investigations

made using the Einstein solid reference system and those made using multicanonical sampling

to directly connect co existing phases

As regards the comparison of thermo dynamic integration and the expanded ensemble using

the Einstein solid section it is clear that in eciency of sampling as quantied by the

size of error bars for a particular computational eort the expanded metho d is comp etitive

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

with TI Moreover there is p erhaps some evidence see gure that the expanded ensembles

sup erior ability to deal with nearsingular p oints gives it an advantage over TI b oth at

where it handles the problem of the rare hardcore overlaps implicitly and near where

the integrand is rapidlychanging However b ecause we used the TI results to b o otstrap the

expanded ensemble and b ecause we did not develop a metho d to enable the expanded ensemble

to reach though we b elieve such a development is p ossible the comparison of the

two metho ds is not strictly fair and we certainly cannot claim to have proven the expanded

ensembles sup erority We would add that the awkwardness exp erienced in transforming the

squarewell solid into the Einstein solid describ ed as part of in section for TI though it

also aects the expanded ensemble is an argument in favour of avoiding the use of reference

systems or of using only those that strongly resemble the system of interest

We have also investigated matters p ertaining to the use of the expanded ensemble alone In

section we have shown that the sub division and overlapping of a chain of interpolating

ensembles do es not inuence the accuracy with which the relative probabilities of the ends are

measured see table conrming the result derived in section We have also presented

some results on choosing the spacing of states in expanded ensemble calculations though we

have done only a little work on this imp ortant matter Our results see table do show the

exp ected tradeo in random walk time b etween the number of states and the acceptance ratio

leading to a fairly wide eciency maximum though we are not able to make the quantitative

prediction of the acceptance ratio that might lead to an optimisation strategy We might

sp eculate though it is not a matter we have investigated that the notion of the maintaining

of equilibrium within macrostatessub ensembles see section may b e relevant here the

larger the separation of sub ensembles the less representative of equilibrium within each one will

b e the congurations that move into them from adjacent sub ensembles Thus we would exp ect

that the amount of equilibration needed b etween attempted sub ensemble transitions would

increase which would tend to favour the use of a close spacing of sub ensembles We suggest

that inadequate equilibration may b e the reason that spurious results have b een rep orted in

some expanded ensemblelike techniques such as Grand Canonical MC where the pro cess of

inserting or removing a particle is naturally discrete and in a dense system may pro duce what

is eectively a wide spacing of sub ensembles

Now let us discuss the investigations made using the multicanonical N pT ensemble section

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

We have studied the squarewell solid with and have obtained detailed infor

mation ab out f v for various over a range of volumes including the co existing phases thus

enabling the construction of the phase diagram gures and The results are largely

consistent with those obtained by TI in the literature and we have grounds to think

that where there are discrepancies our results are the b etter ones The full measurement of

f v and ev also provides some physical insight into how the short range of the p otential

leads to the solidsolid phase transition see section The transition is p ossible b ecause

the energy function ev has the following features all within a narrow range of v not much

ab ove closepacking rst it has p ositive curvature pro ducing the dense solid then negative

curvature pro ducing a minimum of canonical probability and nally it is nearly at the sys

tem having lost most of its internal energy so that the curvature of the entropy sv is able to

stabilise the expanded solid Because all this o ccurs for small sp ecic volumes thanks to the

short range of the p otential every particle remains trapp ed in the cage of its nearest neighbours

and b oth phases are solid However a full understanding of the physics of the transition would

require treatment of the uid to o

In order to apply the multicanonical ensemble to this problem we have also had to extend

it b ecause the very long random walk time in this problem prevents a straightforward

r w

implementation We have solved this problem by increasing the use of transition probability

estimators to enable ecient parallelisation using many replica simulations We foresee this

improvement b eing widely applicable since it largely overcomes the problems caused by the

inherent serialism in the multicanonical metho d itself In section we show that ecient

convergence to the multicanonical distribution is pro duced by TP estimators p ossibly in con

junction with FSS and by continuing with the use of TP estimators in the pro duction state

a pro cedure extensively justied in section we have arrived at an iterative scheme that

achieves to a very large extent the ideal of a combination of the nding and pro duction

stages

Thanks to the separation in this simulation of up dates that alter the preweighted variable

but preserve the conguration and those that do the opp osite we have also as describ ed in

section gained new insight into the imp ortance of the preservation of equilibrium at con

stant v Failure to equilibrate completely is the reason why convergence to the multicanonical

n

distribution is not immediate and b ecause the estimator of the sampled distribution P is

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

n

least aected by incomplete equilibration when P is multicanonical it is also the reason why

the multicanonical distribution should still b e used in the pro duction stage The simulations of

chapter did not provide this insight b ecause the same up date pro cedure b oth made moves b e

tween dierent macrostates of the preweighted variable and equilibrated the microstates within

them

For the squarewell solid we have b een able to tackle systems where is much longer than

r w

the time devoted to each replica simulation though b ecause each replica now traverses only a

part of the macrostate space this is achieved at the cost of losing some of the improved ergo dicity

that can usually b e claimed for the multicanonical ensemble To simulate systems where ergo dic

problems are more severe like spin glasses this pro cedure would not b e adequate connection

to those regions of macrostate space where decorrelation is fast would then b e essential Indeed

the b est pro cedure might well b e to launch all replicas from these states instead of spreading

them uniformly The pro cedure would then resemble that of section and as was the case

there the time devoted to each replica would have to b e O This is still much less than

r w

would b e required using VS estimators b ecause the TP estimators would take account of the

starting distribution of the replica simulations To b e sure that there was no bias would require

that the VS histogram stayed at notwithstanding the biased launch p oints see section

It might in practice b e found as in section that any trend to the histogram had little

eect Alternatively we might try mo difying the metho d we sp eculate that making the VS

histogram at by including only some of the transitions of certain replicas in C may have the

ij

desired eect

It will b e noted that in the form in which it is applied here the multicanonical ensemble b ears

some resemblance to the multistage sampling approach section where each simulation

would b e constrained to walk p ossibly multicanonically within a narrow section of the full

range of macrostates overlapping with its neighbours From a VS histogram the pdf of V

within each section would then b e estimated and by imp osing continuity b etween the sections

it could b e reconstructed for the whole range of macrostates In the multicanonical approach

there are no constraints on the movement of the replica simulations but they do b ehave in

a similar way in practice b ecause is so long Nevertheless the multicanonical approach

r w

retains several advantages To use multistage sampling we must decide a priori how to divide

up the range of macrostateshow wide each section should b e and how much it should overlap

18 n

strictly when P reects the imp osed distribution of the replica simulations

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

with its neighbours We also must decide how to match the results from the various histograms

using just the overlapping states or using a function tted to the whole histogram The use of

the single p o oled histogram with TP estimators handles all of this transparently Moreover to

allow full equilibration in the VS sense over the range of macrostates in each section of the

multistage sampling simulation would require that each section should contain substantially

fewer macrostates even than are explored by one of the multicanonical replicas in the course

of its run Thus even though the multicanonical metho d do es not have very go o d ergo dic

prop erties here the multistage sampling approach would b e even worse in this regard It

is also true that the multistage sampling approach would have the lower acceptance ratio of

volumechanging moves b ecause attempted transitions that would take a replica out of its

allowed section of the macrostate space must b e rejected Finally we do not think that the

time sp ent in generating the multicanonical distribution which is in any case typically only

ab out a quarter of the total time sp ent could b e saved by using multistage samplings It is

true that the sampled distribution can b e canonical within each section but this makes the

overlapping pro cess more dicult b ecause fewer counts are recorded at one end of the range

of states so that uctuations have a greater eect It is also p ossible though we do not know

n

for certain that the same problems that were found in using TP to estimate P in the early

iterations of the multicanonical metho d discussed at the end of section would recur It

is of course p ossible to use a sampled distribution in the multistage sampling metho d that is

multicanonical within each section but this then requires the same sort of generation pro cess

as do es a distribution that is multicanonical over all the macrostates

Let us nally make some comments on the eciency of this multicanonical pro cedure as

compared to the thermo dynamic integration metho d used in it is not p ossible to

make an absolutely fair comparison with TI b ecause to integrate along a path of variable

V see equation would require the measurement of pV which as we explained in

section is inaccessible for this system Though we do not of course know how much time

was required for the simulations of the fact that several dierent ranges of p otential

were investigated and the uid phase was included to o suggests that the thermo dynamic

integration done there is appreciably faster This is surely a consequence of the use of a

reference system which is similar to the squarewell solid whereas the multicanonical metho d

is an ab initio metho d The accuracy of these TI results is thus dep endent on the accuracy of

the equation of state used for hard spheres but any slight errors induced by this are probably

CHAPTER A STUDY OF AN ISOSTRUCTURAL PHASE TRANSITION

outweighed here by b eing able to use a very short integration path except p erhaps near the

critical p ointsee section In a case where there was no suitable reference system we

do not b elieve TI would b e in any way sup erior Even here applied to a system to which it

is not particularly wellsuited b ecause of the hardcore in its p otential see section the

multicanonical ensemble has some advantages it seems to b e more accurate near the critical

p oint it could certainly b e extended to yield accurate information ab out the critical region

and it gives estimates of canonical averages away from the phase transition see section

We also consider it to b e a more transparent way of obtaining information ab out f v b ecause

it uses the interpretation of f v as the logarithm of a probability to relate it directly to the

probabilities that are measured in MC simulation

Chapter

Conclusion

Look at the end of work contrast

The petty done the undone vast

ROBERT BROWNING

The main problem we have addressed in this thesis is p erhaps the generic problem of sta

tistical mechanics the identication of which phase is stable at particular given values of the

thermo dynamic control parameters leading to the construction of the phase diagram As we

describ ed in chapter there are two fundamental ways that MC simulation can b e used to

approach this problem Firstly each phase can b e tackled separately which requires a mea

surement of its absolute free energy This is most usually done by connecting it to a reference

system in some way Secondly the free energy dierence b etween the two phases closely related

to their relative probabilities can b e measured directly by allowing them to interconvert This

removes the necessity of measuring the absolute free energy but some way must b e found to

overcome the free energy barriers that separate the two phases

In our own investigations we have concentrated on the ways that nonBoltzmann sampling

particularly the multicanonical ensemble can b e used to tackle the phase co existence problem

This type of extended sampling can b e used in either of the ways describ ed in the rst paragraph

it can b e used to connect to a reference state at innite or zero temp erature to measure

absolute free energy as in much of chapter or it can b e used to overcome the free energy

barrier b etween two phases to measure their relative probability directly as in section

An idea that turned out to have particularly wide applicability was to use the information

CHAPTER CONCLUSION

contained in transition probabilities b etween macrostates The transition probability metho d

was was originally developed as a way of rapidly pro ducing a distribution approximating to

the multicanonical section While very eective for doing this it also provides a way

of removing the bias due to the starting state in a MC run which is not long in comparison

with see sections and particularly This greatly facilitates the implementation

r w

of the metho d on parallel computers and may b e more generally useful Consideration of

transition probabilities also prompted us to do the analytic work on Markov chains section

which led to the result conrmed by simulation in section that the accuracy of a

multicanonicalexpanded ensemble calculation is not improved by sub dividing the macrostates

and which also enabled us to predict the exp ected variance of estimators from various sampled

distributions While the resulting prediction of an optimal sampled distribution section

is probably mainly of curiosity value variance calculations were useful in checking the validity

of TP estimators in section Much of the work done was purely on the development of

the metho d but physically interesting results were obtained in section on the highM

scaling of the d Ising order parameter and particularly in section where asp ects of the

solidsolid phase transition in the squarewell solid have b een investigated with greater accuracy

than b efore This work thus provides a contribution to the growing b o dy of evidence describing

variation in the nature of phasediagrams as qualitative features of the interparticle p otential

vary

Is it likely that the multicanonicalexpanded ensemble will ever replace thermo dynamic in

tegration as the standard metho d of free energy measurement There seems to b e no reason

why not in principle Though there may b e some cases where TIs eciency is higher one is

mentioned in section whenever we have compared the two over the same path as in

sections or we have found the multicanonical approach to b e at least comparable in

accuracy something we would exp ect in the light of the results of section on sub division

of the range while it can deal b etter with phase transitions or ergo dic problems It also has

the advantage of relating the desired free energies more directly to the probabilities that are

measured in MC simulation Its most serious drawback is still the diculty of constructing

the required sampled distribution but we can now b egin to see that this disadvantage can b e

largely overcome using FSS extrap olation or the TP metho d The TP metho d also helps

in the ecient parallelisation of the metho d a matter which is b ound to b ecome increasingly

CHAPTER CONCLUSION

imp ortant in the future Therefore we exp ect increasing use of multicanonical expanded en

semble metho ds particularly for systems like spinglasses or dense uids where thermo dynamic

integration exp eriences diculties b ecause of ergo dicity problems or phase transitions on the

integration path or for the measurement of surface tension or the probability of large uctua

tions where thermo dynamic integration is not appropriate at all For many other systems eg

uids at fairly low densities however it seems unlikely that thermo dynamic integration will

b e entirely displaced b ecause the multicanonical ensemble do es not oer any great advantage

to comp ensate for the greater eort needed to co de itthermo dynamic integration can b e done

with only a little mo dication of Boltzmann sampling simulations We should also mention

here that while we have not investigated them ourselves several other new metho ds discussed

in the review section chapter were either extremely ecient or seemed to hold promise

particularly the Gibbs ensemble see section the dynamical ensemble section and

the MC and temp ered transitions metho ds discussed in section

Whatever sampling scheme is used it seems likely that the free energy problem will always

remain hard simply b ecause of the distance in conguration space that must b e sampled over

When sampling within a single phase all likely congurations are broadly similar but in the

free energy problem whether approached by connecting with a innite or zero temp erature

reference state or by tunneling b etween two co existing phases the simulation must move b e

tween congurations which are qualitatively very dierent in structure Because the Metrop olis

algorithm works by accumulating small changes to a starting conguration to pro duce others

it inevitably takes a long time to build up large dierences Thus as we found in section

once we have removed the free energy barriers that pro duce a tunneling time that is exp onen

tial in the system size further changes to the sampled distribution pro duce no more than a

marginal improvement in the eciency of sampling to make further large improvements will

require the use of algorithms that can take larger steps through conguration space than the

Metrop olis algorithm Though we have not done any work on them in this thesis some such

algorithms are already b eing developed We think it is likely that the hybrid

MC technique could also b e fruitfully combined with multicanonical sampling

Finally let us sp eculate on some interesting problems that might in the future b e tackled

with the multicanonicalexpanded ensemble metho d Multiple minima problems for example

the simulation of spinglasses and protein folding are examples of applications which demand

the go o d ergo dicity prop erties of multicanonicallike metho ds and some work has already b een

CHAPTER CONCLUSION

done on these We also consider that the development of the techniques ability to sample

across phase b oundaries would b e interesting and pro ductive This has already b een done for

isostructural solidsolid chapter and uiduid transitions we consider it p ossible that

a way could also b e found to sample reversibly across a soliduid phase b oundary p erhaps

using the order parameter introduced in that measures crystallinity as the preweighted

variable This would b e to our knowledge the rst simulation of reversible melting and freezing

As well as enabling the simulation to b e done more elegantly without needing separate reference

systems for the solid and liquid the pseudo dynamics of such a simulation could give insight

into the pro cess of nucleation in real systems which is itself a sub ject of considerable interest

App endix A

Exact FiniteSize Scaling Results

for the Ising Mo del

Let the Gibbs free energy density in the innitevolume limit b e g We shall examine the

b

b ehaviour of the nitesize estimator g L GLL For the critical temp erature only we

shall also discuss F M L Now the eect that nite size has on a systems prop erties dep ends

on its b oundary conditions and on where in its phase diagram it is lo catedin a single phase

region at the co existence of two or more phases or near the critical p oint For the d Ising

p

mo del the critical p oint o ccurs where the coupling ln We

c

shall consider rst the case where the system has p erio dic b oundary conditions PBC

High Temperature Small

For many systems with shortranged forces in the single phase region with PBC g L has

only exp onentially small corrections so

LL

0

A g L g O L e

b

The source of the correction is the interaction of a particular spin or particle with its image in

the p erio dic b oundary in the absence of longrange forces this interaction has to b e mediated

through O L spinspin interactions and away from criticality correlations decay exp onentially

The constant L will therefore b e a function of the inverse temp erature having similar

b ehaviour to the correlation length

APPENDIX A EXACT FINITESIZE SCALING RESULTS FOR THE ISING MODEL

Low Temperature Large

In a single phase region there are once again only exp onential corrections However

for the d Ising mo del the line H for is the line of co existence of two phases of

c

opp osite magnetisation Therefore we must now consider the case of phase co existence At the

co existence p oint of q phases with PBC and away from criticality the free energy has

b een shown to have the following form for a large class of lattice mo dels

q

X

d d LL

0

GL ln A exp L g O L e

m

m

where g is a metastable free energy of phase m g g if m is stable otherwise g g

m m b m b

For the d Ising mo del there is an exact symmetry b etween the two phases so b oth terms in

the sum in equation A are the same This leads to

LL

0

g L g L ln O e

b

The Critical Region

c

Before discussing the scaling of g L we shall discuss the scaling of the Helmholtz free energy

can

F M L and thus of P M at this temp erature We have seen that at high temp eratures

L

F M has a single minimum at M and at low temp eratures two minima at M

L

can

M In b oth cases as L increases the fractional width of the p eaks of P M decreases

L

can

so that at high temp eratures P M is concentrated progressively in the centre of the

L

distribution and at low temp eratures in its wings The crossover b etween these two types

can

of scaling o ccurs at the critical p oint where P M has the prop erty that the

c

L

can can

relative weights of centre and wings remain constant in the sense that P P M

can

constant for the d Ising mo del indep endent of L The scaling of P M is quite

c

L

can d d

dierent from its scaling away from P M dm p xdx where x ML L

c c

L

and where p xdx is a universal function and a constant for all the systems in a particular

universality class For the universality class that includes the d Ising mo del p x has a double

can

p eak structure and Thus the mo des of P M lie at M where M L

c

L

can

which is to say that the mo des of P m lie at m L This together with the

c

L

can can

constant value of P P m which is easily large enough for easy tunneling b etween

the two p eaks pro duces the large almost extensive in the system size uctuations in the order

APPENDIX A EXACT FINITESIZE SCALING RESULTS FOR THE ISING MODEL

parameter M that are typical of the critical p oint

Now we return to consideration of the Gibbs free energy Near the critical p oint with PBC

it is well established chapter that the free energy g L can b e decomp osed into a singular

part g which is zero outside the critical region and a nonsingular background part g

s b

g t H L g t H L g t L

s b

where t T T T It follows from renormalisation or other arguments that g is

c c s

given by

d

g t H L L Y atL bH L A

s

where is the critical exp onent describing the divergence of the correlation length for

the d Ising mo del and Y is a universal function that go es quickly to zero as its arguments

increase It is also found chapter that the dep endence of g on L is much weaker than that

b

of g and can b e neglected from which it follows that at the critical p oint itself t H

s

we have

g L g L U

b

where U Y is universal The next term in this series is nonuniversal but might b e

accessible by simulation metho ds that concentrate on nitesize scaling the result pps

is in general

d

g L g L U L a

b

Where nonuniversal is the critical exp onent b elonging to the rst largest irrelevant

scaling eld u and a ud Y du is the corresp onding amplitude also nonuniversal In

general need not b e integral but in the d Ising case it is analysis in Ferdinand and Fisher

eqn shows that

g L g L U O L ln L A

b

So The amplitude U is also known exactly for the d Ising universality class it is given

by U ln

It is instructive to compare this with the amplitude of the L correction for which c

APPENDIX A EXACT FINITESIZE SCALING RESULTS FOR THE ISING MODEL

was simply ln with the coming from the number of co existing phases see

equation A and the discussion thereof However so in some sense the

critical p oint is b ehaving like the co existence p oint of slightly less than two phases This is in a

way physically reasonable since at criticality large uctuations continually take the system back

and forth b etween congurations which are themselves similar to those in the truly twophase

region

Other Boundary Conditions

Finally let us consider the case where we have xed b oundaries Away from criticality and in

the singlephase region where the there were previously only exp onential corrections we now

nd p

s e d c L

g L g L g L g L g O e A

b

s

where g is the surface free energy due to the presence of d dimensional interfaces in

e

the system g is due to d dimensional edges and so on

At the critical p oint this type of scaling and that describ ed by equation A are sup erimp osed

p et seq the nonsingular part of the free energy g has an expansion similar to that

b

of equation A and the singular part g has one similar to that of equation A though some

s

mo dication to this scaling form may b e necessary p

App endix B

The DoubleTangent

Construction

It was shown in section equation that as a consequence of the scaling of the

probability distribution of the order parameter which b ecomes increasingly sharply p eaked

ab out its maximum or maxima at M H or p V as system size increases we can use the

minimum of F M HM or F V pV to give an estimator of g with an error that

L L

d

disapp ears as L

This suggests the following metho d for nding the co existence value of the eld in cases where

the simulation is most easily p erformed with a constant order parameter We shall thus describ e

the metho d for the olattice system where this is more usually the case and where p is

coex

not determined by symmetry Supp ose we have some metho d that can measure the absolute

Helmholtz free energy F V This should b e used to nd F V for various V around V

L L

of one phase then the entire pro cess should b e rep eated for some values of V characteristic of

the other phase From these measurements it is then p ossible to construct F V pV for

L

various values of p to nd p This is most easily done by the doubletangent construction

coex

can

We b egin from ln P p V F V pV constant Equation shows that

L

can

to go o d accuracy phase co existence is found when ln P p V has two maxima of equal

heights at V in phase A say and V in phase B This means that

A B

F V p V F V p V

L coex L coex

A A B B

APPENDIX B THE DOUBLETANGENT CONSTRUCTION

ie

F V F V p V V B

L L coex

A B A B

which has the form y y mx x

V and V themselves are found by solving

A B

can

ln P

V

ie at co existence

F F

L L

p B

coex

V V

 

V V

A B

Consideration of equations B and B together shows that p is given by the negative of

coex

the gradient of the common tangent to the two branches of F V The p oints of tangency

L

give V and V The construction is shown schematically in gure B

A B

F( β ,V)

F A

F B

** V

VVA B

Figure B A schematic illustration of the doubletangent construction for the olattice pV

case to nd p gradient of tangent The solid lines represent the parts of F V that

coex L

will typically b e measured the dotted line shows the part that is typically not measured

We note that this metho d has the advantage that it is not necessary to map out the whole

of F V it suces to have those parts around the eventual minima in F V pV and

L L

though of course we will not b e sure a priori of their exact lo cation we are likely to have quite a

go o d idea However the metho d has the disadvantage that the absolute free energies of the two

APPENDIX B THE DOUBLETANGENT CONSTRUCTION

phases are required since the relative p ositions of the two branches of F V must b e known

L

b efore the doubletangent construction can b e applied and it uses the equalheights criterion for

phase co existence when it is now thought that the equalweights usually has smaller nitesize

error

App endix C

Statistical Errors and Correlation

Times

Because the MonteCarlo metho d uses random numbers to generate a sample of the congura

tions of the simulated system the estimates of thermo dynamic quantities that it pro duces are

asso ciated with a statistical error We shall now describ e the b ehaviour of this error taking

the example of a Boltzmannsampling simulation Though the concepts introduced are general

some of the results are sp ecic to this eg equation C The analogues of equation C for

multicanonical sampling are extremely complex see

Supp ose we generate a set f g i N of congurations from a Markov chain whose

i c

stationary probability distribution is P We can dene some op erator O internal energy

magnetisation on each one giving the set fO g The sample mean O is an estimator of the

i

P

O P which for a Boltzmann sampling algorithm is the exp ectation value O

f g

canonical average of O

Dene the quantity

O O O

p

O is a measure of the statistical error in the estimate O of O it is called the Standard

Error of the Mean SEM

Averaging over all p ossible data sets of the N observations we nd

c

O O O

APPENDIX C STATISTICAL ERRORS AND CORRELATION TIMES

N N

c c

X X

O O O

j i

N

c

j i

N N

c c

X X

O O O O N

i j c

N

c

j i i

N

c

X

i O O O

i

O O

N N O O

c c

i

C

O O and assumed that we have no preference for a particular where we have used

starting state so O O dep ends only on i j We shall now simplify the notation by writing

i j

v ar O O O

and

O O O

i

C

i

O O

To estimate v ar O in practice we use the sample variance

N

c

X

O O s

i

N

c

i

By expanding O as in the derivation of equation C it can b e shown that s

N v ar O N

c c

If adjacent congurations are uncorrelated all the are zero and equation C reduces to

i

O v ar O N C

c

see any statistics textb o ok for example chapter for another derivation of this How

ever b ecause the MonteCarlo metho d generates a new conguration by making a small change

to the existing one adjacent congurations in the Markov chain are in practice always highly

correlated that is to say the remain appreciable until i is quite big How fast the correlations

i

decay dep ends on how well the system can explore its conguration space which dep ends in

turn b oth on the algorithm which determines the matrix R of allowed transitions and on the

sampled distribution which may make certain transitions very unlikely even if R In

ij

order to quantify the eect of correlations we dene the correlation time of the observable O

APPENDIX C STATISTICAL ERRORS AND CORRELATION TIMES

O by

X

C

i O

i

which will b e dierent for dierent observables in a single simulation is measured in units

O

of the time for a single MC up date Now let us assume that N is big enough that and

c O

that i N can b e put equal to one for all terms where is not negligibly small if

c i

this is not true then it implies that the total sampling time is only of the order of and the

O

results will in that case b e irredeemable by any amount of variance analysis we are thinking

here of congurations generated by a single Markov pro cess this statement is not necessarily

true if many indep endent simulations are run together in parallel as in section Putting

all this into equation C gives

O N v ar O C

c O

It is normally found that the decay exp onentially so

i

expi

i O

In that case a slightly b etter approximation than equation C is

O N v ar O

c

where exp Other improved approximations of an accuracy not normally re

O

quired are discussed in

In any case it is clear that if the correlation time is large it will dominate the error and

O

in fact it may b e a waste of eort to record O for every conguration of the Markov chain

If we sample at regular intervals of k up dates then

k

O k N v ar O C

c

k

which stays within a few p ercent of its minimum value until k and then increases

O

linearly with k with gradient This tells us that there is no advantage in collecting samples at

intervals more frequent than However in practise doing so do es little harm since recording

O

O and doing the analysis usually require negligible time compared with the generation of

APPENDIX C STATISTICAL ERRORS AND CORRELATION TIMES

the congurations themselves

There are two valid approaches to the estimation of O in practice one is to measure

the correlation functions and then to estimate by summing them It is found to b e

i O

essential to cut o the summation at some p oint for example when for the rst time a negative

term in the series is encountered since the estimates of at large i are very noisy and may

i

seriously distort the answer section We can then use equations C or C

directly Alternatively we can simply try to measure the standard error of O directly To do

this we blo ck the congurations into m N blo cks O is enough and estimate O

b

on each blo ck chapter Then we measure the mean and variance of the blo cked means

f O g and use the simple formula

m

O N O v ar

b

since the blo cks should b e long enough for the blo ck means to b e uncorrelated if they are not

then N is not large enough for go o d results anyway A variant of this is to dene estimators

c

J

O on all the blo cks of congurations except the mth and then to nd the mean and variance

m

of these see app endix D It is the blo cking approach that we have generally used to measure

O in this thesis however we shall still consider on o ccasion particularly in section

O

where we shall use the fact that it can b e expressed in terms of correlation functions to make

an analytic prediction of the variance of estimators from various sampled distributions

We should note that whatever algorithm we are working with we must exp ect to have to up

date all the particles or at least a constant fraction of them to get uncorrelated congurations

d

This implies that the b est we can do is have t L

O

If we are interested in calculating observables like E within a single phase normally

O

has a b ehaviour not dissimilar to the ideal and accurate answers can b e obtained without to o

much eort by simple algorithms like the singlespinip Metrop olis sampling the Boltzmann

distribution However if we are trying to measure the free energies asso ciated with phase

transitions then can b e very large indeed

O

1 d

Strictly it is the amount of computing p ower that go es like L If we have a parallel computer then we may

apparently do b etter since for small L some pro cessors may b e uno ccupied and we can bring them in as we

increase L thus apparently keeping constant as L increases Once all the pro cessors are o ccupied the given

O

scaling law applies

App endix D

Jackknife Estimators

The estimators that we pro duce in multicanonical simulations like O in equations and

are ratio estimators that is to say they are ratios of sums and are in fact slightly biased

O O It can b e shown see p that

xc can

P

xc

C E exp E P E covO

E

P

O O

xc can

xc

C E exp E P E

xc

E

P

xc

This bias will not b e zero unless O and C E exp E P E are uncorrelated

E

typically it is of order N The same is true of other biased estimators for instance the

c

estimator of free energy we shall use b elow is the logarithm of a ratio estimator

p

It should b e noted however that we exp ect the standard deviation of O to go like N

c

and we can usually safely regard the bias as negligible in comparison with this However

to b e sure that the bias is negligible we have in this chapter generally used doublejackknife

biascorrected estimators for our estimates of canonical averages and their error bars

A jackknife estimator is dened in the same way as a normal estimator but on a subset of

the data We divide up the Markov chain into b sets of N congurations so that we have

b

j J

b histograms C E j b with b N N Then the j th jackknife estimator O is

b c

j

dened like O but on all the the p o oled data from all the b histograms except the j th

P

b

J J

while the standard error of the mean the We can dene the mean of these O O

j

AV

j

P

b

J J

O O Simple substitution of O E E shows error bar is given by s b b

j

AV J

j

that this reduces to equation the normal expression for the standard error in the mean

in the unbiased case These singlejackknife estimators provide an estimate of variance that

APPENDIX D JACKKNIFE ESTIMATORS

is somewhat more robust less aected by a small sample size than the usual blo cking They

can also b e used to pro duce an estimator which has a reduced bias We assume that the bias of

J

c b N c b N O c bN c bN then the bias of all the O

b b b b

j

JC J

is then unbiased As can b e seen by substitution the estimator O bO b O

r at

AV

to order N However we no longer have an estimate of the standard error of this new

c

estimator To obtain b oth we can extend the approach dening doublejackknife estimators

JJ

O is dened on the data with b oth the j th and k th blo cks of congurations omitted j k

j k

P

JJC J JJ

Then O b O b b are a set of doublejackknife bias O

j j

j k

k j

J

corrected estimators and we can calculate their mean and variance as for the O ab ove For a

j

fuller explanation of the use of jackknife estimators see and for an account of their use in

multicanonical simulations see

App endix E

Details of the SquareWell Solid

Simulation

We wish to carry out a simulation of a d fcc squarewell solid in a constant volume The

primitive unit call of the fcc lattice consists of four particles arranged in a tetrahedron at

and where the vectors are in units where the

side of the cubic unit cell has unit length For convenience we wish to simulate in a cubic

volume and to apply p erio dic b oundary conditions to remove the eects of surfaces edges and

corners see A Suitablysized assemblies of particles then consist of n unit cells arranged

in a cub e with n the rst few such systems thus contain

particles To make particle moves we shall use the usual MonteCarlo pro cedure of generating

trial random displacements of randomlychosen particles and accepting or rejecting them using

the Metrop olis metho d

As regards the choice of system size there is clearly a tradeo b etween ease of simulation

and the accuracy in principle achievable in a simulation with unlimited runtime Because

phase transitions are rounded o and shifted in a nite volume it is ideally desirable to

simulate large systems to get closer to the innitevolume limit however larger systems clearly

demand more computer time and the time required for equilibration and to sample all accessible

parts of the phase space quickly b ecomes excessive This is particularly true of simulations like

these where we are interested in free energies we have seen section that thermo dynamic

integration requires increasingly many simulation p oints around a phase transition b ecause of

APPENDIX E DETAILS OF THE SQUAREWELL SOLID SIMULATION

rapid variation of the integrand while in a multicanonicalexpanded ensemble simulation the

range of macrostates that must b e covered to take in b oth phases is itself extensive It turns out

in fact that the multicanonical ensemble with a hard core p otential is particularly demanding

see section Therefore we have in practice chosen to work with quite small systems of

particles and at least in section to use nitesize scaling to extrap olate the results

This choice then presents some problems in relation to our chosen computer the Connection

Machine CM The CM consists of k pro cessors group ed into pro cessing no des Arrays

are spread across this machine and corresp onding elements op erated on in parallel Thus the

obvious way of mapping the squarewell solid onto the CM would b e geometric decomp osition

break up the simulation volume and assign a region to each pro cessor Noninteracting particles

may b e up dated in parallel which in this case where the interparticle p otential is extremely

short ranged would require information only from within each pro cessor and in some cases

from nearest neighbour pro cessors However if the array is to o small then some pro cessors

are assigned no data and are deactivated clearly this is an inecient use of the machine The

minimum size of array that uses all the pro cessors dep ends on the geometry and in our case

turns out to have k elements Therefore if we simulated a single system with geometric

decomp osition it would have to contain at least k particles which is much to o large to b e

dealt with easily We might then think instead of using just primitive parallelism where we

would simulate k indep endent replicas of a single smaller simulation with each simulation

b eing completely lo cal to a pro cessing no de and all up dates within each simulation b eing serial

This strategy which has the additional advantage of eliminating interprocessor communication

in particle moves is in fact the way that the simulations of particles have b een carried

out However the machine do es not have enough memory to treat systems of more than

particles this way To deal with them we need a mixture of primitive parallelism and geometric

decomp osition The way we have implemented this in practice is shown in gure E The large

cub e shows the way that the parallel dimensions of the array that holds the particles co ordinates

are laid out it can b e thought of as showing the layout of parallel virtual pro cessors in a d

grid The small shaded cub es show the array elements that b elong to a single simulation

1

we emphasise that we are here considering only up dating the positions of the particles and in no simulation

p erformed here do es this result in a change of the variable that is preweighted by  As we saw in section

MC up dates where  may change introduce an eective coupling b etween the particles and prevent parallel up dating

APPENDIX E DETAILS OF THE SQUAREWELL SOLID SIMULATION

the particles within them exist in a single physically continuous volume even though they are

separated in the co ordinate array

Figure E The layout of a single simulation volume within a parallel array holding

the particles p ositions Left each simulation divided into eight subvolumes simulations

run in parallel Right each simulation divided into subvolumes simulations run in

parallel

In the diagram on the left each simulation is divided into subvolumes and distributed

over eight virtual pro cessors The total number of simulations run in parallel is

With eight unit cells particles p er subvolume we would then have particles p er

simulation in total Similarly in the diagram on the right each simulation is divided into and

of them are run in parallel With particles p er subvolume this would imply particles

in total which is excessive for our purp oses but nevertheless illustrates how the system may b e

scaled The numbers of unit cellssubvolume and subvolumessimulation which may of course

b e equal to one and the total size of the co ordinate array are controlled by parameters at

compile time and to change them do es not require rewriting of co de The relevant parameters

are

NPR the edge length of the array that holds the co ordinates so the diagrams in gure E

b oth have NPR

NEDGE the number of subvolumes along each edge of each simulation so the diagrams in

gure E have NEDGE left and NEDGE

LPV the number of unit cells along the edge of a subvolume

Related quantities are

N the number of particlessimulation given by N NEDGELPV

APPENDIX E DETAILS OF THE SQUAREWELL SOLID SIMULATION

NSIM N the number of indep endent replica simulations run in parallel given by NSIM

s

NPRNEDGE

The reason that the subvolumes of each simulation are split up across the array is that by

doing this each subvolume can access the co ordinates of particles in neighbouring subvolumes

in a single p erio dic shift op eration where all data moves the same distance For example

single shifts of NPRNEDGE are used to access the co ordinates of particles in subvolumes that

share faces rep eated shifts at rightangles are required to get at neighbours that share edges or

corners Particles in subvolumes that are not nearest neighbours are to o far apart to interact

b efore or after a particle move so we do not need to check them If the subvolumes were

group ed together slower messages would have to b e sent using the general communications

router b ecause the p erio dic b oundary conditions of each simulation would not match those of

the array as a whole

Having describ ed how each simulation is or can b e split up into subvolumes it remains to

describ e how the particles are treated within each subvolume All particles within a subvolume

are lo cal to a pro cessing no de and are indexed using extra dimensions declared serial in

the co ordinate array The normal way to treat this problem is to keep a neighbour list

which for each particle records all the other particles that it may interact with However

to reference and up date the neighbour lists which are in general dierent for each particle

requires indirect addressing indices on indices and this generates slow communication co de

on the CM even when the particles are within the same pro cessing no de For this reason we

did not use neighbour lists

In fact b ecause we are dealing only with solids each particle always stays near its lattice site

and so would have had the same unchanging neighbour list of its twelve nearest neighbours

Given this indirect addressing could have b een avoided however we opted at the design stage for

a metho d that could b e applied to uids with little mo dication since it was then our intention

to investigate them as well This led us to the following metho d each simulation subvolume

is further sub divided into eight o ctants and they are cycled through in a xed order with a

particle in each b eing picked at random for a trial displacement the displacement is chosen at

random within a small sphere of radius x x itself is chosen to give an acceptance ratio of

particle moves of Provided that the subvolume is big enough this ensures that particles

in corresp onding o ctants of dierent subvolumes of the same simulation cannot interact b efore

or after they are moved and so can b e up dated in parallel we require for the general case

APPENDIX E DETAILS OF THE SQUAREWELL SOLID SIMULATION

where the particles need not b e on their lattice sites that the side length L of the subvolume

should satisfy

L x

For the squarewell solid each particle move requires the calculation of a minimum of in

teractions with the nearestneighbours b efore the move and with them after it However

we do not in general know which the nearestneighbours are b ecause there are no neighbour

lists and so must test all p ossible candidates This can cause substantial ineciency for ex

ample with NEDGE and LPV N and primitive parallelism only we must test all the

other particles that is to say we must calculate interactions for each particle move For

NEDGE and LPV N and NEDGE and LPV N this rises to and in

the second case of the interactions require interprocessor communication

The b est p erformance for the solid is in fact obtained by using LPV with NPR increased

to keep NSIM constant Each virtual pro cessor then contains only four particles so half

the o ctants can b e skipp ed over and the pro cess of checking within the pro cessor volume

and within neighbouring o ctants nds just the twelve nearest neighbours as required and no

others Chosing LPV at solid densities violates the general equation for L given ab ove but

since the forces are shortranged and the lattice prevents large particle movements it is still the

case that interacting particles are never up dated simultaneously Because we no longer waste

time calculating interactions that are always zero this pro cedure is slightly faster ie do es a

slightly greater total number of particle up dates p er second than pure primitive parallelism for

N even though interprocessor communication is now involved For N it is ab out

ve times faster than primitive parallelism alone

Bibliography

C Truesdell The Tragicomical History of Thermodynamics SpringerVerlag

Berlin

A B Pippard Elements of Classical Thermodynamics Cambridge University Press Cam

bridge

H B Callen Thermodynamics and an Introduction to Thermostatics John Wiley Sons

New York

K Huang Statistical Mechanics John Wiley Sons New York

R P Feynman Statistical MechanicsA Set of Lectures W A Benjamin inc Reading

MA

D Chandler An Introduction to Modern Statistical Mechanics Oxford University Press

Oxford

J R Waldram The Theory of Thermodynamics Cambridge University Press Cambridge

SK Ma Statistical Mechanics World Scientic Singap ore

Phase Transitions and Critical Phenomena ed C Domb M S Green Academic Press

London

A E Ferdinand M E Fisher Phys Rev B Kaufman Phys Rev

L Onsager Phys Rev

BIBLIOGRAPHY

B M McCoy T T Wu The TwoDimensional Ising Model Harvard University Press

Cambridge MA

R Baierlein Atoms and Information Theory W H Freeman Co

E T Jaynes in Maximum Entropy and Bayesian Methods ed P F Fougere Kluwer

Academic Publishers Dordrecht

J L Leb owitz E H Lieb Phys Rev Lett

M Plischke B Bergersen Equilibrium Statistical Physics Prentice Hall New Jersey

Finite Size Scaling and Numerical Simulation of Statistical Systems ed V Privman World

Scientic Publishing Singap ore

D P Woo dru The SolidLiquid Interface Cambridge University Press London

Jo oyoung Lee M A Novotny P A Rikvold Phys Rev E

K Binder D P Landau Phys Rev B

Murty S S Challa D P Landau K Binder Phys Rev B

J Lee J M Kosterlitz Phys Rev Lett

C Borgs R Kotecky Phys Rev Lett

J S van Duijneveldt D Frenkel J Chem Phys

E Buenoir and S Wallon J Phys A

P Martin Potts Models and Related Problems in Statistical Mechanics World Scientic

Publishing Co Singap ore

R J Baxter Exactly Solved Models in Statistical Mechanics Academic Press London

J A Barker D Henderson Rev Mod Phys

A J Gutmann I G Enting J Phys A L

M E Fisher Rev Mod Phys

BIBLIOGRAPHY

J Amit Field Theory the Renormalisation Group and Critical Phenomena McGrawHill

New York

G S Pawley R H Swendsen D J Wallace K G Wilson Phys Rev B

J P Hansen I R McDonald Theory of Simple Liquids nd edition Academic Press

London

J K Percus G J Yevick Phys Rev

J E Mayer M G Mayer Statistical Mechanics McGrawHill New York

M Parrinello A Rahman Phys Rev Lett

J J Erb enbeck W W Woad in Statistical Mechanics Vol b ed B J Berne Plenum

Press New York

J Kushick B J Berne in Statistical Mechanics Vol b ed B J Berne Plenum Press

New York

S Duane A D Kennedy B J Pendleton D Roweth Phys Lett B

B Mehlig D W Heerman B M Forrest Phys Rev B

S Nose J Phys Cond Mat SA

The Monte Carlo Method in Condensed Matter Physics ed K Binder SpringerVerlag

Berlin

K Binder D W Heerman Monte Carlo Simulation in Statistical Physics An Intro

duction SpringerVerlag Berlin

O G Mouritsen Computer Studies of Phase Transitions and Critical Phenomena

SpringerVerlag Berlin

M P Allen D J Tildesley Computer Simulation of Liquids Clarendon Press Oxford

K Binder J Comp Phys

BIBLIOGRAPHY

D Frenkel Free Energy Computation and FirstOrder Phase Transitions in Molecular

Dynamics Simulation of StatisticalMechanical Systems ed G Ciccotti W G Ho over

NorthHolland Amsterdam

D Frenkel Monte Carlo Simulations in Computer Modelling of Fluids Polymers and

Solids ed C R A Catlow C S Parker M P Allen Kluwer Academic Publishers

Dordrecht

W H Press B P Flannery S A Teukolsky W T Vetterling Numerical Recipes

Cambridge University Press Cambridge

C J Geyer in Computer Science and Statistics Proceedings of the rd Symposium In

terface

C J Geyer E A Thompson J R Statist Soc B

N Metrop olis A W Rosenbluth M N Rosenbluth A H Teller E Teller J Chem

Phys

H M ullerKrumbhaar K Binder J Stat Phys

K S Shing K E Gubbins Mol Phys

R H Swendsen JS Wang Phys Rev Lett

U Wolfe Phys Rev Lett

R M Neal Probabilistic Inference Using Markov Chain MonteCarlo Methods Technical

Rep ort CRGTR Dept of Computer Science University of Toronto

B A Berg T Neuhaus Phys Lett B Phys Rev Lett

A D Kennedy review article Nucl Phys B S

A P Lyubartsev A A Martsinovski S V Shevkunov P N VorontsovVelyaminov

J Chem Phys

M Abramowitz I A Stegun Handbook of Mathematical Functions Dover New York

N Y

N F Carnahan K E Starling J Chem Phys

BIBLIOGRAPHY

B L Holian G K Straub R E Swanson D C Wallace Phys Rev B

G N Patey J P Valleau Chem Phys Lett

G N Patey G M Torrie J P Valleau J Chem Phys

W G Ho over M Ross D Henderson J A Barker B C Brown J Chem Phys

W G Ho over S G Gray K W Johnson J Chem Phys

R E Swanson G K Straub B L Holian D C Wallace Phys Rev B

P Bolhuis D Frenkel Phys Rev Lett

M Hagen E J Meijer G C A M Mo oij D Frenkel H N W Lekkerkerker Nature

D Frenkel A J C Ladd J Chem Phys

W H Ho over F H Ree J Chem Phys

W H Ho over F H Ree J Chem Phys

J P Hansen L Verlet Phys Rev

K K Mon Phys Rev B

T P Straatsma H J C Berendsen J P M Postma J Chem Phys

D A Kofke Mol Phys

McDonald Singer J Chem Phys

McDonald Singer J Chem Phys

J P Valleau D N Card J Chem Phys

Z Li H A Scheraga J Phys Chem

Z Li H A Scheraga Chem Phys Lett

B S Whatson KW Chao J Chem Phys

C H Bennett J Comp Phys

BIBLIOGRAPHY

S D Hong B J Yoon M S Jhon Chem Phys Lett

K K Mon Phys Rev Lett

A M Ferrenberg R H Swendsen Phys Rev Lett Erratum ibid

B Widom J Chem Phys

K S Shing K E Gubbins Mol Phys K S Shing K E Gubbins

Mol Phys

S K Kumar J Chem Phys

A M Ferrenberg R H Swendsen Phys Rev Lett

A M Ferrenberg in Computer Simulation Studies in Condensed Matter Physics III ed

D P Landau K K Mon HB Schuttler SpringerVerlag Berlin Heidelb erg

J M Rickman S R Philp ot Phys Rev Lett

E P Munger M A Novotny Phys Rev B

G M Torrie J P Valleau Chem Phys Lett

L D Fosdick Methods Comput Phys

B Hesselb o R B Stinchcombe Phys Rev Lett

M Mezei J Comp Phys

B A Berg Int J Mod Phys C

W Janke in Computer Simulations in Condensed Matter Physics VII ed D P Landau

K K Mon HB Schuttler SpringerVerlag Berlin

B A Berg U H E Hansmann T Neuhaus Phys Rev B brief rep orts

B A Berg U H E Hansmann T Neuhaus Z Phys B

A Billoire T Neuhaus B A Berg Nucl Phys B

W Janke B A Berg M Kato ot Nucl Phys B

BIBLIOGRAPHY

W Beirl BA Berg B Krishnan H Markum J Reidler Nulc Phys B S

BA Berg B Krishnan Phys Lett B

B Grossman M L Laursen T Trappenberg U J Wiese Phys Lett B

B Grossman M L Laursen Nucl Phys B

B A Berg T Celik Phys Rev Lett Int J Mod Phys C

B A Berg T Celik U H E Hansmann Europhys Lett

B A Berg U H E Hansmann T Celik Nucl Phys B S

T Celik U H E Hansmann M Kato ot J Stat Phys

B A Berg Nature

U H E Hansmann Y Okamoto J Comp Chem

B A Berg U H E Hansmann Y Okamoto J Phys Chem

U H E Hansmann Y Okamoto preprint ETHIPS NWU March

K Rummukainen Nucl Phys B

W Janke T Sauer Phys Rev E Nucl Phys B S

W Janke T Sauer J Stat Phys

W Janke S Kappler Nucl Phys B S Phys Rev Lett

A M Ferrenberg D P Landau R H Swendsen Phys Rev E

Jo oyoung Lee Phys Rev Lett Erratum ibid

N B Wilding Phys Rev E

B A Berg preprint available as pap er from archive at

httpxxxlanlgovheplat

A P Lyubartsev A Laarksonen P N VorontsovVelyaminov Mol Phys

BIBLIOGRAPHY

E Marinari G Parisi Europhysics Lett

L A Fernandez E Marinari J J RuizLorenzo unpublished

G Iori E Marinari G Parisi Europhysics Lett Int J Mod Phys C

A Irback F Potthast preprint LU TP

D Bouzida S K Kumar R H Swendsen Phys Rev A

E Marinari G Parisi J RuizLorenzo F Retort preprint available as pap er

from archive at httpbabbagesissaitcondmat submitted to Phys Rev Lett

N B Wilding M M uller K Binder J Chem Phys

G C A M Mo oij D Frenkel B Smit J Phys Cond Mat L

W Kerler P Rehberg Phys Rev E

W Kerler C Rebbi A Weber Nucl Phys B S same

authors preprint BUHEP available as pap er from archive at

httpxxxlanlgovheplat

J P Valleau J Comp Phys

J P Valleau J Chem Phys

R W Gerling A H uller Z Phys B

M Promberger A H uller Z Phys B

G E Norman V S Filinov High Temp Res USSR

D J Adams Mol Phys

N B Wilding A D Bruce J Phys Condens Matter

A Z Panagiotopoulos Mol Phys

A Z Panagiotopoulos N Quirke M Stapleton D J Tildesley Mol Phys

B Smit P De Smedt D Frenkel Mol Phys

BIBLIOGRAPHY

B Smit in Computer Simulation in Chemical Physics ed M P Allen D J Tildesley

Kluwer Academic Publishers Dordrecht

A Z Panagiotopoulos Mol Simulation

SK Ma J Stat Phys

H M Huang SK Ma Y M Shih Solid State Communs

H Meirovich J Phys A

A G Schlijper A R D Van Bergen B Smit Phys Rev A

Kikuchi Phys Rev reprinted in Phase Transitions and Critical Phenomena

Vol ed C Domb M S Green Academic Press London

K Binder Z Phys B

J M Rickman S R Philp ot J Chem Phys

J M Rickman D J Srolovitz J Chem Phys

G Bhanot S Black P Carter R Salvador Phys Lett B G Bhanot

R Salvador S Black P Carter R Toral Phys Rev Lett

K M Bitar Nucl Phys B

A B Bortz M H Kalos J L Leb owitz J Comp Phys

M A Novotny Phys Rev Lett Erratum ibid

G L Bretthorst in Maximum Entropy and Bayesian Methods ed P F Fougere Kluwer

Academic Publishers Dordrecht

T J Loredo in Maximum Entropy and Bayesian Methods ed P F Fougere Kluwer

Academic Publishers Dordrecht

T Bayes reprinted in Biometrika

H Jereys The Theory of Probability Clarendon Press Oxford later editions

E S Ristad preprint CSTR available as pap er from archive at

httpxxxlanlgovcmplg

BIBLIOGRAPHY

J J Martin Bayesian Decision Problems and Markov Chains Wiley New York

R A Howard Dynamic Probabilistic Systems Vol Markov Models Wiley New York

M Kra jci J Hafner Phys Rev Lett

D Nicolaides A D Bruce J Phys A

A D Bruce submitted to J Phys E

R Hilfer Z Phys B

A Aharony M E Fisher Phys Rev B

E W Montroll in Proc Symp Applied Maths

P Bolhuis M Hagen D Frenkel Phys Rev E

C F Tejero A Daanoun J N W Lekkerkerker M Baus Phys Rev Lett

A Daanoun C F Tejero M Baus Phys Rev E

C F Tejero A Daanoun J N W Lekkerkerker M Baus Phys Rev E

P N Pusey in Les Houches Session LI Liquides Cristal lisation et Transition

Vitreuse Liquids Freezing and Glass Transition ed J P Hansen D Levesque J Zinn

Justin Elsevier Science Publishers B V

S Asakura F Oosawa J Polymer Sci

D A Young Phase Diagrams of the Elements University of California Press

P R Sp erry J Col l Interface Sci

J N W Lekkerkerker W CK Poon P N Pusey A Stro obants P B Warren

Europhysics Lett

M Hagen and D Frenkel J Chem Phys

R Hall J Chem Phys

C Borgs R Kotecky J Stat Phys

BIBLIOGRAPHY

C Borgs W Janke Phys Rev Lett

A R Ubb elohde The Molten State of Matter Wiley New York

M N Barb er in Phase Transitions and Critical Phenomena Vol p ed C Domb

J L Leb owitz Academic Press New York and references therein

V Privman J Rudnick J Stat Phys

V Privman M E Fisher Phys Rev B

H Cramer The Elements of Probability Theory Wiley New York

M Kikuchi N Ito Y Okabe in Computer Simulations in Condensed Matter Physics

VII ed D P Landau K K Mon HB Schuttler SpringerVerlag Berlin

H L Grey W R Schucany The Generalized Jackknife Statistic New York M Dekker

B A Berg Comp Phys Communs