Maximum Entropy Distribution (Also Known As the Maxwell-Boltzmann Distribution)

Total Page:16

File Type:pdf, Size:1020Kb

Maximum Entropy Distribution (Also Known As the Maxwell-Boltzmann Distribution) Maximum Entropy Framework Sridhar Mahadevan [email protected] University of Massachusetts ©Sridhar Mahadevan: CMPSCI 689 – p. 1/?? Topics Maximum entropy is the dual of maximum likelihood. ME involves finding the “maximally general” distribution that satisfies a set of prespecified constraints The notion of “maximal generality” is made precise using the concept of entropy The distribution that satisfies the ME constraint can be showntobeinan exponential form. The concept of maximum entropy comes from statistical physics, and this whole area of machine learning borrows heavily from physics. Undirected graphical models are called random field models, and also derive much of their formalism from statistical physics. ©Sridhar Mahadevan: CMPSCI 689 – p. 2/?? Statistical Physics Molecules in a gas Temperature ~ Kinetic Energy Distribution of velocities? ©Sridhar Mahadevan: CMPSCI 689 – p. 3/?? Statistical Physics The concept of entropy was investigated originally in statistical physics and thermodynamics. The temperature of a gas is proportional to the average kinetic energy of the molecules in the gas. The distribution of velocities at a given temperature is a maximum entropy distribution (also known as the Maxwell-Boltzmann distribution). Boltzmann machines are a type of neural network that get their inspiration from statistical physics. There is a whole subfield of energy basedmodelsinmachine learning based on borrowing ideas from statistical physics. ©Sridhar Mahadevan: CMPSCI 689 – p. 4/?? Maximum Entropy Example Consider estimating a joint distribution over two variables u and v,givensome constraints P(u,v) 0 1 0 ? ? 1 ? ? 0.6 1.0 The maximum entropy framework suggests picking values that make the least commitments, while being consistent with the contraints that are given. For this example, this suggests a unique assignment of values, but in general, determining the set of possible distributions is not so straightforward. ©Sridhar Mahadevan: CMPSCI 689 – p. 5/?? Maximum entropy distributions Consider maximizing the entropy h(P ) over all distributions P satisfying the following constraints: P (x) 0 (where x S,thesupportofP ). ≥ ∈ x∈S P (x)=1 P (x)ri(x)=αi for 1 i m. !x∈S ≤ ≤ ! ©Sridhar Mahadevan: CMPSCI 689 – p. 6/?? Maximum entropy distribution Writing out the Lagrangian, we get m Λ(P )= P (x)lnP (x)+λ ( P (x) 1) + λi( P (x)ri(x) αi) − 0 − − x∈S x∈S i=1 x∈S " " " " Taking the gradient of this Lagrangian w.r.t. P (x) we get m ∂ Λ(P )= ln P (x) 1+λ0 + λiri(x) ∂P − − # i=1 $ " m λ0−1+ λiri(x) P (x)=e i=1 x S ⇒ ∈ % ! & where λ0, λ1,...,λm are chosen so that the constraints are satisfied. ©Sridhar Mahadevan: CMPSCI 689 – p. 7/?? Differential Entropy If X is a continuous variable, the entropy H(X) is referred to as the differential entropy Asimilarresultcanbeprovedforcontinuousvariables P (x) 0 (where P (x)=0if x is not in the support S of P ). ≥ P (x)dx =1 S P (x)ri(x)=αi for 1 i m. 'S ≤ ≤ The Lagrangian' in the continuous case is m Λ(P )= P (x)lnP (x)+λ0( P (x)dx)+ λi( P (x)ri(x)dx αi) − − S S i=1 S ( ( " ( ©Sridhar Mahadevan: CMPSCI 689 – p. 8/?? Example 1: Uniform Distribution Find the maximum entropy distribution that satisfies the following constraints: S =[a, b] Since there are no other constraints, the form of the distribution must be P (x)=eλ0 This is because all λi =0,i>0. Solving this integral, we find b eλ0 dx =1 eλ0 (b a)=1 ⇒ − (a which immediately gives us the uniform distribution because 1 P (x)=eλ0 = b a − ©Sridhar Mahadevan: CMPSCI 689 – p. 9/?? Example 2: Gaussian Distribution Find the maximum entropy distribution that satisfies the following constraints: S =( , ) −∞ ∞∞ EX or xP x dx =0 −∞ ( ) =0 EX2 = σ2. ' 2 The form of our distribution in this case is P (x)=eλ0+λ1x+λ2x We have to find the constants λi by solving ∞ 2 eλ0+λ1x+λ2x dx =1 −∞ (∞ 2 xeλ0+λ1x+λ2x dx =0 −∞ (∞ 2 x2eλ0+λ1x+λ2x dx = σ2 (−∞ 1 Show by choosing λ1 =0, λ2 = 2 and λ0 = ln √2πσ,wegettheGaussian − 2σ − distribution. ©Sridhar Mahadevan: CMPSCI 689 – p. 10/?? Example 3: Mystery Distribution Find the maximum entropy distribution that satisfies the following constraints: S =[0, ) ∞ EX = µ The solution to this mystery distribution gives us a well-known PDF (highly recommended you solve this by finding the λi constants!) ©Sridhar Mahadevan: CMPSCI 689 – p. 11/?? Cross Entropy and Log Likelihood Let us define the empirical distribution P˜(x) by placing a point mass at each data point N 1 P˜(x)= I(x, xn) N n=1 " The cross-entropy of the empirical distribution with the model P (x θ) gives: | N 1 P˜(x)logP (x θ)= I(x, xn)logP (x θ) | N | x x n=1 " " N " 1 = I(x, xn)logP (x θ) N | n=1 x "N " 1 1 = log P (xn θ)= l(θ X) N | N | n=1 " ©Sridhar Mahadevan: CMPSCI 689 – p. 12/?? KL Divergence and Maximum Likelihood Let us compute the KL divergence between the empirical distribution P˜(x) and the model P (x θ). | P˜(x) KL(P˜ P (x θ)) = P˜(x)log || | P (x θ) x | " = P˜(x)logP˜(x) P˜(x)logP (x θ) − | x x " "1 = P˜(x)logP˜(x) l(θ X) − N | x " The first term is independent of the model parameters. This means optimizing the parameters θ to minimize the KL divergence between the model and the empirical distribution is the same as maximizing the (log) likelihood. ©Sridhar Mahadevan: CMPSCI 689 – p. 13/?? Conditional Maximum Entropy In classification, we are interested in modeling the conditional distribution P (y x). | Let us now turn to viewing classification in the maximum entropy framework. To model P (y x),letusassumewearegivenasetofbinary features fi(x, y), | which represents the co-occurence of the (“contextual”) input x and (label) output y. For example, in a natural language application, x may be the set of words surrounding a given word, and y may represent the part of speech (or a category like “company name”). Let us generalize the idea from logistic regression, and construct an exponential model P (y x, λ) using a weighted linear combination of the features fi(x, y), | where the weights are scalars λi. We want to set up constraints based on the “error” between the empirical joint distribution P˜(x, y) and the product model distribution P˜(x)P (y x, λ) | ©Sridhar Mahadevan: CMPSCI 689 – p. 14/?? Constraints We now define a set of constraints on the simplex of probability distributions. Each constraint is based on comparing the prediction of the model, and the empirical distribution based on the training set (x1,y1),...,(xn,yn). Let us define the empirical joint distribution P˜(u, v) as simply the frequency of co-occurence of (u, v) in the training set. 1 P˜(u, v)= I((u, v), (x, y)) N x,y " For each binary valued feature f(x, y) we can impose an additional constraint, namely the expected value of f under the model should agree with its expectation under the empirical joint distribution. EP (f) P˜(x)P (y x, λ)f(x, y)=E (f) P˜(x, y)f(x, y) ≡ | P˜ ≡ x,y x,y " " ©Sridhar Mahadevan: CMPSCI 689 – p. 15/?? Maximum Conditional Entropy Given n feature functions fi(x, y),wewouldlikeourestimatedmodeltobein accordance with the sample distribution = P EP (fi(x, y)) = E (fi(x, y)) for i 1,...,n C ∈ P| P˜ ∈ { } But these set of% constraints still define a simplex of possibledistributions,andwe& need to constrain this further. The maximum conditional entropy principle states that we should pick the model that maximizes the conditional entropy H(P )= P˜(x)P (y x, λ)logP (y x, λ) − | | x∈X ,y∈Y " Putting both these constraints together, we get ∗ P = argmaxP ∈CH(P ) ©Sridhar Mahadevan: CMPSCI 689 – p. 16/?? Constrained Optimization For each feature function fi(x, y),weintroduceaLagrangianmultiplierparameter λi,andformtheoverallLagrangianas Λ(P, λ) H(P )+ λi EP (fi) E (fi) ≡ − P˜ i " % & If we keep λ fixed, we can find the unconstrained maximum of the Lagrangian Λ(P, λ) over all P .LetPλ denote the argument where the maximum is ∈ P reached, and Ψ(λ) denote the maximizing value Pλ = argmaxP ∈P Λ(P, λ) Ψ(λ)=Λ(Pλ, λ) ©Sridhar Mahadevan: CMPSCI 689 – p. 17/?? Maximum Entropy Distribution Key theorem: By solving the Lagrangian above, one can show that the maximum entropy distribution is given by the following exponential form 1 λifi(x,y) P (y x, λ)= e i | Zλ %! & λifi(x,y) i where Zλ(x)= y e The maximizing value% is & ! ! Ψ(λ) Λ(Pλ, λ)= P˜(x)logZλ(x)+ λiE (fi) ≡ − x i P˜ ! ! ©Sridhar Mahadevan: CMPSCI 689 – p. 18/?? Karush Kuhn Tucker Theorem The original constrained optimization problem, called the primal is ∗ P = argmaxP ∈CH(P ) The dual (unconstrained) optimization problem is ∗ Find λ = argmaxλΨ(λ) The KKT theorem shows that under certain conditions, the solution to the dual problem gives us the solution to the primal problem. ∗ More precisely, P = Pλ∗ We will see the KKT theorem shows up again in support vector machines. ©Sridhar Mahadevan: CMPSCI 689 – p. 19/?? Maximum Likelihood Revisited The log-likelihood of of the empirical distribution P˜,aspredictedbythe discriminant function P (Y X, λ) is | ˜ L (P )=log P (y x, λ)P (x,y) = P˜(x, y)logP (y x, λ) P˜ | | x,y x,y ) " We can show that the log-likelihood above is exactly the dual function Ψ(λ) given above. In other words, the maximum entropy model P ∗ is exactly the model in the ∈ C parametric family P (Y X, λ) that maximizes the likelihood of the training instances | P˜.
Recommended publications
  • Canonical Ensemble
    ME346A Introduction to Statistical Mechanics { Wei Cai { Stanford University { Win 2011 Handout 8. Canonical Ensemble January 26, 2011 Contents Outline • In this chapter, we will establish the equilibrium statistical distribution for systems maintained at a constant temperature T , through thermal contact with a heat bath. • The resulting distribution is also called Boltzmann's distribution. • The canonical distribution also leads to definition of the partition function and an expression for Helmholtz free energy, analogous to Boltzmann's Entropy formula. • We will study energy fluctuation at constant temperature, and witness another fluctuation- dissipation theorem (FDT) and finally establish the equivalence of micro canonical ensemble and canonical ensemble in the thermodynamic limit. (We first met a mani- festation of FDT in diffusion as Einstein's relation.) Reading Assignment: Reif x6.1-6.7, x6.10 1 1 Temperature For an isolated system, with fixed N { number of particles, V { volume, E { total energy, it is most conveniently described by the microcanonical (NVE) ensemble, which is a uniform distribution between two constant energy surfaces. const E ≤ H(fq g; fp g) ≤ E + ∆E ρ (fq g; fp g) = i i (1) mc i i 0 otherwise Statistical mechanics also provides the expression for entropy S(N; V; E) = kB ln Ω. In thermodynamics, S(N; V; E) can be transformed to a more convenient form (by Legendre transform) of Helmholtz free energy A(N; V; T ), which correspond to a system with constant N; V and temperature T . Q: Does the transformation from N; V; E to N; V; T have a meaning in statistical mechanics? A: The ensemble of systems all at constant N; V; T is called the canonical NVT ensemble.
    [Show full text]
  • Magnetism, Magnetic Properties, Magnetochemistry
    Magnetism, Magnetic Properties, Magnetochemistry 1 Magnetism All matter is electronic Positive/negative charges - bound by Coulombic forces Result of electric field E between charges, electric dipole Electric and magnetic fields = the electromagnetic interaction (Oersted, Maxwell) Electric field = electric +/ charges, electric dipole Magnetic field ??No source?? No magnetic charges, N-S No magnetic monopole Magnetic field = motion of electric charges (electric current, atomic motions) Magnetic dipole – magnetic moment = i A [A m2] 2 Electromagnetic Fields 3 Magnetism Magnetic field = motion of electric charges • Macro - electric current • Micro - spin + orbital momentum Ampère 1822 Poisson model Magnetic dipole – magnetic (dipole) moment [A m2] i A 4 Ampere model Magnetism Microscopic explanation of source of magnetism = Fundamental quantum magnets Unpaired electrons = spins (Bohr 1913) Atomic building blocks (protons, neutrons and electrons = fermions) possess an intrinsic magnetic moment Relativistic quantum theory (P. Dirac 1928) SPIN (quantum property ~ rotation of charged particles) Spin (½ for all fermions) gives rise to a magnetic moment 5 Atomic Motions of Electric Charges The origins for the magnetic moment of a free atom Motions of Electric Charges: 1) The spins of the electrons S. Unpaired spins give a paramagnetic contribution. Paired spins give a diamagnetic contribution. 2) The orbital angular momentum L of the electrons about the nucleus, degenerate orbitals, paramagnetic contribution. The change in the orbital moment
    [Show full text]
  • Spin Statistics, Partition Functions and Network Entropy
    This is a repository copy of Spin Statistics, Partition Functions and Network Entropy. White Rose Research Online URL for this paper: https://eprints.whiterose.ac.uk/118694/ Version: Accepted Version Article: Wang, Jianjia, Wilson, Richard Charles orcid.org/0000-0001-7265-3033 and Hancock, Edwin R orcid.org/0000-0003-4496-2028 (2017) Spin Statistics, Partition Functions and Network Entropy. Journal of Complex Networks. pp. 1-25. ISSN 2051-1329 https://doi.org/10.1093/comnet/cnx017 Reuse Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of the full text version. This is indicated by the licence information on the White Rose Research Online record for the item. Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request. [email protected] https://eprints.whiterose.ac.uk/ IMA Journal of Complex Networks (2017)Page 1 of 25 doi:10.1093/comnet/xxx000 Spin Statistics, Partition Functions and Network Entropy JIANJIA WANG∗ Department of Computer Science,University of York, York, YO10 5DD, UK ∗[email protected] RICHARD C. WILSON Department of Computer Science,University of York, York, YO10 5DD, UK AND EDWIN R. HANCOCK Department of Computer Science,University of York, York, YO10 5DD, UK [Received on 2 May 2017] This paper explores the thermodynamic characterization of networks using the heat bath analogy when the energy states are occupied under different spin statistics, specified by a partition function.
    [Show full text]
  • Entropy: from the Boltzmann Equation to the Maxwell Boltzmann Distribution
    Entropy: From the Boltzmann equation to the Maxwell Boltzmann distribution A formula to relate entropy to probability Often it is a lot more useful to think about entropy in terms of the probability with which different states are occupied. Lets see if we can describe entropy as a function of the probability distribution between different states. N! WN particles = n1!n2!....nt! stirling (N e)N N N W = = N particles n1 n2 nt n1 n2 nt (n1 e) (n2 e) ....(nt e) n1 n2 ...nt with N pi = ni 1 W = N particles n1 n2 nt p1 p2 ...pt takeln t lnWN particles = "#ni ln pi i=1 divide N particles t lnW1particle = "# pi ln pi i=1 times k t k lnW1particle = "k# pi ln pi = S1particle i=1 and t t S = "N k p ln p = "R p ln p NA A # i # i i i=1 i=1 ! Think about how this equation behaves for a moment. If any one of the states has a probability of occurring equal to 1, then the ln pi of that state is 0 and the probability of all the other states has to be 0 (the sum over all probabilities has to be 1). So the entropy of such a system is 0. This is exactly what we would have expected. In our coin flipping case there was only one way to be all heads and the W of that configuration was 1. Also, and this is not so obvious, the maximal entropy will result if all states are equally populated (if you want to see a mathematical proof of this check out Ken Dill’s book on page 85).
    [Show full text]
  • Physical Biology
    Physical Biology Kevin Cahill Department of Physics and Astronomy University of New Mexico, Albuquerque, NM 87131 i For Ginette, Mike, Sean, Peter, Mia, James, Dylan, and Christopher Contents Preface page v 1 Atoms and Molecules 1 1.1 Atoms 1 1.2 Bonds between atoms 1 1.3 Oxidation and reduction 5 1.4 Molecules in cells 6 1.5 Sugars 6 1.6 Hydrocarbons, fatty acids, and fats 8 1.7 Nucleotides 10 1.8 Amino acids 15 1.9 Proteins 19 Exercises 20 2 Metabolism 23 2.1 Glycolysis 23 2.2 The citric-acid cycle 23 3 Molecular devices 27 3.1 Transport 27 4 Probability and statistics 28 4.1 Mean and variance 28 4.2 The binomial distribution 29 4.3 Random walks 32 4.4 Diffusion as a random walk 33 4.5 Continuous diffusion 34 4.6 The Poisson distribution 36 4.7 The gaussian distribution 37 4.8 The error function erf 39 4.9 The Maxwell-Boltzmann distribution 40 Contents iii 4.10 Characteristic functions 41 4.11 Central limit theorem 42 Exercises 45 5 Statistical Mechanics 47 5.1 Probability and the Boltzmann distribution 47 5.2 Equilibrium and the Boltzmann distribution 49 5.3 Entropy 53 5.4 Temperature and entropy 54 5.5 The partition function 55 5.6 The Helmholtz free energy 57 5.7 The Gibbs free energy 58 Exercises 59 6 Chemical Reactions 61 7 Liquids 65 7.1 Diffusion 65 7.2 Langevin's Theory of Brownian Motion 67 7.3 The Einstein-Nernst relation 71 7.4 The Poisson-Boltzmann equation 72 7.5 Reynolds number 73 7.6 Fluid mechanics 76 8 Cell structure 77 8.1 Organelles 77 8.2 Membranes 78 8.3 Membrane potentials 80 8.4 Potassium leak channels 82 8.5 The Debye-H¨uckel equation 83 8.6 The Poisson-Boltzmann equation in one dimension 87 8.7 Neurons 87 Exercises 89 9 Polymers 90 9.1 Freely jointed polymers 90 9.2 Entropic force 91 10 Mechanics 93 11 Some Experimental Methods 94 11.1 Time-dependent perturbation theory 94 11.2 FRET 96 11.3 Fluorescence 103 11.4 Some mathematics 107 iv Contents References 109 Index 111 Preface Most books on biophysics hide the physics so as to appeal to a wide audi- ence of students.
    [Show full text]
  • Chapter 6. the Ising Model on a Square Lattice
    Chapter 6. The Ising model on a square lattice 6.1 The basic Ising model The Ising model is a minimal model to capture the spontaneous emergence of macroscopic ferromagnetism from the interaction of individual microscopic magnetic dipole moments. Consider a two-dimensional system of spin 1=2 magnetic dipoles arrayed on a square L L lattice. At each lattice point .i; j /, the dipole can assume only one of two spin states, i;j 1. The dipoles interact with an externally applied constant magnetic field B, and the energyD˙ of a dipole interacting with the external field, Uext, is given by U .i;j / Hi;j ext D where H is a non-dimensionalized version of B. The energy of each spin (dipole) is lower when it is aligned with the external magnetic field. The dipoles also interact with their nearest neighbors. Due to quantum effects the energy of two spins in a ferromagnetic material is lower when they are aligned with each other1. As a model of this, it is assumed in the Ising model that the interaction energy for the dipole at .i; j / is given by Uint.i;j / Ji;j .i 1;j i 1;j i;j 1 i;j 1/ D C C C C C where J is a non-dimensionalized coupling strength. Summing over all sites, we get the total energy for a configuration given by E L L J L L U./ H i;j i;j .i 1;j i 1;j i;j 1 i;j 1/: E D 2 C C C C C i 1 j 1 i 1 j 1 XD XD XD XD The second sum is divided by two to account for each interaction being counted twice.
    [Show full text]
  • Lecture 7: Ensembles
    Matthew Schwartz Statistical Mechanics, Spring 2019 Lecture 7: Ensembles 1 Introduction In statistical mechanics, we study the possible microstates of a system. We never know exactly which microstate the system is in. Nor do we care. We are interested only in the behavior of a system based on the possible microstates it could be, that share some macroscopic proporty (like volume V ; energy E, or number of particles N). The possible microstates a system could be in are known as the ensemble of states for a system. There are dierent kinds of ensembles. So far, we have been counting microstates with a xed number of particles N and a xed total energy E. We dened as the total number microstates for a system. That is (E; V ; N) = 1 (1) microstatesk withsaXmeN ;V ;E Then S = kBln is the entropy, and all other thermodynamic quantities follow from S. For an isolated system with N xed and E xed the ensemble is known as the microcanonical 1 @S ensemble. In the microcanonical ensemble, the temperature is a derived quantity, with T = @E . So far, we have only been using the microcanonical ensemble. 1 3 N For example, a gas of identical monatomic particles has (E; V ; N) V NE 2 . From this N! we computed the entropy S = kBln which at large N reduces to the Sackur-Tetrode formula. 1 @S 3 NkB 3 The temperature is T = @E = 2 E so that E = 2 NkBT . Also in the microcanonical ensemble we observed that the number of states for which the energy of one degree of freedom is xed to "i is (E "i) "i/k T (E " ).
    [Show full text]
  • Section 2 Introduction to Statistical Mechanics
    Section 2 Introduction to Statistical Mechanics 2.1 Introducing entropy 2.1.1 Boltzmann’s formula A very important thermodynamic concept is that of entropy S. Entropy is a function of state, like the internal energy. It measures the relative degree of order (as opposed to disorder) of the system when in this state. An understanding of the meaning of entropy thus requires some appreciation of the way systems can be described microscopically. This connection between thermodynamics and statistical mechanics is enshrined in the formula due to Boltzmann and Planck: S = k ln Ω where Ω is the number of microstates accessible to the system (the meaning of that phrase to be explained). We will first try to explain what a microstate is and how we count them. This leads to a statement of the second law of thermodynamics, which as we shall see has to do with maximising the entropy of an isolated system. As a thermodynamic function of state, entropy is easy to understand. Entropy changes of a system are intimately connected with heat flow into it. In an infinitesimal reversible process dQ = TdS; the heat flowing into the system is the product of the increment in entropy and the temperature. Thus while heat is not a function of state entropy is. 2.2 The spin 1/2 paramagnet as a model system Here we introduce the concepts of microstate, distribution, average distribution and the relationship to entropy using a model system. This treatment is intended to complement that of Guenault (chapter 1) which is very clear and which you must read.
    [Show full text]
  • From Kadanoff-Baym Dynamics to Off-Shell Parton Transport
    EPJ manuscript No. (will be inserted by the editor) From Kadanoff-Baym dynamics to off-shell parton transport W. Cassing1,a Institut f¨ur Theoretische Physik, Universit¨at Giessen, 35392 Giessen, Germany Abstract. The description of strongly interacting quantum fields is based on two- particle irreducible (2PI) approaches that allow for a consistent treatment of quan- tum systems out-of-equilibrium as well as in thermal equilibrium. As a theoret- ical test case the quantum time evolution of Φ4-field theory in 2+1 space-time dimensions is investigated numerically for out-of-equilibrium initial conditions on the basis of the Kadanoff-Baym-equations including the tadpole and sunset self energies. In particular we address the dynamics of the spectral (‘off-shell’) distributions of the excited quantum modes and the different phases in the ap- proach to equilibrium described by Kubo-Martin-Schwinger relations for thermal equilibrium states. A detailed comparison of the full quantum dynamics to ap- proximate schemes like that of a standard kinetic (on-shell) Boltzmann equation is performed. We find that far off-shell 1 3 processes are responsible for chem- ical equilibration, which is not included↔ in the Boltzmann limit. Furthermore, we derive generalized transport equations for the same theory in a first order gradient expansion in phase space thus explicitly retaining the off-shell dynam- ics as inherent in the time-dependent spectral functions. The solutions of these equations compare very well with the exact solutions of the full Kadanoff-Baym equations with respect to the occupation numbers of the individual modes, the spectral evolution as well as the chemical equilibration process.
    [Show full text]
  • Beyond the Boltzmann Distribution
    Beyond the Boltzmann factor for corrections to scaling in ferromagnetic materials and critical fluids Ralph V. Chamberlin and Josh V. Vermaas, Department of Physics, Arizona State University, Tempe AZ 85287-1504 George H. Wolf, Department of Chemistry and Biochemistry, Arizona State University, Tempe AZ 85287-1604 Abstract The Boltzmann factor comes from the linear change in entropy of an infinite heat bath during a local fluctuation; small systems have significant nonlinear terms. We present theoretical arguments, experimental data, and Monte-Carlo simulations indicating that nonlinear terms may also occur when a particle interacts directly with a finite number of neighboring particles, forming a local region that fluctuates independent of the infinite bath. A possible mechanism comes from the net force necessary to change the state of a particle while conserving local momentum. These finite-sized local regions yield nonlinear fluctuation constraints, beyond the Boltzmann factor. One such fluctuation constraint applied to simulations of the Ising model lowers the energy, makes the entropy extensive, and greatly improves agreement with the corrections to scaling measured in ferromagnetic materials and critical fluids. PACS numbers: 64.60.an, 64.60.F–, 64.60.De, 75.40.Mg Keywords: Ising model; phase transitions; corrections to scaling; nanothermodynamics; Boltzmann factor; Onsager relations; heterogeneous dynamics 1 The Boltzmann factor is the most widely used formula in statistical thermodynamics, providing the foundation for the Maxwell-Boltzmann, Bose-Einstein, and Fermi-Dirac distributions. The Boltzmann factor comes from the linear change in entropy with changing energy of an infinite heat bath; small systems have nonlinear terms that influence their behavior.
    [Show full text]
  • Lecture 7: Boltzmann Distribution & Thermodynamics of Mixing
    Prof. Tibbitt Lecture 7 Networks & Gels Lecture 7: Boltzmann distribution & Thermodynamics of mixing Prof. Mark W. Tibbitt { ETH Z¨urich { 14 M¨arz2019 1 Suggested reading • Molecular Driving Forces { Dill and Bromberg: Chapters 10, 15 2 Boltzmann distribution Before continuing with the thermodynamics of polymer solutions, we will briefly discuss the Boltzmann distri- bution that we have used occassionally already in the course and will continue to assume a working knowledge of for future derivations. We introduced statistical mechanics earlier in the course, showing how a probabilistic view of systems can predict macroscale observable behavior from the structures of the atoms and molecules that compose them. At the core, we model probability distributions and relate them to thermodynamic equilibria. We discussed macrostates, microstates, ensembles, ergodicity, and the microcanonical ensemble and used these to predict the driving forces of systems as they move toward equilibrium or maximum entropy. In the beginning section of the course, we discussed ideal behavior meaning that we assumed there was no energetic contribution from interactions between atoms and molecules in our systems. In general, we are interested in modeling the behavior of real (non-ideal) systems that accounts for energetic interactions between species of the system. Here, we build on the introduction of statistical mechanics, as presented in Lecture 2, to consider how we can develop mathematical tools that account for these energetic interactions in real systems. The central result is the Boltzmann distribution, which provides probability distributions based on the underlying energy levels of the system. In this approach, we now want to consider simple models (often lattice models) that allow us to count microstates but also assign an energy Ei for each microstate.
    [Show full text]
  • 13 Classical and Quantum Statistics
    James Maxwell (1831–1879) Ludwig Boltzmann (1844–1906) Enrico Fermi (1901–1954) Established velocity Established classical statistics Established quantum statistics distribution of gases 13 Classical and quantum statistics Classical Maxwell–Boltzmann statistics and quantum mechanical Fermi–Dirac statistics are introduced to calculate the occupancy of states. Special attention is given to analytic approximations of the Fermi–Dirac integral and to its approximate solutions in the non- degenerate and the highly degenerate regime. In addition, some numerical approximations to the Fermi–Dirac integral are summarized. Semiconductor statistics includes both classical statistics and quantum statistics. Classical or Maxwell–Boltzmann statistics is derived on the basis of purely classical physics arguments. In contrast, quantum statistics takes into account two results of quantum mechanics, namely (i) the Pauli exclusion principle which limits the number of electrons occupying a state of energy E and (ii) the finiteness of the number of states in an energy interval E and E + dE. The finiteness of states is a result of the Schrödinger equation. In this section, the basic concepts of classical statistics and quantum statistics are derived. The fundamentals of ideal gases and statistical distributions are summarized as well since they are the basis of semiconductor statistics. 13.1 Probability and distribution functions Consider a large number N of free classical particles such as atoms, molecules or electrons which are kept at a constant temperature T, and which interact only weakly with one another. The energy of a single particle consists of kinetic energy due to translatory motion and an internal energy for example due to rotations, vibrations, or orbital motions of the particle.
    [Show full text]