<<

The concept of was introduced into the theory of by Rudolf Clausius in the 1850s. Clausius defined the entropy S as a of a , such that the infinitesimal change in the entropy of the system is given by

(1) where δ Q is the infinitesimal amount of that the system either absorbs (δ Q >0) or rejects (δQ<0), while T is the system’s temperature. Making this definition exact would require us also to provide rigorous definitions of “thermodynamic system”, “state function”, “heat”, and “”. If you haven’t studied the subject, I suggest consulting the first couple of chapters of H. B. Callen’s Thermodynamics and an Introduction to Thermostatistics, 2nd ed. (Wiley, 1985), or of any other good introductory textbook in thermodynamics. The reason why the definition of entropy based on Eq. (1) is useful is that it allowed Clausius to formulate what we now call the “2nd law of thermodynamics”: that the world’s entropy tends towards a maximum. In a reversible process entropy is conserved. In an the total entropy increases. A process which would cause total entropy to decrease (for instance, lukewarm water spontaneously separating itself into a layer of cold water and a layer of hot water) is forbidden. Clausius couldn’t prove that this law is always valid. Rather, he postulated it and then proceeded to show (based on the pioneering and until then neglected of Sadi Carnot, in whose honor Clausius apparently chose the letter S for the entropy) that a great variety of physical phenomena could be thereby explained. One of the major achievements of theoretical in the late 19th century was to show that the 2nd law of thermodynamics could be explained statistically. The point is that when we see a thermodynamic system (for instance, a gas in a closed container), we can’t in practice determine the microscopic details of what the molecules and atoms within it are doing. All we can do is measure a few macroscopic variables (such as , , and temperature). These macroscopic variables define what we call the system’s macrostate. Each macrostate could correspond to many different possible physical configurations of the molecules and atoms in the system. Each of those possible configurations is called a microstate. showed that Clausius’s definition of entropy in terms of heat and temperature could be substituted for a more fundamental definition in terms of the number of microstates compatible with a given macrostate:

(2) In eq. (2) – first written in this simple form not by Boltzmann himself but by – S is the entropy of a thermodynamic system that is in some given macrostate, Ω is the number of different microstates that are compatible with the macrostate (i.e., in how many ways you could rearrange in the molecules and atoms without changing the macrostate), and k is a universal constant (the B

“Boltzmann constant”). A modern theoretical physicist would prefer to work in units in which kB= 1 (making S dimensionless). There are at least three key things to note about Eq. (2). The first is that the entropy quantifies our ignorance of what the physical system is really doing. We know the macrostate (determined by macroscopic state variables like volume, pressure, and temperature, which we can easily measure). If the macrostate completely determined the actual physical state of the system, then a single microstate would be allowed, so that Ω=1, and therefore, by Eq. (2), S = 0 . Thus, zero entropy corresponds to complete knowledge of the physical state. More entropy corresponds to more possible microstates, and therefore to greater ignorance. The second very important thing about Eq. (2) is that it tells us the 2nd law of thermodynamics is a statistical result. If we assume that all of the possible microstates are equally likely, the probability of a macrostate is proportional to Ω. The natural logarithm is a monotonically increasing function, so a more probable state has greater S. Therefore, when S increases a system is going from a less probable to a more probable state. When systems are composed of a great many particles (which is true of a glass of water, and even truer of the observable Universe) it is overwhelmingly more likely for entropy to increase than to decrease. The 2nd law becomes, for all intents and purposes, a certainty. The third very important thing about Eq. (2) is that is has the right form to reproduce a fundamental property of the entropy as originally introduced by Clausius: it is an extensive variable. This means that the entropy of two systems considered jointly is equal to the sum of the entropy of each of the system. Note that the number of microstates of the two systems considered together is the product of the number of microstates of each separately; or, if you prefer, that the probability of the combined system being in some macrostate is the product of the probabilities of the two separate macrostates. Thus, by a basic property of logarithms:

(3)

Willard Gibbs, the great American mathematical physicist, introduced the concept of “statistical ensemble“ in his book Elementary Principles of , published in 1902. Gibbs’s treatment is more general than Boltzmann’s, because it allows the various microstates of an ensemble to have different probabilities. For such a generalized ensemble, Gibbs showed that the entropy could be expressed as

(4)

where the sum is over all of the possible microstates, each with probability p . Note that if we are i certain of the microstate (say, we have two microstates with p0=1 and p1=0) then S = 0. The more even distribution of the probabilities is, the greater the entropy will be. The term “entropy” is used in other contexts, notably information theory. But there’s an important difference in how the concept is used in the two disciplines: in thermodynamics the entropy measures variability that we don’t care about: how much you could re-arrange the microscopic components without changing the macrostate that we actually deal with and control in the lab. In information theory, on the other hand, entropy measures variability that we do care about, because it’s the vehicle for conveying information to someone else: it quantifies one’s ignorance before decoding the actual message contained in a given source. There’s a famous and amusing story about this. Information theory was developed by Claude Shannon in the 1940s. In quantifying the maximum amount of information encodable in some source, Shannon arrived at a formula of the same form as eq. (4). Shannon asked the eminent mathematician and theoretical physicist John von Neumann what to call that quantity. According to one version of the story, von Neumann replied that Shannon should call it “entropy”, for two reasons: “In the first place, a mathematical development very much like yours already exists in Boltzmann's statistical mechanics, and in the second place, no one understands entropy very well, so in any discussion you will be in a position of advantage.”