<<

lieb.qxp 4/9/98 11:08 AM Page 571

A Guide to and the Second Law of Elliott H. Lieb and Jakob Yngvason

his article is intended for readers who, low- and is not immutable— like us, were told that the second law of as are the others). In brief, these are: thermodynamics is one of the major The Zeroth Law, which expresses the transitiv- achievements of the nineteenth cen- ity of thermal equilibrium and which is often said tury—that it is a logical, perfect, and un- to imply the existence of temperature as a pa- breakableT law—but who were unsatisfied with the rametrization of equilibrium states. We use it below “derivations” of the entropy principle as found in but formulate it without mentioning temperature. textbooks and in popular writings. In fact, temperature makes no appearance here A glance at the books will inform the reader that until almost the very end. the law has “various formulations” (which is a bit The First Law, which is conservation of . odd, as if to say the Ten Commandments have It is a concept from mechanics and provides the various formulations), but they all lead to the ex- connection between mechanics (and things like istence of an entropy function whose reason for falling weights) and thermodynamics. We discuss existence is to tell us which processes can occur this later on when we introduce simple systems; and which cannot. We shall abuse language (or re- the crucial usage of this law is that it allows en- formulate it) by referring to the existence of en- ergy to be used as one of the parameters describ- tropy as the second law. This, at least, is unam- ing the states of a simple system. biguous. The entropy we are talking about is that The Second Law. Three popular formulations of defined by thermodynamics (and not some analytic this law are: quantity, usually involving expressions such as p ln p, that appears in , prob- Clausius: No process is possible, the − ability theory, and statistical mechanical models). sole result of which is that is trans- There are three (plus ferred from a body to a hotter one. one more, due to Nernst, which is mainly used in (and Planck): No process is pos- Elliott H. Lieb is professor of and physics at sible, the sole result of which is that a Princeton University. His e-mail address is lieb@math. body is cooled and is done. princeton.edu. Work partially supported by U.S. Na- tional Foundation grant PHY95-13072A01. Carathéodory: In any neighborhood of Jakob Yngvason is professor of at Vi- any state there are states that cannot enna University. His e-mail address is yngvason@ be reached from it by an adiabatic thor.thp.univie.ac.at. Work partially supported by process. the Adalsteinn Kristjansson Foundation, University of Ice- land. All three formulations are supposed to lead to ©1997 by the authors. Reproduction of this article, by any the entropy principle (defined below). These steps means, is permitted for noncommercial purposes. can be found in many books and will not be trod-

MAY 1998 NOTICES OF THE AMS 571 lieb.qxp 4/9/98 11:08 AM Page 572

den again here. Let us note in passing, however, lated previous work on the foundations of classi- that the first two use concepts such as hot, cold, cal thermodynamics are given in [7]. The literature heat, cool that are intuitive but have to be made on this subject is extensive, and it is not possible precise before the statements are truly meaning- to give even a brief account of it here, except for ful. No one has seen “heat”, for example. The last mentioning that the previous work closest to ours (which uses the term “”, to be de- is that of [6] and [2] (see also [4], [5], and [9]). fined below) presupposes some kind of parame- These other approaches are also based on an in- trization of states by points in Rn, and the usual vestigation of the relation , but the overlap with ≺ derivation of entropy from it assumes some sort our work is only partial. In fact, a major part of our of differentiability; such assumptions are beside work is the derivation of a certain property (the the point as far as understanding the meaning of “comparison hypothesis” below), which is taken as entropy goes. an axiom in the other approaches. It was a re- Why, one might ask, should a mathematician be markable and largely unsung achievement of Giles interested in this matter, which historically had [6] to realize the full of this property. something to do with attempts to understand and Let us begin the story with some basic con- improve the efficiency of steam engines? The an- cepts. swer, as we perceive it, is that the law is really an 1. : Physically this con- interesting mathematical theorem about an or- sists of certain specified amounts of certain dering on a set, with profound physical implica- kinds of matter, e.g., a gram of hydrogen in a tions. The axioms that constitute this ordering are container with a piston, or a gram of hydro- somewhat peculiar from the mathematical point gen and a gram of oxygen in two separate con- of view and might not arise in the ordinary rumi- tainers, or a gram of hydrogen and two grams nations of abstract thought. They are special but of hydrogen in separate containers. The sys- important, and they are driven by considerations tem can be in various states which, physically, about the world, which is what makes them so in- are equilibrium states. The space of states of teresting. Maybe an ingenious reader will find an the system is usually denoted by a symbol application of this same logical structure to another such as and states in by X,Y,Z, etc. field of science. Physical motivation aside, a state-space, math- The basic input in our analysis is a certain kind ematically, isΓ just a set to beginΓ with; later on we of ordering on a set, and denoted by will be interested in embedding state-spaces in some convex subset of some Rn+1; i.e., we will in- ≺ troduce coordinates. As we said earlier, however, (pronounced “precedes”). It is transitive and re- the entropy principle is quite independent of co- flexive, as in A1, A2 below, but X Y and Y X ordinatization, Carathéodory’s principle notwith- ≺ ≺ do not imply X = Y, so it is a “preorder”. The big standing. question is whether can be encoded in an ordi- 2. Composition and scaling of states: The notion ≺ nary, real-valued function, denoted by S, on the set, of Cartesian product, 1 2, corresponds sim- × such that if X and Y are related by , then ply to the two (or more) systems being side by ≺ S(X) S(Y ) if and only if X Y. The function S side on the laboratoryΓ table;Γ mathematically it ≤ ≺ is also required to be additive and extensive in a is just another system (called a compound sys- sense that will soon be made precise. tem), and we regard the state-space 1 2 as × A helpful analogy is the question: When can a being the same as 2 1. Points in 1 2 are × × vector-field, V(x), on R3 be encoded in an ordinary denoted by pairs (X,Y), as usual. ΓThe Γsub- function, f (x), whose gradient is V? The well- systems comprisingΓ a compoundΓ systemΓ Γ are known answer is that a necessary and sufficient physically independent systems, but they are condition is that curl V =0. Once V is observed to allowed to interact with each other for a pe- have this property, one thing becomes evident and riod of and thereby to alter each other’s important: it is necessary to the integral state. of V only along some curves—not all curves—in The concept of scaling is crucial. It is this con- order to deduce the integral along all curves. The cept that makes our thermodynamics inap- encoding then has enormous predictive power propriate for microscopic objects like atoms about the nature of future measurements of V. In or cosmic objects like . For each state- the same way, knowledge of the function S has space and number λ>0 there is another enormous predictive power in the hands of state-space, denoted by (λ), with points de- chemists, engineers, and others concerned with the noted byΓ λX. This space is called a scaled copy ways of the physical world. of . Of course we identifyΓ (1) = and Our concern will be the existence and proper- 1X = X. We also require ( (λ))(µ) = (λµ) and ties of S, starting from certain natural axioms µ(λXΓ )=(µλ)X. The physical interpretationΓ Γ of about the relation . We present our results with- (λ) when is the space ofΓ one gramΓ of hy- ≺ out proofs, but full details and a discussion of re- drogen is simply the state-space of λ grams Γ Γ

572 NOTICES OF THE AMS 45, NUMBER 5 lieb.qxp 4/9/98 11:10 AM Page 573

of hydrogen. The state λX is the state of λ vice consisting of some auxiliary system grams of hydrogen with the same “intensive” and a weight in such a way that the properties as X, e.g., , while “exten- auxiliary system returns to its initial sive” properties like energy, volume, etc., are state at the end of the process, whereas scaled by a factor λ (by definition). the weight may have risen or fallen. For any given we can form Cartesian The role of the “weight” in this definition is product state-spaces of the type (λ1) × merely to provide a particularly simple source (or (λ2) (λN ). TheseΓ will be called multiple- ×···× scaled copies of . Γ sink) of . Note that an adiabatic Γ The notationΓ (λ) should be regarded as merely process, physically, does not have to be gentle, or a mnemonic at thisΓ point, but later on, with the em- “static” or anything of the kind. It can be arbi- bedding of intoΓ Rn+1, it will literally be trarily violent! (See Figure 1.) λ = λX : X in the usual sense. An example might be useful here. Take a pound { ∈ } 3. :Γ Now we come to the or- of hydrogen in a container with a piston. The states Γ dering. We sayΓ X Y (with X and Y possibly are describable by two numbers, energy and vol- ≺ in different state-spaces) if there is an adiabatic ume, the latter being determined by the position process that transforms X into Y. of the piston. Starting from some state X, we can What does this mean? Mathematically, we are take our hand off the piston and let the volume just given a list of pairs X Y. There is nothing increase explosively to a larger one. After things ≺ more to be said, except that later on we will assume have calmed down, call the new equilibrium state Y. Then X Y. Question: Is Y X true? Answer: that this list has certain properties that will lead ≺ ≺ to interesting theorems about this list and will No. To get from Y to X we would have to use some lead, in turn, to the existence of an entropy func- machinery and a weight, with the machinery re- tion, S, characterizing the list. turning to its initial state, and there is no way this The physical interpretation is quite another can be done. Using a weight, we can indeed re- matter. In textbooks a process is usually called adi- compress the to its original volume, but we will abatic if it takes place in “thermal isolation”, which find that the energy is then larger than its origi- in turn means that “no heat is exchanged with the nal value. surroundings”. Such statements appear neither On the other hand, we could let the piston ex- sufficiently general nor precise to us, and we pre- pand very, very slowly by letting it raise a carefully fer the following version (which is in the spirit of calibrated weight. No other machinery is involved. Planck’s formulation of the second law [8]). It has In this case, we can reverse the process (to within the great virtue (as discovered by Planck) that it an arbitrarily good accuracy) by adding a tiny bit avoids having to distinguish between work and to the weight, which will then slowly push the pis- heat—or even having to define the concept of heat. ton back. Thus, we could have (in principle, at We emphasize, however, that the theorems do not least) both X Y and Y X, and we would call ≺ ≺ require agreement with our physical definition of such a process a reversible adiabatic process. adiabatic process; other definitions are conceivably Let us write possible. X Y if X Y A state Y is adiabatically accessible from ≺≺ ≺ a state X, in symbols X Y, if it is pos- ≺ but not sible to change the state from X to Y by Y X (written Y X). means of an interaction with some de- ≺ 6≺

Figure 1. A violent adiabatic process connecting equilibrium states X and Y.

MAY 1998 NOTICES OF THE AMS 573 lieb.qxp 4/9/98 11:10 AM Page 574

In this case we say that we can go from X to Y by The last line is especially noteworthy. It says that an irreversible adiabatic process. If X Y and entropy must increase in an irreversible adiabatic ≺ Y X (i.e., X and Y are connected by a reversible process. ≺ adiabatic process), we say that X and Y are adia- The additivity of entropy in compound systems batically equivalent and write is often just taken for granted, but it is one of the A startling conclusions of thermodynamics. First of X Y. ∼ all, the content of additivity, (2), is considerably A Equivalence classes under are called adiabats. more far-reaching than one might think from the ∼ simplicity of the notation. Consider four states, 4. Comparability: Given two states X and Y in two X,X ,Y,Y , and suppose that X Y and X Y . 0 0 ≺ 0 ≺ 0 (same or different) state-spaces, we say that One of our axioms, A3, will be that then they are comparable if X Y or Y X (or (X,X ) (Y,Y ), and (2) contains nothing new or ≺ ≺ 0 ≺ 0 both). This turns out to be a crucial notion. Two exciting. On the other hand, the compound system states are not always comparable; a necessary can well have an adiabatic process in which condition is that they have the same material (X,X ) (Y,Y ) but X Y. In this case, (2) conveys 0 ≺ 0 6≺ composition in terms of the chemical elements. much information. Indeed, by monotonicity there Example: Since is H2O and the atomic will be many cases of this kind, because the in- weights of hydrogen and oxygen are 1 and 16 equality S(X)+S(X ) S(Y )+S(Y ) certainly does 0 ≤ 0 respectively, the states in the compound sys- not imply that S(X) S(Y ). The fact that the in- ≤ tem of 2 grams of hydrogen and 16 grams of equality S(X)+S(X ) S(Y )+S(Y ) tells us exactly 0 ≤ 0 oxygen are comparable with states in a system which adiabatic processes are allowed in the com- consisting of 18 grams of water (but not with pound system (among comparable states), inde- 11 grams of water or 18 grams of oxygen). pendent of any detailed knowledge of the manner Actually, the classification of states into various in which the two systems interact, is astonishing state-spaces is done mainly for conceptual conve- and is at the heart of thermodynamics. The second nience. The second law deals only with states, and reason that (2) is startling is this: From (1) alone, the only thing we really have to know about any restricted to one system, the function S can be re- two of them is whether or not they are compara- placed by 29S and still do its job, i.e., satisfy (1). ble. Given the relation for all possible states of However, (2) says that it is possible to calibrate the ≺ all possible systems, we can ask whether this re- of all systems (i.e., simultaneously adjust lation can be encoded in an entropy function ac- all the undetermined multiplicative constants) so that the entropy S1,2 for a compound 1 2 is cording to the following: × Entropy principle: There is a real-valued func- S1,2(X,Y)=S1(X)+S2(Y ), even though systems 1 and 2 are totally unrelated! Γ Γ tion on all states of all systems (including com- We are now ready to ask some basic questions. pound systems) called entropy, denoted by S, such Q1: Which properties of the relation ensure that ≺ existence and (essential) uniqueness of S? a) Monotonicity: When X and Y are compara- Q2: Can these properties be derived from sim- ble states, then ple physical premises? (1) X Y if and only if S(X) S(Y ) . Q3: Which convexity and smoothness properties ≺ ≤ of S follow from the premises? b) Additivity and extensivity: If X and Y are Q4: Can temperature (and hence an ordering of states of some (possibly different) systems and states by “hotness” and “coldness”) be defined if (X,Y) denotes the corresponding state in the from S, and what are its properties? compound system, then the entropy is additive The answer to question Q1 can be given in the for these states; i.e., form of six axioms that are reasonable, simple, “ob- (2) S(X,Y)=S(X)+S(Y ) . vious”, and unexceptionable. An additional, crucial assumption is also needed, but we call it a hy- S is also extensive; i.e., for each λ>0 and pothesis instead of an axiom because we show each state X and its scaled copy λX (λ) (de- later how it can be derived from some other axioms, ∈ fined in 2, above) thereby answering question Q2. A Γ A1. Reflexivity. X X. (3) S(λX)=λS(X) . ∼ A2. Transitivity. If X Y and Y Z, then ≺ ≺ A formulation logically equivalent to (a), not X Z. ≺ using the word “comparable”, is the following pair A3. Consistency. If X X and Y Y , then ≺ 0 ≺ 0 of statements: (X,Y) (X ,Y ). ≺ 0 0 A4. Scaling Invariance. If λ>0 and X Y, A ≺ X Y = S(X)=S(Y ) and then λX λY. (4) ∼ ⇒ ≺ A X Y = S(X)

574 NOTICES OF THE AMS VOLUME 45, NUMBER 5 lieb.qxp 4/9/98 11:10 AM Page 575

((1 λ)X,λX) for all 0 <λ<1. Note that the state- such points do not exist, then S is the constant − spaces are not the same on both sides. If X , function.) Then define for X ∈ then the state-space on the right side is ∈ (1 λ) (λ) − . Γ (9) S(X) := sup λ : ((1 λ)X0Γ,λX1) X . × { − ≺ } A6. Stability. If (X,εZ0) (Y,εZ1) for some Z0, ≺ ΓZ1, and Γa sequence of ε’s tending to zero, then Remarks: As in axiom A5, two state-spaces are A X Y. This axiom is a substitute for continuity, involved in (9). By axiom A5, X ((1 λ)X,λX), ≺ ∼ − which we cannot assume because there is no topol- and hence, by CH in the space (1 λ) (λ), X is − × ogy yet. It says that “a grain of dust cannot influ- comparable to ((1 λ)X0,λX1). In (9) we allow − ence the set of adiabatic processes”. λ 0 and λ 1 by using the ΓconventionΓ that ≤ ≥ An important lemma is that (A1)–(A6) imply the (X, Y ) Z means that X (Y,Z) and − ≺ ≺ cancellation law, which is used in many proofs. It (X,0Y )=X. For (9) we need to know only that CH says that for any three states X,Y,Z holds in twofold scaled products of with itself. CH will then automatically be true for all products. (5) (X,Z) (Y,Z)= X Y. ≺ ⇒ ≺ Γ The next concept plays a key role in our treat- In (9) the reference points X0,X1 are fixed and the ment. supremum is over λ. One can ask how S changes CH. Definition: We say that the Comparison Hy- if we change the two points X0,X1. The answer is that the change is affine; i.e., S(X) aS(X)+B, pothesis (CH) holds for a state-space if all pairs → of states in are comparable. with a>0. Note that A3, A4, and A5 automaticallyΓ extend Theorem 1 extends to products of multiple- comparabilityΓ from a space to certain other cases; scaled copies of different systems, i.e., to general e.g., X ((1 λ)Y,λZ) for all 0 λ 1 if X Y compound systems. This extension is an immedi- ≺ − ≤ ≤ ≺ and X Z. On the other hand,Γ comparability on ate consequence of the following theorem, which ≺ alone does not allow us to conclude that X is com- is proved by applying Theorem 1 to the product parable to ((1 λ)Y,λZ) if X Y but Z X. For of the system under consideration with some stan- − ≺ ≺ this,Γ one needs CH on the product space dard reference system. (1 λ) (λ), which is not implied by CH on . − × Theorem 2 (Consistent entropy scales). Assume The significance of A1–A6 and CH is borne out that CH holds for all compound systems. For each byΓ the followingΓ theorem: Γ system let S be some definite entropy function Theorem 1 (Equivalence of entropy and A1–A6, on in the sense of Theorem 1. Then there are con- given CH). The following are equivalent for a state- stants aΓ and BΓ ( ) such that the function S, defined space : for Γall states of all systems by Γ Γ i) The relation between states in (possibly dif- Γ ≺ (10) S(X)=a S (X)+B( ) ferent) multiple-scaled copies of , e.g., (λ1) (λ2) (λN ), is characterized by an en- Γ Γ × ×···× Γ tropy function, S, on in the sense that Γ Γ Γ Γ (6) (λ1X1,λ2X2,...Γ) (λ0 X0 ,λ0 X0 ,...) ≺ 1 1 2 2 is equivalent to the condition that

(7) λiS(Xi) λ0 S(X0 ) ≤ j j Xi Xj whenever

(8) λi = λj0 . Xi Xj ii) The relation satisfies conditions (A1)–(A6), and ≺ (CH) holds for every multiple-scaled copy of . This entropy function on is unique up to affine equivalence; i.e., S(X) aS(X)+B, with a>Γ0. → Γ That (i) = (ii) is obvious. The proof of (ii) = ⇒ ⇒ (i) is carried out by an explicit construction of the entropy function on , reminiscent of an old def- inition of heat by Laplace and Lavoisier in terms of the amount of ice thatΓ a body can melt. Figure 2. The entropy of X is determined by the largest amount Basic Construction of S (Figure 2): Pick two ref- of X1 that can be transformed adiabatically into X, with the erence points X0 and X1 in with X0 X1 . (If ≺≺ help of X0. Γ MAY 1998 NOTICES OF THE AMS 575 lieb.qxp 4/9/98 11:11 AM Page 576

by assumption for all state-spaces. We, in contrast, would like to derive CH from something that we consider more basic. Two ingredients will be needed: the analysis of certain special but com- monplace systems called “simple systems” and some assumptions about thermal contact (the “ze- roth law”) that will act as a kind of glue holding the parts of a compound system in harmony with each other. The simple systems are the building blocks of thermodynamics; all systems we con- sider are compounds of them.

Simple Systems A Simple System is one whose state-space can be identified with some open convex subset of some Rn+1 with a distinguished coordinate de- noted by U, called the energy, and additional co- ordinates V Rn, called work coordinates. The ∈ energy coordinate is the way in which thermody- namics makes contact with mechanics, where the concept of energy arises and is precisely defined. The fact that the amount of energy in a state is in- dependent of the manner in which the state was arrived at is, in reality, the first law of thermody- Figure 3. The coordinates U and V of a simple system. The namics. A typical (and often the only) work coor- state-space (bounded by dashed line) and the forward sector dinate is the volume of a fluid or gas (controlled AX (shaded) of a state X are convex, by axiom A7. The by a piston); other examples are co- boundary of AX (full line) is an adiabat. ordinates of a solid or magnetization of a para- magnetic substance. for X , satisfies additivity (2), extensivity (3), and Our goal is to show, with the addition of a few ∈ monotonicity (1) in the sense that whenever X and more axioms, that CH holds for simple systems and Y are in Γthe same state-space, then their scaled products. In the process we will in- troduce more structure, which will capture the in- (11) X Y if and only if S(X) S(Y ). ≺ ≤ tuitive notions of thermodynamics; thermal equi- Theorem 2 is what we need, except for the ques- librium is one. tion of mixing and chemical reactions, which is First, there is an axiom about convexity: treated at the end and which can be put aside at A7. Convex combination. If X and Y are states a first reading. In other words, as long as we do of a simple system and t [0, 1], then not consider adiabatic processes in which systems ∈ are converted into each other (e.g., a compound sys- (tX,(1 t)Y ) tX +(1 t)Y, tem consisting of a vessel of hydrogen and a ves- − ≺ − sel of oxygen is converted into a vessel of water), in the sense of ordinary convex addition of points n+1 the entropy principle has been verified. If that is in R . A straightforward consequence of this so, what remains to be done? the reader may jus- axiom (and A5) is that the forward sectors (Fig- tifiably ask. The answer is twofold: First, Theorem ure 3) 2 requires that CH hold for all systems, and we are (12) AX := Y : X Y not content to take this as an axiom. Second, im- { ∈ ≺ } of states X in a simple system are convex sets. portant notions of thermodynamics such as “ther- Γ mal equilibrium” (which will eventually lead to a Another consequence is a connection between precise definition of temperature) have not ap- the existence of irreversibleΓ processes and peared so far. We shall see that these two points Carathéodory’s principle [3, 1] mentioned above. (i.e., thermal equilibrium and CH) are not unrelated. As for CH, other authors—[6], [2], [4], and [9]— Lemma 1. Assume (A1)–(A7) for Rn+1 and con- ⊂ essentially postulate that it holds for all systems sider the following statements: by making it axiomatic that comparable states fall Γ a) Existence of irreversible processes: For every into equivalence classes. (This means that the con- X there is a Y with X Y. ditions X Z and Y Z always imply that X and ∈ ∈ ≺≺ ≺ ≺ Y are comparable; likewise, they must be compa- b) Carathéodory’s principle: In every neighborhood Γ Γ rable if Z X and Z Y). By identifying a state- of every X there is a Z with X Z. ≺ ≺ ∈ ∈ 6≺ space with an equivalence class, the comparison Then (a) = (b) always. If the forward sectors in ⇒ hypothesis then holds in these other approaches have interiorΓ points, then (b)Γ = (a). ⇒ Γ 576 NOTICES OF THE AMS VOLUME 45, NUMBER 5 lieb.qxp 4/9/98 11:11 AM Page 577

We need three more axioms for simple systems, convention we choose the direction of the energy which will take us into an analytic detour. The axis so that the energy always increases in adiabatic first of these establishes (a) above. processes at fixed work coordinates. When tem- perature is defined later, this will imply that tem- A8. Irreversibility. For each X there is a ∈ perature is always positive. point Y such that X Y. (This axiom is im- ∈ ≺≺ Theorem 3 implies that Y is on the boundary plied by A14, below, but is stated hereΓ separately of AX if and only if X is on the boundary of AY. A because importantΓ conclusions can be drawn from Thus the adiabats, i.e., the equivalence classes, ∼ it alone.) consist of these boundaries. A9. Lipschitz tangent planes. For each X the Before leaving the subject of simple systems let ∈ forward sector AX = Y : X Y has a unique us remark on the connection with Carathéodory’s { ∈ ≺ } support plane at X (i.e., AX has a tangent planeΓ at development. The point of contact is the fact that X). The tangent plane is assumedΓ to be a locally X ∂AX. We assume that AX is convex and use ∈ Lipschitz continuous function of X, in the sense ex- transitivity and Lipschitz continuity to arrive even- plained below. A10. Connectedness of the boundary. The boundary ∂AX (relative to the open set ) of every forward sector AX is connected. (This is tech- ⊂ nical and conceivably can be replaced by Γsomething else.) Γ Axiom A8 plus Lemma 1 asserts that every X lies on the boundary ∂AX of its forward sector. Al- though axiom A9 asserts that the convex set AX has a true tangent at X only, it is an easy conse- quence of axiom A2 that AX has a true tangent everywhere on its boundary. To say that this tan- gent plane is locally Lipschitz continuous means that if X =(U0,V0), then this plane is given by

0 n 0 (13) U U + Pi(X)(Vi V )=0 − 1 − i X with locally Lipschitz continuous functions Pi. The function Pi is called the generalized pressure con- jugate to the work coordinate Vi. (When Vi is the volume, Pi is the ordinary pressure.) Lipschitz continuity and connectedness are well known to guarantee that the coupled differential equations ∂U (14) (V)= Pj (U(V),V) for j =1,...,n ∂Vj − not only have a (since we know that the surface ∂AX exists) but this solution must be unique. Thus, if Y ∂AX, then X ∂AY. In short, ∈ ∈ the surfaces ∂AX foliate the state-space . What is less obvious but very important because it in- stantly gives us the comparison hypothesisΓ for is the following. Γ Theorem 3 (Forward sectors are nested). If AX and AY are two forward sectors in the state-space of a simple system, then exactly one of the fol- lowing holds. Γ A a) AX = AY; i.e., X Y. ∼ b) AX Interior (AY ); i.e., Y X. ⊂ ≺≺ c) AY Interior (AX ); i.e., X Y. ⊂ ≺≺ It can also be shown from our axioms that the Figure 4. The forward sectors of a simple system are orientation of forward sectors with respect to the nested. The bottom figure shows what could, in principle, energy axis is the same for all simple systems. By go wrong but does not.

MAY 1998 NOTICES OF THE AMS 577 lieb.qxp 4/9/98 11:11 AM Page 578

tually at Theorem 3. Carathéodory uses Frobe- manently connected) then behaves like a simple nius’s theorem plus assumptions about differen- system (with one energy coordinate) but with sev- tiability to conclude the existence locally of a sur- eral work coordinates (the union of the two work face containing X. Important global information, coordinates). Thus, if we start initially with such as Theorem 3, is then not easy to obtain with- X1 =(U1,V1) for system 1 and X2 =(U2,V2) for out further assumptions, as discussed, e.g., in [1]. system 2 and if we end up with X =(U,V1,V2) for the new system, we can say that (X1,X2) X. This ≺ Thermal Contact holds for every choice of U1 and U2 whose sum Thermal contact and the zeroth law entail the is U. Moreover, after thermal equilibrium is very special assumptions about that we men- reached, the two systems can be disconnected, if ≺ tioned earlier. It will enable us to establish CH for we wish, to once more form a compound system, products of several systems and thereby show, whose component parts we say are in thermal via Theorem 2, that entropy exists and is additive. equilibrium. That this is transitive is the zeroth law. Although we have established CH for a simple sys- Thus, we cannot only make compound systems tem, , we have not yet established CH even for a consisting of independent subsystems (which can product of two copies of . This is needed in the interact, but separate again), we can also make a definitionΓ of S given in (9). The S in (9) is deter- new simple system out of two simple systems. To mined up to an affine shift,Γ and we want to be able do this an energy coordinate has to disappear, to calibrate the entropies (i.e., adjust the multi- and thermal contact does this for us. All of this is plicative and additive constants) of all systems so formalized in the following three axioms. that they work together to form a global S satis- A11. Thermal contact. For any two simple sys- fying the entropy principle. We need five more ax- tems with state-spaces 1 and 2 there is another ioms. They might look a bit abstract, so a few simple system, called the thermal join of 1 and 2, words of introduction might be helpful. with state-space Γ Γ In order to relate systems to each other in the Γ Γ hope of establishing CH for compounds and 12 = (U,V1,V2):U = U1 + U2 (15) { thereby an additive entropy function, some way with (U1,V1) 1, (U2,V2) 2 . must be found to put them into contact with each ∆ ∈ ∈ } other. Heuristically we imagine two simple sys- Moreover, Γ Γ tems (the same or different) side by side and fix 1 2 ((U1,V1), (U2,V2)) the work coordinates (e.g., the volume) of each. Con- (16) × 3 (U1 + U2,V1,V2) 12. nect them with a “copper thread”, and wait for equi- Γ Γ ≺ ∈ librium to be established. The total energy U will A12. Thermal splitting. For ∆any point not change, but the individual U1 and U2 (U,V1,V2) 12 there is at least one pair of states, will adjust to values that depend on U and the work ∈ (U1,V1) 1 , (U2,V2)) 2 , with U = U1 + U2 , coordinates. This new system (with the thread per- ∈ ∈ such that ∆

Γ A Γ (17) (U,V1,V2) ((U1,V1), (U2,V2)). ∼ A If (U,V1,V2) ((U1,V1), (U2,V2)), we say that the ∼ states X =(U1,V1) and Y =(U2,V2)) are in thermal equilibrium and write T X Y. ∼ T A13. Zeroth law of thermodynamics. If X Y T T ∼ and if Y Z, then X Z. ∼ ∼ A11 and A12 together say that for each choice of the individual work coordinates there is a way to divide up the energy U between the two systems in a stable manner. A12 is the stability statement, for it says that joining is reversible; i.e., once the equilibrium has been established, one can cut the copper thread and retrieve the two systems back again, but with a special partition of the energies. This reversibility allows us to think of the ther- mal join, which is a simple system in its own right, as a special subset of the product system 1 2, × which we call the thermal diagonal. In particular, T Figure 5. Transversality, A14, requires that each X have points A12 allows us to prove easily that X λXΓfor Γall ∼ on each side of its adiabat that are in thermal equilibrium. X and all λ>0.

578 NOTICES OF THE AMS VOLUME 45, NUMBER 5 lieb.qxp 4/9/98 11:11 AM Page 579

A13 is the famous zeroth law, which says that we have established S within the context of one the thermal equilibrium is transitive and hence an system and copies of the system, i.e., condition (ii) equivalence relation. Often this law is taken to of Theorem 1. As long as we stay within such a mean that the equivalence classes can be labeled group of systems there is no way to determine the by an “empirical” temperature, but we do not want unknown multiplicative or additive entropy con- to mention temperature at all at this point. It will stants. The next task is to show that the multi- appear later. plicative constants can be adjusted to give a uni- Two more axioms are needed. versal entropy valid for copies of different systems, A14 requires that for every adiabat (i.e., an A i.e., to establish the hypothesis of Theorem 2. This equivalence class w.r.t. ) there exists at least one ∼ T is based on the following. isotherm (i.e., an equivalence class w.r.t. ) con- ∼ taining points on both sides of the adiabat. Note Lemma 2 (Existence of calibrators). If 1 and 2 that, for each given X, only two points in the en- are simple systems, then there exist states tire state-space are required to have the stated X0,X1 1 and Y0,Y1 2 such that Γ Γ property. This assumption essentially prevents a ∈ ∈ state-space fromΓ breaking up into two pieces that Γ X0 X1 andΓ Y0 Y1 ≺≺ ≺≺ do not communicate with each other. Without it, counterexamples to CH for compound systems and A can be constructed. A14 implies A8, but we listed (X0,Y1) (X1,Y0). ∼ A8 separately in order not to confuse the discus- sion of simple systems with thermal equilibrium. The significance of Lemma 2 is that it allows us A15 is technical and perhaps can be eliminated. to fix the multiplicative constants by the condition

Its physical motivation is that a sufficiently large (18) S1(X0)+S2(Y1)=S1(X1)+S2(Y0). copy of a system can act as a heat bath for other systems. When temperature is introduced later, The proof of Lemma 2 is complicated and re- A15 will have the meaning that all systems have ally uses all the axioms A1 to A14. With its aid we the same temperature range. This postulate is arrive at our chief goal, which is CH for compound needed if we want to be able to bring every sys- systems. tem into thermal equilibrium with every other sys- Theorem 4 (Entropy principle in products of tem. simple systems). The comparison hypothesis CH A14. Transversality. If is the state-space of is valid in arbitrary scaled products of simple sys- a simple system and if X , then there exist T ∈ tems. Hence, by Theorem 2, the relation among states X0 X1 with X0 ΓX X1. ≺ ∼ ≺≺ ≺≺ states in such state-spaces is characterized by an A15. Universal temperatureΓ range. If and 1 2 entropy function S. The entropy function is unique, are state-spaces of simple systems, then, for every up to an overall multiplicative constant and one ad- X 1 and every V belonging to the projectionΓ Γof ∈ ditive constant for each simple system under con- 2 onto the space of its work coordinates, there is T sideration. a Y Γ 2 with work coordinates V such that X Y. ∈ ∼ Γ The reader should note that the concept “ther- At last we are ready to define temperature. Con- mal contact”Γ has appeared, but not temperature cavity of S (implied by A7), Lipschitz continuity of or hot and cold or anything resembling the Clau- the pressure, and the transversality condition, to- sius or Kelvin-Planck formulations of the second gether with some real analysis, play key roles in law. Nevertheless, we come to the main achieve- the following, which answers questions Q3 and Q4 ment of our approach: With these axioms we can posed at the beginning. establish CH for products of simple systems (each of which satisfies CH, as we already know). First, Theorem 5 (Entropy defines temperature). The the thermal join establishes CH for the (scaled) entropy S is a concave and continuously differen- product of a simple system with itself. The basic tiable function on the state-space of a simple sys- idea here is that the points in the product that lie tem. If the function T is defined by on the thermal diagonal are comparable, since points in a simple system are comparable. In par- 1 ∂S ticular, with X,X0,X1 as in A14, the states (19) := , T ∂U V ((1 λ)X0,λX1) and ((1 λ)X,λX) can be re- − − T garded as states of the same simple system and then T>0 and T characterizes the relation in T ∼ are therefore comparable. This is the key point the sense that X Y if and only if T (X)=T (Y ). ∼ needed for the construction of S, according to (9). Moreover, if two systems are brought into thermal The importance of transversality is thus brought contact with fixed work coordinates, then, since the into focus. total entropy cannot decrease, the energy flows With some more work we can establish CH for from the system with the higher T to the system with multiple-scaled copies of a simple system. Thus, the lower T.

MAY 1998 NOTICES OF THE AMS 579 lieb.qxp 4/9/98 11:11 AM Page 580

The temperature need not be a strictly monot- pound system is the same at the beginning and at one function of U; indeed, it is not so in a “multi- the end of the process. region”. It follows that T is not always ca- The task is to find constants B( ), one for each pable of specifying a state, and this fact can cause state-space , in such a way that the entropy de- some pain in traditional discussions of the second fined by Γ law if it is recognized, which usually it is not. Γ (21) S(X):=S (X)+B( ) for X ∈ Mixing and Chemical Reactions satisfies Γ Γ Γ The core results of our analysis have now been (22) S(X) S(Y ) presented, and readers satisfied with the entropy ≤ principle in the form of Theorem 4 may wish to whenever stop at this point. Nevertheless, a nagging doubt X Y with X ,Y 0. will occur to some, because there are important adi- ≺ ∈ ∈ abatic processes in which systems are not con- Moreover, we require that theΓ newly Γdefined en- served, and these processes are not yet covered in tropy satisfy scaling and additivity under compo- the theory. A critical study of the usual textbook sition. Since the initial entropies S (X) already sat- treatments should convince the reader that this isfy them, these requirements become conditions subject is not easy, but in view of the manifold ap- on the additive constants B( ): Γ plications of thermodynamics to and bi- (λ1) (λ2) (23) B( )=λ1B( 1)+λ2B( 2) ology it is important to tell the whole story and not 1 × 2 Γ ignore such processes. for all state-spaces 1 , 2 under consideration and One can formulate the problem as the deter- Γ Γ Γ Γ λ1,λ2 > 0. Some reflection shows us that consis- mination of the additive constants B( ) of Theo- tency in the definitionΓ ofΓ the entropy constants B( ) rem 2. As long as we consider only adiabatic requires us to consider all possible chains of adi- processes that preserve the amount ofΓ each sim- abatic processes leading from one space to an-Γ ple system (i.e., such that Eqs. (6) and (8) hold), other via intermediate steps. Moreover, the addi- these constants are indeterminate. This is no longer tivity requirement leads us to allow the use of a the case, however, if we consider mixing processes “catalyst” in these processes, i.e., an auxiliary sys- and chemical reactions (which are not really dif- tem that is recovered at the end, although a state ferent, as far as thermodynamics is concerned). It change within this system might take place. With then becomes a nontrivial question whether the ad- this in mind we define quantities F( , 0) that in- ditive constants can be chosen in such a way that corporate the entropy differences in all such chains the entropy principle holds. Oddly, this determi- leading from to 0. These are built upΓ Γ from sim- nation turns out to be far more complex math- pler quantities D( , 0), which measure the entropy ematically and physically than the determination differences inΓ one-stepΓ processes, and E( , 0), of the multiplicative constants (Theorem 2). In tra- where the catalystΓ isΓ absent. The precise definitions ditional treatments one usually resorts to gedanken are as follows. First, Γ Γ experiments involving strange, nonexistent ob- jects called “semipermeable membranes” and “van D( , 0) := inf S (Y ) S (X):X , (24) { 0 − ∈ t’Hofft boxes”. We present here a general and rig- Γ YΓ 0,X Y . orous approach which avoids all this. Γ Γ ∈ ≺ Γ } If there is no adiabatic process leading from to What we already know is that every system has Γ , we put D( , )= . Next, for any given and a well-defined entropy function—e.g., for each 0 0 ∞ there is S —and we know from Theorem 2 that the 0, we consider all finite chains of state-spacesΓ Γ = 1, 2,...,ΓNΓ= such that D( i, i+1) < Γ for multiplicative constants a can be determined inΓ 0 ∞ allΓ i, and we define such a wayΓ that the sum of the entropies increases Γ Γ Γ Γ Γ Γ Γ Γ in any adiabatic process in any compound space (25) E( , 0) := inf D( 1, 2)+ + D( N 1, N ) , { ··· − } 1 2 ... . Thus, if Xi i and Yi i, then × × ∈ ∈ where the infimum is taken over all such chains Γ Γ Γ Γ Γ Γ linking with 0. Finally we define Γ Γ (X1,X2,...) (Y1,Y2Γ,...) if andΓ only if ≺ (20) (26) F( , 0) := inf E( 0, 0 0) , Si(Xi) Sj (Yj ), Γ Γ ≤ { × × } Xi Xj where the infimum is taken over all state-spaces Γ Γ Γ Γ Γ Γ . (These are the catalysts.) where we have denoted S by S for short. The ad- 0 i i The importance of the F’s for the determination ditive entropy constants do not matter here, since ofΓ the additive constants is made clear in the fol- each function S appears Γon both sides of this in- i lowing theorem: equality. It is important to note that this applies even to processes that, in intermediate steps, take Theorem 6 (Constant entropy differences). If one system into another, provided the total com- and 0 are two state-spaces, then for any two states Γ Γ 580 NOTICES OF THE AMS VOLUME 45, NUMBER 5 lieb.qxp 4/9/98 11:11 AM Page 581

X and Y 0 might hold for some state-spaces, even if both ∈ ∈ sides are finite. X Y if and only if In nature only states containing the same Γ Γ≺ (27) amount of the chemical elements can be trans- S (X)+F( , 0) S 0 (Y ). ≤ formed into each other. Hence F( , 0)=+ for An essential ingredient for the proof of this theo- ∞ Γ Γ Γ Γ many pairs of state-spaces, in particular, for those rem is Eq. (20). that contain different amounts of someΓ Γ chemical According to Theorem 6 the determination of element. The constants B( ) are, therefore, never the entropy constants B( ) amounts to satisfying unique: For each equivalence class of state-spaces the inequalities (with respect to the relationΓ of connectedness) one Γ can define a constant that is arbitrary except for (28) F( 0, ) B( ) B( 0) F( , 0) − ≤ − ≤ the proviso that the constants should be additive together with the linearity condition (23). It is clear and extensive under composition and scaling of Γ Γ Γ Γ Γ Γ that (28) can only be satisfied with finite constants systems. In our world there are 92 chemical ele- B( ) and B( ) if F( , ) > . To exclude the ments (or, strictly speaking, a somewhat larger 0 0 −∞ pathological case F( , )= , we introduce our number, N, since one should count different iso- 0 −∞ lastΓ axiom, A16,Γ whoseΓ Γstatement requires the fol- topes as different elements), and this leaves us with lowing definition. Γ Γ at least 92 free constants that specify the entropy of one gram of each of the chemical elements in Definition. A state-space is said to be connected some specific state. to another state-space if there are states X 0 ∈ The other possible source of nonuniqueness, a and Y , and state-spacesΓ 1,..., N with states ∈ 0 nontrivial gap (30) for systems with the same com- Xi,Yi i, i =1,...,N, Γand a state-space 0 withΓ position in terms of the chemical elements, is, as ∈ states X0Γ,Y0 0, such that Γ Γ far as we know, not realized in nature. (Note that ∈ Γ Γ this assertion can be tested experimentally with- (X,X0) Y1,Xi Yi+1,i=1,...,N 1, ≺ Γ ≺ − out invoking semipermeable membranes.) Hence, XN (Y,Y0). once the entropy constants for the chemical ele- ≺ ments have been fixed and a temperature unit has A16. Absence of sinks: If is connected to 0, been chosen (to fix the multiplicative constants), then 0 is connected to . the universal entropy is completely fixed. This axiom excludes F( , Γ)= because, onΓ 0 −∞ We are indebted to many people for helpful dis- generalΓ grounds, one alwaysΓ has cussions, including Fred Almgren, Thor Bak, Γ Γ Bernard Baumgartner, Pierluigi Contucci, Roy Jack- (29) F( 0, ) F( , 0). − ≤ son, Anthony Knapp, Martin Kruskal, Mary Beth Hence F( , )= (which means, in particular, Ruskai, and Jan Philip Solovej. 0 −∞Γ Γ Γ Γ that is connected to ) would imply F( , )= , 0 0 ∞ References i.e., that thereΓ Γ is no way back from 0 to . This is [1] J. B. BOYLING, An axiomatic approach to classical ther- excludedΓ by axiom 16.Γ Γ Γ modynamics, Proc. Roy. Soc. London A329 (1972), The quantities F( , 0) have simpleΓ subadditiv-Γ 35–70. ity properties that allow us to use the Hahn-Banach [2] H. A. BUCHDAHL, The concepts of classical thermody- theorem to satisfy theΓ Γ inequalities (28), with con- namics, Cambridge Univ. Press, Cambridge, 1966. stants B( ) that depend linearly on , in the sense [3] C. CARATHÉODORY, Untersuchung über die Grundlagen of Eq. (23). Hence we arrive at der Thermodynamik, Math. Ann. 67 (1909), 355–386. Γ Γ [4] J. L. B. COOPER, The foundations of thermodynamics, Theorem 7 (Universal entropy). The additive en- J. Math. Anal. Appl. 17 (1967), 172–193. tropy constants of all systems can be calibrated in [5] J. J. DUISTERMAAT, Energy and entropy as real mor- such a way that the entropy is additive and exten- phisms for addition and order, Synthese 18 (1968), sive and X Y implies S(X) S(Y ), even when X 327–393. ≺ ≤ [6] R. GILES, Mathematical foundations of thermody- and Y do not belong to the same state-space. namics, Pergamon, Oxford, 1964. Our final remark concerns the remaining non- [7] E. H. LIEB and J. YNGVASON, The physics and math- uniqueness of the constants B( ). This indetermi- ematics of the second law of thermodynamics, nacy can be traced back to the nonuniqueness of preprint, 1997; Phys. Rep. (to appear); Austin Math. Phys. arch. 97–457; Los Alamos arch. cond- a linear functional lying betweenΓ F( , ) and − 0 mat/9708200. F( , 0) and has two possible sources: one is that [8] M. PLANCK, Über die Begrundung des zweiten Haupt- some pairs of state-spaces and 0 may notΓ beΓ con- satzes der Thermodynamik, Sitzungsber. Preuss. nected;Γ Γ i.e., F( , 0) may be infinite (in which case Akad. Wiss. Phys. Math. Kl. (1926), 453–463. F( 0, ) is also infinite by axiomΓ A16).Γ The other is [9] F. S. ROBERTS and R. D. LUCE, Axiomatic thermody- that there mightΓ Γ be a true gap; i.e., namics and extensive measurement, Synthese 18 Γ Γ (1968), 311–326. (30) F( 0, )