Physics in Proportion

Mark A. Peterson °c 2005 M.A. Peterson

Contents

1 What is physics? 3 1.1 Proportionality ...... 4 1.2 Two Kinds of Physics ...... 7 1.3 Learning Physics ...... 9 1.4 A Capsule History of Physics ...... 11

2 Mathematical Tools 21 2.1 Proportion ...... 22 2.2 Units ...... 24 2.3 Data: Straight Line Plots ...... 26 2.4 Uncertainty in Data ...... 30 2.5 Dimension ...... 32 2.6 Position, Time, and (constant) Velocity ...... 33 2.7 The speed of light, and SI units ...... 38 2.8 Dimension and Scaling ...... 39 2.9 Power Laws, and the Logarithm ...... 40 2.10 Numbers in Geometry are Ratios ...... 43 2.11 The Trigonometric Functions ...... 44

iii iv CONTENTS

2.12 Angular Measures ...... 47 2.13 Trigonometric functions of special angles ...... 50 2.14 Small angle approximations ...... 51

3 Geometrical Optics 61 3.1 Angular Size ...... 62 3.2 The Eye ...... 65 3.3 Binocular Vision and Parallax ...... 66 3.4 Wide Open Pupils ...... 68 3.5 The Lens of the Eye ...... 70 3.6 Refraction ...... 71 3.7 Focal Length ...... 74 3.8 Interpreting Relationships ...... 76 3.9 The focal length of the eye ...... 78 3.10 Virtual Images ...... 80 3.11 Thin Lenses ...... 83 3.12 Object and Image ...... 86 3.13 Optical Systems ...... 89 3.13.1 The Magnifying Glass ...... 90 3.13.2 The Microscope ...... 93 3.13.3 Two lenses together ...... 95 3.13.4 The Astronomical Telescope ...... 96 3.13.5 Galilean Telescope ...... 98 3.14 Mirrors ...... 99 3.15 Spherical Aberrations ...... 102 3.16 Reflection and Refraction ...... 104 3.17 Fermat’s Principle ...... 106 3.18 Wavefronts: A Dual Theory of Light ...... 109 CONTENTS v

4 Time and Oscillation 121 4.1 Angular Clocks ...... 122 4.1.1 The Solar Clock ...... 123 4.1.2 The Sidereal Clock ...... 124 4.1.3 Solar vs. Sidereal ...... 124 4.1.4 Aside on Kepler’s Laws ...... 128 4.2 Atomic Clocks ...... 130 4.3 GPS: Global Positioning System ...... 131 4.4 Longitude ...... 134 4.5 The Moons of Jupiter ...... 137 4.6 Period, Frequency and Amplitude ...... 141 4.7 Velocity in Orbit, Projected ...... 143 4.8 Pendulums ...... 144 4.8.1 The Period of a Pendulum ...... 146 4.9 The Binomial Approximation for Perturbations ...... 149 4.10 Pendulums and the Rotation of the Earth ...... 152 4.11 Simple Harmonic Oscillators ...... 154 4.12 Exponential Decay ...... 158 4.13 Dating by Radioactive Decay ...... 163

5 Mass, Weight, and Equilibrium 173 5.1 Archimedes ...... 173 5.2 Torque and Force ...... 177 5.3 Spring Forces: Hooke’s Law ...... 182 5.4 Weight and Mass ...... 183 5.5 Springs in Parallel and Series ...... 186 vi CONTENTS

5.6 Newton’s Third Law ...... 189 5.7 Young’s Modulus ...... 191 5.8 The Force Between Atoms ...... 193

6 Mechanical Energy and Motion 201 6.1 Gravitational Potential Energy ...... 201 6.2 Spring Potential Energy ...... 206 6.3 The Potential Energy of a Pendulum ...... 207 6.4 Falling, and Kinetic Energy ...... 209 6.5 Velocity v in falling ...... 213 6.6 Universal Gravitation ...... 214 6.7 Energy of an Oscillator ...... 217 6.8 Oscillators Losing Energy ...... 219 6.9 A Chemical Bond ...... 222

7 Vector Quantities 231 7.1 Projectile Motion ...... 231 7.2 Vector Addition ...... 233 7.3 Velocity and Speed ...... 236 7.4 Galilean Relativity ...... 237 7.5 Falling and Relativity ...... 238 7.6 Falling and Impulse ...... 240 7.7 More on Projectile Motion ...... 241 7.8 Impulse and Conservation of Momentum ...... 243 7.9 Impulse and Circular Motion ...... 244 CONTENTS vii

8 Density and Fluids 253 8.1 Mass Density ...... 253 8.2 Archimedes’ Principle ...... 255 8.3 Galileo’s Balance ...... 260 8.4 Galileo’s Proof of Archimedes’ Principle ...... 261 8.5 Buoyancy and Pressure ...... 263 8.6 More on Hydrostatic Pressure ...... 266 8.7 Atmospheric Pressure ...... 268 8.8 The Barometer ...... 270 8.9 Bernoulli’s Principle ...... 272 8.10 Applications of Bernoulli’s Principle ...... 273 8.10.1 Force of the wind ...... 273 8.10.2 Flow Past an Airfoil ...... 275 8.11 Flow in Pipes ...... 276 8.11.1 Venturi Flow Meter ...... 277 8.11.2 Poisseuille Flow ...... 278 8.11.3 Current Density ...... 278 8.12 Shear Stress and Viscosity ...... 280 8.13 Stokes Flow ...... 282 8.14 Poisseuille Flow Revisited ...... 283 8.15 The Reynolds Number ...... 286 8.16 Resistance in Series and Parallel ...... 288 8.17 The Human Circulatory System ...... 290 8.18 A Fractal Model of Circulation ...... 295 viii CONTENTS

9 Temperature, Heat, and Internal Energy 311 9.1 Temperature ...... 314 9.2 Thermometers ...... 317 9.3 The Gas Thermometer ...... 319 9.4 Avogadro’s Hypothesis ...... 321 9.5 Heat Capacity ...... 323 9.6 Molar Heat Capacities ...... 327 9.7 Statistical Model for Molar Heat Capacity ...... 328 9.8 Phase Transitions ...... 331 9.9 Entropy ...... 333

10 Thermodynamics 339 10.1 Work ...... 340 10.2 P ∆V Work ...... 342 10.3 Various Processes ...... 343 10.3.1 Adiabatic Process: ∆Q = 0 ...... 343 10.3.2 Isothermal Processes, ∆T = 0 ...... 345 10.3.3 A Constant Pressure Process ...... 347 10.3.4 Reversible and Irreversible Processes ...... 348 10.4 Heat Engines ...... 351 10.4.1 The Carnot Cycle ...... 354 10.4.2 Refrigerators, Heat Pumps ...... 355 10.5 Life at Fixed Temperature ...... 356 10.6 Life at Fixed Temperature and Pressure ...... 358 CONTENTS ix

11 Statistical Physics 361

11.1 Ideal Solutions as Ideal Gases ...... 361

11.2 Statistical Mechanics ...... 364

11.3 Randomness ...... 365

11.4 Brownian Motion ...... 368

12 Waves in One Dimension 373

12.1 Standing Waves on a String ...... 374

12.2 Standing Sound Waves in a Pipe ...... 377

12.3 Wave Speed ...... 379

12.4 The Speed of Sound in Air ...... 382

12.5 Sinusoidal Travelling Waves ...... 384

12.5.1 Doppler Effect with Moving Receiver ...... 386

12.5.2 Doppler Effect with Moving Source ...... 386

12.6 Superposition, and the Beat Frequency ...... 388

12.7 Reflection of Waves ...... 390

12.8 Reflection and Standing Waves ...... 392

12.9 Energy Current on a String ...... 395

12.10Energy Current Density ...... 397

12.11Energy Current Density in Sound ...... 398

12.12Energy Current Density in Light ...... 400

12.13The Inverse Square Law for Intensity ...... 402

12.14Time Averaging: Mean Square Intensity ...... 404 x CONTENTS

13 Waves in Two and Three Dimensions 411 13.1 The Huyghens Construction ...... 412 13.2 Young’s Experiment ...... 414 13.3 Single Slit Diffraction ...... 418 13.4 Waves in Refraction ...... 422 13.5 Interference Colors ...... 424 13.6 X-Ray Crystallography ...... 428 13.7 The Electromagnetic Spectrum ...... 431 13.8 Young’s Experiment ...... 435

14 Electric Charge and Potential 439 14.1 Static Electricity ...... 439 14.2 Capacitance ...... 441

14.2.1 Electrostatic Energy UC ...... 442 14.2.2 Electrostatic Potential Difference ...... 444 14.2.3 Capacitors in Series and Parallel ...... 445 14.3 Units and Values ...... 448 14.4 Parallel Plate Capacitor ...... 449 14.4.1 The Parallel Plate Formula for Capacitance ...... 450 14.4.2 The Parallel Plate Formula for Electric Field ...... 452 14.4.3 The Parallel Plate Formula for Potential ...... 454 14.4.4 Motion Between Parallel Capacitor Plates ...... 455 14.4.5 Oscilloscope and CRT ...... 456 14.5 Aside on Coulomb’s Law ...... 458 14.6 Franklin’s Bells ...... 461 CONTENTS xi

15 Electric Current 469 15.1 Potential V and Current I ...... 469 15.2 Ohm’s Law ...... 470 15.3 Microscopic Form of Ohm’s Law ...... 471 15.4 Dissipation of Energy in Resistors ...... 474 15.5 Resistors in Series and Parallel ...... 476 15.6 Discharging a Capacitor ...... 477

16 Bioelectricity, Electrochemistry 485 16.1 Excitable Membranes ...... 485 16.2 Nerve Axons: Hodgkin-Huxley Theory ...... 489 16.3 Galvani’s Frogs and Volta’s Piles ...... 496 16.4 The Daniell Cell ...... 497 16.5 Cathode and Anode ...... 500 16.6 Half Cells ...... 502 16.7 Using Batteries ...... 506 16.7.1 Batteries in Series ...... 506 16.7.2 Fuel Cells and Electrolysis ...... 506 16.7.3 Sir Humphry Davy ...... 508 16.7.4 The Telegraph ...... 509

17 Magnetism 515 17.1 Magnetic Field Lines ...... 516 17.2 The Magnetic Force on a Moving Charge ...... 520 17.3 Mass Spectrometer ...... 523 17.4 Spiralling Along the Field Lines ...... 525 xii CONTENTS

17.5 The Magnetic Force on a Current I ...... 528 17.6 Generation of Electric Current ...... 531 17.7 Faraday’s Law and Relativity ...... 533 17.8 Generating Alternating Current ...... 535 17.9 The Magnetic Field due to a Current ...... 536 17.10Faraday’s Law, Lenz’s Law, and Self-Induction ...... 539 17.11Mutual Inductance and Transformers ...... 542 17.12Two Magnetisms? ...... 542

18 Electromagnetic Waves and Resonance 549 18.1 Plane Waves and Polarization ...... 550 18.2 Polarization in Nature ...... 552 18.3 Scattering and the Index of Refraction ...... 554 18.4 Light Transmission in Gases ...... 556 18.5 Why is the Sky Blue? ...... 558 18.6 Producing EM Waves ...... 561 18.7 Hertzian Waves ...... 565 18.8 Resonant Absorption and Emission ...... 569 18.9 The Blackbody Spectrum ...... 573

19 Quantum Mechanics 581 19.1 Quanta ...... 582 19.2 Einstein and the Heat Capacity of Solids ...... 584 19.3 Photons ...... 586 19.4 The Hydrogen Atom ...... 590 19.5 De Broglie Waves ...... 595 19.6 The Heisenberg Uncertainty Principle ...... 599 CONTENTS xiii

20 Nuclear Processes 607 20.1 E=Mc2 ...... 607 20.2 Atomic Mass ...... 610 20.3 Beta Decay ...... 614 20.4 Alpha Decay ...... 617 20.5 Radiation and the Body ...... 622 20.5.1 General Considerations ...... 622 20.5.2 Gamma Rays and X Rays ...... 627 20.5.3 Beta Radiation ...... 629 20.5.4 Alpha Radiation ...... 633 20.6 Neutrons and Fission ...... 634 20.7 Nuclear Reactors and Artificial Isotopes ...... 638 20.8 Artificial Isotopes ...... 643 Foreword

Everyone knows that physics is a rich subject, and most people are quick to express their interest in it. Physics is the foundation of the natural sciences, but it is also central to intellectual history. It has inspired new methods and approaches in the social sciences and in finance. It has created new mathematics. Through interpretations of what it has to say about our place in the natural world, it touches even the humanities and religion. Physicists themselves, however, take a much more restricted view, at least in introductory textbooks. After a few generalities, they get right to work on the motion of hypothetical point particles, and this may go on for most of a semester, or even a year. A puzzled student learns that what is most interesting in physics is still some distance away, and will come with time. Those who go on in physics ultimately see the sense in this, but the great majority of students are left with an introduction that never actually goes anywhere. Worse, they may have come to see physics as a collection of mathematical formulae to learn for professional school entrance exams, and otherwise to forget. This is a sad transformation. Perhaps the first course with calculus should get right to work on New- ton’s description of the world, although it wouldn’t hurt to explain a little more clearly the logic of this approach. The first course without calculus, though, does not have a clear rationale. Without calculus one cannot even state Newton’s second law. Yet a first physics course that is not founded on Newton’s mechanics is almost unthinkable. The non-calculus course has a problem. A look at history suggests another way to organize physics, one that might be more meaningful to students, and might even be a truer introduction to physics. One sees, in history, the importance of proportional reasoning from

1 2 CONTENTS the physics of the Greeks down to the present. This is a more elementary level of mathematics than the calculus course assumes, but it does not mis- represent the field. Quite the contrary, the idea of proportion in Nature is in some ways the essence of physics. Putting the emphasis on proportion, and related ideas like dimensional analysis, leads to an ordering of topics that is quite different from the stan- dard text, at least initially. It essentially amounts to replacing the Newtonian picture of point particles by continuum objects of geometry. The theory of geometrical optics comes first, an intriguing blend of mathematical theory and familiar experience, pervaded by the notion of proportionality. We take a cue from history in choosing topics and ways of thinking that people under- stood early, with simple mathematical arguments. Densities of various kinds play a central role. Fluid phenomena are more prominent in this approach than they would be in a standard text. The historically recent formalism of vectors is de-emphasized and replaced, where necessary, by geometrical arguments about horizontal and vertical projections, a way of thinking that was never problematic in the past. History also furnishes, frequently, a “story line” for topics in physics. Without being strictly chronological, we still find that attention to the story often organizes the subject in an interesting and engaging way. This method frequently changes the order of topics. In electrostatics, for example, it is natural to start with the idea of capacitance rather than Coulomb’s Law. The reader familiar with the standard text will see many examples of such changes. We have unashamedly tried to make this text a suitable preparation for the physics portion of the MCAT exam. It seems to us that the MCAT, without requiring calculus, is in fact a rather interesting test of physics, and frequently probes real comprehension, and not just ability to calculate or to memorize. Attention to history, to real phenomena, and to continuum descriptions, it seems to us, is the most useful introduction to physics for this exam, and beyond that, for integration into a liberal arts education. Chapter 1

What is physics?

Most people, including most physicists, have trouble saying what physics is. Dictionaries are not very satisfactory either. They tend to define physics using the word “energy”, for example, a word that has acquired its precise meaning (the one intended here) only gradually, within physics. This says, in effect, that you will know what physics is after you have studied it. But really, it helps to know what a subject is before you study it. A look at history gives a surprisingly simple answer: physics is the math- ematical theory of Nature. In the context of the present this is hard to see, because now every natural science is mathematical, not just physics. When you think about it, though, you realize that the other natural sciences are defined by their subject matter. Biology studies the world of living things. Chemistry studies the world of substances. Geology studies the world it- self. But if these sciences have become mathematical, it is because they have imported ideas and methods from physics. The influence of physics has transformed all the natural sciences, and that is a very good reason for studying physics: all sciences now are part physics. Physics takes all Nature as its subject matter, and in particular Nature’s mathematical structure. This is its defining characteristic. Thus part of what physics studies is not Nature at all, but mathematical models of Nature. Perhaps that is why it is hard to say what physics is. It may be obvious now that Nature is mathematical, but until recently it was universally held that the world is not mathematical, that it is too

3 4 CHAPTER 1. WHAT IS PHYSICS? complex and chaotic for that. Galileo’s opponents, for example, ridiculed the idea that mathematics had anything to tell us about what the world is really like. Gradually, pioneers like Galileo began to discern simplicities in the complexity, things so simple that we can describe them with mathematics after all. Physics is the mathematical theory of Nature, but it is not mathematics. It is about real phenomena, in a way that mathematics itself is not. Here too history can guide us. Where do mathematical theories of Nature come from? What phenomena do they describe, and how does the description work? We will pursue these questions by keeping real phenomena at the center of attention, and as mathematics enters the picture, we will take time to interpret it and dig out its meaning. This skill is probably the most useful thing you can gain by studying physics.

1.1 Proportionality

Physics is the mathematical theory of Nature. Very often the mathematics in question will be quite simple, just a statement of proportionality. It is an oversimplification, but a useful one, to say that physics is the study of proportionalities in Nature, both obvious ones and mysterious ones. Propor- tionality is the thread that will run through this book: it is the mathematics of our mathematical theory. Proportionality is a relationship. Here is an example using the notions of mass and weight. The words are common English words, and may even seem to mean the same thing, but in physics one makes a careful distinction. What is true is that mass m and weight W are proportional:

W ∝ m (1.1) or, introducing the constant of proportionality g, which is called the acceler- ation due to gravity, W = mg (1.2) We want to be able to look at an expression like that and “read” its meaning. One thing it says is that W and m are similar somehow. If m should double, then W would also double: a more massive thing weighs more, by the same 1.1. PROPORTIONALITY 5 factor. Note that we don’t need the numerical value of the constant g to see this. That value is irrelevant to the relationship we are getting at. That is why g is actually omitted in one way of saying it, Eq (1.1). Are mass and weight really just the same thing, going under different names, with a sort of “conversion factor” g to switch from one name to the other? In that case it would seem redundant to have two names for what is really one concept. It must be that weight and mass really are conceptually different, in some way that we haven’t been told. Then the proportionality between two different things becomes slightly surprising. It says we can measure mass by measuring weight. In a kind of twist, it will turn out that the “constant” g actually is not strictly constant, but depends on where you are, so it is a constant only if you stay at one definite place. Thus another way to read Eq (1.2) is that weight W is proportional to g at fixed m – now m becomes the constant of proportionality, and we can measure g by measuring W . All these meanings are contained in the simple statement above.

Statements like Eq (1.2) are sometimes called “formulas,” with the ex- pectation that we will put numbers in. We could put numbers in, of course, but that is not really what the statement is about, nor what it is for. It is really getting at the relationship between weight and mass, which is both simple and subtle. Standardized tests of physics, like the MCAT, to take an important example, will test your ability to read statements like Eq (1.2). Your ability to put numbers into it is of no interest to anyone. Even if putting in numbers does turn out to be part of the question, it will not be the im- portant part. Students who do not know about reading relationships will waste their time memorizing “formulas,” hoping that when the time comes they will choose the right one and put in the right numbers. This is a losing strategy. Forget numbers! Concentrate on the relationships. That is what you will be tested on. Learn to read the meaning. That is what this book aims to help you do.

It is true that one should work towards a sense for magnitudes, a kind of physical common sense. For this purpose it is important to think about numbers after all, and to see how they are related in formulas, but these will be typical values, and can be very approximate. This may seem a mysterious remark, but we will be doing many rough estimates using approximate values to get a feeling for magnitudes as we go.

Let us look at a typical MCAT question alluding to Eq (1.2): Gravity on 6 CHAPTER 1. WHAT IS PHYSICS? the Moon is only 1/6 that on Earth. A 6 kg mass on Earth is taken to the Moon. What is its mass on the Moon?

(A) 1 kg (B) 9.8 kg (C) 6 kg (D) 1.63 kg

The answer is (C). Answers (B) and (D) might appeal to someone who knows the typical numerical value g = 9.8 ms−2 on Earth, or who might compute g/6. These values are irrelevant, because of their units if nothing else. Answer (A) will appeal to someone who thinks we must divide by 6 because gravity is weaker on the Moon. The correct answer, that the mass m in Eq (1.2) is an invariant quantity, goes to the real meaning of Eq (1.2). Weight is something we are intuitively familiar with, but closely coupled to it is the much simpler, invariant concept of mass. How did people figure out that there exists a simple quantity mass, related to weight by the propor- tionality in Eq (1.2)? And what is mass? Eq (1.2) expresses a mathematical relationship, but its real meaning is physical, not mathematical. Our simple example question has led us up to the edge of a real mystery. Mass, as simple and fundamental as it is, is not well understood. The masses of the elementary particles, for example, are a continuing puzzle. For our purposes we can just notice that this question about mass clearly alludes to Eq (1.2), but not as a formula. You don’t answer the question by putting numbers into Eq (1.2). You answer it by knowing what Eq (1.2) is getting at: the surprising concept of mass. Looking at Eq (1.2) in this way, one realizes that there is actually a story concealed in what looks like a trivial formula. The story would start with the sense perception of weight, something everyone has a feeling for, and would end with the abstract notion of mass. Understanding this story is in many ways the important thing. It would be enough to answer the MCAT question, for instance. Although we will not always take an historical approach, it will be a help to our study of physics to know something about how it developed. We therefore include a capsule history at the end of this chapter. It would be very pleasant to learn physics through stories. The history of physics is in fact full of stories. For each one you have to learn the physics to 1.2. TWO KINDS OF PHYSICS 7 understand the story, so this could even be a kind of method for learning. If it is a story of a discovery, one cannot help imagining how the discovery actually appeared to the discoverers. In the end, the mathematical representation of the idea will be abstract, but attention to the story means it can be never be merely abstract. We will use this method occasionally, and even broaden the definition of physics a little bit, to include its stories, and its connections to history, the arts, and technology.

1.2 Two Kinds of Physics

Another slightly confusing thing about physics as a discipline is that there are two almost contradictory ways that it uses mathematics. In one way of looking at it, physics is endlessly mathematical: it has actually inspired new kinds of mathematics, and continues to do so even today. This kind of physics is associated with higher and higher mathematics, beginning with the invention of calculus in the 17th century and continuing on to developments that are not easy to describe in ordinary language, but are occasional grist for popularizers talking about such things as curved space or uncertainty prin- ciples. From this point of view, you can’t have too much mathematics when you are doing physics. The more the better. Experiments have frequently confirmed the results of theories that could never have been conceived with- out higher mathematics. This in itself is a very intriguing fact, and argues that Nature is deeply and richly mathematical, beyond anything we could have expected. The way this very mathematical theory is actually used in practice, how- ever, has a very different flavor. Far from solving complex equations exactly, most physicists, and others who use physics, work with rough estimates and quick approximations. They have in mind mental pictures or models that they apply to this or that situation. The models are essentially statements of proportionalities, with pictures that go along with them to make them easier to think about and visualize. The models are grounded in the ex- act theories of physics, but their usefulness comes not from their exactness but from their ability to approximate situations that are too complicated to model exactly – which is to say, most situations. Physicists pride themselves in being able to do useful computations “on the back of an envelope.” Clearly whatever they do on the back of an envelope can’t be very much! But if it 8 CHAPTER 1. WHAT IS PHYSICS? cuts to the essential idea of what is going on, identifying an appropriate model, and drawing some simple conclusion from it, then it is physics of the highest order.

The two kinds of physics reflect two kinds of “simplicity,” since physics always aims to bring out what is simple. Most people would say that the mathematical theories and the elaborate experiments of physics are far from simple, but in some ways this misses the essential point. A mathematical theory, being unambiguous and completely defined, is in some sense very simple. And an experiment that cools a sample to near absolute zero (to get rid of thermal fluctuations), pumps down a vacuum around it (to get rid of extraneous, perturbing matter), mounts everything on shock absorbers (to eliminate unwanted vibrations), etc., looks, and is, elaborate, but at its heart it creates a region of almost unnatural simplicity, in order to discern the simple law that governs what is left. The complicated mathematics, the complicated apparatus, are all meant to isolate something that is simple enough to be comprehensible.

On the other hand, what may seem simple in a superficial sense, like what we see when we walk down the street, is, from the point of view of physics, fantastically, hopelessly complicated. The only way physics can say anything about it is to take a “back of the envelope” approach – ignore the true complexity, and just model the main things, roughly. This remark applies to any situation that is not a precision experiment in physics. To apply physics broadly one must be willing to approximate, literally throwing away the precision that physics claims to have.

Professional physicists learn the first kind of physics first – the highly mathematical kind. That is the tradition. They view the second kind of physics, the “back of the envelope” kind, as a very sophisticated understand- ing that they acquire only late in their training. We will be aiming, however at the second kind of physics. We will not spend much class time devel- oping a mathematical formalism. Rather we will often be thinking about real situations and trying to understand them in terms of models that are mathematical, but not overly so. This approach covers a lot of real physics, and could be interpreted as aiming at higher sophistication in place of higher mathematics. It means we can feel free to consider complicated, even impos- sibly complicated, problems – not with the intent of solving them, but just with the intent of understanding them quantitatively, roughly. 1.3. LEARNING PHYSICS 9

In practice, physics provides models of what is proportional to what. Knowing what the models are, and what the proportionalities are, is real physics for most practical purposes. The designers of the MCAT exam very sensibly look for this kind of understanding, a broad and useful knowledge, not mathematically deep. That is the subject of this book.

1.3 Learning Physics

There is a whole branch of physics (PER: Physics Education Research) de- voted to how people learn physics. One intriguing conclusion of PER is that people have innate ideas about how the world works that they have to give up in order to learn physics. That just makes the point once again that the “simple” models of modern physics and the “simplicity” of everyday things are not the same. It is sometimes said that students begin as Aristotelians. The reference is to Aristotle’s physics, a 2000 year aberration, a dominant idea from Roman times down to the time of Galileo. Aristotle’s physics claims to be common sense about the world, proved by elementary logic and observation. History alone, even without PER, would tell us that Aristotle’s physics must have had a powerful appeal to have lasted so long. It is noteworthy that it is scrupulously non-mathematical, and also that it ultimately produced nothing of value. Despite its apparent basis in observation and logic, nothing in Aristotelian physics survived. It was simply and thoroughly wrong. And yet it is apparently the intuitive idea of the world that we all unconsciously begin with! This puts us on notice that physics, despite its emphasis on simplicity, is psychologically not so simple. To learn physics, we must sometimes unlearn what seems right, and ac- cept a subtly different idea in its place. This may take some conscious effort to do. The conscious effort takes the form of recalling and recognizing math- ematical models, and using them. The problem of how an object moves, for example, is modeled by Newton’s mechanics. You must translate the situation, as given, into the terms of the model. In this case, you think of the object as made of little point masses, with initial positions and velocities. Then you compute what the model says about how those positions and velocities change in time. This may 10 CHAPTER 1. WHAT IS PHYSICS? involve geometry, algebra, or arithmetic. It might all be done in a picture or diagram. There may be quick shortcuts – knowing about these would be part of knowing the mathematical model. If it is very clear what to do, but merely tedious, you could get a computer to help. In the end you translate the computation back into a statement about the actual object. In summary, you substitute the model for any intuitive ideas you may have, and you see what the model says. Lastly, you think about the result, and try to see how it makes sense. Ultimately you would like your intuition to agree with the model, but that takes time to learn.

Whether we can really change our Aristotelian intuition is an interesting question. When physicists talk about their “physical intuition,” it may be, in the end, that they have simply learned to consult the mathematical model more quickly and easily, through practice, which is not quite what we usually mean by “intuition.” For some models, like quantum mechanics, it may fairly be said that nobody claims to have made it intuitive.

On the other hand, physics is not a system of thought that forces you to think in a certain way, and denies you any creativity. Historically it is clear that what we now call physics provided, as people said, a new way to philosophize. They meant using mathematics to understand the world, and called it Natural Philosophy. It is true that learning these methods requires discipline, but in the end you have more to think with, and more to think about: new tools. No one is telling you what to think. But you will want to use these tools effectively.

The usual first course in physics starts with the ideas of Newton. PER first became interesting to physics teachers as it documented just how un- successful the first course often is. A great deal of effort has gone into im- proving the success rate of the introductory course in the sense of raising test scores on PER tests. Another possible interpretation, though, is that perhaps Newton is not the best place to begin. Starting with Newton is especially problematic if one does not use calculus, since that was Newton’s key innovation. Newton’s mechanics is the beginning of that very fascinating interaction between physics and higher mathematics that we have just said we would not emphasize. Acquaintance with the history of science suggests, rather, that the foundational ideas of physics are simpler and go back much further. Newton famously said, “If I have seen farther, it is because I have stood on the shoulders of giants.” For us, starting the study of physics, it 1.4. A CAPSULE HISTORY OF PHYSICS 11 might be worthwhile to ask who those giants were, and what Newton meant by standing on their shoulders. Perhaps we could stand on their shoulders too! This book takes that suggestion seriously. All physical concepts have their roots in simple, concrete phenomena. We have alluded to the familiar notion of weight, for example. There may even be an intuitive, commonsensical way to understand the phenomenon, as in Aristotle’s principle that all earthy matter seeks to go toward the center of the Earth. Why is this not enough? What prompted anyone to take the next step? In this example, Archimedes’ Law of the Lever, showing that there is something profoundly mathematical about weight, played a role that is too important to ignore. History can tell us, if we pay attention to it, which ideas are really important, and worth our time. History also suggests that, in physics, patient contemplation of simple things is rewarded. While other sciences seem to progress by accretion, adding more facts and growing larger and more complex, physics, to an amazing degree, returns again and again to its oldest problems and princi- ples, with new eyes and deeper appreciation. In terms of learning physics, it means we can afford to learn simple things, not hurriedly, but thoroughly. You might think that we have already said more than enough about the sim- ple relation W = mg, for example. Many people in 1910 would have agreed with you, just as Einstein was reinterpreting this relation in an astonishing way, soon to become his General Relativity Theory, a completely new and unexpected theory of gravity. We close this introduction with a capsule history of physics. It is the story in outline, to be reconsidered in detail later.

1.4 A Capsule History of Physics

Physics begins with geometry, and geometry begins with Euclid. 1 Strangely enough, we do not know very much about how this happened. The Hellenis- tic Greek civilization that produced geometry and physics was centered in

1I am much indebted to Lucio Russo’s The Forgotten Revolution for the point of view expressed here about classical civilization. 12 CHAPTER 1. WHAT IS PHYSICS?

Alexandria, Egypt, and – amazingly! – no history of Alexandria survives. We know a lot about Athens in its Golden Age, although that was much earlier. We know Socrates, for example, better than we know the celebri- ties in today’s newspapers, even though he died in 399 B.C. But of Euclid, who probably wrote around 300 B.C. in Alexandria, we know truly nothing, except for his surviving works, chiefly the Elements. This work was, and remains, the first resource of theoretical physics. Euclidean geometry is still a subject everyone studies: it is the theory of points, lines, and circles, starting from a few basic postulates about them. The truth of the theorems of geometry is not in doubt, because all of these theorems are proved, that is, justified by arguments that go back to the postulates. This in itself is not physics, because the points, lines, and circles, are purely theoretical objects, not real objects. They obey the postulates by definition. It becomes theoretical physics, however, if we set up some correspondence between real objects and these theoretical objects, if, for example, we suggest that rays of light in Nature behave like straight lines in geometry. Suddenly we have a wealth of predictions about how real light should behave, and the ability to theorize about optical configurations that have never been built before. The Alexandrian Greeks invented this conception: a mathematical struc- ture and a correspondence to Nature. That is what physics is. In the work of Archimedes we have other examples of such structures and such correspondences. Nearly two thousand years later Galileo Galilei, through study of Archimedes, came to understand his methods and to pick up where Archimedes had left off. Galileo discovered his own examples of such corre- spondences. In just a few decades more, physics had reached a recognizably modern form. It is an intriguing question why there is a 2000 year gap in this story. The answer is almost scary, like science fiction. The centers of Greek science were conquered by the Romans between about 212 B.C. and 46 B.C. With the arrival of the Romans, original work in the sciences came to an end. The tradition was kept alive through reworking of some of the old texts and commentaries, but active understanding and research stopped. The Romans themselves had no clue. In the hundreds of years that they were in possession of Greek cities, they never even translated Euclid into Latin, 1.4. A CAPSULE HISTORY OF PHYSICS 13 for example. That was left to the Europeans of the Middle Ages, eager to recover the knowledge of the classical past, and they translated, at least initially, from Arabic. Why there was not an Arabic language counterpart of Galileo is a particularly intriguing question, with repercussions that we are still feeling. Islamic science inherited Greek science weirdly mixed with later Roman influence, religion, and astrology. In the West, where the story is better documented, one can see that it took about 500 years to separate the Greek substance from the Roman nonsense. Perhaps that is a clue. What the early translators found was fragmentary. It was clear from cita- tions in works that had survived – to works otherwise unknown – that much classical knowledge was gone forever. Euclid wrote at least four books on optics, for example, but only one of them survives, and that one is probably the least interesting from the point of view of physics: it is a geometrical theory of human vision, and addresses questions that we would probably not put first, such as why distant objects appear indistinct and why they are eventually lost to sight if they are far enough away. Some brilliant works of Archimedes survived, including his derivation of the Law of the Lever, and his theory of buoyancy, “Archimedes’ Principle.” These are results that are still as fresh as when they were first discovered: they are still a part of modern physics. One of the last original Greek works from the Eastern Mediterranean, written just before the Romans arrived, was Apollonius’ study of the geom- etry of the conic sections, the curves that you get when you cut a cone with a plane, including the parabola and the ellipse. We don’t know why he was studying this, but it is startling that both these curves were discovered in the early 1600’s to be the paths of objects in nature, moving under the influence of gravity. That key discovery was only possible because these curves were known from the work of Apollonius, some 1700 years earlier. In fact all the works mentioned here were enormously influential in the Renaissance, the rebirth, of physics. Two lessons emerge from looking at this early history of physics. First, it isn’t enough to know just the results, the answers. The Romans were fond of curious lore, and they knew the Greek results in a sense, but without understanding them. Their misconceptions are sometimes even comical. Vit- ruvius, a 1st century Roman author, is aware of Eratosthenes’ determination of the radius of the Earth, for example, although he has no idea how it was 14 CHAPTER 1. WHAT IS PHYSICS? done. As he describes it, it emerges that he thinks the Earth is a flat disk, with ourselves at the center of it, and that it is the radius of this disk that has been measured. Roman science, such as it was, is basically the same as the later medieval bestiaries and fabulous travelogues. As far as science goes, the Dark Ages began with the rise of Rome, not with its fall. The Romans had no interest in the intellectual process behind the Greek scientific results. They were only interested in the results. But to have a real understanding, you need to know the process by which the answers were found. Science declined in the Roman period because it wasn’t practiced, only copied, with growing incomprehension.

Second, even though classical knowledge in the custody of the Romans was almost completely lost, it was never created again independently. Rather the surviving bits of Greek science were carefully worked over and became the nucleus of the new physics. This is really quite amazing. You would think that physics, being the mathematical description of the elementary processes all around us, would have been discovered many times over, and in many places. This did not happen, though. It was discovered only once, by the Hellenistic Greeks, and we are all their students.

By the mid 1600’s Greek physics, together with new results of Galileo, Kepler, and others strongly hinted at a larger mathematical theory of Nature. Optics in particular developed rapidly. Then in 1689 Isaac Newton threw open the door to modern physics, in effect, with a new theory of the motion of objects, and a new mathematics to implement it. With his calculus an infinity of curves became available to model Nature, not just the straight line, circle, etc. of classical geometry. All motion became the subject of a mathematical theory, and in particular the motion of the planets, the most famous puzzle in all of mathematical science, was solved and understood. It must have seemed to people living then that human reason was on the threshold of unlimited accomplishment. Writings from this period, the Enlightenment, are wonderfully optimistic in this sense, and documents like the Constitution of the United States of America embody the courage to attack old problems, like governance, with the power of human rationality and a confident new spirit.

Newton’s theory of the universe was the paradigm for understanding all physical phenomena for over 200 years. The picture is basically very simple. The universe consists of small “massy particles.” These particles would move 1.4. A CAPSULE HISTORY OF PHYSICS 15 in straight lines at constant speed in the absence of external influences. This is quite a non-obvious assumption, because we virtually never see things moving in straight lines at constant speed. The reason is that all particles are subject to external influences – the effects of the other particles! Each particle exerts a force on each other particle, causing its motion to depart from simple uniform motion in a complicated way. The result is the world we see.

The force in this description includes the universal gravitational force, by which each particle attracts each other particle, a force that Newton described with mathematical precision. It has the intriguing property of getting weaker with the inverse square of the distance between particles (a mysterious proportionality). There are also other forces, like the forces that hold particles together to make solid objects, and forces of contact by which one object resists being penetrated by another one – forces that Newton did not claim to understand in any deep way. These forces too could be treated within his theory in a practical, approximate way.

Through the 1700’s and 1800’s this picture was used to model the be- havior of solids, liquids, and gases, now thought of as consisting of massy particles (not necessarily atoms, but not inconsistent with atoms either). In 1785 Coulomb discovered that the electric force, familiar to everyone nowa- days as the force of static cling in polymer fabrics, is also an inverse square force between particles, much like gravity, but stronger, sometimes attractive but sometimes also repelling. Experimental research in electrical phenomena revealed that electric currents exert forces on each other, related somehow to the mystery of magnetism. Accidental discoveries revealed that electric- ity plays a mysterious but crucial role in living systems. Mary Shelley’s Frankenstein, written in 1818, is a window into the strangeness of this new knowledge.

An interesting controversy developed in the early 1800’s over the nature of light. Newton had believed that light, like everything else in his mechanics, consists of particles, but increasingly experiments, notably those of Thomas Young (who also contributed to the decryption of the Rosetta Stone), sug- gested that light is a wave, with a very small wavelength. There was already a Newtonian theory of sound waves, described initially by Newton himself: the particles of the air, or of a solid or liquid, move in undulatory ways in a sound wave. Thus a Newtonian theory of light waves was also possible, as 16 CHAPTER 1. WHAT IS PHYSICS? the undulations of the particles of some medium. This is a bit peculiar, since light, unlike sound, travels through a vacuum, but Newtonian ideas were so firmly established that this seemed to prove that the vacuum is really a kind of tenuous but rigid material! In this context the vacuum was called the ‘ether,’ taking its name from ancient speculations about the material of the heavens, a “fifth element”. James Clerk Maxwell’s theory of the motions of this material, the stresses and strains in it, became a unified theory of electricity and magnetism, including light waves, to his immense astonishment and satisfaction. This superlatively beautiful theory is still our theory of electricity and magnetism, although we no longer believe it describes a material ether. The theory survives, while the mechanical model that inspired it has fallen away. Maxwell’s theory, as we now understand it, is a break from the Newtonian model of the universe – it includes things which are not massy particles: namely the electric and magnetic fields, permeating space. Light is then undulations of these fields, and hence a new kind of non-mechanical wave. As Newton’s ideas were developed in the 19th century, and reformulated in various ways, a new quantity, energy, became increasingly important. En- ergy is a truly mysterious thing, and it is difficult to recognize it as a single, unified concept, because it is so changeable – it takes so many forms. With this new concept, though, one has a somewhat different picture of the world, as an arena in which energy is always being passed around. The forces that were so important in Newton’s original description are now just the way particles pass energy among each other. But energy can also go from the particles to the electric and magnetic fields, and we don’t say that the par- ticles are exerting forces on the fields. So the Newtonian picture, containing just particles and their mutual forces on each other, becomes a subset of a bigger picture. Energy also flows without any forces being involved from hot to cold bod- ies, a mysterious process involving the concept of temperature that is outside the Newtonian mechanical picture. Such energy can then be transferred to other bodies through a force, in the good old Newtonian way. This kind of energy flow, involving both heat and mechanical force is the physicists’ view of heat engines, like steam engines, that burn fuel to do mechanical work, and the basis of the branch of physics called thermodynamics. One very successful attempt to bring thermodynamics back within the 1.4. A CAPSULE HISTORY OF PHYSICS 17

Newtonian framework is called statistical mechanics, and is still a lively research area. It basically assumes that the mysterious flow of heat, the non-mechanical transfer of energy, really is mechanical, but at a microscopic level, where the motions of particles are essentially random, and must be treated statistically. The phenomenon of Brownian motion, first noticed by the botanist Robert Brown as the apparently ceaseless random motion of pollen grains in water, was recognized only in the 20th century by Einstein as a visible manifestation of this random microscopic mechanical energy.

Brownian motion is an exception, in that it shows what is going on at very small length scales while still being large enough to see in a microscope. Much of what came to occupy physicists in the 20th century is too small to see in any conventional sense, but could still be seen in a different sense: the things themselves are not seen, but their vibrations are seen very easily!

This was first done in identifying the chemical elements in compounds through the “flame test.” A bit of unknown material might produce a green flame, indicating the presence of copper, when held over a Bunsen burner. Robert Bunsen and Gustav Kirchoff at the University of Heidelberg refined this method by looking at the spectrum of the green light with a prism, and found that typically only certain well defined colors – or wavelengths, or frequencies, which are equivalent ways to say it – were present in the emitted light: in the case of copper, for instance, not just green, but a very precise, definite green, with a definite frequency, and perhaps other definite frequencies at lower intensity, which were only noticeable when the light was dispersed through the prism. This method of looking at the spectrum of flames led to the rapid development of the field of spectroscopy, and large catalogues of the natural frequencies of the chemical elements and their com- pounds.

These frequencies were useful to know for identification purposes, even while their origin was completely mysterious. A striking example is provided by astronomy. Around 1850 August Comte, the philosopher of Positivism, had given an example of something we could never know, even in principle: the nature of the material that makes up the stars. Just a few years later, Bunsen and Kirchoff analyzed starlight by the methods of the new spec- troscopy, and found exactly the frequencies that correspond to the lighter chemical elements on Earth. One unfamiliar family of frequencies, found in sunlight, was hypothesized to indicate a new, unknown chemical element, 18 CHAPTER 1. WHAT IS PHYSICS? and was named helium from the Greek for sun. Helium was later discovered on Earth as well, in natural gas. Thus spectroscopy provided a kind of win- dow into the submicroscopic world, and suggested that for practical purposes we could envision that world as made of oscillators with mysterious, definite frequencies. It is the oscillators that we “see.” The microscopic nature of these oscillators became clear only gradually, and by a circuitous route. First, experiments in “Crookes tubes,” glass tubes with a good vacuum inside and electrodes leading to the outside where they could be connected to a high voltage source, showed glowing “cathode rays.” The setup is rather like what we now use for fluorescent lighting, except that in a Crookes tube the vacuum is better and you use direct current, not alternating current. The cathode rays turned out to be a stream of negatively charged particles, accelerated by the applied high voltage, striking the residual gas atoms left in the tube and making them oscillate (emit light at their characteristic frequencies). These “electrical” particles were named electrons, and proved to be one of the fundamental constituents of matter, drawn out into space in the Crookes tube, where they could be studied apart from the complications of solid materials. Second, the Crookes tubes at high enough voltage produced X-rays (Roent- gen, 1895), so named because their nature was initially mysterious, much more penetrating than cathode rays, capable of going right out through the glass, and through solid, opaque material as well. Wilhelm Roentgen won the first Nobel Prize in Physics, in 1901, for this marvellous discovery. Third, at almost the same time as this discovery, natural radioactivity was discovered in certain minerals (Becquerel, 1896). It included penetrat- ing radiation, like X-rays, but with no need for a high voltage source, and also ionizing radiation, not so penetrating, which turned out to be high en- ergy charged particles, like cathode rays, but both positively and negatively charged. These phenomena were clues to the microscopic nature of matter in themselves, and they also provided tools for studying it, since even without knowing their nature precisely, beams of these radiations could be directed at thin material targets. How would they be deflected? Such experiments are in principle easy to do. The most important result of such “scattering experiments” was the dis- covery by Geiger and Marsden (1911) in the laboratory of Ernest Rutherford that the positive charge in matter is concentrated in very small, massive 1.4. A CAPSULE HISTORY OF PHYSICS 19 nuclei, while the light electrons spread out to fill up most of the space. Radioactivity tells us that matter is made of charged particles. Thus the oscillators of spectroscopy must somehow represent oscillations of these charges. It would make sense to try to imagine configurations of charges that would oscillate at the observed frequencies, but such attempts never succeeded. The actual structure is completely different. Rather the observed frequencies correspond to energies of the charge configurations, not frequen- cies! The reason these two things could be confused is a truly mysterious proportionality between energy and frequency in light. This proposal of light quanta, or photons, from Albert Einstein won the Nobel prize many years later, in 1921, but when it was proposed in 1905 it seemed crazy even to Einstein’s admirers, as we know from letters. It was the key, though. The frequencies of the oscillators, as inferred from spectroscopy, actually are telling us not about frequencies, but about the differences in energy levels of the microscopic entities of matter. These en- ergies can be computed from amazingly straightforward models, resembling Newtonian models (atoms somewhat like the solar system, for example), with Coulomb’s inverse square force providing the interaction, and every- thing moving in a space where Euclidean geometry still describes the basic spatial relationships. In the new setting, called quantum mechanics, only certain energies are allowed. Classical analogues can still guide us at the scale of atoms: this is fantastic luck. It means we have a theory – quantum theory – of atoms, molecules, solid state electronics, etc., that is the basis for much of modern technology. Einstein is best known popularly for his Theory of Relativity. This quintessentially 20th century theory illustrates the unity of physics across the centuries, because it takes up again a question that had been discussed in antiquity, namely how we can tell if we are moving. Einstein’s short answer is that we can’t tell. We can only tell if we are moving relative to something else. Thus the idea of being “at rest” has no absolute meaning. This ques- tion arose in antiquity and in Galileo’s time with reference to the motion of the Earth (why don’t we notice it if we are spinning along at hundreds of miles per hour?) and in Einstein’s new formulation it had still other startling consequences that we leave for Chapter 20. Despite the continued success of Euclidean geometry even at the atomic length scale, the limitations of Euclidean geometry had begun to be found. 20 CHAPTER 1. WHAT IS PHYSICS?

A consequence of Einstein’s 1905 relativity theory, pointed out three years later by Hermann Minkowski, is that Euclidean space would henceforth have to be augmented with a fourth dimension, time, and might better be called spacetime. This spacetime is not Euclidean, in the sense that the natural no- tion of length is quite different from length in familiar geometry: so different in fact that it can be positive, negative, or zero. (Euclidean length is always positive.) A very daring extension of this idea, general relativity, suggests that spacetime is distorted by energy, and that the apparently curved paths followed by particles coasting along under the influence of gravity, like home runs and planetary orbits, are actually the straight lines of the spacetime. This non-Euclidean theory of gravity has been confirmed in a number of crucial experiments. One of the outstanding problems of modern physics is to reconcile the non-Euclidean geometry of gravitation theory with quantum mechanics. The ideas proposed in connection with this problem are even more daring exten- sions of geometry, although there is no consensus on what the outcome will be. Chapter 2

Mathematical Tools

This chapter is all about the mathematics of proportionality, in various guises, with a few related ideas on the side. It starts with the idea of propor- tionality in arithmetic, and moves to the idea of proportionality in geometry. These are things you already know, so the point isn’t to learn them as if for the first time, but rather to aim at a more sophisticated understanding of them.

On the arithmetic side we count the straight line graphs we get when we graph proportionality relationships. A related idea is graphing data and finding that they fall on a straight line, always a good method for analyzing data if you can use it. The idea of detecting a power law in data by graphing the logarithms may be new to you. It is a very important notion! Changing units is an essential use of proportion, as important as it is familiar.

On the geometric side we recall that similar triangles are proportional to each other, and we especially recall the ratios that define the trigonometric functions like sine and cosine. There is one thing here that might be new, the small angle approximation for trigonometric functions.

We begin with something you learned in grade school, but try to dress it up in an interesting way, to get a feeling for the (rather low) level of mathematics in the early Renaissance, when the story of modern physics begins.

21 22 CHAPTER 2. MATHEMATICAL TOOLS 2.1 Proportion

If you had been born into a 15th century Italian merchant’s family, you would have been sent to an abacus school to learn the mathematics you would need for life. This would have been basic arithmetic, followed by the culmination of your mathematical education, the Rule of Three. Apart from some practical geometry, this rule was basically all that higher mathematics had to offer you, but it was surprisingly effective. With the Rule of Three the great fortunes of Islamic civilization and Renaissance Europe were founded. And with the Rule of Three physics was reborn. We still use it, although not by that name. The Rule of Three is an algorithm for solving problems like “If 3 pounds of wool cost 5 ducats, how much do 7 pounds cost?” As the merchants saw it, you are given three numbers, and you are to find the fourth. The rule says that you multiply two of the numbers, and divide by the third. There were ways to identify which number was which, in any such problem. To look at one of the old abacus school textbooks is to realize that you could learn how to solve such problems with almost no understanding, just by rote. Algebra was invented in part to make it clearer what was going on in this problem, because there is an unknown here, and you are to find it. A graph also helps to make the meaning clear, but this innovation came later. In various forms, we will be meeting this problem again and again, so let us start with it. The unspoken assumption behind the Rule of Three (using the problem about ducats and pounds as an example) is that ducats D and pounds P are proportional. That is, if you want twice as many pounds, it will cost you twice as many ducats, or D ∝ P (2.1) It is exactly the same thing to say there is some constant of proportionality k, such that D = kP (2.2) This assumption has nothing to do with the particular numbers given in the statement of the problem. It is more abstract. D and P are variables, related by proportionality. We even have a diagrammatic representation of this proportionality relationship, Fig 2.1, in which all possible pairs (P,D) 2.1. PROPORTION 23

D x 10

5

0 P 0 5 10

Fig. 2.1: The proportionality between ducats D and pounds P is represented in a graph. are visualized as a straight line through the origin. No particular values have any special importance. It is a picture of the relationship. In the figure, however, we have singled out with dashed lines the values that were given in the problem, P = 3, D = 5, and P = 7, along with x, the value we were to find. Notice that this graphical solution to the problem does not even use arithmetic! The constant of proportionality k in Eq (2.2) is the slope of this line, namely k = D/P for any point (P,D) on the line. As you go over by P you go up by D, so slope is a measure of the steepness of the line. But we are given a point on the line, namely (3, 5). Starting at 0, we go over 3 and up 5. Thus we are given k = 5/3 (ducats/pound), the price per pound. Knowing k we can find D for any P . In particular if P = 7 pounds, then D = kP = (5 ducats/3 pounds) × (7 pounds) = 35/3 ducats. As the Rule of Three said, the answer is found by multiplying 5 and 7, and dividing by 3. Doing out the arithmetic, though, even in so simple a problem, obscures the basic idea, which is not really about numbers. The real idea is given by the relationship between the variables in Eq (2.1) or (2.2), or Fig 2.1. Actually putting in the numbers is not very illuminating. Pity the poor 24 CHAPTER 2. MATHEMATICAL TOOLS schoolchildren of the Renaissance who did put the numbers into problem after problem of this type, but never learned to read a relationship like Eq (2.2), or a graph like Fig 2.1, summarizing the whole idea!

2.2 Units

What we were really saying in the previous section is that money M is proportional to wool W , i.e. M ∝ W (2.3) These quantities are not numbers, though! Wool is a real thing, and money is a real thing. They are represented by numbers only if we measure them, and to measure them we must choose units, like pounds for wool and ducats for money. The units are arbitrary choices, but the relationship Eq (2.3) is true, quite apart from these choices. Thus proportionality is more than just a numerical relationship. A better way to represent the problem in a graph is Fig 2.2, which is almost the same as Fig 2.1, but represents the real quantities on the axes, and indicates the units in which they have been measured. If we changed units, the numbers would change, but the meaning would be exactly the same. To convert from one unit to another, you use the Rule of Three once again, because the measures of things in different units are proportional. If 3 ducats is the same quantity as 4 florins, then what is 9 ducats? You multiply 9 ducats by the constant of proportionality k = 4/3 (florins/ducat) to get 9 · 4/3 = 12 florins. A peculiar thing about this conversion is that, since 4 florins = 3 ducats, k, their ratio, is 1! Thus, when we convert units, we are just multiplying by the number 1, in a peculiar form. For the purpose of arithmetic it looks like the number 4/3, but because it has units, and is really 4/3 (florins/ducat), it is the number 1. It would be a serious mistake to omit writing the units! When we convert, we are expressing the same real quantity of money, 9 ducats: the quantity does not change, although the number changes from 9 to 12 (florins). Converting units kept a lot of Renaissance accountants busy, and it has kept scientists busy too. Gradually, over the centuries, units have been stan- dardized and the process of conversion has, for the most part, been made as simple as is reasonably possible, but there is no escaping the frequent need 2.2. UNITS 25

Money (Ducats)

10

5

0 Wool (Pounds) 0 5 10

Fig. 2.2: Money in this transaction is proportional to wool, but the numbers depend on the units you choose. to do it. It is a skill that we, like Renaissance schoolchildren, should simply acquire, and be able to use accurately. Since centimeters and meters are related by 100 cm = 1 m, for example, the number 1 in the form 100 cm 1 = (2.4) 1 m can be used to convert from meters to centimeters. If you are 1.76 m tall, you are µ ¶ 100 cm 1.76 m = 1.76 m = 176 cm (2.5) 1 m i.e. 176 cm tall. Notice how the unit ‘meters’ cancels in numerator and denominator in the middle step, leaving centimeters. We can also say the conversion factor is 102 cm/m, using power of 10 notation, or 10−2 m/cm, the form we would use to convert from centimeters to meters. The whole point of the metric system is that these changes should just require shifting the decimal point. To change to a non-metric unit of length like inches, we could use 2.54 cm = 1 in, so that your height in inches is µ ¶ 1 in 176 cm = 176 cm = 69.3 in (2.6) 2.54 cm 26 CHAPTER 2. MATHEMATICAL TOOLS

In short, to convert units, simply multiply by 1 in the appropriate form, and take time to write everything out carefully. Even within the metric system it would be easy to move the decimal point the wrong way. With the above method, if you take the time to use it, it is almost impossible to make this mistake. Here is a unit conversion that many people get wrong. Let’s make sure we get it right! If a certain area is 2 square meters, what is it in square centimeters? We do it out carefully: µ ¶ µ ¶ 100 cm 100 cm 2 m2 = 2 m2 = 2 × 104 cm2 (2.7) 1 m 1 m Just notice that to convert meters squared we need the conversion factor between meters and centimeters twice. In the metric system every basic unit can be used with prefixes to derive other units, related to the basic unit by a power of ten. Below we give most of these prefixes, together with their combinations with the meter. Prefix effect on -meter Full name Abbreviation Mega- 106 m Megameter(?) Mm kilo- 103 m kilometer km centi- 10−2 m centimeter cm milli- 10−3 m millimeter mm micro- 10−6 m micrometer, micron µm nano- 10−9 m nanometer nm 10−10 m Angstrom˚ A˚ pico- 10−12 m picometer pm The Angstrom˚ is not really a metric unit, despite being related to the meter by a power of 10, but it is a convenient length, being about the size of an atom. It is no longer used, but one sees it in older literature. I have never seen the Megameter actually used. One says thousands of kilometers.

2.3 Data: Straight Line Plots

In Figure 2.2 it was assumed that the two quantities, wool and money, were proportional. The straight line in that figure is a theoretical line. You could 2.3. DATA: STRAIGHT LINE PLOTS 27 imagine testing this theory by actually observing transactions in the market- place. Every time someone bought wool, you would record the amount P of wool, and the price D. Then you would plot these data as points in a graph in the (P,D) plane. Each transaction is a point. It might look something like Fig 2.3. In this made-up example we see that the data do indeed lie along

D (Ducats)

10

5

0 P (Pounds) 0 5 10

Fig. 2.3: A scatter plot of transactions in the wool market. The data suggest proportion- ality, as indicated by the dashed line. a straight line indicating proportionality. Thus, we could conclude, it is es- tablished by experiment that P ∝ D. Not only that, but we could introduce the constant of proportionality k, and say D = kP , where the value of k can be read from the data as the slope of the line that fits, indicated as a dashed line in Fig 2.3. This k would be an empirically determined price-per-pound. This way of handling data is surprisingly useful and effective. You can tell in a graph if the points seem to lie along a straight line. As in this fictitious example, you wouldn’t expect empirical data to line up perfectly, but the linear trend is obvious. To take a more physical example, we know that when you dive down under water, the pressure P increases with depth D. Suppose we have some kind of pressure gauge and we measure actual pressure vs depth data It might look like Fig 2.4. You can see in the figure that gauge pressure and depth are proportional, even though the units are 28 CHAPTER 2. MATHEMATICAL TOOLS

Gauge Pressure P (Arbitary units)

Depth D (Arbitrary units)

Fig. 2.4: The data indicate gauge pressure is proportional to depth. not indicated. They are unnecessary to make this point. If you wanted to know the constant of proportionality for later use, though, you would need to know the units.

Like all proportionality relationships, the line in Fig 2.4 goes through the origin. That is, when the depth D is zero, the pressure P is zero. Is this really true, though? When we say D = 0, we mean the surface of the water, before we start to go down. Is the pressure zero there? That is what our gauge said, but in fact, it must be measuring pressure with reference to the pressure at the surface, or, to put it more simply, the change in pressure from the pressure at the surface. A closely related concept is the absolute pressure, and if we were to measure absolute pressure somehow, we would find the pressure at the surface is not zero, but a positive value called “atmospheric pressure.” A plot of absolute pressure P vs depth D might look like Fig 2.5. This is just like Fig 2.4, except that now all the pressure values have atmospheric pressure included. In particular, at D = 0, we can read off that the pressure is not zero, but a positive value, which must therefore be the atmospheric pressure. It is still true that the change in pressure ∆P is proportional to the change in depth ∆D, measuring from the surface.

This use of the Greek letter ∆ (delta) to indicate a change in a quantity takes getting used to. Here P means absolute pressure, but ∆P means the 2.3. DATA: STRAIGHT LINE PLOTS 29

Absolute Pressure P (atm)

3

2 ∆P

1 ∆D Depth D (meters)

10 20 30

Fig. 2.5: The absolute pressure is a linear function of depth. The change in pressure ∆P is proportional to the change in depth ∆D. change in P from some other point. In Fig 2.5 it is indicated that the change is measured from the surface, where D = 0. Thus ∆P = P − Patm = Pgauge in this case, and ∆D = D − 0 = D. In particular, it is crucial in using this notation to be clear that we are not multiplying by something called ∆! Rather, we are finding the change from some other (D,P ) point, which must be understood if the notation is to make sense. In Fig 2.5 we mean the change from (0,Patm). We can express this linear relationship between P and D by

P = Patm + kD (2.8) This is the slope-intercept form of the equation of a straight line. When D = 0, P = Patm =1 atm, and Patm is therefore the intercept on the P axis. The slope k of the line is ∆P k = ≈ 0.1 atm/m (2.9) ∆D Thus k is also the constant of proportionality relating gauge pressure to 30 CHAPTER 2. MATHEMATICAL TOOLS depth. This is only saying that the slopes of the lines in Fig 2.4 and 2.5 are the same. They are really the same line, with one of them moved up by Patm.

2.4 Uncertainty in Data

The “data” in Figs 2.4 and 2.5 scatter about the straight line, and don’t lie precisely on it. Such scatter is a property of all measurement. It is often called “measurement error”, but this term suggests that there is some kind of mistake. A better name is “measurement uncertainty.” This suggests, correctly, that uncertainty is an inherent property of measured values, and not a mistake. Scientists must simply learn to deal with it. Measurement uncertainty is probably the reason Aristotelian philosophers believed mathematics could not describe our world. After all, if all measure- ment is uncertain – and it is! – then how can you trust it? This is a deep question. The most important method we have to deal with measurement uncer- tainty is to keep track of its estimated size through the measurement process, and to indicate the size of the uncertainty when we talk about the data. How good are the data? This is a crucial piece of information, which should al- ways be included. In Figs 2.4 and 2.5, although the data are made up, we could imagine that the size of the X’s representing the data points is meant to indicate the uncertainty in each point. An important convention in quot- ing numerical data is to give only the significant digits, with the last digit uncertain. This is a quick and easy way to indicate uncertainty. Let us think how the measurement of pressure vs depth might have been made, and estimate the uncertainty. A pressure gauge is lowered down on a rope. When we have paid out 100 meters of rope, we will say the depth is D = 100 m. What is uncertain about this? Well, the rope might not be hanging straight down. Perhaps we are on a boat that is being moved a bit by the waves and wind, or perhaps there are currents down there which carry the rope somewhat sideways. We know that D = 100 is about right, but we also know that these effects could make a difference of 1 meter or so. We should probably say D = 99 ± 1 m (since these effects tend to make the depth D less than the rope paid out). If we know something about how the pressure P is measured, we could also estimate how uncertain P is. In this 2.4. UNCERTAINTY IN DATA 31 way we keep track of the uncertainty. As you do an experiment, you may find checks of consistency. The size of the scatter, that is, the size of the uncertainty, is determined along with everything else. You may even find ways to “beat the uncertainty down” – but never to eliminate it altogether! A good experimentalist always thinks about the measurement and the uncertainty at the same time, not because the uncertainty invalidates the measurement, but just the reverse: because knowing the uncertainty makes the measurement meaningful. If we can’t estimate the uncertainty, a mea- sured number is meaningless. Galileo, at the very beginning of modern physics, somehow had a very good feeling for this subtlety of the measurement process. Without it he could not have been so sure – and he was sure – that mathematical relationships are verified in Nature. The conceptual difficulty in making this step can hardly be overstated. It is part of Galileo’s genius that he was able to hold together in his mind the perfection of a simple mathematical theory and the messiness of real data, and to believe that they somehow corresponded. He put it like this:

“Just as the merchant who wants his calculations to deal with sugar, silk, and wool must discount the boxes, bales, and other packings, so the mathematical scientist, when he wants to recognize in the concrete the effects which he has proved in the abstract, must deduct the material hindrances, and if he is able to do so, I assure you that things are in no less agreement than arithmetical computations. The errors, then, lie not in the abstractness or concreteness, not in geometry or physics, but in a calculator who does not know how to make a true accounting.”

For Galileo, dealing with error meant recognizing that data come to us with some extra wrapping! That’s not a bad picture to keep in mind. Galileo seems to say that our mathematical theories are exact, and that it is only “material hindrances” that make this difficult to see. There are, in fact, a few systems that are so simple that we can (now) make very accurate predictions and also make very accurate measurements, and in these few cases theory and experiment agree, so far as anyone can tell, exactly. These simple 32 CHAPTER 2. MATHEMATICAL TOOLS systems include the motions of planets and satellites, moving in a vacuum, governed by the gravitational force; and the natural frequencies of atoms in a vacuum, governed by quantum mechanics. One should also mention the behavior of a single electron in a magnetic field. Apart from these simple systems, though, “material hindrances” play a much bigger role. Physics still has something to say about situations that are not artificial physics experiments, but it is rough and approximate. Textbook problems are often worded as if all quantities were known exactly. In most settings, though, the mathematical model that we call physics is an idealiza- tion, deliberately leaving things out to keep the model simple. The things we leave out, if they are truly not very important, may manifest themselves as small systematic discrepancies in what we expect and what we actually observe. They may also manifest themselves as a small apparent randomness around an average value. Making detailed, precise predictions from physical theories usually re- quires higher mathematics. That is not what this book is about. Using physics in a rough and ready way, though, does not at all require higher math- ematics, but only a knowledge of proportion, and this is the way physics is actually used most often in practice: as a method of approximate prediction.

2.5 Dimension

No matter how we change the units, a length remains a length. Any length is said to have dimension [L]. This is a rather peculiar use of the word “dimension”. It simply means that the quantity in question is a length, and could therefore be measured in meters or inches, or any other unit of length, without committing us to a particular choice. The quantity σ in the preceding section had dimension [L], for instance. An area, for example the base × height of a rectangle, being the product of two lengths, is said to have dimension [L2]: dimension obeys the rules of algebra when you multiply. A product carries the dimensions of its factors. This dimension tells us that the units of area could be m2, or in2. A volume, being the product of three lengths, say length × width × height, has dimension [L3], and could be measured in units m3 or cm3 (cubic centimeters, or cc’s, as they say). 2.6. POSITION, TIME, AND (CONSTANT) VELOCITY 33

A conversion factor like (100 cm)/(1 m) in Eq (2.4) is a ratio of two lengths, and hence dimensionless. Such things are pure numbers. A conver- sion factor, specifically, is always the ratio of a quantity to itself (in different units), so it is always the pure number 1.

There are only two other familiar quantities in physics that behave like length in this way, namely time, which carries the dimension we denote [T ], and mass, which carries the dimension we denote [M]. Products of physical quantities that carry these dimensions continue to carry them. Dimension is a property that persists in statements of proportionality. Therefore the arith- metic of physical quantities is not just the ordinary arithmetic of numbers: all physical quantities carry with them their dimension.

It is a mysterious thing that most of the familiar physical quantities, like force, velocity, energy, stress, diffusion constant, momentum, viscosity, etc. are all products of powers of just three kinds of things, [M], [L], and [T ]. (We will later introduce one more, electric charge [Q].) This confirms, perhaps more clearly than anything else, that physics is about proportionality. All physical quantities are, dimensionally speaking, products of just a few types of quantities, and products are proportional to their factors.

2.6 Position, Time, and (constant) Velocity

If a moving object moves a distance D that is proportional to time t, then we say D ∝ t (2.10) or, equivalently D = vt (2.11) for some constant v. Here we are describing a relationship between D and t. D and t are variables, not numbers. As time t increases, the distance D travelled increases proportionately. The constant of proportionality is the speed v. The letter is chosen to suggest “velocity”, a near synonym for speed. A graph of D vs. t would be a straight line, like Fig 2.2, but with the axes labelled distance and time instead of money and wool, measured in appropriate units. The line would have slope v. 34 CHAPTER 2. MATHEMATICAL TOOLS

Since D is a length, it has dimension [L], and t, being a time, has dimen- sion [T ]. This means v must have dimension [v] = [LT −1]. Suitable units for it would be meters/second or miles/hour, etc. Here v is just a number, with units, corresponding to the constant speed of the object. A generalization of this relationship is the position x of something that moves with constant velocity v, where x is the usual coordinate along a line or axis. If we say x = vt (2.12) we are saying much the same thing as before. The difference is that all quantities now could be either positive or negative. Distance is considered to be intrinsically positive, as when we say that the distance between any two positions x1 and x2 is |x2 − x1|. Negative distance doesn’t make sense, but negative position does make sense: it is to the negative side of x = 0. The word “speed” is also taken to be intrinsically positive, but now we are calling v “velocity”, and it can be either positive or negative, depending on which way the object is travelling. Speed is the absolute value |v|, always positive, whether v is positive or negative. Finally, t no longer means elapsed time, which would always be positive, but rather time on a clock, which is negative if it comes before the time which is arbitrarily called 0. A further generalization of this relationship is the linear function

x = x0 + vt (2.13) which still describes something moving at constant velocity v, only now not starting at x = 0 at the time t = 0, but rather at x = x0. This could describe something moving just like x = vt, but with a head start x0. Its position differs from x = vt by x0 at all times, as if the two were running along keeping a constant distance between them. The graph of these two relationships is Fig 2.6. The linear relationship x = x0 +vt is in slope-intercept form because the two constants, v and x0, are the slope and intercept respectively in the graph. We recall the definition of the slope of a line, given any two points on it, (t1, x1) and (t2, x2). In this example we know the equation of the line, Eq (2.13), and so we know the slope, but if we didn’t know it, we could reconstruct it from the two points, using the definition x − x ∆x slope = 2 1 = (2.14) t2 − t1 ∆t 2.6. POSITION, TIME, AND (CONSTANT) VELOCITY 35

x

x0

t

Fig. 2.6: The graph of x = x0 + vt (solid) and x = vt (dashed). The slope of the lines, v, is constant and the same for both.

We have used the ∆-notation to indicate the change in position and the change in time. Physically it is obvious that this combination must give v, because it is a displacement (∆x) divided by the time necessary to travel it (∆t), and that is just what we mean by velocity. The following problem comes up frequently in various disguises: suppose that at time t = 0, A is at position PA and B is at position PB. A and B move with constant velocities vA and vB respectively. When and where do they meet? The problem can be pictured (and solved) graphically as in Fig 2.7. It is clear that there is typically just one time and place where they will meet. If, however, we have the very special case vA = vB, in which they are moving with the same velocity, then the lines will be parallel (unlike what is shown) and they won’t meet. It is clear that in this case they will keep a constant distance, as in Fig 2.6. We can look for these features in an algebraic solution. The statement of the problem tells us that xA, the position of A, and xB, the position of B, are given, for any time t, by

xA = PA + vAt (2.15)

xB = PB + vBt (2.16) 36 CHAPTER 2. MATHEMATICAL TOOLS

x

xm

PA

t tm PB

Fig. 2.7: Two objects A (red) and B (blue) move along the x axis with constant velocity and meet at a definite time tm and position xm.

The condition that they meet is xA = xB, and using the expressions above, we find that this is a condition on the time t. It is satisfied at t = tm, where

PA − PB tm = (2.17) vB − vA

Even if we are given a problem of this kind with numbers, it is better to introduce algebraic symbols and do it algebraically (you can put the numbers in at the end, if necessary). One benefit of doing it algebraically is that you don’t find yourself doing numerical arithmetic at every step, but only at the end. The main benefit, though, is that a result like Eq (2.17) contains a lot of useful information that is lost when you substitute numbers. This information, if you take the time to read it, is useful as a check, and it can also suggest other ways of looking at the problem. Let us take the time to read and interpret Eq (2.17). The first thing to do is check the dimensions, for consistency. On the left hand side we have tm, which is dimensionally [T ]. On the right hand side we have [L] in the numerator, and [LT −1] in the denominator. Does 2.6. POSITION, TIME, AND (CONSTANT) VELOCITY 37 the ratio have dimension [T ]? It ought to! Then we check common sense. Fig 2.7 shows what it looks like if PA > PB and vB > vA. In that case tm > 0, and we see that Eq (2.17) gives a positive meeting time. If we had had vB < vA, it is easy to see in Fig 2.7 that the slope of the blue line would be less than the slope of the red line, and the meeting time tm would be negative. But that is also what Eq (2.17) says, because in this case the denominator would switch signs and become negative, while the numerator would still be positive. Finally, as we imagine vB → vA, i.e., the velocities becoming the same, we know that A and B don’t meet, because the lines in the graph would be parallel, and that is also what Eq (2.17) says: as the denominator goes to 0, tm → ∞. That is how Eq (2.17) tells us there is no meeting in this case.

It is always a good idea to check special cases where the result is obvious. For example, if PA = PB then A and B are together at t = 0, and hence it is obvious that tm = 0 in this case – just what Eq (2.17) says. Also, in the limiting case that one of vA or vB is very large (the limiting case of one of them going to infinity), then we would have tm ≈ 0. That too is common sense: the gap between A and B is closed in almost no time.

There is yet another way to read Eq (2.17). The numerator PA − PB is the initial relative position of A with respect to B, i.e., the displacement from PB which ends at PA. The denominator vB − vA is the relative velocity of B with respect to A. Their ratio is just the time it takes to cover the relative displacement PA − PB at the relative velocity. Amazingly, this illustrates the principle of relativity: we could imagine how the motion of A and B looks to someone who is moving along at speed vA, i.e., the same speed as A. To him, A appears not to be moving. In fact all the velocities are changed, according to this observer, in having vA subtracted off, so that what was vA becomes vA − vA = 0 and vB becomes vB − vA. Now how long does it take A and B to come together? Well, A sits still, at an initial distance PA − PB from B, and B approaches at velocity vB − vA. Clearly the time necessary is given by Eq (2.17). By shifting the point of view to the moving perspective, we have made the problem easier.

It is always worth inspecting a result for the way it depends on the vari- ables in the problem. But that is only possible if they are left as variables! It would be a good exercise to solve for the position xm where A and B meet in this problem and try to interpret the result, just as we interpreted tm above. 38 CHAPTER 2. MATHEMATICAL TOOLS

Motion at constant speed is rather special, and most things don’t move with constant speed, at least not for long, so as a model of real motions constant speed motion, D ∝ t and its various generalizations, is still not very general. There is an interesting physical example, however, of something that does turn out to move at constant speed, namely light. We say a little about this in the next section.

2.7 The speed of light, and SI units

The speed of light in vacuum is a constant of Nature, and is given its own symbol c. The distance D travelled by light in time t is D = ct where c ≈ 3 × 108 m/s. This is such a high speed that for everyday purposes, not involving clever instrumentation, the speed of light might as well be infinite. Nonetheless, over the centuries, beginning around 1675, people have succeeded in measuring the speed of light with ever increasing accuracy. In recent years, though, something peculiar has happened. Einstein’s Special Relativity Theory asserts that c is the same for everyone, a true con- stant of Nature. This theory is now unconditionally accepted. Meanwhile the definition of our standard unit of length, the meter, has been changed several times since its adoption in the French Revolution. Its most recent change was in 1983, when the roles of c and the standard meter were re- versed. The speed of light is now precisely 2.99792458 × 108 m/s, and is now understood to be the definition of the meter. The meter is, by definition, the unit of length such that in 1 second light in vacuum travels 2.99792458 × 108 meters. The definition of the second is, as we shall see, on very firm footing, so the meter is now just as firm. The value for c was chosen to agree with the previous definition of the meter, to the accuracy attainable, at the time of the changeover. By international agreement the units most often used in physics are the Syst`emeInternational units (SI units), based on a standard unit for [L] (the meter, abbreviated m), [T ] (the second, abbreviated s), and [M] (the kilo- gram, abbreviated kg). We will, for the most part, use SI units in this book. Very occasionally we will not write the units with a measured number, but simply say “SI”. This means you have to look at the dimension of the quan- tity and construct the SI unit from that. If the dimension is [MLT −2], for 2.8. DIMENSION AND SCALING 39 example, then the SI unit is kg·m/s2. Usually, though, we will write out units in full.

2.8 Dimension and Scaling

In Galileo’s last book, Two New Sciences (1638), he makes a very interesting use of the notion of dimension. His idea is that you either know or you can guess, for some physical quantities, that they are proportional to volume. Such a quantity would grow like volume if it were scaled up. Similarly, quantities proportional to area would grow like area if they were scaled up. Here is the argument Galileo makes. If you were twice as big, meaning twice as tall, twice as wide, etc., then you would weigh eight times as much, because weight is proportional to volume, and volume [L3] is proportional to the cube of the linear dimension. On the other hand, what about the strength of your bones and muscles? Galileo argues that strength is proportional to the cross-sectional area of the bone (or muscle), and if the bone is twice as thick, then its cross-section is four times as big, because area [L2] goes as the square of the linear dimension. Thus you would be eight times heavier, but only four times stronger: you would be effectively weaker, relative to your weight, by a factor of two. Galileo’s argument implies that if you simply scale something up indefinitely, eventually it would be unable to support its own weight and would collapse, as its weight became too much for its strength. To take an example from architecture, we note that this had actually begun to happen in 16th century Europe. Cities vied with each other to build the biggest and most impressive cathedrals. Eventually there were some disasters (Beauvais, for instance), and the largest planned cathedrals were never finished. Galileo applied these ideas to animals. Many mammals have essentially the same body plan, but they differ very much in size. In order that the large ones not be too weak, in the sense described above, their bones have to be thicker than you would expect from simple scaling. If the length of the bone goes up by a factor of 2, for example,√ the thickness should go up not by a factor of 2 but by a factor of 8 = 23/2 ≈ 2.8. Thus the bone would look thicker and clumsier. (Galileo’s own drawing exaggerates the effect to make the point.) We would certainly not expect elephant bones to be simply delicate mouse bones scaled up! The argument is so appealing and 40 CHAPTER 2. MATHEMATICAL TOOLS so persuasive that it has only quite recently been carefully checked. Real animals turn out to be a bit more complicated than Galileo said. But the idea has been very influential. The background to this story is amusing. In 1588, some fifty years before he wrote Two New Sciences, Galileo was an unemployed dropout medical student with a keen interest in geometry. Through his own abilities, and perhaps family connections, he managed to be a serious candidate for the open chair of mathematics at the University of Pisa (which belonged to Flo- rence). In what may have been, in effect, a job interview, he was invited to address the Florentine Academy, and to lecture on mathematics, something virtually none of the members knew anything about. He chose to expound the geometry of Dante’s Inferno, a work they all knew intimately, and did so in very entertaining fashion. He got the job! But in the process he committed a blunder that he must have realized soon after. The Inferno, as he describes it, is an enormous region within the Earth, capped by a layer of the Earth’s crust some 400 miles thick. He explicitly says that it would not be in danger of caving in because it is geometrically similar to a large dome in architecture (like the famous Brunelleschi dome on Florence’s own cathedral), but scaled up. This is just the situation that Two New Sciences addresses: scaling up makes the dome weaker, something he hadn’t realized at the time, and in fact makes it obvious that the Inferno would collapse after all! Thus Galileo must have lived with a painful dilemma all of his life: his marvellous insight into scaling and proportionality was something he had to keep to himself. To reveal it would have been an embarrassment not so much to himself, since after all, he would be the one to correct the error, but really to Florence and the Florentine Academy, for reasons that are too political to go into here. It also might have been dangerous to point out that the Inferno couldn’t actually exist, because the literal existence of a Hell within the Earth was Catholic dogma. His solution was to keep quiet, but to publish at the end of his life, when nothing more could happen to him.

2.9 Power Laws, and the Logarithm

Galileo guessed that strength should be proportional to L2, where L is a typical length in an animal or a structure. Is that true, though? And what exactly is meant by “strength” here? It was never defined. 2.9. POWER LAWS, AND THE LOGARITHM 41

Suppose we make some operational definition of the strength S of a beam, say the weight necessary to break it when it is supported in a certain way, etc. Then we could actually measure strength S vs. length L for various beams, all scale models of each other. In this way we could determine experimentally if Galileo was right about S ∝ L2. The best way to check this experimentally would be to graph the measured values of S vs. L2. We know how it should look if the two are proportional: the points should lie on a straight line through the origin. Suppose we try this, though, and it doesn’t work: they don’t lie on a straight line through the origin. Galileo’s idea is so plausible that it occurs to us: maybe he just guessed the wrong power in the scaling law. Perhaps it is S ∝ Lα (2.18) for some other power α, different from 2, perhaps not even an integer. Now we seem to have a tedious problem. We could guess values of α, one after the other, and for each one graph S vs. Lα to see if we have proportionality (by looking for a straight line through the origin, as before). This trial and error process could go on for a long time. There is actually a much better way to detect such a power law, though. We should graph log S vs. log L, where log is a logarithm. We’ll see in a moment how this works.

First, though, recall the most important property of logarithms: if A > 0 and B > 0, then log(AB) = log(A) + log(B) (2.19) That is, the logarithm of a product is the sum of the logarithms. It follows that

log(A2) = log(A · A) = log(A) + log(A) = 2 log(A) (2.20) log(A3) = log(A2 · A) = 2 log(A) + log(A) = 3 log(A) (2.21) ... (2.22) and in general log(Aα) = α log(A) (2.23)

Now we return to our power law hypothesis,

S = kLα (2.24) 42 CHAPTER 2. MATHEMATICAL TOOLS which we have rewritten, introducing a constant of proportionality k (un- known). If this power law were true, it would also be true, by taking the logarithm on both sides, that

log(S) = log(k) + α log(L) (2.25)

This says that if we graph log(S) vs. log(L) we should get a straight line of slope α and intercept log(k), because it is a linear relationship in slope- intercept form. If we just graph the data this way, we check all possible power laws in one graph! A straight line indicates a power law, and in that case the slope tells us the power. The intercept is perhaps not so interesting, but it still tells us the constant of proportionality in the power law, if we want to know it.

Since this procedure may not be very familiar, we illustrate it with made- up data in the table below:

S L 1.6 1 9.1 2 89.4 5 506 10

A plot of log(S) vs. log(L) shows that the data points lie along a straight line: thus the relationship between S and L is a power law. Picking two points on the line we find that the slope, and hence the power, is about 2.5. If you try checking this exercise in detail, you may find yourself wondering which logarithm we used. For any positive number except 1 – let us say 10, for concreteness – there is a corresponding logarithm, called logarithm base 10, abbreviated log10. For Fig 2.8 we used the logarithm base e ≈ 2.71828..., also called the natural logarithm, abbreviated ln. It makes no difference for this purpose, though, which logarithm you pick, as long as it is the same choice on both axes. The reason is, logarithms to different bases are proportional! So a different choice of logarithm is like changing the units in the same way on both axes. In particular, the slope α will come out the same for any choice. 2.10. NUMBERS IN GEOMETRY ARE RATIOS 43

ln(S)

5

0 ln(L) 0 1 2 3

Fig.2.8: A log log plot of the data in the preceding table. The data appear linear with slope about 2.5, indicating S ∝ L2.5.

2.10 Numbers in Geometry are Ratios

Euclidean geometry does contain numbers, but these numbers are ratios of lengths, or of areas, and hence dimensionless. The numbers of geometry are pure numbers. There is no need for units. The most famous ratio in all geometry is π = C/D, the ratio of a circle’s circumference C to its diameter D. π cannot be written as a decimal fraction, so it is a rather awkward number for arithmetic. A good approximation is 3.14159265... For almost any practical purpose one could use fewer decimal places. The diameter D of a circle is 2R where R is the radius, so

C = πD = 2πR (2.26)

A circle of radius 1 is called a unit circle, and its circumference is 2π. The definition of π uses the notion of geometric similarity, that is, that all circles are similar: they differ only in being scaled up or down. In similar figures, the ratios of corresponding lengths are the same, because they are 44 CHAPTER 2. MATHEMATICAL TOOLS scaled up or down together. A circle of radius R is similar to a circle of radius 1, and its circumference 2πR is just 2π scaled up by the same factor R. There are familiar formulae that we learn from geometry that seem to involve dimensional quantities, for example

Area of circle = πR2 (2.27) Surface Area of sphere = 4πR2 (2.28) 4 Volume of sphere = πR3 (2.29) 3 In these formulae the areas are clearly of dimension [L2] and the volume of dimension [L3], as they should be. That is not how Archimedes stated them, though, when he proved them. Rather he said that the ratio of the circle’s area to the square on its radius is π. On his tomb was engraved the sphere inscribed in a cylinder, a wordless reference to his beautiful result above on the volume of the sphere: what he actually said was that the ratio of this 2 volume to the volume of the cylinder is 3 . Here is another example: a theorem of Euclid (Elements I, Proposition 32) says that the sum of the interior angles of any triangle is 180◦. At least that is how we say it now. Where we say 180◦, however, Euclid says “two right angles.” That is, the ratio of the sum of the interior angles of a triangle to the right angle is 2. And a right angle is not just a shorthand for 90◦, but rather it is the angle you get where two perpendicular lines cross, and makes no reference to any number whatever. The degree, as a measure of angle, is a very old unit, going back to the Babylonians, and was almost certainly known to Euclid, but he didn’t use it. Thus the famous theorem about the angles of a triangle is another illustration that numbers in geometry are ratios. Except for the ratio 2, numbers here are irrelevant. The idea is best expressed by the picture that accompanies Euclid’s proof, Fig. 2.9.

2.11 The Trigonometric Functions

Similar triangles differ from each other only by an overall scale factor. It is easy to recognize when triangles are similar because scaling leaves the angles the same, and only changes the overall size. Two triangles are similar if and only if they have the same angles. The ratio of corresponding sides is 2.11. THE TRIGONOMETRIC FUNCTIONS 45

B

β

D β α γ α AC Fig. 2.9: Euclid’s Prop. I.32. The three interior angles of the triangle all appear at point C, where they clearly fit together to make two right angles (one straight angle). CD is the line through C parallel to AB. That the angles β and α which appear at C really are the same as the corresponding angles of the triangle follows from properties of parallel lines. For example, the two angles labelled β are “opposite interior angles” and hence equal.

the same for both of them, and thus depends only on these angles, and not on which triangle with these angles one is thinking about. For example in Fig 2.10 we have a/b=A/B, since these are ratios of corresponding sides in similar triangles. This connection between proportionality and similar triangles leads to graphical methods for solving problems. If you look back at Fig 2.1, you can see it in perhaps a new way: our first solved problem used similar triangles, although we called it a graph, so you probably didn’t notice the triangles. By far the most important use of this idea is in similar right triangles. Since in these triangles one angle is a right angle, the sum of the other two angles is also a right angle, so that the sum of all three angles together is two right angles. The two acute angles that add to a right angle are called complementary. The triangle is thus specified completely by giving just one acute angle. The other two angles are the complement of that one and a right angle. The ratios of sides in a right triangle depend only on this one angle, and are extremely important geometric ratios to know about. In old days they were carefully tabulated in books, and now they are available on calculators. You put in the angle and get out the ratio. Angles are 46 CHAPTER 2. MATHEMATICAL TOOLS

B b

a A Fig. 2.10: The triangles are similar, so the ratios of corresponding sides are equal: a/b=A/B frequently given Greek letter names, a continuous tradition from Hellenistic times, like θ (pronounced theta) in Fig 2.11, which shows how the ratios of sides are named. Given an angle θ, one can ask for sin(θ), cos(θ), or tan(θ) (pronounced sine of theta, cosine of theta, tangent of theta). The reciprocals of these are also possible ratios, but are less commonly used. In Fig 2.11 you can see that if we use the complementary angle, the sides we call “adjacent” and “opposite” would be switched (but the hypotenuse is of course still the hypotenuse). Thus the sine of an angle is the cosine of its complement, and the cosine of an angle is the sine of its complement. The tangent of an angle is the reciprocal of the tangent of its complement, since switching b/a gives a/b. This reciprocal a/b is called the cotangent of θ, but we will not use it. The sine and cosine are related through a very useful identity. Referring to Fig 2.11, and recalling the Pythagoras Theorem a2 + b2 = c2, we have

cos2 θ + sin2 θ = 1 (2.30)

Throughout the Renaissance these functions of an angle provided a living to mathematicians and military engineers. These methods could be used, for example, to determine the height of a cliff or a tower without climbing it. To determine the height b of the tower in Fig 2.11, you would pace off the distance a to the tower, and also, by sighting, determine the angle θ. 2.12. ANGULAR MEASURES 47

sin(θ)=b/c c b cos(θ)=a/c tan(θ)=b/a θ a

Fig. 2.11: The trigonometric ratios. Remember the mnemonic SOH-CAH-TOA: Sine is Opposite over Hypotenuse, Cosine is Adjacent over Hypotenuse, Tangent is Opposite over Adjacent

Then since tan(θ) = b/a, you compute the height as b = a tan(θ) To use the method, of course, you need some way to compute tan(θ) from the measured θ. That was where the expertise came in. Galileo spent a good part of his early career developing streamlined methods to make measurements and computations of this sort, using a “Military Compass” of his own invention. His students were young noblemen who might need this kind of knowledge in military campaigns. His thorough knowledge of such geometric methods was the perfect preparation for him to interpret what he saw through the telescope, when in 1609 his life was changed by this invention, and he turned it on the moon and Jupiter. He used the idea of similar triangles to measure the height, not of a tower on earth, but of mountains on the moon!

2.12 Angular Measures

The two sectors of a circle in Fig. 2.12 differ only by scaling: that is, al- though one of them is bigger, they have the same angle θ. Thus they are geometrically similar, and the ratio of the arclength to the radius is the same in both: s/r = S/R. This is much like the observation in Fig. 2.10, except that here one of the lengths is measured along the curved arc of the circle. 48 CHAPTER 2. MATHEMATICAL TOOLS

S s θ θ r R

Fig. 2.12: The two sectors are similar. The ratio s/r = S/R is the radian measure of θ.

The ratio S/R determines the angle θ, and is called the radian measure of θ. For many purposes this way of measuring angle is simpler and more practical than the more familiar degree measure. In any case, both measures are in common use, and we have to be ready to use either one. In Fig. 2.13 we see the angle 90◦ with an arc added to make it a quarter sector of a circle. Since the circumference of the circle would be 2πR, one quarter of that, the length of the arc, would be πR/2. Thus the radian measure of this angle is π/2.

πR/2

R

Fig. 2.13: The radian measure of 90◦ is π/2.

By the same argument, 1/8 of the circle is the angle 45◦ or π/4 radians, 2.12. ANGULAR MEASURES 49

60◦ is 1/6 of the circle or π/3 radians, and 30◦ is 1/12 of the circle or π/6 radians. It is worth practicing the conversions for these special angles so that you can do them almost without thinking. Since conversion of units is so important, let us do it out carefully one more time. We choose some convenient angle for which we know the measure in both systems: say the quarter circle, which is 90◦ or π/2 radians. Then the ratio of these two angles, which is just the ratio of an angle to itself, is the number 1, only written in a very peculiar way:

π/2 radians π radians 1 = = (2.31) 90◦ 180◦ It looks as if it might have been better to say that the half circle is 180◦ or π radians: this is how most people remember it. In any case, we now can multiply any angle given in degrees by the number 1 in the above form to convert it to radians. Let us check it on one of the special angles we already know: we will find the radian measure of 60◦, µ ¶ π radians π 60◦ = 60◦ = radians. (2.32) 180◦ 3

The units, radians and degrees, follow the rules of algebra in this computa- tion. In particular, degrees occur in both numerator and denominator, and hence cancel, leaving just radians in the result. The units tell us we have the number 1 in the right form to make the conversion. Let us try this on an angle we haven’t converted yet, 1 radian. What is that in degrees? For this purpose we need the reciprocal of the conversion factor (which is still just the number 1, being the ratio of one and the same angle to itself), µ ¶ 180◦ 1 radian = (1 radian) ≈ 57.3◦. (2.33) π radians

By being careful to write the units along with every number, we can be sure of making the conversion correctly. In this example, radians cancel in numerator and denominator, leaving a result with the units degrees. In terms of radian measure, the angles of a triangle add up to π. In an equilateral triangle, for example, three equal angles add up to π, so each is π/3. 50 CHAPTER 2. MATHEMATICAL TOOLS

1 1 1

1 1

Fig. 2.14: Special triangles

2.13 Trigonometric functions of special an- gles

Use Fig. 2.14 to show that

sin(π/6) = cos(π/3) = 1/2 (2.34) √ sin(π/3) = cos(π/6) = 3/2 (2.35) √ sin(π/4) = cos(π/4) = 2/2 (2.36) √ tan(π/6) = 1/ 3 (2.37) √ tan(π/3) = 3 (2.38) tan(π/4) = 1 (2.39)

These special angles are frequently useful in practical problems. 2.14. SMALL ANGLE APPROXIMATIONS 51 2.14 Small angle approximations

If θ is a small angle, then the small angle approximations apply:

sin θ ≈ θ (2.40) tan θ ≈ θ (2.41) cos θ ≈ 1 (2.42)

Here ≈ means “approximately equal.” For example, if π/6 is small, then we would find sin(π/6) ≈ π/6 = 0.5236, which is not so different from the exact value sin(π/6) = 0.5000. Of course this approximation only works if we use radian measure! In units of degrees this angle is 30, which is not at all a good approximation to 0.5000. In fact thirty degrees, or π/6 radians is not even a particularly small angle, which is why the approximation is not especially accurate even when we do it right. The meaning of “small” here is “much less than 1 radian.” The best way to get a feeling for it is perhaps to compute sin θ for various θ less than 0.1 or so. You will find that the approximation is excellent for angles as small as this, and the smaller the angle, the better.

AB

θ 1 CD

Fig. 2.15: The small angle approximation is the statement that the lengths AC, AD, and BD are all approximately equal. In the picture it is hard to see the difference, so it is clear that they are approximately equal. Which of these is sin θ, which is tan θ, and which is θ itself?

We can understand the small angle approximation in a picture, as seen in Fig. 2.15. That figure also alerts us to two more special angles, namely 0 and π/2, the limiting case of complementary angles as the triangle degener- ates to a line. In this limit the small angle approximation is exact, and we 52 CHAPTER 2. MATHEMATICAL TOOLS have special values like those in the previous section, but perhaps even more important,

sin(0) = cos(π/2) = 0 (2.43) cos(0) = sin(π/2) = 1 (2.44) tan(0) = 0 (2.45)

To see this, just imagine the triangle in Fig. 2.15 collapsing to a line. The small angle approximation, in case the angle is small but not zero, is a good example of accepting a little messiness (it is not quite exact) for the sake of simplicity (it is easy). Quick, what is the tangent of 0.01 radians? It is typical of physics to work with models that are believed to be exact, and then to cut corners to get quick, informative answers. It is precisely because the underlying theory is exact that we can use approximations with confidence: we can’t be very far off! And if necessary, of course, we can always calculate more exactly. We will use the small angle approximation a lot in the next chapter. Problems

Many of these problems use terms and concepts not defined yet. We put them here to point out that there are things one can say without knowing any details, just using the notions of proportionality and dimension.

Proportionality

For each of problems 2.1-2.5 include both a graphical solution, like Fig 2.1, and a numerical solution.

2.1 In an electromagnet, the magnetic field B is proportional to current I. If B is 5 × 10−2 Tesla when I is 3 Amperes, what current do you need for a field of 0.2 Tesla?

2.2 In a circuit, the current I is proportional to applied voltage V . Suppose that when V = 0.4 Volts, I = 5 milliamperes. What current will you have if V is 1.1 Volts?

2.3 The energy E collected by a solar cell is proportional to time t. If it collects 900 Joules in 5 seconds, how long will it take to collect 3000 Joules?

2.4 Pressure P in a liquid is proportional to depth d. If the pressure is 1.2 × 105 Pascals at a depth d = 80 cm, what will be the pressure at a depth of 2 meters?

2.5 Electrical resistance R in a wire is proportional to length L. If a 2.4 m wire has a resistance of 0.2 Ohms, how long should a wire be to have a resistance of 1 Ohm?

53 54 CHAPTER 2. MATHEMATICAL TOOLS Dimensions and Units

In the following problems, understand x to be a length [L], t to be a time [T ], and m to be a mass [M]. For each new symbol, determine its dimension and its SI unit. Each problem depends on the one before it, so work carefully!

2.6 x = vt, so v has dimension ... and SI unit ...

2.7 v = gt, so g has dimension ... and SI unit ...

2.8 F = mg so F has dimension ... and SI unit ...

2.9 P x2 = F , so P has dimension ... and SI unit ...

2.10 P = S/t so S has dimension ... and SI unit ...

Conversion of Units

Another system of units, similar to the SI system, is the cgs system. In the cgs system, the units for [L], [M], and [T ] are the centimeter, the gram, and the second respectively (hence cgs). One must frequently figure out how to make these conversions. Give the unit in each system, and the conversion factor for converting in each direction, for the following physical quantities.

2.11 Area.

2.12 Volume.

2.13 Force. The dimension of force is [MLT −2]. The SI unit is called the Newton, and the cgs unit is called the dyne, so this problem asks for the conversion factor for Newtons to dynes, and vice versa. 2.14. SMALL ANGLE APPROXIMATIONS 55

2.14 Energy. The dimension of energy is [ML2T −2]. The SI unit is called the Joule, and the cgs unit is called the erg, so this problem asks for the conversion factor for Joules to ergs, and vice versa.

2.15 Mass density. The dimension of mass density is [ML−3].

2.16 Viscosity. The dimension of viscosity is [ML−1T −1]. The SI unit of viscosity is the Pascal-second, and the cgs unit is the Poise, so figure out how to convert Pascal-seconds to Poises, and vice versa. The viscosity of water is about 1 centipoise. What is that in Pascal-seconds?

Scaling

2.17 In the 19th century, the body mass index (BMI), a kind of numerical statistic associated with the human body, was defined as BMI = M/h2, where M is the mass and h is the height in SI units. One sometimes hears that a healthy BMI is between 20 and 25. (a) Using Galileo’s ideas, say how the BMI should scale with linear size. What does that suggest about the usefulness of the BMI in identifying well proportioned bodies? Write a few sentences about this. (b) A young boy had the following BMI’s at different ages:

Age 4 9 13 BMI 17.8 21.0 25.1

People who knew him thought he was always a bit overweight, but his BMI changed a surprising amount. What is going on? Hint: use your answer from (a).

2.18 A water filtering system uses a fixed volume V of a material that absorbs water contaminants onto its surface. The rate at which it removes contaminants is proportional to the surface area of the material, so we would like to make this area as large as possible. 56 CHAPTER 2. MATHEMATICAL TOOLS

(a) If the material were made into a sphere, it would have minimal surface area. What is this area? How does it scale with volume V ? (b) Suppose instead that the material is made into N identical smaller spheres, each with volume V/N. What is the surface area of one such sphere? What is the total surface area of all N spheres? In particular, how does the total surface area depend on N? Why should the material be in the form of a fine powder?

Position, Time and Velocity

2.19 The most famous paradox of Zeno says that Achilles, who is a fast runner, cannot catch a slow moving tortoise if the tortoise has a head start. The argument goes like this. Suppose Achilles can run 10 m/s and the tortoise only 1 m/s, but the tortoise has a 10 m head start. Then it takes 1 s for Achilles to cover this 10 m, but in that time, the tortoise has moved on another 1 m. In the time 0.1 s it takes Achilles to cover this distance, the tortoise has moved on yet again, by 0.1 m. In the time it takes Achilles to cover this distance, the tortoise has moved on ... You get the idea. It looks as if Achilles will never catch the tortoise!!

(a) Represent the positions of Achilles xA and the tortoise xT as linear functions of time, as in Section 2.6.

(b) Solve for the time tm at which Achilles catches the Tortoise (by this method it looks as if he does catch him).

(c) Express tm as a decimal fraction, and point out its relationship to Zeno’s paradox.

2.20 Two trains are heading towards each other on the same track. (a) How much time is there to avert the trainwreck? They are 60 miles apart, one going 40 mph and the other 20 mph. Do the problem two ways, a hard way and an easy way. Notice that the two speeds are of course positive, because speed is always positive, but the trains are travelling in opposite directions, so the velocities must have opposite signs. 2.14. SMALL ANGLE APPROXIMATIONS 57

(b) If you didn’t make a graph in part (a) showing what happens, mod- elled on Fig 2.7, make one now. (c) Where does the wreck occur, if it does?

2.21 Carry out the project suggested near the end of Section 2.6: that is, find xm algebraically in terms of PA, PB, vA, and vB, and simplify it as much as possible. Interpret its common sense meanings in words.

Power Laws

Test the data below by making a graph to see if there is a power law rela- tionship y = Cxα for some constant C. If there is a power law, determine the power α.

x 2 5 10 20 2.22 y 56.6 89.4 126.5 178.9

x 2 5 10 20 2.23 y 0.283 1.12 3.16 8.94

x 2 5 10 20 2.24 y 1.8682 1.4192 1.1527 0.9363

2.25 In The Almagest, Ptolemy’s system of the world, hundreds of stars are listed and their brightnesses given as visual magnitude 1, 2, 3, 4, 5, or 6. Here 1 is the brightest and 6 is the dimmest. The number of stars in each category is given in the table below. It looks as if Ptolemy was not interested in listing all the dim stars in magnitude 5 and 6, since there are many more than the ones he names. The brighter stars, though, are fairly represented in the table. The number clearly grows with visual magnitude for magnitudes 1-4. Is it a power law? Show a graph that helps decide this question.

Magnitude 1 2 3 4 5 6 Number of Stars 15 45 206 476 217 49 58 CHAPTER 2. MATHEMATICAL TOOLS Ratios in Geometry

2.26 The meter, still the SI unit of length, was originally defined so that the distance from the North Pole to the Equator along the longitude line through Paris should be 107 meters. (a) What is the radius of the Earth in meters, assuming the Earth is a sphere? (b) What are the advantages and disadvantages of defining the length standard in this way?

2.27 In the 3rd century B.C.E. Eratosthenes measured the size of the earth by comparing astronomical sightings with ground based measurements. The story as it comes down to us (probably oversimplified) is that the sun at noon on midsummer’s day shines directly down into the wells in Syene, on the Nile. Therefore the sun is directly overhead. But in Alexandria, some 1 ◦ 500 miles to the north, the sun is not overhead, but rather 7 2 south of the overhead position. (a) Explain how the shadow of a vertical stick can be used to measure the sun’s angle at Alexandria. (b) Explain the use of Fig 2.16 in the argument, and use it to estimate the size of the Earth from the Alexandria data.

2.28 A tower is 200 m away from you, over level ground. The top of the tower appears 20o above the horizontal from where you stand. How high is the tower? How might your own height affect your answer? Include an informative sketch that shows the idea of the computation.

2.29 A square is constructed on the hypotenuse of a right triangle with a 30o acute angle in it. What is the ratio of the area of the square to the area of the triangle?

2.30 On Archimedes’ tomb was engraved a sphere inscribed in a cylinder. The cylinder was a right cylinder over a circular base, like a tin can, and to say the sphere was “inscribed” means that the cylinder was just big enough 2.14. SMALL ANGLE APPROXIMATIONS 59

θ A θ S

Fig.2.16: Noon rays of sunlight at Alexandria (A) and Syene (S). The angle θ is somewhat 1 o exaggerated here. It is actually only 7 2 . The Earth’s equator is shown as a dashed line. to contain the sphere, touching it at the top and bottom, and around the equator. Sketch this famous figure, and show that Archimedes’ own formulae imply that the sphere has 2/3 the volume of the cylinder.

2.31 Explain how the Pythagoras theorem and the special triangles in Fig 2.14 imply the values of the trigonometric functions for π/6, π/4, and π/3. Which of these trigonometric values, would you say, are the easiest to remember? 60 CHAPTER 2. MATHEMATICAL TOOLS Chapter 3

Geometrical Optics

Around the year 1420 a few painters in Italy began using geometry to plan and design their paintings. The result was an artistic revolution, called per- spective painting. They had discovered, or re-discovered, a mathematical description of how we see, and in particular why things appear large or small to us. What they realized – and they had Euclid’s Optics as an authority for it – is that the size we see is really angular size. This idea is a bit tricky, because seeing has an important psychological component too. Something large appears small if it is far away, to be sure, but if we know it is large, we make a mental adjustment. The most dramatic example of this phenomenon must be the appearance of the full moon. On the horizon at nightfall it looks enormous, but later, high in the sky at midnight, it looks much smaller. And yet its angular size never changes! (It is hard to believe this until you have measured it yourself. As you read this chapter, think of a way to measure the angular size of the moon, and try it out on the next full moon.) Why the moon seems to change size is still a subject for debate – perhaps when we see the moon on the far horizon, where large things should look small, being so far away, and even at that great distance the moon doesn’t look small, we interpret it as meaning that the moon must be huge. Later, when it is high in the sky, and we have no cues about its great distance, we have no reason to believe it is large, and thus it looks smaller. (This is not the only theory.) The example of the full moon is the exception, though. What the Renais- sance painters discovered is that on the whole, we do see by geometry. The

61 62 CHAPTER 3. GEOMETRICAL OPTICS experimental proof was in their paintings. Constructed by geometry, they looked “real”. One of the first examples, the Trinity fresco of Massacio in Santo Croce in Florence, painted in 1425, can still be seen today. This fresco astounded Massacio’s contemporaries – they knew it was painted on a flat wall, but, as they said, it was like looking through the wall – so realistic was the illusion of depth and distance.

3.1 Angular Size

The secret to painting realistically is to make everything in the painting have the same angular size it would have if the scene were real. So what exactly is angular size? Well, the angular size of the moon is about one half degree, or about 0.0087 radian. This means the angle formed at the eye by rays of light coming from the extreme edges of the moon is 0.0087. The situation is illustrated in Fig. 3.1 for the angular size of any sphere. The angle θ at the eye is also called the angle subtended by the sphere. In the small angle approximation it is θ ≈ 2R/D, where R is the radius of the sphere and D is the distance to the sphere. More generally, for any object, not necessarily a sphere, the angle subtended at the eye, in small angle approximation, is the height H of the object divided by its distance D, as in Fig. 3.2. This theory assumes that light travels in straight lines, and that we can use Euclidean geometry! Notice how the small angle approximation was used in Figs. 3.1 and 3.2 to get the main idea into a simple form. The angular size of something is just its linear size divided by its distance, 2R/D in the case of a sphere of radius R, and H/D in the case of something of height H. It would work the same way for an object of width W : it would subtend angle W/D at the distance D. This makes angular size a truly simple concept. The formula is not quite exact, however! The actual angular size of the moon is µ ¶ −1 Rmoon 2Rmoon θmoon = 2 sin ≈ (3.1) Dmoon Dmoon

Is it worth using the exact formula to figure out the angular size of something, including the symbol sin−1 that you might not even understand? For many purposes, no. We will usually assume we are describing something that is 3.1. ANGULAR SIZE 63

R

θ/2 D E O

R

Fig. 3.1: The angle θ subtended at the eye E by a sphere of radius R at a distance EO=D can be computed by studying this figure. In small angle approximation it is θ = 2(θ/2) ≈ 2 sin(θ/2) = 2R/D, just the diameter of the sphere divided by its distance.

H/2

θ/2 D E O

H/2

Fig. 3.2: The angle θ subtended at the eye E by an object of height H at a distance D is θ = 2(θ/2) ≈ 2 tan(θ/2) = 2(H/2)/D = H/D. far enough away so that small angle approximation applies. Thus we will often say = (equals) where strictly speaking we should say ≈ (approximately 64 CHAPTER 3. GEOMETRICAL OPTICS equals). Now suppose you want to represent the full moon realistically in a paint- ing. How should you paint it? Well, it should have angular size 0.0087 to the viewer. That means it should be painted with a height H, or width W (or just call it diameter, since the moon is round) such that H/D = 0.0087, where D is the distance of the viewer. This points out a problem with per- spective painting. The whole construction must assume that the viewer is at some definite distance D. In most locations where you would put a painting, though, there is no way to force the viewer to stand in a particular place. Fortunately this turns out not to be a critical consideration. The psycho- logical aspects of seeing make us quite tolerant about being in the wrong place: the realism effect still works. So just assume a reasonable distance for D, like 2 meters. Then the diameter of the moon in the painting should be 2 × 0.0087 = 0.0174 m = 1.74 cm. Of course most paintings don’t try to represent things life-size. More often the whole scene is scaled so that things only keep the same relative proportion to each other, but appear, say, 2/3 the size they actually would. Again, we adjust to this psychologically, and we still say the painting looks realistic. We even seem to have a particular fondness for miniature paintings. Perhaps it is a kind of nonverbal joke that things look both real and not real (because they are small). Our way of seeing is all about proportions. That is the secret of perspective painting. Suppose a scene contains two human figures, each about the same height H in reality, but at different distances, D1 and D2. Then in the real scene they have angular sizes θ1 = H/D1 and θ2 = H/D2. Their relative angular sizes are therefore θ1/θ2 = D2/D1 (the common factor H cancels out in the ratio). Note common sense: the nearer figure subtends the bigger angle. This same proportion must be observed in the painting, where the figures will be painted with heights H1 and H2, and viewed at distance D, so that they have angular sizes H1/D and H2/D. The ratio of these is H1/H2, and it must agree with D2/D1 as found above: therefore H D 1 = 2 (3.2) H2 D1 For instance, if the second figure is twice as far away in the real scene, so that D2/D1 = 2, then the height of the second figure in the painting must 3.2. THE EYE 65

be only half that of the first figure in the painting, so that H1/H2 = 2. The second figure looks smaller in the real scene because it is farther away, and it looks smaller in the painting because it is painted smaller (but it is not farther away). The relative angular size, however, is the same in both cases. The angle H/D gets smaller as D gets larger (at fixed H), but it also gets smaller as H gets smaller (at fixed D). That is the phenomenon that is being exploited here. Everyone understands this qualitatively, but we are giving a genuinely mathematical description. This extra step requires an effort, of course, and the masters of Renaissance painting took this idea much farther than we are taking it here. In doing so they were laying the foundation for the rebirth of theoretical physics.

3.2 The Eye

In Renaissance theories of how we see, the eye was just a point, like the point E in Fig. 3.1. To understand optics, though, it is essential to know how the eye works in more detail than that! In Fig. 3.3 the point E has been replaced by a schematic diagram of the eye, with its own extended structure, looking at an object of height H at a distance D, so that the object has an angular size θ = H/D. We suppose rays of light from the extremities of the object enter the pupil of the eye and hit the retina as shown. This amounts to a theory of how the eye works. WARNING: there are significant problems with this theory already! Try to spot them – we will improve the theory as we go. The image occupies the region H0 on the retina. How big is the image H0? Using the small angle approximation, and assuming that the angle θ outside the eye is also the angle inside the eye, as it would be if the rays were straight lines, we have H0 θ = (3.3) D0 where D0 ≈ 2.5 cm is the distance from the pupil of the eye to the retina, i.e., just the diameter of the eye. Therefore

H0 = D0θ (3.4)

This is the kind of statement that we want to be able to read with un- derstanding. It says that the size of the image (H’) is proportional to the 66 CHAPTER 3. GEOMETRICAL OPTICS

H' θ H

D' D

Fig.3.3: A first attempt at a theory of the eye. WARNING: this is a useful start, but even the physics is not this simple, not to mention the eye! angular size (θ) of the object, with a constant of proportionality D0, the di- ameter of the eye. This is what we have been claiming right along: what we see (H’) is proportional to angular size (θ), i.e., H0 ∝ θ. It isn’t a statement about numbers: it is a statement about a relationship, in this case just the simple relationship of proportionality. Perceived size and angular size are proportional. This is a mathematical theory. Cameras work the same way: the images that form on the film, or on the detector array of a digital camera, also have a linear size proportional to the angular size of the object being photographed. Of course the eye is not a simple camera, and we don’t literally sense the pattern on the retina (in particular we don’t see the images upside down, as they actually form!) But these images are the raw material of seeing, and their sizes reflect angular sizes. That information is largely retained in the mental processing of vision.

3.3 Binocular Vision and Parallax

Another way we use our ability to measure angle with our eyes is to measure distance. When we fix attention on a nearby point P with both eyes, the eyes 3.3. BINOCULAR VISION AND PARALLAX 67 individually see the point P in slightly different directions. This difference,

B

A

H

θ P A D

Fig. 3.4: Two eyes, separated by H, look at a point P at distance D. They do not agree on the angular position of P , differing by the angle θ, called the parallax angle. This difference shows up as an offset on the retina. The top eye does not image P at the point A, as the bottom eye does, but at a different point B. This offset is thus a measure of the angle θ. Since θ = H/D and H is the fixed distance between the eyes, the visual system has measured the distance to P , namely D = H/θ. the parallax angle, is a measure of how close P is, as you can see in Fig. 3.4. In fact, if P is a distance D away, then the parallax angle is θ = H/D where H is the distance between the eyes, about 10 cm. Since H is fixed by our anatomy, a measure of θ is a measure of D (or, you might say, of 1/D). How do the eyes measure θ? Fig. 3.4 points out that if the eyes look straight ahead, then the two images of P are at points on the retina that don’t correspond. In principle the brain could process the information in this form, but actually, in this configuration we see a double image and we don’t get very good distance information. When we bring the two images together, by crossing our eyes, muscles turn our eyes toward each other until the point P that we are paying attention to is imaged at corresponding points 68 CHAPTER 3. GEOMETRICAL OPTICS on the two retinas. How far we have to turn our eyes is a measure of θ. We sense it by sensing the strain in the muscles that move the eyes, and the eyes provide the feedback so that we know when we have turned far enough. This complicated set of sensations is integrated in the brain into a sense of distance. For distant objects we don’t have to use these muscles – the parallax angle is essentially zero, and both eyes can look in the same direction. That is, the light arriving from a distant point comes along parallel lines. Starlight is certainly like this, for example, because stars are very distant, but even something 10 meters away has such a small parallax that our visual system can’t measure it. Measuring distance by measuring parallax doesn’t work for D much much bigger than H, because then all you can say is that θ = H/D ≈ 0, and if all you can say is that θ ≈ 0, then all you can say is that D is essentially infinite. Here “infinite” just means “much bigger than H,” i.e. much bigger than 10 cm. We can measure the distance to objects within a couple of meters by this method, but for objects farther away we must use other cues. In other situations where parallax is used for measuring distance, not unconsciously with the eyes, but by sighting and triangulation, the way sur- veyors do it, H is called the “baseline” for the measurement. What we are pointing out here is that if you can’t measure the parallax angle θ very accu- rately, then you can’t determine distances D that are much bigger than the length of the baseline H. Surveyors work hard to measure θ accurately, and so extend the usefulness of the method to larger D. Astronomers do too.

3.4 Wide Open Pupils

Up until now, we have treated the pupils of the eye as if they were very small, so that only one well defined ray comes through, intersecting the retina at a well defined point. A very small pupil is sometimes called a “pinhole”, and a pinhole camera can be built that actually works with a pinhole in place of a lens, just as these diagrams suggest. If our eyes really were like this, no one would need glasses, certainly an advantage. In bright sunlight you may see more clearly, even if you normally need glasses, just because your pupils stop down and are more like pinholes. 3.4. WIDE OPEN PUPILS 69

The disadvantage of pinholes for eyes is that, under normal lighting con- ditions, very little light would come through them. Of course under these conditions our pupils open up to let in more light, but that means the imag- ing system has to be more complicated than in Fig. 3.3. We propose an improved model for the eye in Fig. 3.5, where all the rays from a distant star are imaged at the same point, so that the star is seen as a nice sharp point and not as a smudge. As we pointed out in the last section, rays of

Fig. 3.5: Eye receiving light from a distant star. WARNING: this is still very schematic. light from a distant point are parallel. But then, since the rays intersect in a single point on the retina, they cannot be parallel: they must change direction as they enter the eye. The surface of the cornea of the eye, the outer transparent layer where light enters, has a precisely determined shape to be sure that this focussing happens correctly. We have not tried to show the details of this shape in the figure.

Fig. 3.6: A myopic eye.

If the cornea has the wrong shape, an extremely common condition, then the result may be more like Fig. 3.6, where the image of the star is spread over a noticeable area on the retina. This eye is myopic, near-sighted. As 70 CHAPTER 3. GEOMETRICAL OPTICS

P

Fig. 3.7: The good eye of Fig. 3.5 without an adapting lens. As the object point P moves closer, the image moves back, off the retina. we will see, if the object were brought nearby, then the image would move back toward the retina, and for a near enough object it would be imaged on the retina. This is the experience of near-sighted people, who can at least see nearby things clearly. A far-sighted eye has the opposite problem – the image is behind the retina, and bringing the object closer only causes the image to move even further back, making the focussing discrepancy worse. Laser surgery on the cornea can now change the shape of the cornea and correct these focussing errors. We will look at the geometry of this situation in more detail later.

3.5 The Lens of the Eye

By the remarks of the preceding section, the good eye in Fig. 3.5 should have trouble seeing nearby objects! It can see a star as a sharp point, to be sure, but if an object comes close, like the point P in Fig. 3.7, the image moves back and forms behind the retina. As you can tell in Fig. 3.7, the retina itself gets a spread out, blurry image of P . Our eyes have an adaptive feature to help us see nearby objects clearly, the lens, a transparent component of the eye connected to muscles that can change its shape. The cornea has a fixed shape, and that is the problem. The lens, inside, can change its shape and thus change the direction of the rays, making them focus on the retina as shown, schematically in Fig. 3.8. We do not attempt to show the lens or the actual path of the light inside the eye – but the rays cannot be simple straight lines. They must make additional deviations, just as they do at the cornea in entering the eye. These additional deviations are small corrections 3.6. REFRACTION 71

P

Fig. 3.8: The good eye of Fig. 3.5 with a lens (only the effect of the lens is shown). to bring the light to a good focus on the retina. We can sense the muscular strain in the lens muscles when we focus on something nearby, and this is another visual cue we have about distance. Of course there is a limit to how much correction the lens can apply. If you try to look at a point too close to your eye, you will not be able to see it clearly. It is a natural consequence of aging that the lens becomes less capable of applying this correction, so older folks have to get more elaborate external corrective lenses to make up for the failing internal one. If you take the lens off a camera, the camera is useless: no image forms. But if your lenses were surgically removed, you could still see (sort of). The reason is that the analogue of the camera lens is really the cornea, the outer surface of the eye, and you would still have that. That is the main component of our image-forming system. The shape of the cornea wouldn’t be right, because the construction of the eye assumes the lens will be in the system, but you could wear glasses to correct for its absence. Our “lens” is an auxiliary part of the system that cameras don’t have. By the way, how do cameras solve the problem that the lens solves for the eye?

3.6 Refraction

Image formation in Fig. 3.5 depends on the bending of the rays at the air- cornea interface. This is the phenomenon of refraction. Now we look at refraction in more detail. The basic phenomenon is shown in Fig 3.9. A light ray passes from one transparent material into another: it makes the angle θ on one side and the angle θ0 on the other side, measured from the normal direction. (“Normal” means perpendicular to the interface.) The two angles 72 CHAPTER 3. GEOMETRICAL OPTICS are different, but they are connected by a simple rule called Snell’s Law, discovered by the Dutch mathematician Willebrord Snell in the early 1600’s.

θ n

n' θ'

Fig. 3.9: Refraction at a horizontal interface. A light ray changes direction as it goes from the top medium, characterized by index of refraction n, to the bottom medium, charac- terized by index of refraction n0. The two direction angles are measured from the normal direction, perpendicular to the interface, the dashed vertical line. The picture would look exactly the same if the ray were going the other direction, from bottom to top.

Snell must have carefully measured the angles θ and θ0 on both sides, and then made various guesses about the data that he collected. The table below shows what he might have measured at an air-water interface, and how he might have made his discovery.

θ θ0 θ/θ0 sin θ sin θ0 sin θ/ sin θ0 0.300 0.224 1.34 0.296 0.222 1.33 0.600 0.439 1.37 0.565 0.425 1.33 0.900 0.630 1.43 0.783 0.589 1.33 1.200 0.777 1.55 0.932 0.701 1.33

As the angle θ in the first column of the table increases, the second angle θ0 also increases. You might guess that there is a simple proportionality between 3.6. REFRACTION 73 the two angles, and in fact the third column shows that they are roughly proportional, but the ratio isn’t quite constant, and to be proportional, they would have to be related by a constant. You consider the sines of the angles in the fourth and fifth columns, instead of the angles themselves, and aha! The sines are proportional. Their ratio is a constant, which turns out to be 1.33. This number describes refraction at an air-water interface! It is called the relative index of refraction of the interface. More generally, Snell’s law says that this works for any interface: the sines of the angles are proportional in refraction, and the relative index depends on which materials you use. The simplest way to write the law is

n sin θ = n0 sin θ0 (3.5) where n and n0 are the indices of refraction of the two materials individually. Solving for sin θ/ sin θ0 = n0/n we see that the relative index of refraction is the ratio of the indices of refraction of the two materials. Thus every transparent material is characterized by its index of refraction, and interfaces between one material and another are characterized by their ratio. The table below shows typical values of the index of refraction for a few materials. Since the experiment only measures the ratios, we get to choose some special transparent medium and give it the index 1 by definition. Initially air was given this special status. But as the theory developed to describe more phenomena than just refraction, it became clear that a better choice would be to give the vacuum the index n = 1 by definition, and then, by measurement, air under normal room conditions has index slightly greater than 1, as shown in the table. For most purposes you can take the index of refraction of air to be 1.

Medium n Air 1.00028 Ice 1.31 Water 1.33 Glass 1.5-2.0

As the table implies, there are many kinds of glass, and they differ quite a lot in their index of refraction. Also, when water freezes, its index of 74 CHAPTER 3. GEOMETRICAL OPTICS refraction changes slightly. In fact the index of refraction for a given material depends on temperature, and also on color(!), so the numbers in the table are merely representative. The index of refraction turns out to be a very interesting property, and summarizes in an average way quite a lot about how the atoms of a material interact with light. That is not at all obvious just from the phenomenon of refraction alone. For hundreds of years this index was measured for all kinds of materials under all kinds of conditions without any deep knowledge of why it had the values it did. The experimental value was enough to characterize a material for optical purposes, and so it was worth measuring, even if it was, in a deeper sense, a mystery.

In fact, Snell’s Law is a good example of a mysterious proportionality in Nature, with the generality that is typical of physics. It is an idea that is useful across the sciences. Biologists are aware of the index of refraction in microscopy, and geologists use it to identify minerals. Chemists use it to characterize compounds. For physicists it is just one manifestation of a richly detailed theory of matter that is now quite well understood. This old idea from 17th century Holland is now woven into a much larger theoretical framework of atoms, molecules, crystals, and electric fields. But even without knowing any of this, we will find Snell’s Law has interesting things to say about how the eye works.

3.7 Focal Length

Our theory of how the eye works in Fig. 3.5 is not much of a theory – it looks more like a cartoon. This figure has more content, though, than you might think, if we just use geometry and Snell’s Law to dig that content out. We redraw the essential idea in Fig. 3.10, putting the star on the horizontal axis, for simplicity, and only drawing two parallel rays. We assume the cornea has a spherical shape, of radius R, and that the rays intersect behind the cornea a distance f, the focal length, the distance the light travels to reach a focus.

This will be a more elaborate argument than anything we have done up to now. The steps are not difficult individually, however – try to check each step of the argument, and then see how it all fits together. We start with 3.7. FOCAL LENGTH 75

BDθ

θ' α R AOC E f

Fig. 3.10: Two parallel rays, DB and EC, are refracted at a spherical interface of radius R and are brought to a focus at the point A, a distance f behind the interface.

Snell’s Law in small angle approximation: in that case Eq (3.5) becomes

nθ = n0θ0 (3.6)

This says the two angles in refraction, θ and θ0, are simply proportional, and since in that case θ0 = (n/n0)θ, the constant of proportionality is just the relative index of refraction. You can check this in the first table in Section 3.6, third column, even though the angles aren’t particularly small. For an angle as small as 0.3 the approximation is quite good, and it just gets better for smaller angles (not shown in the table). Thus for small angles, Snell’s Law is particularly simple.

In Eq (3.6), with reference to Fig 3.10, the index n refers to the medium on the right and the index n0 refers to the medium on the left. If the sphere is an eyeball, then it is air on the right and eyeball on the left (or ‘vitreous humor’, the fluid inside the eye). If n is the index of refraction of air, we would have n ≈ 1, but let us keep calling it n for generality. Then at the end, just by taking n = 1.33 instead of 1, we can also tell what happens when we open our eyes under water, at a water-cornea interface. To begin checking the argument, locate the angles θ and θ0 in Fig. 3.10 and be sure you see how they really are the usual angles in refraction. 76 CHAPTER 3. GEOMETRICAL OPTICS

Now we read off some geometrical relationships from Fig. 3.10.

OB = R (3.7) AC = f (3.8) BC = R sin θ = f tan α (3.9) µ ¶ ³ n ´ n0 − n α = θ − θ0 = θ 1 − = θ (3.10) n0 n0 It takes a little to work to find all these relationships in the figure – think of it as a puzzle and try to solve the puzzle. The last one, Eq (3.10), uses Snell’s Law, in small angle approximation, Eq (3.6). Now, since the angles θ and α are small, Eq (3.9) simplifies, by the small angle approximation, to Rθ = fα, or θR n0R f = = (3.11) α n0 − n This is a result! It predicts where the rays focus, i.e. the distance f, just knowing the shape of the cornea R and the indices of refraction n and n0. That is all we need in order to evaluate the right hand side. There is a very interesting and peculiar thing about Eq (3.11). The final result, on the far right, does not have θ in it. The angle θ cancels out in the ratio θ/α, which occurs in an intermediate step of the calculation. This means that all the parallel rays like BD that we didn’t show in the figure determine the same focal length f, i.e. they all intersect ray EA at the same point A. Each such ray has its own θ, as you can verify by sketching in a typical parallel ray between the two that are already there, for example. The argument above leads to the same focal length f for all such rays, since the final result doesn’t depend on θ. Thus the rays must behave like the rays in Fig 3.5, all coming together to a single focal point.

3.8 Interpreting Relationships

It is easy to make a mistake in a complicated argument like this, so it is essential, when we finally get to a result like the one in Eq (3.11), to check common sense, by interpreting the meaning. 3.8. INTERPRETING RELATIONSHIPS 77

The first thing is to check dimensions: f on the left is a length (dimension [L]), so the right side must also be a length. We see that it is: R is a length (dimension [L]), and the ratio of indices of refraction is a pure number. So our result passes the dimensional check, that both sides of the equation must have the same dimension. We also see something that we might even have anticipated, if we had been clever: the focal length f is a multiple of the size of the sphere. It had to come out that way! The focal length is a length, after all, and the only length in the problem, the way it comes to us in Fig. 3.10, is the radius of the sphere, R. What length could be the result of a calculation using properties of the sphere? Only some multiple of R. This observation is surprisingly useful: knowing just the dimensions of what you are looking for (f in this case), think what could even conceivably be the result. Often, as here, there is basically only one possibility. This argument is called “dimensional analysis”, and we will certainly see it again. Now we consider special or limiting cases. This is another crucial thing to do with any expression like Eq (3.11). What would happen if n = n0? This would mean that θ = θ0, so that there really isn’t any refraction after all – the rays go right through the interface without changing direction. Clearly in that case they wouldn’t come to a focus, because they are all parallel. And that is just what Eq (3.11) says: as n0 → n, the denominator goes to zero, and f → ∞. The focal point recedes away to infinity, which is a way of saying the rays are parallel. Finally what would happen if n0 >> n? This means an enormous index of refraction in the left hand medium, which means θ0 ≈ 0, i.e., the rays on the left would be essentially along the normal direction. Thus all rays would converge to the point O, which is at the distance R from the interface, being the center of the sphere. And again, that is just what Eq (3.11) says: if n0 >> n, then we can ignore n in the denominator, and the fraction in parentheses is essentially 1. In this case, Eq (3.11) says f = R, which we just saw was correct. Taking some time to check a result like this, to interpret it, is a very important part of learning physics. It is how you begin to learn to read relationships like Eq (3.11). And let us emphasize again that what we are talking about is relationships. If we had done all this with specific choices of numbers, we would just get some number as the result, and there would be no way to see if it made sense or not. It wouldn’t mean anything, or rather it would mean something too restricted and specific to be of any interest. But a relationship like Eq (3.11) is full of meanings that we still haven’t extracted, 78 CHAPTER 3. GEOMETRICAL OPTICS even though we have begun to see what it means.

3.9 The focal length of the eye

We know the focal length f of the eye: it is the diameter of the eye, because the focal point is on the retina. Thus if the eyeball is really a sphere of radius R, then f = 2R. Comparing with Eq (3.11), we deduce that the factor in parentheses must have the value 2, i.e. n0/(n0 − n) = 2. Since n = 1 in air, we can solve for n0, the index of refraction of the vitreous humor, and we find n0 = 2 (check that this is the solution). This is quite remarkable: we predict what must be inside the eye, almost by pure thought! As soon as we do this, though, we realize it can’t be true. The index n0 = 2 is just too big. It is true that very dense glass might have an index close to n = 2, and diamond has an index n = 2.42, but these values are unusual. The vitreous humor undoubtedly has a high water content, and water has quite a small index, just 1.33 – it seems highly improbable that Nature could somehow add something to water that would bring the index up to the value 2. In fact, if we put in n0 = 4/3, the approximate value for water, and n = 1, for air, which is surely closer to the truth, we find n0/(n0 − n) = 4, so that f ≈ 4R. And yet f is the diameter of the eyeball! What is going on?? The only possible resolution of this puzzle is that the eyeball is not a sphere, and in particular the radius of curvature R of the cornea must be considerably smaller than half the diameter of the eye, smaller by roughly a factor of 2. The eye must look more like Fig 3.11, with a highly curved cornea superimposed on a basically spherical eyeball of larger radius. The focal length is determined by the cornea alone, so now we see how it could be that f ≈ 4R: the R in Fig 3.10 is not the radius of the eyeball, it is the radius of curvature of the cornea. We even realize that this shape is a familiar fact about the eye. It is the reason that contact lenses don’t float freely over the whole surface of the eyeball, but are confined to just the cornea, where their shape is molded to fit. We noted earlier that we could also use Eq (3.11) to think about how we see when we open our eyes under water. In this case n = 4/3, the value for water. If also n0 = 4/3, we would have zero in the denominator, and f would 3.9. THE FOCAL LENGTH OF THE EYE 79

R

f

Fig. 3.11: For reasons given in the text, the radius of curvature R of the cornea must be smaller than the radius f/2 of the eyeball. (The dotted circle is included in this figure to help visualize the radius of curvature R of the cornea, and does not correspond to any structure in the interior of the eye.) be infinite, a case we already ran into as a special case in the previous section. This corresponds to no refraction at the interface, and no image formation. This is not what happens, though. We can see under water, just not very clearly. It must be that n0 for the vitreous humor is appreciably larger than 4/3, the value for pure water. This is to be expected on other grounds as well. Adding solutes to water invariably raises the index of refraction. Thus n0 > n, and that is why we can see under water! If you imagine n increasing from 1 (the value for air) in Eq (3.11), the denominator of the fraction in parentheses would get smaller, and the fraction itself would get bigger. This means f would get bigger, so the focal point would move back, off the retina – we are effectively farsighted under water. If you don’t know what it is like to be farsighted, just open your eyes under water. Of course if you wear goggles or a diving mask, you can suddenly see sharply again under water. Why? Because now you have an air-cornea interface, not a water-cornea interface, and the relative index of refraction at the interface is just what it should be for good vision. 80 CHAPTER 3. GEOMETRICAL OPTICS 3.10 Virtual Images

The curvature of the cornea is essential for the formation of the image on the retina. That is clear in Eq (3.11), which can even describe what would happen if the cornea were flat. Since R can be anything in this relationship, suppose R is very large. A flat interface is the limiting case as R goes to infinity: a small piece of an enormous sphere looks flat. But if we let R → ∞ in Eq (3.11), then also f → ∞, that is, the image goes to infinity, which is to say it doesn’t form at any finite place. So a flat interface wouldn’t produce an image. That is why our eyes have a curved surface, and why a camera has a curved lens. You might very plausibly think, therefore, that a flat interface cannot form an image. (Isn’t that what we just said?) But, in a certain amusing sense, that is not true! We will show not only that a flat interface does form an image, but that you are even very familiar with this phenomenon. We hasten to add that this paradox is possible only because we are going to redefine the word “image” slightly: the image formed will be a virtual image. The example we will consider is what you see when you look down into shallow water, through the flat air-water interface. You see a virtual image of the rocks, shells, etc. on the bottom. The geometry of the situation is shown in Fig 3.12. Once again we inspect the figure and read off geometric information: BP = D (3.12) BP 0 = D0 (3.13) AB = D tan θ = D0 tan θ0 (3.14) We imagine looking straight down into the water, so that the rays our eyes receive are nearly vertical, that is, the angles θ and θ0 are small. (They are not drawn small in the figure, in order to spread things out so that you can see the geometrical relationships, but now we specialize to the case of small θ.) In this case we can use the small angle approximation in Snell’s Law and find D0 tan θ θ n0 = ≈ ≈ (3.15) D tan θ0 θ0 n We see that the ratio of depths D0/D does not depend on θ, i.e., it does not depend on which ray we choose. It only depends on the two media (air and 3.10. VIRTUAL IMAGES 81

θ' air AB water θ D' P' D

P

Fig. 3.12: Light from a point P under water, as it emerges into air, seems to come from the point P 0. Thus P appears to be at the depth D0 instead of the true depth D. water) through their indices of refraction. All the rays intersect at the same depth D0, as the rays on the right suggest (as long as the angles are small). That means the emerging light from P seems to come from a different point, P 0. Our visual system can estimate distance from the geometry of the rays we receive, as we have already noticed. Hence water looks shallower than it actually is! The visual evidence about where the bottom is comes to us from the rays we actually receive, which emanate from P 0, not the rays in the water that emanate from P .

The point P 0 is called a virtual image of the point P . When we look into water, we are really looking at this virtual image. In the case of water and air we have n0 = 1 and n = 4/3. Thus the depth we see, D0 is only 3/4 of the true depth. When you look for yourself (a still swimming pool is an excellent place to see this effect) you may have the impression that the effect is even more extreme than this – the pool looks quite shallow. This is because you are probably looking down at an angle, and not straight down. The geometry is more complicated to work out in this case, although the idea is exactly the same: every point on the bottom produces a virtual image that 82 CHAPTER 3. GEOMETRICAL OPTICS you see, and the virtual image is even shallower in the general case than in the straight-down case. A special limiting case of this last effect is easy to see. Because the index of refraction of water is greater than that of air, light rays bend away from the vertical as they emerge into air. That is clear in Fig. 3.12. For a large enough angle, the ray in air will have bent away from the vertical by π/2, that is, it will just skim the surface, as shown in Fig. 3.13. According to

π/2P' air water θ c

θ c P

Fig.3.13: A ray from P at the critical angle θc emerges in air to skim the surface. A nearby ray from P would intersect the horizontal at the virtual image P 0. This is essentially where the point P would appear to be, to someone looking along the water at a very shallow angle.

Snell’s Law, if θ0 = π/2, so that sin θ0 = 1, the corresponding angle in water, which we call θc for “critical angle”, obeys

0 n = n sin θc (3.16)

0 −1 If n = 1, for air, and n = 4/3, for water, we have θc = sin (3/4) = 0.8481. This is about 48.6◦. (You might very well wonder what happens to rays from P at larger angles than this! We’ll return to this question.) Meanwhile just notice that if you look into the water at a grazing angle, along the ray in Fig. 3.13, you see the virtual image P 0 at the surface of the water! This 3.11. THIN LENSES 83 observation confirms, in a limiting case, our impression that when we look into water at an angle, the water looks really shallow. It is just the virtual image that we are looking at. This phenomenon of the critical angle is something to keep in mind when- ever you think about a light ray going from high to low index of refraction. The critical angle θc is always determined by the picture in Fig. 3.13 and the corresponding relationship Eq. (3.16). If we think of light rays going the other direction, in Figs. 3.12 and 3.13, and perhaps coming to the eye of a fish at the point P , we notice that the fish sees the whole upper world of the air confined to a cone of opening angle θc, a phenomenon sometimes called “Snell’s window”. You can look through Snell’s window yourself if you can swim down to a reasonable depth: turn over on your back, look up, and you will see the surface above you illuminated in a bright circle out as far as the critical angle. Then it goes dark.

3.11 Thin Lenses

In thinking about the eye, we have encountered many of the basic ideas of geometrical optics. We have kept details about the real eye to a minimum, mentioning them only when physical principle required it. It is interesting to notice in this special example how physics and biology differ in their emphases. Biologists treat the eye with less explicit emphasis on geometry. Physicists have little interest in most of the anatomical structure of the eye, and treat the essential parts as geometrically as possible. It should be clear that these approaches complement each other, and that each has useful things to say. A glass lens, like a simple camera lens or a magnifying glass, must be much simpler to understand than the eye, so simple that a biologist wouldn’t even be interested. This is something we ought to be able to understand pretty completely. In principle we could trace rays through any lens, of any shape, using Snell’s Law in its exact form at interfaces, and in this way we could learn exactly what any lens does. High quality lenses and lens systems are designed by this process. As usual, though, physicists use quick approximations to get at the essential properties of common lens shapes. We will assume that the interfaces are spherical surfaces, each characterized by 84 CHAPTER 3. GEOMETRICAL OPTICS

f

Fig. 3.14: The focal length of a plano-convex lens its own radius of curvature R, just as we have done above, and we will also assume that the lenses are thin, so that we can say that a lens is located at some definite position, without distinguishing between its front and back surfaces, even though, strictly speaking, they are at slightly different places. We will also assume that light rays make small angles with the normal at the lens surfaces, so that we can use small angle approximation to describe refraction. This is just what we have been doing right along. The only thing new here is the requirement that the lens should be thin. A thin lens is characterized by one number, its focal length f, as illus- trated in Fig. 3.14, the distance behind the lens at which an image forms of a point at infinity. It is precisely where the film ought to be located in a camera to take a sharp picture of a star. One also speaks of the focal plane, the plane located a distance f behind the lens. In a camera, the film actually lies in the focal plane when the camera is focused on infinity, and an image of the night sky would show sharp star images at many places. For a plano- convex lens like the one in Fig (3.14) we can use geometry and Snell’s Law in the small angle approximation to find the focal length from the radius of curvature R and the index of refraction, much as we did in Section 3.7. It turns out to be nR f = (3.17) n0 − n where n = 1 refers to the air around the lens, and n0 refers to the material of the lens. We would use n = 4/3 if the lens were in water. Try checking 3.11. THIN LENSES 85 the common sense of Eq (3.17), using the same ideas that we used to check Eq (3.11). Is it dimensionally correct, for example? Here is an odd thing, which turns out to be generally true: the focal length of this lens is the same if the light comes in from the left and focuses on the right, as it is if the light comes through from the right and focuses on the left, although the ray tracing argument is different in the two cases. It is a good puzzle to try to do the geometry and find f both ways. It comes out the same either way, just Eq (3.17). The image formed in the focal plane is called a real image, in contrast to the virtual image that we saw in the last section. When we say real image, we are emphasizing that it would actually appear, visibly, on a screen or on film placed in the focal plane. The image on a movie screen is a real image. Similarly, the image that forms on the retina is a real image.

f

Fig. 3.15: Light from a distant point diverges after going through a plano-concave lens as if coming from a virtual image a distance f behind the lens.

A plano-concave lens creates a virtual image of a distant star as shown in Fig 3.15. The parallel rays from a distant point diverge after passing through the concave lens, as if they were coming from a point behind the lens. The distance to this point might again be called f, the focal length of the (concave) lens. The same kind of geometrical argument, using Snell’s Law in small angle approximation, leads once again to the same formula for f, Eq. (3.17)! This is really a surprise. It is as if convex and concave lenses were somehow the same, mathematically, although they seem to behave so differently. Of course they are not really the same: one is convex, the other concave, one forms a real image, the other a virtual image. In particular the foci are on opposite sides of the lens. 86 CHAPTER 3. GEOMETRICAL OPTICS

The following sign convention turns out to be a way to put all this to- gether. We think of the light flowing through the lens. If the focus occurs “downstream” from the lens (to the left of the lens in our case, as in Fig 3.14), corresponding to a real image, then we call f positive. But if the focus occurs “upstream” from the lens, (to the right of the lens as we have drawn it in Fig. 3.15), corresponding to a virtual image, then we call f negative. Similarly, if the spherical surface is convex, like a sphere seen from the out- side, we call the radius of curvature R positive, but if the spherical surface is concave, like a sphere seen from the inside, we call R negative. Now ev- erything works! The same formula, Eq (3.17), describes both plano-convex and plano-concave lenses, but for the concave lens R is negative, and so the formula makes f negative, telling us that the lens forms a virtual image, upstream from the lens, as in Fig. 3.15. The concavity is represented by the negative sign in R.

There is a possibility that we haven’t considered in Eq (3.17). We have always assumed that n0 − n is positive, since we think of n0 as referring to a glass lens, and n referring to air, or perhaps water, and then n0 > n. But suppose n0 < n, as would be the case for a lens shaped air bubble in glass. Then if R > 0, corresponding to a convex “air lens” in glass, we would find f < 0 in Eq (3.17), since n0 − n < 0, so that we predict a diverging lens and a virtual image, in spite of the convex shape. Is this what actually happens? Yes! The sign of f does tell us how the lens behaves, even in cases that we hadn’t explicitly intended. And the (negative) focal length f is still given correctly by Eq (3.17)! The sign conventions tell us how to choose the sign of R and how to interpret the sign of f.

3.12 Object and Image

We have seen that if the object is at infinity, then the image is at the focal length (i.e., in the focal plane). This is really the operational definition of the focal plane, and also a way to compute f from geometry. The concept is illustrated in Figs. 3.14 and 3.15.

But what if the object is not at infinity? There is a simple relationship between the object position o, the image position i, and the focal length f, 3.12. OBJECT AND IMAGE 87

called the thin lens equation

1 1 1 + = (3.18) i o f

Here i and o are measured from the position of the lens, and there are sign conventions: o is positive if it is upstream from the lens and negative if it is downstream. On the other hand i is positive if it is downstream from the lens and negative if it is upstream. These conventions are chosen so that the simplest situation, with an object upstream forming an image downstream, corresponds to both o positive and i positive.

E A

h C f F O B h'

D i o

Fig. 3.16: An object of height h at o produces an inverted real image of height h0 at i. The focal length of the lens is f

As you might expect, Eq (3.18) is a consequence of geometry, and not difficult to prove. We will go through it at the end of this section. Meanwhile, let us do the more important and interesting job of interpreting the meaning of Eq (3.18). First of all, check dimensions: i, o, and f are all lengths, so they carry dimension [L]. Therefore each term has dimension [L−1], and thus the equation makes dimensional sense. All terms have the same dimension. Next, check special cases: if the object is at infinity, like a star, then 1/o = 0 88 CHAPTER 3. GEOMETRICAL OPTICS and Eq 3.18 is just 1/i = 1/f, that is, i = f, or to put it into words, the image is at the focal length. That is correct, of course! And if f happens to be negative, because the lens is concave, then i is negative, that is, it is upstream from the lens, and must be a virtual image (Fig. 3.15 again). We can check something we only asserted before in Fig. 3.16. When the object comes in from infinity to some nearer position, the image moves back from the focal plane, farther away from the lens. That is clear in Fig 3.16, and it is also clear in Eq (3.18). If f is positive, and 1/o increases from zero, then 1/i has to be decrease in order that 1/o and 1/i continue to add up to the same positive value 1/f. That means i increases: the image moves back. Now let us see in detail what Fig. 3.16 means, and how it leads to the thin lens equation, Eq (3.18). The diagram actually shows how the point A leads to its image D by following just two special rays from A to see where they intersect. Other rays would also intersect there, but it is not as easy to describe them. The first special ray is the one that goes through the middle of the lens, which we show as a straight, undeviated line AOD. It is straight because the normal directions to the two surfaces of the lens are parallel at the middle (they are both horizontal), so that whatever deviation happens at the front surface is undone at the back surface. Strictly speaking there should be a little jog through the lens, but the lens is thin, so we ignore that. Also the normal isn’t quite horizontal if we enter just above the middle, but in the small angle approximation we don’t enter very far above the middle, so we ignore that little effect too. Thus we have one special ray, straight through the middle, AOD. The second special ray is AE. It goes horizontally from A to the lens. We know what happens on the other side: it goes through the focal point F at distance f behind the lens (compare Fig. 3.14). So with very little effort we draw those two rays, and where they intersect at D is the image of A. Now we dig out the geometrical information in the diagram. From the similar triangles ∆OCD ∼ ∆OAB we see i o = (3.19) h0 h This tells us that the image is a kind of magnified version of the object, with h0/h = i/o. Then from the similar triangles ∆DCF ∼ ∆OFE we have i − f f = (3.20) h0 h 3.13. OPTICAL SYSTEMS 89

The thin lens equation Eq. (3.18) follows from Eqs (3.19) and (3.20) by algebra. [A quick way: divide the left side of Eq. (3.20) by the left side of Eq. (3.19), and the right side of Eq. (3.20) by the right side of Eq. (3.19). These are equals divided by equals, so they are equal.] Since the thin lens equation only involves the distance o, and not anything else about the object, all points of the object are imaged in the plane at position i, not just the point A at D. In particular, the image of the point B is C. Eq. (3.19), which is easy to see in Fig 3.16, is used, with a sign convention, to express the magnification by a single lens, i Single Lens Magnification = − (3.21) o The minus sign is the sign convention. With this convention, the magnifica- tion is negative when the image is inverted and positive when it is erect. If you hold up a convex lens at a distance from your eye and look through it, you see an inverted version of the scene in front of you. This is because you are actually looking at the inverted real image at the position i, in front of the lens. The rays don’t stop at i, of course, they continue on, and what gets to your eye comes from the real image as if there were really something there. If you move closer to it, eventually you get too close to focus properly, and it blurs. This inverted real image is also what slide and movie projectors produce at the screen. In this case the magnification is enormous. How is that achieved? And why don’t you see the image upside-down? We haven’t considered the other signs possible for i, o, and f, but the thin lens equation holds for all possibilities. We will run into them in the next section on systems of lenses.

3.13 Optical Systems

At the end of the last section we described what you see when you look through a convex lens held up at a distance from your eye. You see an inverted real image in front of the lens. That means that the image on your retina is really the image of an image. The convex lens forms a real image somewhere in space, and then that image becomes the object that you look at. That is how optical systems of lenses work: each lens forms an image 90 CHAPTER 3. GEOMETRICAL OPTICS that then becomes the object for the next lens as the light flows through the system. It is like input-output systems strung together, with the output of one being the input for the next. In this section we will consider a number of optical systems. Remember: the eye is part of the system!

3.13.1 The Magnifying Glass

When you use a magnifying glass, you hold a convex lens close to the thing you are magnifying, closer than the focal length f, in fact. The result is a virtual image behind the lens, instead of a real image in front of it. This follows immediately from the thin lens equation Eq. (3.18) if we solve for i: of i = (3.22) o − f If 0 < o < f, then i is negative, i.e., the image is virtual. The situation is illustrated in Fig. 3.17. Again we can use two special rays to locate the image of the point A, the ray AE (which, extended, goes through the focal point F ) and the central ray AO. After passing through the lens the ray AE becomes horizontal, by definition of the focal point, and extending these special rays backwards, they seem to come from the point C, which is therefore the virtual image of A. When we look through a magnifying glass, we see the virtual image, with its magnified height h0 instead of the real object, with its height h. It is not clear in Fig. 3.17, however, that the virtual image will really look bigger, because it is also farther away. What we see, after all, is angular size. So Fig. 3.17, although it is suggestive, does not provide a completely satisfactory explanation for how a magnifying glass works. We should really think about the angular size of what we see, and for that we must introduce the distance d from our eye to the magnifying glass. Then without the magnifying glass we see an object of height h at distance d + o with angular size h θ = (3.23) d + o and with the magnifying glass we see the virtual image of height h0 at distance d + |i| with angular size h0 θ = (3.24) M d + |i| 3.13. OPTICAL SYSTEMS 91

E C

A h' h O B F D o

f i

Fig. 3.17: An object of size h at o closer to a convex lens than the focal length f forms a magnified virtual image of size h0 at i.

We must be careful to use |i|, the magnitude of i instead of i itself, because i itself is negative, but it is the magnitude |i| that says how far it is from the lens. Of course, since we know i is negative, we can just change the sign to get |i| = of/(f − o), from Eq (3.22). The ratio of the angular size with the lens to the angular size without the lens is the angular magnification: ¡ ¢ θ h0(d + o) 1 + 1 M ¡ d ¢o = = 1 1 1 (3.25) θ h(d + |i|) d + o − f The rather complicated expression on the far right of Eq (3.25) follows from some algebra, using Eq (3.22) to eliminate i and the magnification relation between h and h0, namely h0/h = |i|/o from Fig. 3.17. As always, when we encounter a complicated result, arrived at by a long argument that might have introduced some mistakes, or might have failed 92 CHAPTER 3. GEOMETRICAL OPTICS to include an essential feature, we try to check the relationship for common sense. First dimensions: the ratio on the left hand side is of course dimen- sionless, so the right hand side should be as well. But each term in the numerator of the final expression is [L−1], and each term in the denominator is as well, so the ratio is dimensionless. Next, we notice that if f > 0, as it is for a convex lens, then the de- nominator is less than the numerator, because it is the same except with 1/f subtracted, and therefore the angular magnification is a number greater than 1, corresponding to actual magnification. This is, in a sense, what we wanted to be sure of. We also notice that if f < 0, as it is for a concave lens, then subtracting 1/f really means adding something: the denominator is greater than the numerator, and the magnification is less than 1. The concave lens makes thing look smaller. This too is a familiar fact of experience. If f is infinite, as it is for a flat piece of glass, the object looks just the same whether the glass is there or not, because the 1/f in the denominator is zero, and the angular magnification is 1. For a typical hand-held magnifying glass, with f ≈ 20 cm, say, we might have o = f/2 and d = f, leading to an angular magnification of 3/2. This seems about right. By moving o towards f, i.e., moving the object towards the focal plane 20 cm behind the lens we could boost the angular magnifica- tion to 2, but that’s it. As o → f, the virtual image goes to infinity, where it is comfortable for the eye to look at, and still a nice angular size, because it grows larger as it moves away. In this case, 1/o and 1/f cancel in the denominator of Eq. (3.25), and the angular magnification simplifies to

θ d M = 1 + (if o = f) (3.26) θ f Now suppose we bring our eye right up to the magnifying glass, i.e., let d → 0. Then the angular magnification would be 1. What good is that? you may ask. The virtual image at infinity, with the lens, has the same angular size as the object a distance f from our eyes would have without the lens – if we could see it! The point is, though, for short enough focal length f, we couldn’t see an object there. It would be too close to our eyes for our internal lens to accommodate and bring to a focus on our retinas. But we can see the virtual image at infinity produced by the lens, with a relaxed eye, and it 3.13. OPTICAL SYSTEMS 93 is just like being a distance f away from the object, which is to say, really close. In effect we are using the magnifying glass to supplement our internal lens, which is not what we do when we use a magnifying glass casually. This is what the eyepiece of a microscope does – our eye is right at the lens, the lens has a very short focal length, and we are looking at something in the focal plane, much too close to see ordinarily, but comfortable to see with the lens. Most textbooks use the term angular magnification to describe this special application of the magnifying glass, and they compare the angular size you could achieve by viewing an object at distance f (with the lens, of course) with the angular size you would have to settle for at distance dmin, the minimum distance at which you can focus on things, conventionally estimated at 25 cm, but varying from person to person. This comparison says you could make things look bigger by the factor dmin/f with the lens, by reducing the distance to the object, but of course you don’t just introduce the lens, you also physically move the object, and that is where the angular magnification comes from. Microscope eyepieces are described by their magnification in this second sense. A 10× eyepiece magnifies things 10 times, meaning it has a focal length f = 2.5 cm. At the distance 2.5 cm, things look ten times bigger than they do at the distance 25 cm, where you would otherwise have to view them. This is a good place to point out the perils of learning results without thinking about where they come from. Suppose you conscientiously learn that a magnifying glass produces an angular magnification dmin/f, where dmin = 25 cm is the near point of the eye and f is the focal length of the lens. Then someone hands you a convex lens with f = 50 cm. You go to the formula, and find an angular magnification of 25/50 = 1/2, that is, the lens should make things look smaller. But when you try it, you find that this convex lens makes things look bigger – just like every other magnifying glass! Where did you go wrong??

3.13.2 The Microscope

We have already said, in effect, how a microscope works: just put together what we already know about how a convex lens forms images, both real and 94 CHAPTER 3. GEOMETRICAL OPTICS virtual. A microscope consists of two lenses, in principle. The first lens, the “objective”, forms a highly magnified, inverted real image. Since the magnification is −i1/o1, we want i1 to be much larger than o1. Therefore, by the thin lens equation Eq. (3.18) o1 must be just slightly larger than f1, the focal length of the objective lens. Then i1 forms at a position downstream from the objective by a large multiple of f1, and to keep the microscope a reasonable size, this means that f1 must be small, i.e., the radius of curvature of the objective must be small. Then, as we have just described above, the second lens, the eyepiece, functions as a magnifying glass to inspect this inverted real image by forming a virtual image of it at infinity that the eye can see. The magnification of the result, over what you could see without the microscope, is the product of the magnifications of the two parts of the system separately, (−i1/o1)(dmin/f2). The distance between the two lenses is i1 + f2 as we have described it, but f2 is small, to get good magnification from the eyepiece, so the distance between the lenses is essentially just i1, and that is also the length L of the microscope barrel that holds the lenses in place at either end. Also, as we have already noted, o1 is essentially f1, the focal length of the objective. Therefore the angular magnification of a microscope is essentially

−Ld Angular Magnification = min (3.27) f1f2 Let us interpret this result – what does it really mean? First, the minus sign reminds us that the image is inverted. We check dimensions: the ratio of lengths on the right is dimensionless: good. We get better magnification by choosing smaller focal lengths, i.e. more highly curved lenses. We get better magnification by making the microscope long – why is that? Because the farther away the real image gets from the objective, the bigger it gets. (Of course that also makes the instrument bigger and clumsier: maybe better to keep it compact and work on making good lenses.) And finally, oddly, we seem to get better angular magnification if dmin is larger! What is that about? Well, that part is just common sense: microscopes are even more helpful to people who can’t hold things close to see them! In good microscopes the objective is still a single lens, as we have de- scribed it, but the eyepiece is often a little optical system in itself. The reason is that the designers are correcting for chromatic aberration. With- out this, objects seen in the microscope seem to have a colored halo. It all 3.13. OPTICAL SYSTEMS 95 goes back to the index of refraction n0 of the glass. Unfortunately the index of refraction depends slightly on the color of the light, so that according to formulae like Eq (3.17), the focal length is not the same for all colors. If you get a nice sharp image for one color, nearby colors are blurry, and you see them all together. In the next section we will see how a little system could be achromatic, i.e. not subject to this problem, even while it is made from glass that does have this problem.

3.13.3 Two lenses together

A general thin convex lens that is curved on both sides can be made by gluing together two plano-convex lenses, which are curved on only one side, along their flat sides. This means a general lens might be thought of as a system of two lenses (we might think of them as “half-lenses”), an amusing example of a two lens system, because both lenses are in the same place. We can find the focal length f of the resulting system from the focal lengths f1 and f2 of the two plano-convex lenses separately by computing the image of an object at infinity. We do this one lens at a time. Since the object is at infinity, 1/o = 0. Thus the first lens forms an image at i1 = f1. This image is the object for the next lens. We therefore have o2 = −i1 = −f1. The minus sign is because i and o have opposite sign conventions: where i is positive, o is negative. Then from the thin lens equation Eq. (3.18) for the second lens, 1 1 1 = − (3.28) i2 f2 o2 but i2 is just f, the focal length of the system, because it is where the image of the distant object is. Using o2 = −f1 we find 1 1 1 = + (3.29) f f1 f2 a nice result on how lenses combine. (Caution: it was essential that they were at the same place! More general configurations of two lenses are not this simple.) If we put together two plano-convex lenses of index of refraction n0 and radii of curvature R1 and R2 (not necessarily the same) to make a lens with these radii of curvature characterizing the two faces, then by Eq. (3.17) and 96 CHAPTER 3. GEOMETRICAL OPTICS

Eq. (3.29), the focal length of the complete lens (in a surrounding medium with index of refraction n) is given by µ ¶ 1 n0 − n 1 1 = + (3.30) f n R1 R2 This is called the lensmaker’s equation, because it tells you how to make a lens of a desired focal length. As before, the sign convention is that R is positive for a convex face and negative for a concave face. It works for both. That is, we could also use half-lenses that are plano-concave, with negative R. (Caution: some books use opposite sign conventions for the two faces, so that R is positive for a convex face on one side, but positive for a concave face on the other. This seems to me unnecessary, and extremely confusing!) Can you check the common sense of the lensmaker’s equation? What are some of the meanings hidden in this relationship? Finally, nothing says we have to use the same glass for the two half-lenses. If the glasses are different, the lensmaker’s equation becomes 1 n0 − n n0 − n = 1 + 2 (3.31) f nR1 nR2

0 0 0 The indices of refraction n1 and n2 may depend on color, but if n1 increases 0 when n2 decreases, the effect might cancel out in the sum, so that f doesn’t change. We just have to find special glass with the right chromatic property. The result is an achromatic lens system. (Like every good idea, there must have been a fortune in this for somebody, probably many fortunes.)

3.13.4 The Astronomical Telescope

A simple configuration of two convex lenses, with focal lengths f1 and f2, makes a telescope for looking at distant things, hence the name astronomical telescope. We follow the light through the system, denoting the first lens (the objective lens of the telescope) by the subscript 1. Since o1 = ∞, we have 1/o1 = 0, so that i1 = f1. Therefore the image of the first (objective) lens is a distance f1 downstream from the first lens. Now we want i2 = ∞, because the image at i2 is the object for the eye to look at. Thus, by the thin lens equation for the second lens, o2 = f2. That is, the object for the second lens must be a distance f2 upstream from the second lens. That object, recall, is 3.13. OPTICAL SYSTEMS 97

just the image formed by the first lens, f1 downstream from the first lens. So the length of the telescope is f1 + f2, and the two lenses must be mounted as shown in Fig 3.18. One should imagine an eye (not shown) looking through the telescope, of course. The angular magnification can be read off the figure.

C A F θ θ 2 1

B

f f 2 1

Fig. 3.18: This telescope consists of two lenses separated by f1 + f2. The angular magni- fication is, in magnitude, θ2/θ1 = f1/f2, or −f1/f2 if we use the sign convention that an inverted image gets a minus sign. The lines AB and BC through the middle of the lenses to the focal plane show how the angles are related. The red lines represent actual light rays going through the telescope from a distant object, like a star.

Since BF = f1θ1 = f2θ2 (3.32) in small angle approximation, the angular magnification is just −f1/f2, the minus sign indicating that the image is inverted. If you switch the roles of the lenses, by looking through the telescope the wrong way, it makes things look smaller instead of bigger. To make a telescope with high magnification, you should have a long focal length objective and a short focal length eye- piece. For reasons we will explore in more detail later, the best telescope for many astronomical purposes is not necessarily the one with the highest magnification. In particular, high magnification spreads the image out and makes it dimmer. Thus it might appear bigger, but also harder to see. 98 CHAPTER 3. GEOMETRICAL OPTICS

3.13.5 Galilean Telescope

When Galileo made his telescopes, beginning in the summer of 1609, he had virtually no idea how lenses worked. His description of his trial and error method suggests that he failed to find what we call the astronomical telescope, and instead found a configuration that we now call the Galilean telescope. Its main virtue is that the image is right side up. This was no doubt an important consideration when he tried to convince others that the image seen through it was indeed a faithful representation of reality.

C A F θ θ 2 1

B f 2 f 1

Fig. 3.19: Galilean telescope: the caption to Fig. 3.18 applies verbatim, with the under- standing that f2 < 0.

From the mathematical point of view, the astronomical telescope and the Galilean telescope are the same, except that the Galilean telescope uses a diverging lens for the eyepiece, so that f2 < 0. As before the length of the telescope is f1 + f2, but since f2 < 0, this length is less than f1. The angular magnification is as before −f1/f2, but this is actually positive now, indicating an image right side up. 3.14. MIRRORS 99 3.14 Mirrors

Reflection in a mirror can also be described very simply with geometrical optics. The simple law of reflection, known in Hellenistic times, is that the incident ray and the reflected ray make equal angles with the normal at a reflecting surface, θi = θr (3.33) illustrated in Fig 3.20. This law, together with the concepts of real image,

θ r θ i

Fig. 3.20: The law of reflection: θi = θr virtual image, etc. that we have already encountered, amounts to a theory of mirrors. Fig 3.21 shows that light originating at a point A and reflecting from a mirror seems to come from a point B, located symmetrically opposite A on the other side of the mirror. When we look into a mirror, we are therefore seeing a virtual image. Spherical mirrors (reflecting spherical surfaces) obey relationships much like those for lenses. They are characterized by a focal length, for example. The focal length of a spherical mirror is the distance f in front of the mirror at which a distant object forms an image. The concept is illustrated in Fig 3.22 for a concave mirror, showing the real image in front of the mirror. A convex spherical mirror, on the other hand, produces a virtual image of a distant point, and the location of that virtual image, a distance R/2 100 CHAPTER 3. GEOMETRICAL OPTICS

D

AB C

Fig.3.21: Incident and reflected rays make equal angles with the normal to the mirror CD. As a result, reflected light from A seems to come from B. behind the mirror, defines the focal length in the convex case. By analogy with lenses, we introduce a sign convention and call the focal length negative in this case. The formula f = R/2 (3.34) holds for all mirrors if we consider the radius of curvature R negative for convex mirrors and positive for concave mirrors. R is infinite for flat mir- rors. This simple, purely geometrical relationship replaces Eq (3.30, the lensmaker’s equation, for lenses. Exactly the same kinds of geometrical arguments that we used for lenses lead to the law of image formation for spherical mirrors, and it is the same as for lenses! We recall the relationship here: 1 1 1 + = (3.35) i o f where f is understood to be given by Eq (3.34). The sign convention is that both i and o are positive in front of the mirror, corresponding to the simplest 3.14. MIRRORS 101

DA θ R θ θ 2θ 0 F R/2 B

Fig. 3.22: The focal length of a concave mirror is R/2, where R is the radius of curvature. Here O is the center of the spherical surface AB of radius R. The red line DAF is a typical light ray, coming in horizontally and going through the focal point F . case, illustrated in Fig 3.23.

o h f

0h' F i

Fig. 3.23: A concave mirror of radius R forms a real image of height h0 at i of an object of height h at o. Here O is the center of curvature for the mirror, and f = R/2.

The two special rays that are used to find the image in Fig 3.23 are the ray through O, which reflects straight back, and the horizontal ray, which 102 CHAPTER 3. GEOMETRICAL OPTICS reflects through the focal point F . The image is where they intersect. Other rays leaving the object would also intersect at the image point, but they are more difficult to describe. From similar triangles in the figure one can deduce Eq (3.35) in just the same way that we deduced Eq (3.18) for thin lenses. Just as for a single lens, the magnification of the image is h0/h = i/o in magnitude, or Mirror magnification = −i/o (3.36) using the sign convention that makes the magnification negative for an in- verted image. Unlike the case for lenses, this relationship is not obvious in the figure. It follows from similar triangles and some algebra. A concave mirror, having f > 0, forms a real inverted image in front of the mirror, since i > 0 for o > f, by Eq (3.35). A convex mirror, having f < 0, forms a virtual upright image behind the mirror, since i < 0 in this case, again by Eq (3.35). You can see both of these images by looking into the two sides of a spoon. A Newtonian reflecting telescope uses a concave mirror in place of an objective lens to form a real image, which is then inspected with an eyepiece. The main advantage is that with a mirror in place of a lens, there is no chromatic aberration, because the light does not go through an objective lens with its complicated index of refraction. Also, it may be easier to fabricate a good mirror surface than to make glass that is flawless not just on its surfaces but also in its interior.

3.15 Spherical Aberrations

The theory that light is described by rays that are straight lines in Euclidean geometry is at least a candidate for an exact theory of light, but we have been treating it approximately, using the small angle approximation. Now we look, briefly, at how the exact theory differs from our approximate treatment in the case of a spherical mirror. In Fig 3.24 we draw rays reflecting from a spherical mirror, like the typical ray in Fig 3.22, but more of them. The focal point of the mirror is shown with a black dot labeled F . The mirror is taken large enough that for some rays the angle θ (referring to Fig 3.22), is not small. We notice that contrary 3.15. SPHERICAL ABERRATIONS 103

A F B

Fig. 3.24: The image of a distant point formed by a spherical mirror. The cusp shape AF B is called a light caustic. to the small angle approximation, the rays do not accurately intersect at F . The ones with small θ, close to the symmetry axis, do intersect there, but the rays farther from the axis, with large θ, are noticeably off. The way the rays intersect suggests a cusp-like arrowhead figure, with the focal point F at the tip. You may even have noticed bright reflections with this shape, in the bottom of a coffee cup, for example, if its shiny interior surface acts like a round mirror. These shapes are called “light caustics”, and arise as imperfect focal points, like the one here. It looks as if geometrical optics could also describe light caustics, somehow, when the optics are a little bit “off” – we do not pursue this very interesting idea here, but it has been the starting point for some fascinating applied mathematics.

The problem we have noticed here, called spherical aberration, is a serious issue for instrument makers. The unfortunate truth is that spherical optical components are not quite the right shape. Spherical lenses suffer from the same problem as spherical mirrors. If they are small enough, the discrepancy may not matter, but if they are so large that the small angle approximation is no longer very good, then they should be made in a better shape, not a spherical shape. For a mirror, that better shape turns out to be a parabola, 104 CHAPTER 3. GEOMETRICAL OPTICS and one sometimes hears about parabolic mirrors. Shapes different from spheres are invariably more expensive to manufacture, so high quality optics is not cheap. One motivation for using large optical elements, even though it requires more careful attention to shape, is that large elements allow more light through, just as our eyes admit more light when our pupils open wider. More light means brighter images. An ingenious idea for making a parabolic mirror cheaply, for use as a telescope, is to rotate a circular pool of mercury. The shiny liquid naturally assumes a parabolic profile when it rotates, and you can even control the focal length (which is just half the radius of curvature at the center) by controlling the rotation speed. Unfortunately, since the mirror must be horizontal, such a telescope can only point vertically. That is probably its main disadvantage, although keeping the surface smooth and free of vibrations might also be a technical challenge. In principle such a telescope could be looking straight up, waiting for interesting objects to pass into its narrow field of view, or it could look at a steerable plane mirror that directs light from other directions straight down.

3.16 Reflection and Refraction

Up to now we have treated refraction at an interface and reflection at a mirror as if they were two different things. In fact, though, when light refracts at an interface, only some of it is refracted – the rest is reflected. That is, every such interface is also a kind of mirror. You know of course that you can see your reflection in transparent glass. Interfaces at which both reflection and refraction occur are often called “dielectric” interfaces, to distinguish them from metal surfaces. To empha- size their reflective property, we could even call such an interface a “dielectric mirror.” There is nothing new to say about the geometrical optics of a di- electric mirror, though – it is just like any other mirror. There is still an important question to consider, however: what fraction of the light is reflected and what fraction is transmitted at a dielectric interface? This question presupposes a way of measuring “how much” light we have, a measure of brightness. The best measure of brightness makes use of the notion of energy, and what we really mean by the question is, what fraction 3.16. REFLECTION AND REFRACTION 105 of the energy is reflected and what fraction is transmitted? We can’t make this question more precise yet, but we will describe the answer anyway. Reflection and refraction occur because of a difference in the index of refraction, a “mismatch”. For normal incidence, if the relative index of re- fraction at the interface is n, the reflected fraction, or “reflection coefficient” is µ ¶ n − 1 2 R = (3.37) n + 1 Note that if the two media had the same index of refraction, the relative index of refraction (their quotient) would be 1, and the reflection coefficient would be R = 0, i.e., no reflection. This is sometimes used as a quick way to measure the index of refraction of an unknown glass or mineral. Just immerse it in a series of clear oils with different indices of refraction – if you can’t see it in the oil, then there must be an index match! Seeing it, after all, means we see the light reflected from the interface, but in a matched oil, there is no reflected light. Let us do a numerical example. If window glass has index n = 1.5 and air has index 1, then the relative index of refraction is n = 1.5, and R = 1/25 = 4/100. In other words, only 4% of the incident energy is reflected at the air-glass interface. If no energy is absorbed by the glass, then 96% is transmitted through the interface. Of course there is a glass-air interface at the back of the window. You should verify that R is the same whether we go from air into glass or from glass into air. That means once again 4% of the energy is reflected at this second interface. The energy incident there is not all the energy initially incident, only 96% of it, but that is almost all of it. Therefore roughly 8% of the originally incident enery is reflected, and 92% transmitted. Specialty glass can have an index of refraction as high as n = 2. For such glass R = 1/9, meaning over 10% of the incident energy is reflected at each interface. That may not seem like much more than the usual 4%, but it is close to three times higher reflectivity. Architects like such glass: it makes their buildings more opaque, seen from outside, and not like fishbowls. The higher reflectivity is very noticeable when you come up to a glass door of such glass. The formula above, Eq (3.37), is for the reflection coefficient at normal incidence. For light incident at any other angle, R is larger than this. We have met a case where R actually becomes 1, that is, all the light is reflected! 106 CHAPTER 3. GEOMETRICAL OPTICS

This is for the case of light incident on a dielectric interface at an angle θ ≥ θc, where θc is the critical angle. Recall from the discussion around Fig 3.13 that the critical angle is the incident angle at which the refracted ray just barely gets into the second medium, by skimming along the surface. For an incident angle even larger, it can’t get into the other medium – it is all reflected. This phenomenon is called total internal reflection. A clever application of total internal reflection is to make excellent mirrors with just glass, and no silver or other metal to coat them. Consider the simple prism in Fig 3.25. The critical angle at the air-glass interface, if n = 1.5 for glass, is −1 ◦ θc = sin (2/3) = 0.7297 radians = 41.8 . Light incident normally at the left face is totally reflected because its angle of incidence on the long face is 45◦, which is larger than θc. If you look through a prism, the totally reflecting face looks silvery, and it is hard to resist turning it over to be sure that it actually isn’t.

Fig. 3.25: Total internal reflection in a glass prism

3.17 Fermat’s Principle

There is an old idea, going back to Socrates, that what Nature does must be somehow for the best. In Plato’s dialogue Phaedo Socrates even expresses his disgust for the physical ideas of Anaxagoras, because they don’t tell him what he really wants to know, namely how it is that what Nature does is 3.17. FERMAT’S PRINCIPLE 107 for the best. In modern physics, however, this idea has become absolutely fundamental. Almost all physical theories can be formulated in a simple way, expressing that what Nature actually chooses to do is somehow best. Such formulations are called variational principles.

Perhaps the first use of a variational principle was a new formulation of geometrical optics by Pierre Fermat in the 17th century, now called Fermat’s Principle. Geometrical optics follows from three basic rules: light travels in straight lines in a homogeneous medium, it reflects at an interface according to the law of reflection Eq (3.33), and it refracts at an interface according to Snell’s Law Eq (3.5). There seems to be no particular reason that light should do this and not behave in some totally different way. It seems arbitrary. But these three rules, in turn, all follow from one single idea, Fermat’s Principle, which almost seems to explain what is going on: a light ray between two points takes the path of shortest time. One could argue (and one did) that God would not waste time with His light, and that is why light behaves the way it does! To put it in a more neutral way, light seems to optimize the travel time in getting from one place to another, and this implies all of geometrical optics, including extensions of the theory that we could not have treated before. This example illustrates very neatly the simplicity and power of variational principles.

Behind Fermat’s principle lies the assumption that light travels with some definite speed in each medium. With this assumption it is clear how Fermat’s principle implies that light travels in straight lines: the way to get from A to B in minimum time at fixed speed is to take the shortest path, and that of course is a straight line.

It is not so clear how Fermat’s principle implies the Law of Reflection, but a clever geometrical argument explains this. Given two points A and D in a single homogeneous medium, and a nearby mirror, we ask for the shortest path from A to D via the mirror. The path should go from A to some point C on the mirror by a straight line, of course, – any other path would take more time, unnecessarily. And then it should go from C on the mirror to D by another straight line. Thus the only freedom we have in searching for the shortest path is the freedom to choose the point C on the mirror. The situation is illustrated in Fig 3.26. As illustrated in Fig 3.26, we introduce the point B symmetrically opposite A on the other side of the mirror. The mirror is the plane bisecting the line segment AB. Thus the triangle ACB 108 CHAPTER 3. GEOMETRICAL OPTICS D

C

A B

Fig. 3.26: What choice for C minimizes the travel time along ACD? Hint: not the one shown! is isosceles for any choice of C on the mirror, and hence the distance ACD is the same as the distance BCD for any choice of the point C. We need only choose C to minimize BCD, but that is easy: BCD must be a straight line, and that implies the law of reflection. In this way of looking at the rays reflected from a mirror, it is obvious why there is a virtual image of A at B. Every ray from A that reflects from the mirror continues along the direction that comes from B. This is easier to see via the variational argument, perhaps, than it was in Fig 3.21. How does Fermat’s principle imply Snell’s Law? Now we ask for the shortest time path from a point A in one medium to a point B in another medium. You might think the ray should just be the straight line from A to B, as before, but if B is in a “slow” medium, and A is in a “fast” medium, the shortest time path from A goes to a point on the interface relatively close to B, insofar as that is possible, and then travels a shorter distance in the slow medium. Actually solving for the shortest time path requires calculus, so we just give the result: if the light speed in a medium with index of refraction n is c/n, where c is the speed of light in vacuum, then the shortest time path obeys Snell’s law. The bending of the ray toward the normal as it enters a medium of higher index of refraction is just its way of reducing the distance it must travel in the slow medium. Snell’s Law only follows from Fermat’s Principle if the light speed is c/n, where n is the index of refraction, and naturally one wonders if it is really true 3.18. WAVEFRONTS: A DUAL THEORY OF LIGHT 109 that light travels at a rate proportional to n−1 in transparent materials. The answer is yes. Fermat could not have known that, of course. Experimental confirmation came only centuries later. Now we can consider how light travels through complicated materials, like inhomogeneous glass, where the index of refraction may not be constant, but instead changes gradually with position. This could happen if the glass was not well mixed, so that the concentrations of important constituents are high at some places and low at others. Fermat’s Principle tells us that the actual path taken by light rays will avoid the regions of high n, because these take longer time to get through, and favor the regions of low n. The actual rays will be curves. A familiar example of this is the water mirages one sees on highways on hot summer days. In this case, light is travelling through air that is heated from below by the ground. The index of refraction of air is very close to n = 1, of course, the value for vacuum, but the amount by which it is greater than 1 is essentially proportional to the air’s density. You could imagine that each air molecule contributes to the index of refraction, and the more molecules there are, the larger n is. The hot air near the ground expands and is less dense, so there are fewer molecules, and n is less. The cool air above is more dense, so there are more molecules, and n is more. Thus light coming from the sky near the horizon does not come straight to your eye, but follows a curved path that comes near the ground, favoring the region of low n, and it may even appear that the sky is reflected in the highway ahead of you. It looks like a reflection from water – the road looks wet.

3.18 Wavefronts: A Dual Theory of Light

In this final section we look at our results in geometrical optics in a “dual” way, adding wavefronts to our diagrams for light rays. The wavefronts are surfaces that are perpendicular to the rays. We only draw one example, since this is only meant to be a vague hint, suggesting a completely different picture of what light is. Starting from the rays we can draw the wavefronts by just drawing curves perpendicular to the rays. Corresponding to rays leaving the point A, for ex- ample, we get spherical wavefronts centered on A, shown in blue in Fig 3.27. 110 CHAPTER 3. GEOMETRICAL OPTICS

B A

Fig. 3.27: A convex lens forms a real image at B of an object at A. The rays show how one pictures this in geometrical optics, and the wavefronts give an alternative, equivalent picture.

The effect of the lens is to convert these wavefronts to new spherical wave- fronts centered on B. The formation of the image can then be thought of as these new spherical wavefronts advancing toward B, carrying their energy, until the energy is concentrated at B. In some ways this is a more satisfy- ing picture, since it was never really clear in geometrical optics why a place where rays cross should correspond to bright light, but in the wave picture we seem to see why that would be true: the wave is concentrated there.

The thin lens equation gets an interesting new interpretation in this pic- ture. We write it as 1 1 1 = − + (3.38) i o f Here o refers to the distance from the lens to A, and i refers to the distance from the lens to B. Now think of each term in the thin lens equation as a curvature. The term 1/o is the curvature of a sphere of radius o, i.e., the curvature of the wavefront from the object A at the position of the lens. Similarly the term 1/i is the curvature of a sphere of radius i, i.e., the curvature at the position of the lens of a wavefront centered at B. What 3.18. WAVEFRONTS: A DUAL THEORY OF LIGHT 111 the thin lens equation seems to say is that the lens of focal length f adds curvature 1/f to the curvature of a wavefront that reaches it. The resulting, now differently curved, wavefront then goes on to form an image determined by its new curvature. There is a sign convention to watch out for – curving one way is positive, the other way is negative. We once derived the focal length of a combination of two thin lenses put together at the same place, having individual focal lengths f1 and f2. The result for the new focal length f was 1 1 1 = + (3.39) f f1 f2 In terms of curvatures, this is a triviality! Each lens adds its characteristic contribution to the total curvature – that is all it says. This excursion into wavefronts was not meant to be a complete explana- tion of anything, just a provocative first look at an alternative theory. We will actually return to the wave theory of light. For now let us just ask our- selves if geometrical optics tells us anything about what light really is. Does light really consist of rays obeying Snell’s Law, and so forth? The success of the theory might tempt us to say yes, there really are light rays, but the sudden suggestion of a different picture that looks as if it would lead to all the same results should make us cautious about ascribing reality to our con- structions. This is an example of the wave-particle duality that so intrigued the discoverers of quantum mechanics, and continues to be a subject of fas- cination. In this context we could ask, is light a ray or a wave? The most cautious answer is that our theories tell us nothing about what light really is. Rather they are just correspondences between mathematical structures like geometry and real things like light, that are successful in ordering our understanding about what happens around us. Such correspondences are all physics has to offer. 112 CHAPTER 3. GEOMETRICAL OPTICS Problems

Not every problem is like one discussed in the text. Be ready to make sketches and interpret the geometry of the situation. Where an estimate is called for, use reasonable numbers, and make your choices clear.

Angular Size

3.1 Estimate the angular size, both height and width, of someone you see standing across the street.

3.2 A high-tech camera is said to have such good resolution of detail that it can see a dime at a distance of a mile. Clearly this description is talking about an angle. What angle is it?

1 ◦ 3.3 The Moon’s angular size as seen from Earth is about 2 . If the Moon is 60 Earth radii distant from Earth, what is its actual size?

3.4 The Sun’s angular size as seen from Earth is almost exactly the same as the Moon’s. (a) How can that be, if the Sun is actually much bigger? (b) If the Sun’s radius is 100 times that of the Earth, how far away is it? (c) We know the angular size of the Sun and Moon are the same because the Moon just covers the sun in a total solar eclipse. Actually, though, it sometimes fails to cover, leaving a bright ring of Sun showing around it, even at the moment of perfect alignment (a so-called annular eclipse). Why is this?

113 114 CHAPTER 3. GEOMETRICAL OPTICS

3.5 The Greek philosopher Anaxagoras was accused of blasphemy when he suggested that the Sun was not a god, but rather a hot rock, as big as the Peloponnesus. How far away did he (implicitly) think the Sun was?

3.6 If the Moon is 60 Earth radii from the Earth, what is the parallax angle relating two observers, of whom one sees the Moon setting on the western horizon and the other sees the Moon rising on the eastern horizon? What is the angular size of the Earth as seen from the Moon? What is the relation between these two questions?

3.7 A surveying crew makes observations from a north-south baseline of length 100 m. A tall tree is due east, as seen from one end of the line, but it is 1◦ north of east as seen from the other end. (a) How far away is the tree? (b) How different would the answer be if we replace 1◦ by 0.9◦ (a change of 10%), realizing our angle measurement is uncertain by about this much?

3.8 An old argument against the motion of the Earth is that if the Earth moved around the Sun, then the stars would show parallax: the nearest ones would seem to move back and forth against the background of the more distant ones. But this is not observed. At least it wasn’t until the 19th century, when at last parallax was observed in a star. Even in the closest stars, the parallax is only about 1 second of arc (where 60 seconds is a minute and 60 minutes is 1◦). This is a very small angle, and to detect such a small motion is technically a challenge. Roughly how far away are the nearest stars, according to this observation? Give the distance in astronomical units (AU), the distance from the Earth to the Sun. (Hint: the baseline for this parallax measurement is 2 AU: why?)

Eyes

3.9 A camera focussed “on infinity”, i.e., set to capture a nice sharp point image of a star, will not be able to take a sharp picture of something close up. Explain why with a diagram, and say how cameras are constructed to solve this problem. 3.18. WAVEFRONTS: A DUAL THEORY OF LIGHT 115 Snell’s Law

3.10 Suppose a ray of light is incident on an interface at angle π/3 from the normal and exits at angle π/6 from the normal. What is the relative index of refraction at the interface? Which medium has the greater index of refraction?

3.11 Suppose a ray of light goes through a flat plate of glass, but not normal to the plane of the glass (i.e., at an angle). (a) Sketch the path of the ray in case there is air on both sides of the glass. (b) Sketch the path of the ray in case there is air on one side and water on the other. (c) Show that in both cases the direction of the ray when it exits the glass is the same as if the glass had not been there. What does this imply about what you see through a pane of glass?

3.12 Fig 3.28 shows a light ray going through a prism in a symmetrical way, arranged to make the deviation of the ray from its initial direction a minimum.

α

θ1 θ1 θ2 θ2

Fig. 3.28: In this sketch, a light ray goes through a prism in a peculiarly symmetrical way, entering and exiting at the same angle θ1. At this angle its deviation by the prism away from its original direction is a minimum (non-obvious fact).

(a) Make a careful drawing to show that the deviation at the first interface is θ1 − θ2. Since the deviation is the same at both interfaces, the total deviation is 2(θ1 − θ2). 116 CHAPTER 3. GEOMETRICAL OPTICS

(b) Use Euclidean geometry to show that θ2 = α/2.

(c) Use Snell’s Law to show that the correct angle θ1 for a minimum deviation ray obeys sin θ1 = n sin(α/2), and find θ1 for an ice prism with n = 1.31 and α = 60◦, an angle that actually occurs in ice crystals in the atmosphere. (d) Thus compute the angle of minimum deviation for a 60◦ ice prism. Could this have anything to do with the halo around the moon, a ring that sometimes is seen at an angular distance of 22◦ from the moon?

3.13 In Fig 3.29, show that the relative index of refraction is the ratio n = AB : CD (3.40) This is how Snell originally expressed his law.

AB

CD

Fig. 3.29: A light ray is refracted at an interface. The ratio of AB to CD is the relative index of refraction, a characteristic of the interface: that is a geometric way to look at Snell’s Law.

Focal Length of the Eye

3.14 Fish eyes are essentially spherical, unlike ours. Discuss the problem of the focal length of the fish eye, following the discussion in Section 3.9, not 3.18. WAVEFRONTS: A DUAL THEORY OF LIGHT 117 forgetting that, of course, fish eyes must work in water. How can we be sure that there is some essential structure inside the fish eye which is not part of the model in that Section? What could it be?

Virtual Images

3.15 Suppose an optometrist looks, with unaided eye, into the relaxed eye of a patient, and sees the (suitably illuminated) structures on the patient’s retina. Just like someone looking into water, the optometrist is really looking at a virtual image. Where is that virtual image located? (This question requires us to follow certain rays back to see where they appear to emanate from.) Sketch a diagram as part of your answer.

Thin Lenses

3.16 (a) How is Eq (3.17) different from Eq (3.11)? What factor relates the two expressions for focal length? What is physically different in the two configurations that are being described? (b) By pointing out what happens at the plane interface of the plano- convex lens, explain the factor you noticed in (a).

3.17 Say in words what Eq (3.17) means, and point out things about it that make common sense, including dimension, limiting cases, etc.

Object and Image

3.18 Take the object position o to be positive (upstream from the lens) and find the image position i in case f > 0 (convex lens). Do this for several representative values of o: f/4, f/2, f, 2f, 10f. Summarize in words what this says about looking at something through a convex lens. 118 CHAPTER 3. GEOMETRICAL OPTICS

3.19 Take the object position o to be positive (upstream from the lens) and find the image position i in case f < 0 (concave lens). Do this for several representative values of o: -f/4, -f/2, -f, -2f -10f. (Note that these are positive values for o!) Summarize in words what this says about looking at something through a concave lens.

3.20 Describe how a slide projector forms a real image on a screen, and propose realistic values for i, o, and f. What is the magnification in your proposal (including the usual sign convention)? Include a diagram.

3.21 A burning glass (convex lens) has a focal length of 15 cm. How big is the focussed image of the sun that it forms on a sheet of paper? (Hint: consider rays through the center of the lens to locate the image.)

Optical Systems

3.22 The magnification of a convex lens of focal length f is often said to be dmin/f, where dmin is the near point of the eye, typically 25 cm. A 50 cm focal length lens, however, actually magnifies things. The magnification is not 25/50=1/2. What is going on?

3.23 From the verbal description in subsection 3.13.2 make an informative diagram showing how a microscope works.

3.24 Use the lensmaker’s equation to design a reasonably thin lens with f = 2.5 cm. (It could be a microscope eyepiece.) Make an accurate sketch of your design, and give all necessary specifications.

3.25 What is the focal length of a system of two convex lens, of individual focal lengths f1 and f2, if they are separated by a distance d? Measure the length from the second lens, as you follow the path of the light rays “down- stream.” Check the dimensions of your result, and verify common sense. In particular check the special cases d = 0 and d = (f1 + f2), and comment. 3.18. WAVEFRONTS: A DUAL THEORY OF LIGHT 119

3.26 Design an astronomical telescope that is small enough to carry around easily and that magnifies by a factor of -8. Make an accurate sketch and give all relevant specifications.

3.27 As in the problem above, design a Galilean telescope that is small enough to carry around easily and that magnifies by a factor of 8. Make an accurate sketch and give all relevant specifications.

Mirrors

3.28 Use Euclidean geometry to show that the law of reflection implies the picture in Fig 3.21.

3.29 When you look into a shiny sphere, like a Christmas ornament, you see yourself and the objects around you reflected. Use the relation between o, i, and R (object position, image position, and radius of the sphere) to locate the image for various object positions o: R/2, R, 2R, 10R. Be careful with sign conventions! Here R is the radius of the sphere, which can only be positive, but what is the sign convention for finding the focal length f for the sphere? And is o really positive here?

3.30 A limiting case of the spherical mirror with radius R is the flat mirror. Discuss Eqs (3.34) and (3.35) in this limiting case. Do they make sense? Explain what these relations mean.

3.31 Prove Eq (3.36) from the geometry of image formation in the case of a concave mirror forming a real image (as in Fig 3.23).

Reflection and Refraction

3.32 One design for binoculars has, for each eye, an objective lens, an eyepiece lens, and two internal prisms. If each of these optical components is made of crown glass, with n = 1.5, and each has an entrance surface and an exit surface, and the components are not coated with any special anti-reflecting surface layer, how much energy is lost by reflection? 120 CHAPTER 3. GEOMETRICAL OPTICS

3.33 It is possible to delay one optical pulse with respect to another by sending it through a fiber optic cable. The extra travel time for the pulse through the cable is the delay, and it can be controlled by choosing the length of the fiber. Suppose we want a delay of 1 µs, and the fiber if made of glass with n = 1.5. How long should the fiber be? Chapter 4

Time and Oscillation

Time is one of those things that seems elementary and obvious until you try to say what it is. No other science claims time itself as its subject: this topic is pure physics. Insights into time and its measurement are useful across the sciences, from potassium-argon dating of ancient minerals in geology, to femtosecond laser pulses in the study of reaction kinetics in chemistry. But what is time, really?

Albert Einstein startled the world of physics in 1905 with his first paper on Special Relativity Theory. You might expect that this would be an impossibly difficult paper to read, but consider the following passage, taken from it verbatim: “If I say that ‘the train arrives here at 7 o’clock,’ that means, more or less, ‘the pointing of the small hand of my watch to 7 and the arrival of the train are simultaneous events.’” This seems way too simple! Why is there a sentence like this in a deep scientific paper? Einstein is emphasizing that what we mean by time depends on what we use for clocks. Previously physics had assumed that time exists independently, and clocks simply measure it. Einstein is pointing out that clocks define time, which has no other meaning or reality. Do you accept that? It seems a bit strange, but modern physics has had to adopt this notion of time. Einstein’s idea is now part of the bedrock of physics. Time is defined by clocks, and not the other way around.

So what is a clock? That will be our topic. At a minimum, we can say, a clock should have some measurable property that we will take to be

121 122 CHAPTER 4. TIME AND OSCILLATION proportional to time. Then by measuring that property, we are measuring time. An old metaphor for time is a stream or a river, flowing smoothly along. Taking this metaphor literally, we can collect flowing water, somehow, and the amount of water collected is a measure of the elapsed time – we have invented a water clock. An hour glass with flowing sand is a variation on this idea. We will not spend any time on these clocks, since they turn out to be surprisingly complex systems, for all their apparent simplicity. The consequence is that they are not very reliable clocks. On the other hand, just because of their complexity, fluid flow and granular flow turn out to be challenging and interesting phenomena to study in their own right. For now we turn to simpler clocks.

4.1 Angular Clocks

Another metaphor for time that can be made into a clock is a turning wheel. Nature even provides such a clock in the apparent turning of the heavens, carrying the sun, moon, planets, and stars across the sky, from East to West, as if they were on a great spherical dome. Thus the metaphor isn’t just a metaphor, it is, in a sense, real. When you look south, where the Sun, Moon, and zodiacal stars appear from typical latitudes in the northern hemisphere, you see this daily rotation going in the direction we call clockwise. Of course now we say that the Earth is the wheel, not the heavens, and the apparent clockwise motion of the heavens is really due to the Earth turning the other way, counterclockwise (as seen from above the northern hemisphere). But mechanical clocks were already being built in the late middle ages, before the motion of the Earth was understood, as analogue devices to mimic the heav- enly motions. Since the heavens turn clockwise, clocks also turn clockwise, a convention that has never changed. If you use a turning wheel for a clock, the quantity you measure is θ, the angle through which it has turned. Since we are assuming it is a clock, θ is proportional to t, time. That is θ = ωt (4.1) The constant of proportionality ω (“omega”) is called the angular speed. Using the abbreviation [T ] for the dimension of time, we see that ω has 4.1. ANGULAR CLOCKS 123 dimensions [T −1], since t has dimension [T ] and θ is dimensionless. Typical units for ω would therefore be radians per second, degrees per hour, cycles per minute, etc. (Recall that angles are dimensionless, but still are measured in some dimensionless unit, like radians or degrees. A cycle is another unit of angle, one full turn of the wheel, the same as 2π radians or 360 degrees.) It is ambiguous, and potentially confusing, but if the unit for ω is radians per second, it is frequently expressed without saying radians explicitly. That is, 50 s−1 means 50 radians per second. The default unit for angle is radians. Needless to say, wall clocks with a second hand, minute hand, and hour hand, are angular clocks. We read an angle and interpret it as a time. In such a clock the second hand moves with ω = 1 cycle/minute, and the other hands just keep track of how many cycles the second hand has made.

4.1.1 The Solar Clock

In a certain sense, which turns out to be a bit subtle, the angular position of the Sun is our most basic clock. Hours, minutes, and seconds are defined as fractions of the day, and one solar day is the time it takes the Sun to go from one noon to the next. Everyone knows this, right? This has been the meaning of time for most of human history, defined by the Sun as a clock. The Sun’s angular position can even be measured quickly and easily with a sundial. We interpret the measured angle as a time by using Eq (4.1), with ωSun = 1 cycle/day. Here is an example. Suppose we measure angle from the zenith, so that θ = 0 corresponds to noon. Then when θ = 15◦, the elapsed time t, by definition, is such that µ ¶ µ ¶ 1 cycle 360◦ 360◦ 15◦ = ωt = t = t (4.2) day cycle day so that t = (15◦/360◦) day= 1/24 day= 1 hour. That is, it is 1 PM. It would ◦ have been simpler to notice that ωSun = 15 /hour. Of course the unit ruled on a sundial is hours, not degrees, so that you don’t even have to make this conversion of units, which we only do here as a reminder. Sun time is close to being the time we use in everyday life, but it is not quite right. Now we define time with a different clock, and we adjust it to 124 CHAPTER 4. TIME AND OSCILLATION make it agree with sun time on average. Why would we complicate things like that?

4.1.2 The Sidereal Clock

Any star could also define a clock, and hence a time, by its angular position, which changes over the course of a night, much the way the Sun’s angular position changes over the course of a day. This time is called sidereal time, and we would read the sidereal clock by reading the angular position of the star and using Eq (4.1), just as we did in the example above. Of course we would have to know the constant of proportionality ωSidereal for this clock, which turns out to be slightly greater than 1 cycle per day! The reason is that the stars, in their apparent East to West motion, gradually gain on the sun, or to put it another way, the sun slips back toward the East, losing ground among the stars, slipping back once around the Zodiac in a year (this is the definition of the year). So the sidereal clock turns faster.

4.1.3 Solar vs. Sidereal

To relate these two clocks, we need to know how many days there are in a year: 365.2422, approximately. This is the result of an observational program which has been carried out over literally thousands of years, as one civilization after another tried to bring its calendar into synchrony with the seasons. We can take this number as well established. That the extra fractional part of this number, namely 0.2422, is about 1/4 = 0.25 means that we put one extra day into the calendar every four years. That the fractional part is really closer to 1/4 − 3/400 = 0.2425 means that three times every 400 years we don’t put in the extra day. As you see, this is still not quite the right correction. Since 0.2422 = 1/4 − 3/400 − 3/10000, we should also refrain from adding the extra day an additional three times every 10, 000 years, but there is no agreement about exactly when to do that, and maybe we won’t have to worry about it. Now we can relate the solar and the sidereal clocks. Using the conversion factor 1 = 365.2422 days/year, we find that ωSun = 365.2422 cycles/year. But ωSidereal = 366.2422 cycles/year, because by definition, the year ends 4.1. ANGULAR CLOCKS 125 when the stars exactly lap the Sun, like a runner who catches her opponent on a racetrack by gaining an entire lap. Thus ω 366.2422 Sidereal = = 1.00274 (4.3) ωSun 365.2422 This is the factor by which the sidereal clock runs faster than the solar clock, it appears. If this were the whole story, it would just mean that we have two different clocks, provided to us by Nature, that run at slightly different rates. It would be like having two different units to measure time, the solar day, and the sidereal day, with the solar day being slightly longer. (Here by “sidereal day” we mean the time for a star to go from one zenith, or “transit”, to the next, to complete one cycle.) We would read the sidereal clock in sidereal days, we would read the solar clock in solar days, but we could easily convert units, so both would tell us the time, and they would agree about what time means.

The actual story is more subtle, though. The angular speeds ωSun and ωSidereal must be understood as average speeds, averaged over the entire year. The measurement of the length of the year, on which all of this is based, is just a measurement of the total accumulated angle in each clock over the year, and says nothing about whether the clocks ran at constant rate. For a single clock this is not even a question, as we could define the meaning of time by insisting that ω is constant. But with two clocks we can ask whether they keep the same time, and the answer in this case is that they do not. The solar clock sometimes speeds up and sometimes slows down with respect to the sidereal clock, or, alternatively, the sidereal clock sometimes slows down and sometimes speeds up with respect to the solar clock. What we can be sure of is that they do not determine the same time, even if we correct for the difference in average angular speeds. A quick way to be sure of this is to consider the average angular speeds over just half the year instead of the whole year. In going from the vernal equinox to the autumnal equinox, the Sun goes exactly halfway around the Zodiac, so this is half a year. How many solar days are in half a year? Well, the day on which the Sun reaches the equinox is a little bit variable, since the year is not an integer number of days – this moment, which is announced in the newspapers each Spring and Fall, indicating the onset of a new season, 126 CHAPTER 4. TIME AND OSCILLATION is about six hours later each year until leap year jumps it back by 24 hours. But the autumnal equinox is always around Sept. 22, and the vernal equinox is always around March 21. So you can just count the days to find out how many days there are in half a year. I get 180 days from Sept. 22 to March 21, and 185 days from March 21 to Sept. 22. So in addition to running slower than the sidereal clock on average, the solar clock runs particularly slowly in the winter, and speeds up a bit in the summer, or perhaps it runs at constant rate, and it is the sidereal clock which speeds up in the winter and slows down in the summer. In any case, if we choose one over the other, we are making a choice about what time is, and there is no purely logical way to decide. This dilemma arises whenever we have two candidates for clocks that do not agree. In choosing between these two clocks we cannot use logic, because they are logically equivalent. We use physics. That is, we have a way of physically understanding the motions of the clocks, and this understanding includes an explanation for why one of them is not keeping good time and the other one is. Of course time, which we are defining here, is part of the theories that we use to make the decision! So in the end, time is defined in such a way that it becomes part of a coherent system to make mathematical sense of the world. But it is a choice. It is subject to international agreement, for example. Here is how we understand the two clocks, solar and sidereal. The sidereal clock uses the stars as a reference, and we assume the stars are essentially stationary from the point of view of the Earth, because of their great distance D. Their positions are angular positions on the dome of the sky, and even if they should move by a distance H, the angular size of that displacement H/D is essentially zero, because D is essentially infinite. The Earth may move by some relatively small distance H, but again the parallax angle H/D is essentially zero. Thus the stars provide a stationary reference for observers on Earth. Their apparent motion from East to West, the motion of the clock, is entirely due to the rotation of the Earth. But our understanding of rigid body rotation says that a rotating rigid sphere, isolated from any outside influence, rotates at constant angular speed forever. This is even true if the sphere is revolving around the Sun in a gravitationally bound orbit. We are clearly calling on more physics here than you could be expected to know, but these conclusions follow from Newton’s theory of motion. We therefore think we understand why the sidereal clock is a good clock, and should be taken seriously. The angular speed ωSidereal is the angular speed of the turning 4.1. ANGULAR CLOCKS 127

Day 1 A Sun A Day 0

Fig. 4.1: The Earth turns (counterclockwise) through more than one full rotation between one noon and the next because it has also advanced in its orbit. The extra angle that it must turn is the angular distance that it advances.

Earth, and physics strongly suggests that it should be essentially constant.

On the other hand we also have good reason to think the solar clock would not be a good clock. Fig 4.1 shows how far the Earth must turn between one noon and the next at some fixed point A on the Earth’s surface. The Earth has to make more than one complete rotation because while it is rotating it is also revolving in its orbit around the Sun. That is why the solar day is a little longer than the sidereal day. The extra angle the Earth must turn is essentially the angle through which it moves along its orbit. Fig. 4.1 is oversimplified, however. In the first place, we have made the distance the Earth moves along its orbit larger than it really is, just to make its effect on the solar day big enough to see easily. More important, as we describe in a little more detail in the next section, the Earth’s orbit is slightly elliptical, and the Earth travels faster through some parts of the orbit than others (faster during the Northern Hemisphere winter, in fact). This is the important thing: even while it rotates at a constant rate, the Earth speeds up and slows down in its orbit. When it moves fast (in winter) it must turn farther in a solar day, and this takes more time. Thus the solar 128 CHAPTER 4. TIME AND OSCILLATION day is of variable length. There is even another complication, coming from the tipping of the rotation axis with respect to the orbital plane, but we have seen enough to know that the solar clock is variable and complicated. Thus until recently (1967, to be exact) our civilization used the sidereal clock, scaled by the factor in Eq (4.3) to give mean solar time. With this definition, the sun at noon (by the clock) would be sometimes ahead of the zenith and sometimes behind. The amount by which the solar clock is ahead of or behind clock time through the year is graphed in Fig 4.2, what is sometimes called the Equation of Time.

15 T (minutes) 10 5 100 200 300 N 0 -5 -10 -15

Fig. 4.2: The Equation of Time, the amount T by which a sundial is ahead of or behind mean solar time, as a function of day N in the year, with N = 1 corresponding to Jan. 1

4.1.4 Aside on Kepler’s Laws

The properties of planetary orbits alluded to in the previous section were discovered in the early 1600s by Johannes Kepler, and were published by him in 1619. The three Kepler laws were shown by Newton in 1689 to follow from his Law of Universal Gravitation, and in this way the problem of planetary motion – at least in its simplest form – was solved once and for all. The Kepler laws are 4.1. ANGULAR CLOCKS 129

I. The orbit of a planet is an ellipse with the Sun at one focus (see Fig 4.3).

II. The planet moves in such a way that the line joining it to the Sun sweeps out equal areas in equal times (see Fig 4.4).

III. The period T and the semimajor axis a of its elliptical orbit are such that T 2 ∝ a3.

a b a OS

Fig.4.3: An ellipse is characterized by two lengths, the semimajor axis a and the semiminor axis b. The ellipse has two foci, one indicated by the point S above√ and another symmet- rically located to the left of the center O. The distance OS is a2 − b2, as indicated in the figure. A special case of the ellipse is the circle, for which a = b is the radius. In this case the foci and the center coincide. If the ellipse is a planetary orbit, the sun is at S.

B C

S D A

Fig. 4.4: Kepler’s 2nd Law: In an elliptical planetary orbit, equal areas are swept out in equal times. Since the two shaded regions are equal in area, the times to sweep them out were equal, which means the planet moved faster along the arc AB, where it was near the Sun, than along CD, where it was far away. 130 CHAPTER 4. TIME AND OSCILLATION

The Earth is a planet, and the Earth’s period T is 1 year. Until 1976 its semimajor axis a was the astronomical unit (AU) by definition, so the constant of proportionality in Kepler’s Third Law was exactly 1 yr2/(AU)3. In 1976 the definition of the AU was changed slightly! Now 1 AU is the average distance of the Earth from the Sun, in a certain sense. Since the Earth’s elliptical orbit is almost circular, though, i.e, a ≈ b, referring to Fig 4.3, the difference between a and 1 AU is very small, and to excellent approximation we can still take the constant to be 1 yr2/(AU)3.

4.2 Atomic Clocks

In the 20th century new clocks were developed (atomic clocks), using prin- ciples of quantum mechanics, that promised very high accuracy. Like the rigid, spherical Earth turning in isolation, these clocks used an essentially simple, isolated timekeeper (cesium atoms at low density in a near-vacuum, and more recently rubidium atoms). Atomic clocks, however, do not agree perfectly with the sidereal clock. This poses the question all over again: which clock do you believe? It comes back to questions of physics. Are there reasons why the sidereal clock might be irregular, and in particular might be slowing down, as the comparison with atomic clocks suggests? Well, yes: the Earth is not quite rigid, is not quite spherical, and is not quite isolated. This complicates its rotation, and suggests ways it could slow down. On the other hand there is no plausible reason why cesium atoms should be grad- ually speeding up. So it seems that we should trust the atomic clocks, and even use them to learn things about the dynamics of the Earth, by studying how it fails to be a perfect clock. In 1963 Jocelyn Bell and Anthony Hewish, radio astronomers, detected what came to be known as pulsars, astronomical sources of regular pulsed radio signals. For a brief time there was even speculation that these were signals from an alien civilization (the first name for pulsars was LGM: Little Green Men) but soon it became clear that the pulses were extremely regular, and could not contain a message. Their regularity suggested that they must come from rigid rotating objects, essentially angular clocks, with the nice feature that on each rotation they send us a pulse, like the light flashes from a lighthouse as it rotates. If this is the case (and it is still our model for pulsars) they might be fantastically good clocks, because whatever they are, they must 4.3. GPS: GLOBAL POSITIONING SYSTEM 131 be much more massive than the Earth, and so it would be much harder to slow them down or speed them up. Perhaps they really turn at constant angular speed. In the comparison with atomic clocks, though, pulsars lost out. Within a few years it was found that at least some pulsars slow down slightly with respect to atomic clocks, and hence we must conclude that they really do slow down in reality. This says something about the environment in which they are spinning, perhaps, or about their dynamics (“starquakes” have been observed in pulsars, sudden changes in shape that affect their angular speed ω – at least that is the most plausible interpretation). Thus pulsars did not become our time standard, but if they had been discovered 50 years earlier, they probably would have. Finally, the atomic clocks do not agree among themselves! They are ba- sically all the same construction, so how do we choose? Once again physics has something to say. Einstein’s Special Relativity Theory actually predicts that they shouldn’t all run at the same rate, although they should all run at constant rates. Part of reading the clock is making these relativistic cor- rections. There is a correction for latitude: clocks nearer the North Pole are moving slower, as the Earth turns, than clocks nearer the Equator, and this turns out to affect the clock rate. It is now a well understood effect. Also the altitude of the clock makes a difference, because according to Einstein’s Gen- eral Relativity Theory, the gravitational field affects the clock rate, and the gravitational field is weaker at higher altitude. This is also well understood. These corrections, which everyone agrees should be made, bring the clocks to near agreement. If there were some systematic discrepancy left among the clocks, it would indicate some new physics phenomenon that we don’t yet understand, but there doesn’t seem to be such a thing in the data. The remaining discrepancy looks random. Therefore the time shown by all the clocks in the network, which consists of over fifty clocks, at widely separated locations, is simply averaged. The result is: time!

4.3 GPS: Global Positioning System

The basic idea for GPS comes from Euclid Book I, Proposition 22, which says that you can construct a triangle knowing the lengths of its sides. Call the sides D1, D2, and D3. As Fig 4.5 shows, we can lay out the length D1, with endpoints A and B. Then if the point P is known to be a distance 132 CHAPTER 4. TIME AND OSCILLATION

D1 A B

D2 D3

P

Fig. 4.5: A triangle is determined by the lengths of its sides, the principle of the Global Positioning System.

D2 from A and a distance D3 from B, we swing an arc with radius D2 and center A and an arc with radius D3 and center B. Where these intersect is the point P , the third vertex of the triangle. In GPS the points A and B are two satellites, whose positions are known. The point P is a lost hiker trying to figure out her position. That position would be determined if she could just find her distance D2 and D3 from the two satellites. The system accomplishes this by very accurate timing! Signal pulses are sent out from the satellites and travel with the speed of light. They are eventually received at P , and the delay due to travel time is a measure of the distance the signal travelled. This goes to the very notion of speed: the distance the signal travels is proportional to the time, and the constant of proportionality is the speed, denoted c in the case of the speed of light. Thus

D2 = ct2 (4.4)

D3 = ct3 (4.5) where t2 and t3 are the two measured delays, and c is the known speed of light. The longer the delay, the farther away she must be. She switches on the 4.3. GPS: GLOBAL POSITIONING SYSTEM 133

receiver. It measures the delays, computes D2 and D3 using the known value of c, and then does the construction of Fig 4.5 to determine her position. The above is the basic idea. Actually implementing it calls for a lot of cleverness, and the system is still rapidly developing and improving, but it is basically just Euclidean geometry and accurate timing. One problem that might occur to you is that our world is three dimensional, and the above construction seems to take place in just a plane (two dimensions). That is a fair criticism: in fact, there have to be three satellites, and three delays t2, t3, and t4 to locate the receiver P in three dimensions. The picture is still basically the same, but instead of two circles of radius D2 and D3 intersecting to determine P , we would have three spheres of radius D2 = ct2, D3 = ct3, and D4 = ct4 intersecting to determine P . We would like the GPS to locate things within a few meters. How good must the timing be? To answer this, we must finally take account of what the speed of light actually is, numerically: about 3 × 108 m/s. So the time to go 3 meters is 3 meters t = = 10−8 s (4.6) 3 × 108meters/sec That is 0.01 microsecond, or 10 nanoseconds, a very short time! Any error in timing by just that small amount produces an error of 3 meters in position. The conceptually simplest way of measuring a delay like t2 from satellite #2 would be for satellite #2 to emit a pulse at a predetermined time that is known to everyone. Then the receiver at P notes the time the pulse is received, which is a later time, of course, and notes the difference from the known time it was sent. This is conceptually simple, but technically much too hard: everyone would have to be carrying atomic clocks to make a comparison like that with a precision of 10 nanoseconds! Instead, the receiver just compares arrival times of signals from different satellites, but does not assume that it knows the real time. To make up for this missing datum, it needs to use a signal from a fourth satellite. Over a short time, like a second, it can make meaningful comparisons with a precision of 10 nanoseconds, but over long times, not being an atomic clock, it goes out of synchrony with standard time. With four satellites all sending pulses close together, it gets three accurate time differences, enough to locate P in three dimensional space. The receiver acts more like a stopwatch than a clock, so that it only has to be a good timer for a brief interval, not stable and accurate over a long time. You could think of the first pulse as starting the 134 CHAPTER 4. TIME AND OSCILLATION stopwatch, and the other three pulses then get measured with respect to the first one. Let us consider two more technical points about the GPS that the engi- neers have had to deal with. First, the signals from the satellite do not propa- gate through a vacuum – eventually they come through the atmosphere, with its index of refraction n. In fact n is not even constant, but depends on the properties of the air. In the ionosphere, a region in the upper atmosphere, n is anomalously large. And since the light speed is really c/n, and not c, as we have been assuming, we are likely to overestimate the distances unless we correct for the characteristics of the atmosphere. Therefore a mathematical model of the index of refraction of the atmosphere is part of the GPS, built into the receivers! Finally, do not forget that we are assuming we know exactly where the satellites are, probably within a few centimeters. How is this possible? Well, Newton’s laws of motion can predict pretty accurately where they must be, and signals sent between the satellites of the system and ground stations can monitor any drift away from the prediction and update the system with true positions. The network of atomic clocks governs the satellite pulses, since it is essential that they be properly synchronized. Taken all together, the GPS is a truly amazing invention. As a resource for economic efficiency, not to mention saving lives, its value is almost im- possible to exaggerate. Yet the GPS makes essential use of physics that was until recently considered almost impossibly abstract, and without practical importance, like relativity theory. It also makes essential use of Euclidean geometry, the very beginning of physics. It should be seen not as an achieve- ment of our time alone, but as a culmination of a very long project, over most of human history.

4.4 Longitude

An older version of the GPS problem (the problem of determining location) is to determine longitude. This problem too requires Euclidean geometry, good clocks, and celestial references, so it is a kind of precursor to the GPS, before the days of artificial satellites and atomic clocks. 4.4. LONGITUDE 135

The basic idea is shown in Fig 4.6, showing the Earth from above the North Pole, so that the angle in the figure is longitude. Light from a star is

A θ

G θ

Fig. 4.6: The Earth is viewed from above the North Pole. A star at essentially infinite distance (note parallel rays) is seen at its zenith in Greenwich G, and at the same moment at an angle θ (measured from the local zenith) at A. The longitude at A is therefore θ. observed at a location A on Earth. The star appears at an angle θ from its zenith at A. Now suppose that at this very moment that same star is seen at the zenith over G, Greenwich Observatory (the zero of longitude). Then geometry says that the longitude angle at A is θ. Another way to make the observation is to time the star from the moment of its zenith over Greenwich to the moment of its zenith over location A. Since the sidereal clock (the rotating Earth) moves at 1.0027 cycle/day, you could convert this time into the angle moved. This makes it seem as if it would be easy to determine longitude. You just observe any star that happens to be at the zenith over Greenwich right now, and its angular departure from the zenith, as seen from where you are, is your longitude. The hard part is in the phrase “right now.” How would you know that a given star just happens to be at the zenith at Greenwich right now? You could have a table of stars with the times of their zeniths at Greenwich (Greenwich time). But you would still need a clock that told you 136 CHAPTER 4. TIME AND OSCILLATION

Greenwich time. Then you would be all set. You observe a star at the time given for it in the table according to the Greenwich clock. To do it, though, you need that clock. If the clock is wrong, then you make the observation at the wrong time, the star has moved, and you get the wrong angle.

Nowadays anyone can make a rough measurement of longitude. You could call a friend in London on the telephone and ask where the sun is in the sky. At the same time you note where the sun is in your sky. (You could each use sundials to measure the angle.) The difference is your longitude. Or you could note the Sun’s position at some convenient time by your watch at your home on the East coast, fly from the East coast to the West coast of the US, and when you get there check the Sun’s position a second time against your watch. You see that the Sun is not as far along in its daily motion as the watch says it should be. The discrepancy is the difference in longitude between the East coast and the West coast. You have to trust that your watch is still keeping good East coast time, but even the cheapest watch can do that now. If you are satisfied with a very crude measurement, the assignment of time zones is a rough indication of longitude, so you wouldn’t have to measure anything, just compare local clocks on the West coast with your East coast watch: 3 hours discrepancy means 45◦ in longitude. (Reality check: the longitude difference between Boston and San Francisco is about 51◦, but when we use time zones we can only get an answer in integer multiples of hours, i.e. 15◦. With that understanding, that the value obtained from time zones could be off by as much as 15◦, these two values agree.)

It requires an effort to imagine a time when none of this was possible: no nearly instantaneous communication to distant places, and no clocks that could be transported while accurately keeping the time of their original lo- cation. Without one or other of these things, the measurement of longitude is nearly impossible. In the late middle ages in Europe it was known that Japan was about 10000 miles to the East, but what is that in longitude? It turns out that Tokyo is about 140◦ east of London, not even half way around the world. But Columbus apparently convinced himself that it was closer to 300◦, or most of the way around, so that Japan was out in the Atlantic somewhere. To believe this he had to believe that the spherical Earth is only about half the size it actually is. This is to say that ignorance on the question of longitude was profound. The Hellenistic Greeks had known the true size of the Earth, but 1700 years later no one knew. 4.5. THE MOONS OF JUPITER 137

In addition to Eratosthenes’ accurate determination of the size of the Earth (using latitude, much easier), the Hellenistic Greeks had made some measurements of longitude by the following method. It occasionally hap- pened that a lunar eclipse was observed at cities widely separated from each other and people noted the time (probably not very accurately, but at least approximately). The eclipse functioned retrospectively like a phone call, an instant communication – both cities were seeing the same thing at the same moment, but the local times were different. The difference in local times is precisely a measurement of longitude, as we have seen. With patience, as lunar eclipses conveniently happened, it would be possible to build up an accurate map of the Earth this way. But it is not the kind of method one can call upon as needed, to determine the longitude of a ship, for example, for purposes of navigation. The determination of longitude by navigators seems to require a good clock, one that can run on board a ship and keep Greenwich time to bet- ter than 1 second for a whole voyage. Eventually such clocks were built, and given the Greek name chronometers. Their invention, in England, con- tributed significantly to British naval power. This is a wonderful story, mem- orably told by Dava Sobel in her well known book Longitude, but it doesn’t have much more to teach us about physics. We will tell instead about an attempt that failed, but an ingenious one, and quite instructive.

4.5 The Moons of Jupiter

In January 1610 Galileo, making the first telescopic observations of Jupiter, noticed four bright little stars, as he called them, that seemed to accompany Jupiter. They seemed to oscillate from one side of the planet to the other as he observed them over successive nights and weeks, getting sometimes a little ahead of the planet and sometimes a little behind. It took him a few days to realize what he was seeing. The little stars were actually moons of Jupiter, making circular orbits around the planet, but he was seeing the circles from the side. Schematically the situation was like Fig 4.7, where we imagine Galileo looking in along the x-axis and seeing only the y-coordinate of each moon. Each moon is, in effect, a little angular clock. The angle θ of the moon in its orbit is called its “phase”, perhaps by analogy with our terrestrial moon. 138 CHAPTER 4. TIME AND OSCILLATION

y y M 1 θ x θ J

Fig. 4.7: A moon M in circular orbit of radius 1 is observed from the side. The observer sees only the projection of the orbit on the y-axis. If θ is the phase of the moon in its orbit, the observer sees y = sin θ. This is the observed displacement of the moon M from Jupiter J, sometimes positive, sometimes negative.

When θ = 0 the moon is seen against Jupiter, but when θ = π/2 the moon is at maximum displacement from Jupiter. It goes back to zero displacement at θ = π, and reaches maximum displacement on the other side of Jupiter when θ = 3π/2. This makes the clock a little tricky to read. You don’t really see the phase angle θ, you only see the projection onto the y-axis, which is sin θ. Galileo had to learn to look at the displacement of the moon and realize what it meant in terms of angle θ. That is what Fig 4.7 shows how to do. If you only see the clock from the side, you actually can’t tell the angle θ unambiguously. We have shown the clock with phase π/4, with M in the first quadrant, but there is another location, in the second quadrant, namely θ = 3π/4, where it has the same y-displacement from Jupiter. This is just a little complication to be aware of. In calling this displacement sin θ, we are actually extending the definition of sin θ beyond its original definition, as a ratio of sides in a right triangle. That definition would apply to the θ in Fig 4.7, which is in the first quadrant, 4.5. THE MOONS OF JUPITER 139 and we have even indicated the appropriate right triangle, but for angles larger than π/2 that definition wouldn’t make sense. The extension of the definition to any angle is shown in Fig 4.8. With this definition, which clearly y

(cosθ,sinθ)

1 θ x (1,0)

Fig. 4.8: The cosine and sine of an angle θ are defined to be the x and y coordinates of the point on the unit circle at angle θ, measured counterclockwise from the x-axis. For the angle θ shown here the cosine is negative and the sine is positive. agrees with the old definition for angles in the first quadrant, the sine and cosine can be either positive or negative, depending on the angle. You can see why the sine and cosine are sometimes called the “circular functions.” The tangent is now defined more generally by sin θ tan θ = (4.7) cos θ We notice how this picture includes some familiar facts about the sine and cosine, for example cos(0) = 1 and sin(0) = 0, but also some perhaps unfa- miliar facts, like cos(π) = −1 and sin(π) = 0. The identity cos2 θ +sin2 θ = 1 that we pointed out in Eq (2.30) still holds, because the point (cos θ, sin θ) is a point on the circle of radius 1, whatever quadrant it may be in. Each moon of Jupiter is essentially an angular clock, and its phase angle obeys θ = ωt for some angular speed ω. Thus θ is proportional to time, and 140 CHAPTER 4. TIME AND OSCILLATION the axis in the graph in Fig. 4.7 could be considered a kind of time. Galileo’s observations of the displacement of the moon from Jupiter, if plotted versus time, would look like that graph, but wouldn’t stop after one cycle. In fact, he went on taking data on the moons for years, in order to establish their ω’s as accurately as possible, to be able to read time from the clock. The moons were fascinating in themselves, but he also had a practical motivation: he saw the moons as the solution to the longitude problem! Here was a clock that everyone could see and agree on! It could play the role that Greenwich time was to play later. Comparing local time to Jupiter-moon-time would be a determination of longitude. You wouldn’t need to carry a clock with you, because the clock was already there, wherever you went, visible in the sky.

Sadly, this idea never really worked in the way Galileo had hoped. He perhaps underestimated the difficulty most people had even seeing the moons of Jupiter, much less making an accurate determination of their positions. And of course one can only make the measurement at night, in good weather, during those hours when Jupiter is in a good position to view. It was not a general solution to the longitude problem. But the thoroughly documented orbits of the moons of Jupiter did lead to a wonderful discovery around 1675, long after the death of Galileo, one that would have delighted him. The moons occasionally enter the shadow of Jupiter, and later re-emerge, in Jovian lunar eclipses. These occur at precisely identifiable times, because the moon rather suddenly disappears! In observing these the Danish astronomer Ole Roemer noticed that the eclipses occur early when Jupiter appears far from the Sun in the sky (“in opposition”), and they occur late when Jupiter appears near the Sun in the sky (“in conjunction”). He realized that it wasn’t the moons speeding up and slowing down, to get to their eclipses early or late. Rather, the time for the light to travel to inform us was different in the two cases, taking longer when Jupiter is farther away – he had determined that light travels with finite speed! From the amount of the time delay t and the very roughly known size of the solar system D, he could even estimate the speed of light c from D = ct. Much later still, good terrestrial determinations of the speed of light c and the time delay t would determine the size of the solar system D to high accuracy. 4.6. PERIOD, FREQUENCY AND AMPLITUDE 141 4.6 Period, Frequency and Amplitude

Fig 4.9 shows how the four moons discovered by Galileo behave in time. As in the previous section these are displacements ahead of and behind Jupiter as a function of time, and they are sine curves. The curves all start at a hypothetical time t = 0 when all the moons have phase angle zero. Because their ω’s are different, they rapidly get “out of phase,” that is, their phase angles are soon all different. In this context ω is called angular frequency instead of angular speed. It is angular speed if you are looking at the circular orbit face on, but it is angular frequency if you are looking from the side, as we imagine here. The units are typically radians/sec, often denoted simply s−1.

If we call the displacement y, the equation for such a curve is

y = A sin(ωt) (4.8)

The constant A is the amplitude, and it corresponds to the maximum dis- placement in the graph. Thus it is the same as the radius of the moon’s orbit, as we saw in Fig 4.7. The smallest orbit has radius 1 here, since the smallest amplitude is 1. (This just means we are measuring the other radii in units of that one.) The largest amplitude is about 4.5, which means that one of the moons is 4.5 times as far from Jupiter as the closest one. The sine function itself oscillates between 1 and −1, but when you multiply by some other amplitude A, it oscillates between A and −A. The period is the time it takes the moon to complete one cycle. If we call it T and measure the angle in radians, we must require ωT = 2π, because the circular function sin θ just repeats its values again after θ reaches 2π, having completed a full circle. Therefore the period T is related to angular frequency by 2π T = (4.9) ω Note how the dimension works. The period has dimension [T ], and angular frequency has dimension [T −1]. 142 CHAPTER 4. TIME AND OSCILLATION

5

4

3

2

1

0

−1

−2 Apparent displacement from Jupiter

−3

−4

−5 0 5 10 15 20 25 30 Time (days)

Fig. 4.9: The apparent displacement of the moons of Jupiter away from the planet, as a function of time, are sinusoidal curves. The amplitude of each curve is the radius of the corresponding moon’s orbit. The displacements are scaled so that the smallest orbit has radius 1. It is interesting to notice that the moons which are farther from Jupiter take longer to complete one cycle than the ones which are closer.

Finally, the frequency is f = 1/T (4.10) with typical unit cycles/sec, now called Hertz (abbreviated Hz). The 1 in the definition of f is really 1 cycle, and the period T is typically given in 4.7. VELOCITY IN ORBIT, PROJECTED 143 seconds, the time for one cycle. It is always a possible source of confusion that people may say “frequency” when they mean ω, “angular frequency.” These two frequencies differ by a factor 2π, since ω = 2πf. This is the same factor 2π that occurs in the conversion factor 1 = 2π radians/cycle. It is always a good idea to say “angular frequency” when you are talking about ω if there is any possibility of confusion. Finally, we can notice that for these moons the large amplitude curves are also the long period curves, and the small amplitude curves have shorter periods. This means that the inner moons get around the cycle faster, as you would expect. In fact the moons’ periods T and orbital radii a obey Kepler’s Third Law, T 2 ∝ a3! Galileo had all the data he needed to verify this mysterious fact, but alas – he never paid much attention to Kepler. He apparently never read Kepler’s books, and he never noticed the power law relationship, which would have delighted him.

4.7 Velocity in Orbit, Projected

We continue with the topic of how circular motion looks when viewed from the side, i.e., sinusoidal oscillation, now considering speed, or velocity. A moon in a circular orbit of radius R with a period T goes around the entire circumference of the orbit 2πR in time T , so its constant speed is 2πR v = = ωR (4.11) T Notice how the units work: the dimensionless unit “radians” has disappeared! R has dimension [L], ω has dimension [T −1], and v has dimension [LT −1], with units perhaps ms−1. But the derivation requires that ω be in radians per second. The velocity v = ωR can be pictured as in Fig 4.10. The moon moves along the circular orbit, and its velocity at any moment is tangent to the circle. Looking from the side, however, we see only vy, the projection of this velocity on the y axis, which is

vy = ωR cos θ (4.12) Thus the projected velocity oscillates. The projected velocity is first positive, then negative, then positive again as the moon orbits. 144 CHAPTER 4. TIME AND OSCILLATION

ωR ωR cosθ θ

R

θ

Fig. 4.10: The projection on the y axis of the velocity ωR of an orbiting moon is ωR cos θ

The words “speed” and “velocity” are nearly interchangeable in everyday use, but in physics “speed” is usually just a nonnegative number, while “ve- locity” may be either positive or negative, as here, with the sign indicating direction. More generally velocity may be a vector, an arrow, like that in Fig 4.10, giving not just the speed, but the direction in space.

4.8 Pendulums

The kind of time dependence seen in Fig 4.9 is surprisingly common in Na- ture, and not just because one might be looking at circular orbits from the side. Many things oscillate in a sinusoidal way even if there is no circle in- volved. The displacement of a pendulum as it swings left and right, like the one in Fig 4.11, is essentially sinusoidal in time, with a definite angular frequency ω. This makes a pendulum a clock, and one reads the clock by 4.8. PENDULUMS 145 reading the phase, i.e., the completely fictitious angle it would have if the back and forth motion of the pendulum bob were circular motion, but seen from the side. This phase angle is not to be confused with the real angle

φ L

Fig. 4.11: A pendulum of length L the pendulum makes with the vertical direction, which corresponds to dis- placement, not phase! Since we have always called the phase angle θ, let us emphasize that the displacement angle of the pendulum is a different thing by giving it a different Greek letter, φ. We could even let φ be the oscillating variable and write

φ = φ0 sin(ωt) (4.13) For small swings of a pendulum, this is a pretty accurate description of how the pendulum angle φ behaves in time. The constant φ0 is the amplitude, the maximum angle the pendulum makes with the vertical. The pendulum swings between φ0 and −φ0. The period T of the pendulum is the time for it to swing through a complete cycle. One can measure time in units of T by just counting the periods – the pendulum is a kind of clock. The angular frequency, if you should need it, is ω = 2π/T , by Eq (4.9). A pendulum makes a very good clock, in fact. One can test this by comparing pendulums with each other. They all keep quite consistent time, even if they have different periods, at least if the amplitude of their swings 146 CHAPTER 4. TIME AND OSCILLATION is small. That is, they are all described by Eq (4.13), with different ω’s and (small) φ0’s. Galileo was one of the first to notice this, and he used his own pendulum clocks in experiments. It seems quite amazing that such a simple clock had not been noticed and perfected long before! But in fact a scramble for rights to this invention ensued. Galileo in his old age hoped his son Vincentio might somehow get the rights to it, but the Dutch mathematician and physicist Christian Huyghens is usually credited with doing the most to put the pendulum clock into general use. Huyghens did considerable work on the problem of making a pendulum that always had the same period T , no matter whether it oscillated with a small amplitude φ0 or a large one. The dependence of period on amplitude is slight, and Galileo, who is usually a very good observer, apparently didn’t even notice it, but in fact as the amplitude gets larger, the period gets slightly longer. Thus, for good consistent timekeeping, the amplitude of the swing should be small.

4.8.1 The Period of a Pendulum

One of the most remarkable things about a simple pendulum clock is that it doesn’t matter what you make it out of. A little weight on the end of a string of length L makes a pendulum, but it is completely irrelevant what the little weight is. The period T of the pendulum depends only on the length L. If you make a pendulum that is 1 meter in length, for example, it will have a period of almost exactly 2 seconds. Try it! This means you could be cast away on a desert island (with a meter stick) and still construct a simple pendulum clock that keeps known time. If you like your eggs cooked for exactly three minutes, for example, you could do it. Of course you could also calibrate a pendulum clock by comparing it with an astronomical clock. Early pendulum clocks were calibrated by counting swings for an entire sidereal day, determined by the successive transits of a star! Longer pendulums have longer periods, in a very precise way. This rela- tionship was noticed in the middle 17th century. It can be established with no other clocks than the pendulums themselves. Just let two pendulums of different lengths run at the same time, and count how many swings each makes in some definite time, for example in the time that the shortest pen- dulum makes 100 swings. Measure length in units of the shortest pendulum. Then you might collect the following data: 4.8. PENDULUMS 147

L # of swings 1 100 2 71 3 58 4 50 Notice that when the pendulum is made 4 times longer, the period be- comes 2 times longer, so it only makes half as many swings. We could also say the frequency is only 1/2 what it was. More generally, the relationship is √ T ∝ L (4.14) That is, the period is proportional to the square root of the length. This is a clear example of a mysterious proportionality in Nature, pure physics. We try different ways of expressing it. It would be the same thing, for example, to say ω ∝ L−1/2 (4.15)

That is, the angular frequency√ is proportional to the reciprocal of the square root of the length (1/ L). We could replace ω (angular frequency) by f (frequency) in the statement above, because ω and f are proportional to each other, and hence both are proportional to L−1/2, although with different constants of proportionality. We could also square both sides to get rid of the square root and say ω2 ∝ L−1 (4.16) These are all equivalent statements. Writing the last statement of proportionality as an equality introduces a constant of proportionality g g ω2 = (4.17) L or equivalently, for the period of a pendulum, s L T = 2π (4.18) g

Let us check dimensions in Eq (4.17). The left side has dimensions [T −2] and L is a length. Therefore the dimensions of g, which we may denote [g], are [g] = [LT −2]. This looks a little bit like a speed, but speed has dimensions [LT −1] – notice the difference! The units of g identify it as an acceleration, 148 CHAPTER 4. TIME AND OSCILLATION and in fact g is called the acceleration due to gravity. It occurs in physics wherever the Earth’s gravitational field plays a role. In this case, for example, it is obvious that the reason the pendulum swings at all is that gravity pulls it down. An amusing limiting case of Eq (4.18) is the case g → 0. Then T → ∞. What is this telling us? From the experimental observation that L = 1 meter gives T ≈ 2 seconds, we find g ≈ 10 m/s2 (4.19) A more precise measurement of T gives the better experimental determina- tion g = 9.8 m/s2 to two decimal digits. You will seldom see g quoted to higher accuracy than two digits, because, although g is a constant at any given location, it actually varies over the surface of the Earth. This varia- tion affects the third digit. Thus gravity is not quite constant as you travel, although its variation is subtle. Gravity is less at high altitude, but even at sea level it varies with latitude. The variation of g was discovered in just the way you might expect. As soon as Huyghens and others had managed to build good pendulum clocks, they were optimistically taken on board ships as possible chronometers to determine longitude. There were even some early successes: in 1664 a certain Captain Holmes, carrying one of the first of these clocks, and running short of water, correctly determined that he was closer to the Cape Verde Islands off Africa than he was to the Caribbean, and turned East rather than West. But it was soon clear that these clocks could not give correct longitude, and one reason was that they didn’t run at a constant rate. The period T of a pendulum is given by Eq (4.18), but g depends on where you are. Several expeditions investigated this effect, and it turns out that g de- pends on latitude. A typical result, quoted here from Newton’s Principia Mathematica, was that of Edmund Halley (of Halley’s Comet), in 1677, who found that a pendulum clock that had a period of exactly 2 seconds in Lon- don, by comparison with the sidereal clock, had to be readjusted at the island of St. Helena. The pendulum had to be made shorter by 1/8 inch to keep the same period, a larger adjustment even than the clockmaker had allowed for. That can only mean that g at St. Helena is considerably smaller than g in London, smaller, in fact, by the same factor as L is smaller, since g/L is kept constant. We will look at the details of this in the next section, with quick approximate methods for handling such issues. 4.9. THE BINOMIAL APPROXIMATION FOR PERTURBATIONS 149

By the time Newton was writing about this, in the 1680’s, he understood exactly what was going on, which is more than we can fully explain here, but we can outline it. The main reason that g is observed to be smaller at more equatorial latitudes is the rotation of the Earth. This effect alone, though, which Newton computed, is not enough to explain the size of the observed effect. The Earth must also be oblate, bulging a bit at the Equator, which also has the effect of lowering g there. In fact, one expects the Earth to bulge like that if it is rotating, and by just the amount that the data required! Finally, Newton could not rule out that the pendulums had been warmer in the tropics than in London, so that L had become larger because of thermal expansion, and therefore had to be shortened, another contribution in the same sense as what is observed. He estimated this effect quantitatively and showed that it was small compared to the other effects, but not negligible, and he kept this small correction, which made the agreement with theory almost exact. Thus pendulum measurements, combined with Newton’s new theory of motion, had demonstrated conclusively that the Earth rotates (still contro- versial at the time), and had revealed a subtle departure of the Earth’s shape from a sphere. The pendulum, despite its simplicity, is a very delicate and precise instrument. You can measure its period with high precision if you are just willing to take enough time and count enough swings, and the period tells you about the interesting quantity g.

4.9 The Binomial Approximation for Pertur- bations

A small change in g causes a small change in T , the period of a pendulum. Physicists have quick and easy methods for estimating such effects. Any physicist would look at Eq (4.18) and realize that if g were 1% smaller, then T would be about 0.5% larger, or more generally, whatever the fractional change in g, the fractional change in T would be about half of that, and in the other direction. How does one see this? It all goes back to the binomial theorem, which gives a formula for ex- panding the nth power of a binomial. We only need it in a very special case, 150 CHAPTER 4. TIME AND OSCILLATION in which the binomial is (1 + x), and |x| << 1 is small. In this case, which we could call a “small x” approximation,

(1 + x)n ≈ 1 + nx (|x| << 1) (4.20)

It is as if the exponent n “slid down” and multiplied x. We will call this the binomial approximation. This is an extremely useful approximation, just like the small angle approximation, and should be memorized! Fortunately its form makes it easy to remember. As a numerical example, (1.01)2 ≈ 1 + 2(.01) = 1.02. The exact value is 1.0201, different by only one part in 104. The approximation is good because x here is 0.01, which is much less than 1. The binomial√ approximation works just as well if n is not an integer. For example 1.01 = (1.01)0.5 ≈ 1 + (0.5)(0.01) = 1.005. The exact value is 1.004987..., differing by less than 2 parts in 105. For x as small as 0.01 you can use the binomial approximation with confidence. How do we use the binomial approximation to see how g affects T for a pendulum? We write Eq (4.18) as

T = 2πL1/2g−1/2 (4.21)

Now if g changes to g0 (i.e., some other value), then T changes to T 0, where

T 0 = 2πL1/2g0−1/2 (4.22)

We introduce the notation g0 = g + ∆g, where ∆g (‘delta g’) is the little change in g, equal to the difference g0 −g. The symbol ∆ here means “change in”, a kind of peculiar concept. It does not mean we are multiplying g by something! We know from observations that g does change a little bit when you change latitude, so ∆g is just a name for this change. It is one single quantity, not a product! Depending on how g changes, ∆g could be either positive or negative. Similarly we shall write T 0 = T + ∆T , where ∆T is the change in the period. Then Eq (4.22) becomes

∆g T + ∆T = 2πL1/2(g + ∆g)−1/2 = 2πL1/2g−1/2(1 + )−1/2 (4.23) g 1 ∆g ≈ T (1 − ) (4.24) 2 g 4.9. THE BINOMIAL APPROXIMATION FOR PERTURBATIONS 151

We used the binomial approximation and Eq (4.21) in the last step. Finally, by algebra, this reduces to

∆T 1 ∆g ≈ − (4.25) T 2 g which is just what we said at the beginning. If ∆g/g = −0.01, for example, corresponding to g becoming 1% less, then ∆T/T ≈ 0.005, corresponding to T becoming 0.5% more. Exactly the same argument, just changing the letters, shows that if y = Axn, where A is a constant, then a change ∆x in x implies a change ∆y in y given by ∆y ∆x ≈ n (4.26) y x This is the most useful form of the binomial approximation. In the case above, n was −1/2. We can use this idea to see how changing the length L of a pendulum by a small amount ∆L affects the period T :

∆T 1 ∆L ≈ (4.27) T 2 L because T is proportional to the 1/2 power of L. In the voyages described by Newton, both quantities changed, because the ships went to locations where g was different, and then the pendulum L was adjusted to keep T the same. The cumulative effect of the two changes is ∆T 1 ∆g 1 ∆L ≈ − + (4.28) T 2 g 2 L and if ∆T = 0, because the adjustment in L was made to cancel the change in g, we find ∆L ∆g = (4.29) L g The argument above gives this only approximately, but it is actually exactly true, because it is really g/L that is being kept constant by way of this adjustment, so that L must be shortened by exactly the same factor that g is made smaller. If g is smaller by 1%, through being multiplied by 0.99, then L is also. 152 CHAPTER 4. TIME AND OSCILLATION

Halley’s expedition, to take one typical example, found that L had to be shortened by 1/8 inch in a pendulum that had a 2 second period (so L ≈ 3 feet long). Thus we find ∆L/L ≈ 1/300. The typical change in g is therefore about one part in 300, or a few tenths of a percent. Why does g change like this?

4.10 Pendulums and the Rotation of the Earth

Whatever the cause of ∆g, it must lead to something with the right dimen- sion, namely [LT −2], an acceleration. What is there about the Earth that has dimensions of [T ], or [L]? Well, the Earth is a sphere, so the only length that characterizes it is its radius R. (You might object, why not its diameter, 2R? That would be fine too. We are just estimating magnitudes, by looking for things with the right dimensions. Pure numbers like 2 cannot be ruled in or ruled out by these arguments.) For something with the dimension [T ] we have its daily period, the day, and also its annual period, the year. We might also think of the lunar period, the month, as possibly having something to do with g. Equivalently we could consider the angular speed ω in each case, having dimension [T −1]. Now to make ∆g with its dimension [LT −2] from these quantities, we can only use the combination ∆g = Cω2R (4.30) where C is a pure number. Typically the pure numbers that arise in the- oretical physics are around 1. Nature seems not to use pure numbers that are very large or very small, on the whole. That is a slightly mysterious fact. Nature seems to like numbers like π and 2, that are “of order 1,” as physicists say, like the numbers that come from geometry as typical ratios. So we assume that C is not as big as 100 or as small as 0.01, but has the “order of magnitude” 1. Let us now estimate ∆g from the expression above. Since for the Earth R ≈ 6 × 106 m, and the angular rotation speed is ω ≈ 2π/86400 ≈ 7 × 10−5 s−1, we find ω2R ≈ 3 × 10−2 m/s2 as an estimate for the typical size of ∆g we should expect if the rotation of the Earth has something to do with it. Since g ≈ 10 m/s2, this is a change of about 0.3%, – just what was found by Halley’s expedition! Even without a detailed theory of how the rotation of 4.10. PENDULUMS AND THE ROTATION OF THE EARTH 153 the Earth affects pendulums, we can see that there probably exists such an effect. Dimensional analysis alone has told us. Let us also estimate, in the same way, how much the Earth bulges at the Equator because of its rotation. This would be a change in radius ∆R of the equatorial circle. ∆R should be proportional to ω, because if ω were zero, the bulge would also be zero. Now ω has dimension [T −1], and to get a length [L], we have to cancel the time dimension. The simplest combination using the quantities we have been talking about that grows with ω and has dimension [L] is µ ¶ ω2R ∆R = C R (4.31) g This expression has common sense features which make it plausible. If gravity were stronger, g would be bigger, and since g is in the denominator, the bulge would be less. That makes sense. Stronger gravity would hold things in place better. Also, because of R in the numerator, the bulge would be bigger if the Earth were bigger, but that is just part of scaling things up. Since we have already evaluated the expression in parentheses, and found that it is about 0.003, we can see that the bulge will be of the order of a few tenths of a percent of the radius R, which would be hard to see, as a visible bulging of the Earth, but still amounts to 0.003 × 6 × 106 m ≈ 20 km, which is more than the rise of the highest mountains. How much would g change between the bottom of a mountain and the top? Here we would not expect the rotation of the Earth to have anything to do with the effect. Presumably gravity would be weaker at the top of the mountain whether the Earth were rotating or not. The effect should be roughly proportional to the height of the mountain H, which is a length. To get something with the dimensions of g, we must use g itself, the only quantity left with the dimension [T ] in it: H ∆g = C g (4.32) R

For a 6 km mountain, which is high, but realistic, H/R ≈ 10−3. Thus if g = 9.80 m/s2 at the bottom of the mountain, we shouldn’t be surprised if it is only 9.79 m/s2 at the top, a change of about a part in 103. This is not a real theory, of course, because we don’t know how to compute g from anything more fundamental, but it is an estimate of how big an effect there 154 CHAPTER 4. TIME AND OSCILLATION could be, using dimensional analysis. It turns out that there is an altitude effect of roughly this size. (Records set at the Mexico City Olympics are given a special notation in some accounts, because of the high altitude and the slightly smaller g. Presumably if g is less, you could jump higher, farther, etc. How much higher? How much farther?) Perhaps more surprisingly, g varies in an unpredictable way if we can measure it to more than three decimal places. Geologists use “gravimeters”, sensitive instruments for measuring g, to see these local variations in g. Local dense deposits in the Earth can show up as local regions of larger g, and similarly, regions of lower density can show up as local regions of smaller g. Commercial gravimeters measure g using a unit which is about 1 part per million, so the variations that actually are seen are typically 10 to 100 parts per million. A gravimeter is like a crude eye, looking into the Earth, but without any ability to focus, able to report only that there is something interesting nearby or there isn’t.

4.11 Simple Harmonic Oscillators

Any quantity x that oscillates in a sinusoidal way is called a simple harmonic oscillator. The word “harmonic” indicates that it has a well defined angular frequency ω, so that the time dependence of x is

x = A sin(ωt) (4.33) for some constant amplitude A and angular frequency ω. There is one slight generalization we should also consider. In Eq (4.33) we notice that the phase θ = ωt is 0 at t = 0, but in general the zero of time could correspond to any phase in the oscillation. Thus the most general simple harmonic oscillation is x = A sin(ωt + δ) (4.34) The extra phase angle δ is called the phase shift. If δ = π/2, for example, then the oscillation starts at its maximum value A at time t = 0, instead of starting at 0. For this particular phase shift π/2, the resulting oscillation could be written more simply as A cos(ωt), i.e.,

A sin(ωt + π/2) = A cos(ωt) (4.35) 4.11. SIMPLE HARMONIC OSCILLATORS 155

The cosine function is just the sine function with a phase shift π/2. If we graph both the sine and the cosine on the same axes, as in Fig 4.12, we see that the cosine is literally the sine shifted by π/2. Another special phase

1

0.8

0.6

0.4 sinθ 0.2

0 π/2

−0.2

−0.4 cosθ

−0.6

−0.8

−1 0 1 2 3 4 5 6 7 8 9 10 Angle θ

Fig. 4.12: The sine and cosine curves are the same except for a phase shift of π/2. The formula is sin(θ + π/2) = cos(θ). Adding π/2 to the argument of the sine shifts the sine curve (solid) to the left by π/2, as indicated by the arrow, where it coincides with the cosine curve (dashed). shift is π. The effect of a phase shift π is A sin(ωt + π) = −A sin(ωt) (4.36) 156 CHAPTER 4. TIME AND OSCILLATION

Perhaps you can see in Fig 4.12 that shifting the sine curve by twice π/2, which is the same as shifting the cosine curve by π/2 gives the sine curve “upside-down.” This phase shift of π turns out to be a very important idea later on when we look at how waves interfere. The velocity of a simple harmonic oscillator is v = ωA cos(ωt + δ) (4.37) by the argument of Section 4.7. Thus the oscillating velocity is π/2 out of phase with the oscillating position. The velocity is maximal (either positive or negative) when the position of the oscillator is zero, and the velocity is zero when the position of the oscillator is maximal (either positive or negative). This last statement just says that the oscillator stops at the extreme limit of each swing: perhaps that is obvious. The world is full of real oscillators, meaning things with a sinusoidal time dependence. All of them are potentially clocks. Old-fashioned watches have a little wheel that oscillates back and forth, driven by a coiled watch spring. Modern quartz watches have a crystal that oscillates at a very high frequency, so fast, and with such a small amplitude, that you couldn’t see it with your eye. And for many purposes, atoms behave like oscillators, in the sense of having built-in definite frequencies. Physicists have a kind of metaphor for all simple harmonic oscillators: the mass-on-a-spring, shown in Fig 4.13. The idea is that if the mass moves left, the spring is compressed, and if the mass moves right, the spring is stretched. Either way, the spring forces the mass back the other direction. The spring is said to exert a “restoring force” on the mass, tending to restore it to that special place where the spring is neither compressed nor stretched, its relaxed position. The mass always overshoots, however, first on one side, then on the other, and hence it oscillates. An analysis of the mass on a spring using Newton’s laws of motion predicts that the angular frequency will be r k ω = (4.38) m where m is the mass and k is a constant representing the stiffness of the spring. The period of the oscillator is then r m T = 2π (4.39) k 4.11. SIMPLE HARMONIC OSCILLATORS 157

k m

Fig. 4.13: A mass m on a spring with spring constant k.

Even if the oscillator is more complicated than this, or even looks nothing at all like this, physicists keep this simple picture in mind, because, in a sense, all oscillators are alike, abstractly, so one might as well have one standard picture of an oscillator. We have not talked about either force or mass, so this is just a look ahead, but everyone has an intuitive idea of what is meant by mass. It is worth noticing in these expressions that if m gets larger for fixed k, then ω goes down√ and T goes up – the more massive oscillator is more sluggish. In fact T ∝ m for fixed k. Also, if k gets larger, i.e., the spring√ gets springier, then ω goes up – springier means faster oscillation, ω ∝ k for fixed m. Let us see how this works in the case of the one oscillator we know, the pendulum. The pendulum bob must have some mass m, and it is on a string of length L. The frequency ω is such that

g k ω2 = = (4.40) L m On the right hand side we are insisting that we want to regard the pendulum as a mass m on a spring k, even though there is no actual spring. We see 158 CHAPTER 4. TIME AND OSCILLATION that we can do it if we just understand the “spring constant” to be mg k = (4.41) L In this case it is potentially confusing to regard the pendulum as a mass on a spring, because what we find is that the spring constant k is proportional to m. In the ratio k/m = g/L, the mass m cancels, so that the period of a pendulum is actually independent of m. Unless you remember that when you change m in a pendulum you also are changing k, you might erroneously expect the pendulum to run slower if it is more massive: not true! For this oscillator there is no way to change m without also changing k in a way that exactly compensates. Fig 4.13 suggests that the spring and the mass are independent things, but for the pendulum the mass m, through its weight, is the spring, somehow. It is a characteristic property of gravity to pull harder on larger masses, just compensating for their greater sluggishness (or inertia, to use the technical word). We take up the topic of mass and weight in Chapter 5.

4.12 Exponential Decay

So far we have described oscillation as if it kept the same amplitude A forever, and never ran down. In fact, though, most oscillators, like pendulums, do run down. A good model for this is to assume that on each successive swing its amplitude is some fixed fraction r of the previous amplitude. Thus r is the ratio of successive amplitudes, assumed constant (and less than one). If the oscillator starts from rest with displacement A0, then on the next swing it will only reach the displacement A1 = rA0. On the second swing it reaches 2 displacement A2 = rA1 = r A0, and so forth. In general the nth swing will reach displacement n An = r A0 (4.42) This is a discrete example of exponential decay. It is called exponential because the variable n, which is the number of oscillations, and therefore proportional to time, occurs in the exponent. Whether a given oscillator really behaves like this as it runs down is, of course, a question for experiment to decide, but it is always a reasonable guess. 4.12. EXPONENTIAL DECAY 159

If you did this experiment on a real oscillator, and noted An for n = 0, 1, 2, ..., how could you tell if it was exponential decay? The naive method is to look at successive ratios, like A1/A0, A2/A1, etc. to see if they were all the same r. Because of the unavoidable little inaccuracies of the measurement process, though, the ratios will certainly not all be the same, even if this is exponential decay. A much better method is to re-express Eq (4.42) by taking a logarithm: log(An) = n log(r) + log(A0) (4.43)

This says that in exponential decay, log(An) is a linear function of n. Thus we just plot log(An) vs. n and see if it is a straight line. If it is, then the slope is log(r), and thus we determine r (even if the data scatter a bit about the line, so that individual ratios are not constant). This method illustrates a very useful idea in data analysis generally. If you suspect some relationship, like Eq (4.42) above, an excellent test is to plot data in such a way that they will lie along a straight line if the suspicion is correct. The reason is that you can always recognize a straight line! Other curves are much less obvious to identify. We have seen this idea before, in Sections 2.3 and 2.9. In the case of the exponential relationship in Eq (4.42), the logarithm produces the linear relationship with n in Eq (4.43), bringing n down out of the exponential. The idea is very much like that in Section 2.9, where we had a similar test for a power law, using a log-log plot. The test for an exponential decay is sometimes called a semilog plot, because we only take the logarithm of An, but not n. We illustrate the method below, using the fake data in the table.

n An ln(An) 0 2.092 0.738 1 1.674 0.515 2 1.298 0.261 3 1.065 0.063 4 0.913 -0.091 5 0.747 -0.292 The data in the semilog plot in Fig (4.14) lie approximately on the straight line ln(An) ≈ 0.711 − 0.205n (4.44) We chose the natural logarithm here. Any other choice of base for the log- arithm would just change the scale on the vertical axis, leaving the picture 160 CHAPTER 4. TIME AND OSCILLATION

ln A 1.0 n

0.5

0.0

-0.5 n 0 1 2 3 4 5

Fig. 4.14: Plot of ln(An) vs n from the accompanying table. the same. The inverse of the natural logarithm is the exponential function (base e). Applying this function to both sides we find An itself in a standard exponential form,

0.711−0.205n 0.711 −0.205 n An ≈ e = e (e ) (4.45) In the terms that we began with,

0.711 A0 ≈ e ≈ 2.04 (4.46) r ≈ e−0.205 ≈ 0.815 (4.47) That is, the oscillator starts with an amplitude that is about 2, and each successive swing is only about 81% the displacement of the previous one. If we think of steady oscillation as being circular motion seen from the side, then decaying oscillation can be thought of as spiral motion seen from the side. The exponential decay of the oscillation is what turns the circle into a spiral, as the amplitude shrinks in time. The equation of such an exponentially decaying oscillation, looking in from the side, is y = Ae−αt sin ωt (4.48) as shown in Fig 4.15. The radius A of the circle, which had been constant before, is now exponentially decaying, and is replaced by Ae−αt, where α is 4.12. EXPONENTIAL DECAY 161

y y (a) (b)

x t

Fig. 4.15: A spiral in toward the origin, as in graph (a), looks like an oscillation decaying in time, as in graph (b), when observed from the side. The decaying oscillation in (b) has the equation y = Ae−αt sin ωt, where ω is the angular frequency, and α is the decay rate. The exponentially decaying amplitude Ae−αt is indicated with dashed lines. Compare Fig 4.7. a constant, called the decay rate. If the decay rate happens to be 0, then the exponential factor is e0 = 1 and we are back to the previous case, of simple harmonic motion. But if α > 0, then when the time increases by one period T , the amplitude is multiplied by r = e−αT . This is essentially the case we thought about earlier, where on each swing the amplitude is cut down by some factor r. The decay constant α has a simple interpretation. It has dimension [T −1], like ω, since αt, like ωt, must be dimensionless. Thus 1 τ = (4.49) α is a time, called the decay time. It is the time it takes for the amplitude to fall to 1/e ≈ 0.368 of its initial value, that is, for the oscillation to die down very noticeably. Notice that it is not the time for the oscillation to stop. In fact, the oscillation never stops in this model, it just gets smaller and smaller, and the time for it to get very noticeably smaller is about τ. Since there are now two times describing the oscillator, T the period, and 162 CHAPTER 4. TIME AND OSCILLATION

τ the decay time, there is a natural dimensionless ratio τ Q = (4.50) T called the quality of the oscillator. (Engineers just refer to “the Q” of the oscillator.) It is the decay time measured in periods, which is simply the number of oscillations the oscillator makes before it has essentially run down. The oscillator in Fig 4.15 looks as if it has a Q of about 3. This is a very low quality oscillator! It hardly oscillates at all before it has essentially quit. A church bell, on the other hand, may ring for many seconds, at a frequency of hundreds of Hz, so it could easily have a Q of 1000 or more. When the Apollo 12 mission left the Moon in November, 1969, it deliberately crashed the landing module Intrepid into the lunar surface to generate oscillations to be detected by seismometers left behind. The results were amazing and puzzling: the Moon rang like a bell! Its Q was estimated at 4000-6000. Earthquakes generate oscillations of the Earth, but Q down here is more like 200. Think about that next time you feel a tremor. Exponential decay can also be used as a clock. It isn’t as convenient as an oscillation that just ticks along forever, but it if you know the decay time τ, and if you can monitor the decay, then you have a measure of time. If for example, you wait long enough for the amplitude to fall to 1/e of its initial value, then you must have waited for one decay time τ. If you wait for it to fall by another factor 1/e, so that it is now less than the initial amplitude by the factor 1/e2, then you must have waited 2τ. In that sense, exponential decay is a clock: the logarithm of the amplitude is a linear function of time. The best name for it might be a logarithmic clock. The logarithm of the decaying quantity is interpreted as the time. The simplest way to read a logarithmic clock is to use the notion of half- life, denoted T1/2. The half life is a time, not too different from the decay time, actually, but simpler to think about. It is the time for the exponentially decaying quantity to fall to half its original value. Since the decay time is the time for the exponentially decaying quantity to fall to 1/e of its original value, these ideas are essentially the same, but 1/2 is a little more familiar than 1/e. Here is how to use it. Suppose something is decaying exponentially, and it has been reduced to 1/4 its original value. How much time has passed? Answer: two half lives, because it has been reduced by (1/2) and then by (1/2) again, for a total reduction of (1/2)2 = 1/4. If it were reduced to 1/8 of its original value, three half lives must have passed, because 1/8 = (1/2)3. 4.13. DATING BY RADIOACTIVE DECAY 163 4.13 Dating by Radioactive Decay

Natural radioactivity gives us a number of logarithmic clocks. The most familiar of these is probably Carbon 14, a radioactive isotope of carbon with 14 a half life T1/2 ≈ 5700 years. By measuring the amount of C remaining in a sample, we can determine how old the sample is (lots of details to fill in here!) As we just mentioned, the half life is not a very new concept: it is essen- tially just the decay time τ, and simpler to think about. In any exponential decay, there is some quantity that decays in time t by the factor e−αt. The half life T1/2 is just the time at which this factor is 1/2, the time for the quantity to fall to 1/2 its original value. Thus

1 e−αT1/2 = (4.51) 2 and hence, taking the natural logarithm on each side,

−αT1/2 = − ln 2 (4.52) so that T1/2 = τ ln 2 (4.53) using τ = 1/α. The proportionality factor between half life T1/2 and decay time τ is ln(2) ≈ 0.693, which is smaller than one, so that the half life is a bit less than the decay time. That makes sense because in a half life the quantity falls to 0.5 time its original value, but in a decay time it falls to 0.368 times its original value, which takes a little longer.

14 Thus, in the case of C , the decay time is τ = T1/2/ ln 2 = 8200 years. That means the decay rate is α = 1/τ = 1.2 × 10−4 yr−1 = 3.8 × 10−12 s−1. The number of decays in a short time is proportional to the decay rate, as the name suggests, but also to the number of C14 nuclei, since each one of them is a candidate to decay. The decay rate is really telling us what fraction of the nuclei decay per second, per year, etc., depending on units. If we have a sample of N0 nuclei, then decay events happen at the rate αN0, called the activity, proportional both to α and to N0. The SI unit of activity is the Becquerel(Bq), 1 decay per second. In one year, using α as computed above, the fraction 1.2 × 10−4 or 0.012% of the C14 nuclei in our bodies, or in any 164 CHAPTER 4. TIME AND OSCILLATION other sample, undergo radioactive decay. So how radioactive does that make us?

The natural abundance of C14 is about 1 part in 1012, that is, 1 atom in every 1012 atoms of carbon is the radioactive isotope. Since the atomic weight of carbon is about 12, one mole of carbon atoms has a mass 12 grams, so in 1 kg of carbon, there are about 80 moles, or 5 × 1025 atoms, of which 13 14 N0 = 5 × 10 are C . Using the decay rate α found above, we find the activity will be about αN0 ≈ 200 decays per second (200 Bq). We are pretty hot! (Note: if the notions of atomic weight and mole are unfamiliar to you, just accept the result. We will deal with these issues more carefully in Chapter 9, and we will say more about radioactivity in Chapter 20.) Now suppose we had a 1 kg sample of carbon and it was only giving us 100 decays per second instead of 200: having only half its original activity, it must have only half of its original C14 atoms, and hence it must be 5700 years old. (In practice one uses much smaller samples, and the number of decays observed per second is proportionately less. We are skipping over all the practical details of how you would actually monitor a sample for its activity.)

The above describes the basic idea, but there are many details still to fill in. Perhaps the most pressing one is, if C14 is decaying away at this rate, why is there any of it around? The answer is that it is continually being produced by cosmic rays, in a process by which Nitrogen-14, the principal constituent of the atmosphere, is converted to C14. When C14 decays, it goes back to being N14. Atmospheric C14 reacts with oxygen in the atmosphere to create a radioactive form of carbon dioxide that is taken up by plants, which are eaten by herbivores, which are eaten by carnivores. In this way the atmospheric C14 is exchanged among all living things, and its abundance agrees with its abundance in the atmosphere, which, as we said above, is about 1 part in 1012. After a plant or animal dies, though, it stops this exchange process of respiring and eating. Whatever C14 it has is all that it is ever going to have, and that starts to decay away. This is why formerly living things can be dated.

The first tests of this idea looked at samples of known age, for example wood from the sarcophagus of an Egyptian pharaoh of a historically datable dynasty. The most useful comparison has been with dendrochronology, the dating of wood samples by tree rings. Visible patterns of more and less growth year by year in tree rings, due to random annual variations in weather, 4.13. DATING BY RADIOACTIVE DECAY 165 have been pieced together to give precisely datable samples going back in some cases over 10,000 years. The basic result is that radiocarbon dating works, but there are small corrections necessary for precise work, and in some cases large corrections. Underwater life, for example, is not exposed directly to the atmosphere, but it is exposed to dissolved carbonate ions which, coming from ancient rocks, have no C14. Thus underwater specimens have less C14 than expected and without a correction would appear older than they really are. Occasional episodes of volcanism can temporarily inject a lot 14 of ancient CO2 into the atmosphere, none of it radioactive: this dilutes C and so makes everything alive at the time appear slightly older than it is, in the sense that it has less C14 than it should. The Age of Industrialization, starting around 1800, did the same thing, in burning fossil fuels, so C14 became relatively less abundant than it was before. On the other hand, nuclear weapons tests in the atmosphere between 1944 and 1963 produced a lot of C14, so abundances now are unnaturally high. The study of this “pulse” of C14 has revealed how carbon moves through the biosphere, a small benefit from an otherwise regrettable episode. 166 CHAPTER 4. TIME AND OSCILLATION Problems

Angular Clocks

4.1 What is the angular velocity of the minute hand and the hour hand of a clock? Give the answer in several different units.

4.2 The summer and winter solstices occur around June 21 and December 21, with a variability of about 18 hours (which sometimes changes the date). Find the number of solar days between the solstices, and say what it means for the speed of the earth in its orbit. Why is this result so different from the number of solar days between the equinoxes?

Global Positioning

4.3 In Fig 4.5 the satellites at A and B are separated by 500 km, and the delays registered by the receiver at P are t2 = 1 ms and t3 = 1 ms respectively (i.e., both are 1 ms). Where is the point P ? (Assume c = 3 × 108 m/s, and ignore the complexities of the atmosphere.)

Longitude

4.4 If in making a longitude measurement your clock is off by 1 second from GMT, how great an error do you make in the determination? By how great a distance do you err if you are on the equator? If you are at 45◦ N latitude? If you are at the North Pole? Make your reasoning clear.

167 168 CHAPTER 4. TIME AND OSCILLATION

4.5 Suppose that in making a measurement of local time, for the purpose of comparing with GMT, and hence determining longitude, you measure the position of a known star, but get the position wrong by 1 minute of arc. How far off will the longitude determination be? Roughly how great a distance does that correspond to? Make your reasoning clear.

Circular Functions

4.6 Use Fig 4.8 to find the cosine, sine, and tangent of the angles 3π/4, 2π/3, 5π/4, 5π/3, and 8π/3, and sketch a circle with these positions indicated.

4.7 7. Sketch the graph, on the same set of axes, for the functions 2 sin t and sin 2t.

4.8 Investigate in Fig 4.9 whether the radii R and the periods T of the moons of Jupiter obey a power law, i.e., whether R ∝ T α for some α.

Velocity in Orbit

4.9 Estimate from Fig 4.9 the speeds of the moons of Jupiter in units of the speed of the innermost moon. Make it clear how you are doing this.

4.10 Copy from Fig 4.9 the graph of the observed position R sin θ of the outermost moon of Jupiter over time, and on the same time axis graph the projected component of velocity ωR cos θ. Check that the projected velocity really is zero at the time when the moon “stops” at its extreme elongation from Jupiter and turns back the other direction.

Pendulum

4.11 Suppose we set out to make a pendulum that takes exactly 1 second on each swing, so that its period is T = 2 s. 4.13. DATING BY RADIOACTIVE DECAY 169

(a) Describe how to check by counting swings from one transit of a star to the transit on the following night that the period is exactly 2 seconds. (b) How long will this pendulum be, in meters? (c) It was once proposed to make this length the standard unit of length, on the grounds that it could be easily reproduced anywhere. Would this actually work?

4.12 Suppose you take a pendulum clock, which is running accurately at your old location, to a new location where g, the acceleration due to gravity is less by 0.1%. How many minutes will the clock gain or lose in a day at the new location? How could you adjust the length of the pendulum to make the clock run accurately again?

4.13 Suppose your legs are 3 ft long, and your stride (length of one step) is also 3 ft. (a) If your legs swing like pendulums of length L = 3 ft, with no lag between one swing and the next, how fast do you walk, in miles per hour? (b) In fact, it is not reasonable to expect a 3 ft leg to swing like an L = 3 ft pendulum, because what we mean by that is a mass M on the end of a 3 ft string. But in a leg, there is mass all the way along the length. Only a little bit of the mass is at a distance 3 ft from the pivot. If the mass is uniformly distributed along the length L, then it would swing like a pendulum of length 2 3 L, a result that requires calculus. What would be the effect on your answer in (a) if we put in this more realistic length? (c) In fact, the mass of the leg is not uniformly distributed along the length, but is concentrated in the upper part of the leg. This would produce a further correction in the direction of the correction from (a) to (b). Comment on the common sense of these results: are the walking rates reasonable?

Binomial Approximation

Use the binomial approximation to get quick answers to the following ques- tions. 170 CHAPTER 4. TIME AND OSCILLATION

4.14 If the angular speed ω of a wheel goes down by 1%, what happens to the period T ?

4.15 If the radius R of a sphere becomes 1% greater, what happens to the volume V ?

4.16 If the radius R of a sphere becomes 1% smaller, what happens to the surface area A?

4.17 If the surface area S of a sphere becomes 1% larger, what happens to the volume V ?

Simple Harmonic Oscillators

4.18 Two oscillators have the same amplitude A and frequency ω, but they oscillate with a constant phase difference π. Graph their motion as a function of time, on the same set of axes. Also write formulas for the functions of time that give their positions.

4.19 Two oscillators have the same amplitude A, but their frequencies are in the ratio 2:1. Each starts initially at position 0, at t = 0. Write formulas for functions that give their positions as a function of time, and sketch a graph of their motion on one set of axes.

4.20 Two oscillators have the same amplitude A and frequency ω, but they differ in phase by π/2. Graph their motion as a function of time on one set of axes.

4.21 An oscillator can be thought of as a mass m on a spring of stiffness k. How does the frequency of the oscillator change if the mass is doubled but the spring stays the same? Sketch a graph of the oscillation for the original m and also for the doubled mass 2m, on the same time axis. (Label the graphs!) 4.13. DATING BY RADIOACTIVE DECAY 171

4.22 An oscillator can be thought of as a mass m on a spring of stiffness k. How does the frequency of the oscillator change if the spring stiffness k doubles? Sketch a graph of the oscillation for the original k and also for the doubled stiffness 2k, on the the same time axis. (Label the graphs.)

4.23 A mass of 100 kg is oscillating with a period of 10 s. What must be the stiffness k of the “spring” that causes it to oscillate? (We use quotation marks, because it might not be a real spring: the mass might be hanging on a long rope and swinging as a pendulum, for example.)

4.24 A 3 kg mass oscillates 5 times per second from A to B and back again. What must be the stiffness k of the “spring” that causes it to oscillate?

Exponential Decay

4.25 Show by graphical means that the following data for successive ampli- tudes An of a pendulum running down are consistent with exponential decay, and in particular use the slope of your graph to find the factor r relating each amplitude to the next. Check common sense against the original data for An.

n An ln(An) 0 4.29 1.46 1 2.87 1.06 2 1.88 0.634 3 1.29 0.251 4 0.923 -0.080 5 0.645 -0.439

4.26 1 µg of Strontium-90 is about 6.7 × 1015 atoms. The half-life of Strontium-90 is about 28 years. Using the ideas of Section 4.13, find the activity of a 1 µg sample in decays per second. Make your reasoning clear. 172 CHAPTER 4. TIME AND OSCILLATION Chapter 5

Mass, Weight, and Equilibrium

The theory of the balance of forces and torques in equilibrium, what we now call statics, was fully worked out in antiquity. Nowadays the notion of force seems more complicated, occurring, as it does, not only in statics but also in dynamics, the theory of motion. The earlier theory took force as a simple concept, one that we are all intuitively clear about, with weight being the simplest force of all. We will follow this tradition and assume that “force” requires no deep explanation. Archimedes’ Law of the Lever can be used to give an operational definition of static force.

5.1 Archimedes

In one of the most beautiful works of Hellenistic physics, Archimedes proved his famous Law of the Lever. It is illustrated in Fig 5.1. It says that weights in proportion p : q balance where the beam connecting them is divided in proportion q : p. Nowadays this result is just a special case of Newton’s theory, and would be called the balance of torques. It is worth looking at Archimedes’ argument in isolation, though. Historically, Archimedes’ results stood alone for centuries, as one of very few indications that Nature admits a mathematical theory. The book in question is On the Equilibrium of Plane Figures. It deals with the kind of situation illustrated in Fig 5.1, weights on possibly unequal arms

173 174 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM

2 3

3 2

Fig. 5.1: The Law of the Lever: Weights in proportion p : q balance where the beam is divided in proportion q : p. The balance point is indicated here by the triangular fulcrum. of a balance apparatus. The beam of the apparatus itself has no weight (an interesting abstraction already). The form of the book is much like the form of Euclid’s Elements. It begins with postulates, and the rest is deduction from the postulates. We describe just a little of the book. Postulate I says that equal masses at equal distances from the fulcrum balance. It might seem that this is obviously true, but when you imagine trying to verify it, you realize there is a problem. Suppose you take weights that you think are equal at equal distances from the fulcrum, and they do not balance. What do you conclude? That Postulate I is false, or that the weights weren’t equal after all? You can’t tell! That is why Postulate I is a postulate. Its status has nothing to do with empirical verification. It is completely abstract. The equal weights it talks about are theoretical objects. There is so far no way to know when weights are equal! Postulate I is true within the theory, by definition. That is what it means to be a postulate. Postulate III says that if weights balance, and you remove something from one side, then the weight goes down on the other side. From just these two postulates Archimedes proves his Proposition I: if two weights balance at equal distances, then they are equal (giving, finally, a way to tell that they are equal.) Here is the proof: Suppose two weights that balance at 5.1. ARCHIMEDES 175 equal distances are not equal. Then we can remove weight from the heavier one until they are equal. But in removing weight from one side of weights that balance, we know the other side will go down, by Postulate III. This contradicts Postulate I, since the now equal weights at equal distances do not balance. Thus our supposition that the weights were not equal must be wrong, and therefore they are equal! The weights in this theory are suspended from their centers (or “centers of gravity”), the place where each balances individually. It is assumed that the weights keep the same value even if you change their shape. This is the key to proving the Law of the Lever. For the example in Fig 5.1, where the masses are in proportion 3 : 2, divide the mass 3 into 2 × 3 = 6 equal parts and the mass 2 into 2×2 = 4 equal parts, so that all 10 parts are equal. Now reshape them, keeping the centers of the two masses in the same place, and re-assemble them into a single rod of length 10, where the beam had length 5. It will look as shown in Fig 5.2. The balance point is now the center of

Fig. 5.2: Archimedes’ proof of the Law of the Lever, using the example of Fig 5.1. The original weights of 3 and 2 have been reshaped into a rod. Each mass has contributed segments to the rod, and the centers of each mass are in the original places. A dotted line shows where the left mass ends and the right mass begins. The balance will now obviously be at the midpoint of the rod, just the location determined by the Law of the Lever. the rod, by symmetry, and that is just the point determined by the Law of the Lever. 176 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM

In fact, if the left mass were p and the right mass were q, where p and q are integers, then take a unit of length such that the beam has length p + q, and form the rod out of 2p + 2q equal segments, of total length 2p + 2q. Its midpoint can be found by starting from either end and counting p + q segments. But starting from the left you count p segments to the center of the left mass, and then q segments further on. And starting from the right you count q segments to the center of the right mass, and then p segments further on. And in this way you see that the original beam is divided as q : p.

This result means you can, in principle, easily measure one weight W1 in units of another weight W2, because that is just the proportion W1 : W2, and W L 1 = 2 , (5.1) W2 L1 where L1 and L2 are the distances along the beam from each weight to the fulcrum when you get the weights to balance. The ratio of the weights is represented visibly on the beam as the ratio of the two arms. The two lengths L1 and L2 are called the lever arms for the weights W1 and W2 respectively. In addition to being a measuring instrument, the lever is a practical device for balancing a large weight by a small weight. A small weight W2 can be “leveraged” by the ratio L2/L1 to balance a large weight W1, to lift it, for example. The ratio L2/L1 (pure number) is sometimes called the “mechanical advantage” of the arrangement. Archimedes’ result is what is needed to design practical devices to handle such jobs. The theory is still a bit abstract. In real applications the beam itself has weight, for example. In fact, though, it is easy to include the weight of the beam. You find the actual balance by finding the point A where W1 and W2 would balance by themselves, as we have done, and then, regarding the beam as a weight W3, with its center at B, the balance of the entire assembly will be at the point that divides the segment AB in proportion W3 :(W1 + W2). That is, you just use the Law of the Lever again. Archimedes is mentioned in two places in Roman histories, once in the biography of the Roman general Marcellus in Plutarch’s Lives, and again in Polybius’ history of the Punic Wars. These accounts, which are rather sim- ilar, were written well over a hundred years after the death of Archimedes, when Roman rule and Roman ways were firmly established. Plutarch empha- sizes Archimedes’ impracticality, a common Roman complaint about Greek 5.2. TORQUE AND FORCE 177 intellectuals. He seems to express a Roman frustration that the Greeks pro- duced theoretical works and not blueprints for useful devices. The Romans seem never to have appreciated how practical and useful the Greek abstrac- tions were to anyone who understood them. These same accounts also con- tradict themselves, in a way, because the reason they talk about Archimedes at all is their interest in the war machines he devised at the siege of Syra- cuse. According to Plutarch it was largely Archimedes’ ingenuity that kept the Romans out. When the city finally fell, Archimedes was murdered by a soldier.

5.2 Torque and Force

Nowadays we rearrange Eq (5.1), multiplying both sides by L2L1, and write the balance condition as W1L1 = W2L2 (5.2)

We read it as saying that the torque W1L1 due to W1, which tends to ro- tate the beam in the conventional positive direction (counterclockwise) is equal to the torque W2L2 due to W2, which tends to rotate the beam in the conventional negative direction (clockwise). Here torque is the product of the weight and the distance, the distance being the distance out from the fulcrum, also called the “lever arm.” Since these torques balance, the beam balances. We can also write it as

W1L1 − W2L2 = 0 (5.3)

We read this as saying that the total torque about the fulcrum is zero. We have to put in each torque with its appropriate sign, saying which direction it tends to twist the beam. The condition of balance is called equilibrium, a Latin word that refers precisely to this balance of torques, literally “equal-balance.” More generally the notion of equilibrium has been generalized to denote any condition of balance or stationarity. Weight is the prototype for the more general notion called force. Weight is a force, but not every force is a weight. Archimedes’ lever provides a way to compare weights with other, more general, forces. For example, when 178 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM

there is a weight W1 at one end of the lever, and you balance it by pushing down on the other end with your hand, you can read off from the position of the fulcrum what force you are exerting. It is the same force as the weight W2 that would balance W1, namely W2 = (L1/L2)W1. The two forces are related by the mechanical advantage. We represent the situation somewhat abstractly in Fig 5.3. The beam is shown subject to forces W1 and W2 at the F

L L AB1 2 O W 2 W 1

Fig. 5.3: Abstract representation for the situation in Fig 5.1 two ends A and B. We do not say whether these forces are weights or some other force. We also show an upward force F exerted by the fulcrum, which supports the beam. There are actually two conditions of balance,

W1L1 − W2L2 = 0 (5.4)

F − W1 − W2 = 0 (5.5) The first of these is the balance of torques. The force F does not appear in Eq (5.4) because it exerts no torque about O: its lever arm is zero. (To be careful, we should emphasize that we are balancing torques about the point O, – the lever arms L1 and L2 are measured from that point.) The second equation, Eq (5.5) is the balance of vertical forces, expressing the fact that the fulcrum, by pushing up, balances the forces down which would otherwise 5.2. TORQUE AND FORCE 179 cause the beam to fall. We call forces “up” positive and forces “down” negative, an arbitrary sign convention. Notice that if we used the opposite sign convention and called “down” positive and “up” negative, we would still get the same second equation, just multiplied through by −1. Either way, Eq (5.5) says F = W1 + W2. That is, the fulcrum force F balances the full weight (or whatever W1 and W2 are). It is interesting to notice that the torques balance about any point, not just O. Suppose we compute torques about the point A, meaning we measure lever arms from A. Then the force W1 exerts no torque, because its lever arm is zero. The balance of torques, using the same sign convention for positive and negative torque, is now

0 = FL1 − W2(L1 + L2) (5.6)

But putting in F = W1 + W2, from Eq (5.5), we find after cancellation of the term W2L1 that what is left in Eq (5.6) is exactly Eq (5.4), the Law of the Lever. Alternatively, since the Law of the Lever must hold about both point O and point A, Eq (5.5) is not really a new, second condition. Using two statements of the Law of the Lever, Eq (5.4) and Eq (5.6), we find Eq (5.5) by subtracting:

FL1 − W2(L1 + L2) = 0 (5.7)

W1L1 − W2L2 = 0 (5.8) −−−−−−−−−−−

(F − W1 − W2)L1 = 0 by subtraction (5.9)

In the last equation, since L1 6= 0, it must be that F − W1 − W2 = 0, just Eq (5.5). This kind of internal consistency is necessary for the theory to make sense. The diagram in Fig 5.3 is an example of a free body diagram, an essential tool in reasoning correctly about forces and torques. It strips away all detail from the situation and keeps only the minimum information about the shape of the body together with the forces that act on that body. Let us do an even simpler free body diagram, one for just the weight W1, shown in Fig 5.4. In this case both forces act along a line through the center, so the lever arms are zero (measured from this point), and the torques are zero, and hence automatically balance. The only condition is that the force up should balance 180 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM

T 1

W 1

Fig. 5.4: Free body diagram for a weight W1 on a string with tension T

the force down. The force down is of course the weight W1, but the weight is hanging on a string, which exerts a force T1 up (the force due to a string is called tension, hence the letter T ). The condition of balance is

T1 − W1 = 0 (5.10) that is, T1 = W1, the tension exactly balances the weight. Similarly, a weight W2 would hang in equilibrium on a string with tension T2 = W2. Now we return to Fig 5.3. If this is a free body diagram for the beam alone, which is represented by a single horizontal line, then the forces exerted on it should be the force F due to the fulcrum, and forces at either end due to the strings. That is, we imagine the weights W1 and W2 are hanging on strings from the beam, but are not considered part of the beam. The beam itself is only contacted by the strings, which pull on it with their tension forces T1 at the left end, and T2 at the right end. As we have just seen, however, from looking at free body diagrams for the weights, these tension forces are equal to the weights, and since the strings are below the beam (and strings can only pull, not push), the forces due to the strings on the beam are down. (We are assuming the strings themselves have negligible weight!) Thus the free body diagram in Fig 5.3 is correct, but perhaps slightly confusing. It seems to suggest that there are weight forces acting on the beam at the ends, 5.2. TORQUE AND FORCE 181 but really the weight forces act on different objects, the weights themselves. The forces in Fig 5.3 are tension forces, due to strings attached to the beam, and just happen to be equal to the weights. Forces and torques are best understood from free body diagrams. The word “free” means we choose an object and draw it free of all the other things around it, in isolation. Then we add the forces that act on the chosen object due to other objects. Anything that touches our chosen object can in principle exert a force, and so should be represented by a force in the diagram. That is why there are three forces in Fig 5.3: the chosen object is the beam, and the things that touch it are the two strings and the fulcrum. In Fig 5.4 the chosen object is a weight, and it is touched only by one other object, a string, from above. That accounts for the force T1 in the diagram. The weight force W1 is not due to anything that touches the weight, but rather to the Earth, which mysteriously exerts a force at a distance without touching: thus all objects have an extra force on them, not due to something touching them, but rather due to the Earth, pulling them down. Let us go back to Fig 5.3 and see what difference it makes to include the weight W3 of the beam, which we have neglected so far. This force acts at the center of the beam, which is at the position (L2 − L1)/2 to the right of the point O (this is its lever arm). The balance of torques says µ ¶ L − L W L − W L − 2 1 W = 0 (5.11) 1 1 2 2 2 3 and thus, by algebra, in order to balance, the beam should be divided in the ratio L W + W /2 1 = 2 3 (5.12) L2 W1 + W3/2 We check common sense. First, both sides of this relation are dimensionless. Second, if W3 is negligible, the right hand side is just W2/W1, the usual Law of the Lever. As another limiting case, suppose W3 is much bigger than the weights W1 and W2, as if these were perhaps just mosquitos that had landed on the (heavy) beam. Ignoring W1 and W2 we have L1/L2 = 1, that is, the beam balances in the middle – clearly correct, because the mosquitos are too small to affect the balance very much. These common sense checks strongly suggest that the expression is correct. If W3 is neither very large nor very small, the balance position will be somewhere between the two limiting cases that we just checked. 182 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM 5.3 Spring Forces: Hooke’s Law

In Fig 5.4 the upward force T1 of a string under tension could also have been provided by a spring, with the weight hanging on the spring, instead of a string. In terms of the balance of forces, the diagram would have been the same. What is meant by a spring is something that noticeably deforms (stretches, in this case) when it exerts a force. Denote by x the extension of the end of the spring from its relaxed posi- tion. Then we can ask how much the spring stretches for a given weight W when it is in equilibrium. It seems clear that if the weight were more, then x would also be more. In fact, for most springs, which means most deformable materials, the amount of extension is proportional to the weight:

x ∝ W (5.13) an observed fact that is called Hooke’s Law, after Robert Hooke, a con- temporary of Isaac Newton. We could also express this as an equality by introducing a constant of proportionality

W = kx (5.14) where k is a constant corresponding, in a precise way, to the stiffness of the spring. If k is large, the spring is stiff, in the sense that W must be large to get even a small deformation x. The final form of Hooke’s Law, encoding the above information in the most useful way, says that the force FS exerted by a spring is

FS = −kx (5.15)

The force exerted by a spring is proportional to the extension x, and it is in the opposite direction to the deformation. That is the meaning of the minus sign. If the spring is stretched downward by a weight, then it pulls upward on the weight. If the spring is underneath and is compressed by the weight, then the deformation x is still downward, and the force on the weight is still upward, opposing gravity, just as in Fig 5.4, again. Hooke’s Law for most springs describes both compression and stretching, with the same k. The Hooke’s Law force Eq (5.15) is also called a restoring force, since it opposes any displacement away from the rest position. The larger the deformation 5.4. WEIGHT AND MASS 183 of the spring, in either direction, the harder it pushes back, and the force is only zero when the spring has its relaxed length. The constant k in Hooke’s Law is the same spring constant that appears in Eqs (4.38) and (4.39), for the angular frequency and period of a spring os- cillator. In that setting, a stiff spring, with large k, has a high frequency and a short period. In what follows, though, we think just about the equilibrium of a spring. We will return to the spring as an oscillator in Chapter 6. A spring of known spring constant k is a convenient scale for weighing things. It can be calibrated using weights that have already been measured and checked with a balance. It would have marks showing how far it stretches for various weights, the distances being proportional to the weights, if it re- ally obeys Hooke’s Law. Thereafter one can use it as a scale in place of the balance. It works because you can see by the deformation of the spring how much force the spring exerts, and hence the weight that it is supporting. The two systems for weighing, the spring scale and the balance, seem inter- changeable, and of course both are used in practice. The remarkable thing is that they measure different things! Only one of them actually measures weight. That is a subtlety that we take up in the next section.

5.4 Weight and Mass

A sensitive spring scale can be transported to different latitudes and altitudes, along with some test weight W , just to check its behavior. If you should do this, you would find that the test object does not keep the same weight! According to the spring scale, W is less in equatorial regions, and it is less at high altitude. If we didn’t have some reasonable explanation for what is happening, we might think that the spring is somehow affected by being in different places, but that doesn’t seem very plausible. Suppose, on the contrary, that the spring constant k is the same everywhere, i.e., the spring is not affected. Then we have to believe the spring scale. If it says the weight is different, it really must be different. Weight is the force due to gravity, of course, and we already know that g, the measure of the Earth’s gravity, varies from place to place. This is the simple explanation for why the weight of an object really does vary. Weight 184 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM is proportional to g, and g varies. That is,

W = mg (5.16) where g is the acceleration due to gravity, and m is a constant of propor- tionality, called the mass. In ordinary speech we use “mass” and “weight” quite interchangeably, but in physics we make a careful distinction. Mass and weight don’t even have the same units. Mass is measured in grams, or, in the Standard International (SI) system of units, kilograms, abbrevi- ated kg. The dimension associated with mass is [M]. From Eq (5.16) we see that weight, and more generally force, has dimension [MLT −2], recalling that [g] = [LT −2]. Thus the SI unit for weight, or force, is the kg-m/s2, also called the Newton, abbreviated N. A spring scale measures weight, and so should be calibrated in Newtons. In the US the spring scale might be calibrated in pounds, because the pound is also a unit of weight, and hence force. A balance, on the other hand, if it is really used for measuring, comes with standard weights, labelled by their mass in grams. When we balance an unknown W2 against a standard weight W1, we know L W m g m 1 = 2 = 2 = 2 (5.17) L2 W1 m1g m1 That is, the balance determines mass, not weight! Whatever g is at the place where the measurement is made, it cancels out. Its value is irrelevant. The masses do not change as they are transported, and if they balance in one location, they balance in all locations. This is why chemists use balances, not spring scales, to measure quantities of materials. 100 grams of a reagent has the same meaning everywhere, but 1 Newton weight of reagent has a slightly variable meaning, depending on location. The two quantities are roughly the same however, because the weight of m = 100 grams = 0.1 kg is about 0.1 × 9.8 = 0.98 N ≈ 1 N, using the typical value g = 9.8 m/s2 in W = mg. In the spring scale, the spring force kx and the weight mg balance, i.e. mg x = (5.18) k where x is the extension of the spring. It is this extension that you read on the scale. When the scale is used for weighing, we know k is constant, so 5.4. WEIGHT AND MASS 185 this says that x is proportional to mg, i.e., we read the weight mg. But we can also keep both k and m constant, and then we read this as saying that x is proportional to g. This is how gravimeters are constructed. They are just sensitive spring scales, with a fixed mass m to weigh. We read the local value of g.

We have seen how the SI unit of [T], the second, is defined, and how the SI unit of [L], the meter, is defined. How is the SI unit of [M], the kilogram, defined? Remarkably, it is by an arbitrary choice! When the metric system was invented, during the French Revolution, it was intended that the gram be the mass of 1 cubic centimeter of water, and this is still approximately true, but for precise work this is apparently not good enough to be a definition. The kilogram is now, by definition, the mass of a platinum-iridium cylinder kept at the Bureau International de Poids et Mesures (BIPM) in S`evres, France in a vacuum, to prevent, to the extent possible, chemical changes on its surface which might incorporate extra mass from the atmosphere. There are copies of it, which balance with it, to the precision attainable, in other countries. The weight of the standard kilogram in Newtons would be found by multiplying by the local value of g.

We are more accustomed to measuring weight in pounds. The definition of the pound (lb) now depends on the SI system! By definition

1 lb = 4.448221615 N (5.19)

The British system, which is of course very ancient, did not distinguish be- tween weight and mass. Rather confusingly, one sometimes hears mass ex- pressed in pounds (or pound mass, a more careful way to say it). One pound mass is the mass that weighs a pound. This convention relies on the choice of an arbitrary representative value for g, namely 9.80665 m/s2, and with this conversion factor 1 pound mass is 0.45359237 kg, or 453.59237 g. We will only use these conversions to re-express forces approximately in pounds, in order to get a more intuitive idea of them. It will be quite sufficient to know 1 kg ≈ 1/0.454 ≈ 2.2 pound mass for this purpose: i.e., on the surface of the Earth, 1 kg weighs about 2.2 pounds. 186 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM 5.5 Springs in Parallel and Series

We continue using free body diagrams to understand the notion of force in equilibrium. Suppose a weight W = mg is supported by two springs, with spring constants k1 and k2, as in Fig 5.5, and imagine that each spring is stretched by the same amount x. Let L1 and L2 be the distances from the center of the weight at which the springs are attached, as indicated in the free body diagram. Then in equilibrium the balance of torques about the

k x k x 2 1

k k L L 1 2 1 2

W

Fig. 5.5: A weight is supported on two unequal springs, each stretched by an amount x. The situation is summarized in the free body diagram on the right. center and the balance of forces requires

−k1xL1 + k2xL2 = 0 (5.20)

k1x + k2x − W = 0 (5.21) Notice that the weight W does not contribute to the torque about the center, because its lever arm about the center is zero. The first relation says L2/L1 = 5.5. SPRINGS IN PARALLEL AND SERIES 187

k1/k2, which just says where to attach the springs so that the springs will stretch by the same x when the weight is suspended. If this condition were not observed, the springs would have to stretch unequal amounts in equilibrium to balance the torques, and the weight would hang crookedly. The second relation says W = (k1 + k2)x (5.22) This says that the springs are stretched by an amount x proportional to the weight W , with the constant of proportionality k1 + k2. It is as if the two springs were effectively a single spring with an effective spring constant

keff = k1 + k2 (5.23)

In particular, if we have two springs k, then keff = 2k. This arrangement of springs is called “springs in parallel”, and what we have found is that the result is an effectively stronger spring. Each spring only has to support a fraction of the total weight W . Spring number 1 exerts a force µ ¶ k1 F1 = k1x = W (5.24) k1 + k2 taking x from Eq (5.22). The fraction in parentheses is clearly less than 1, so F1 is less than the total weight W , and similarly for the force F2 exerted by the other spring. Notice that the stronger spring bears proportionately more weight in this configuration. The two springs divide up the load. Now consider a different configuration of two springs, springs in series, as shown in Fig 5.6. A weight is suspended from the two springs. The free body diagram for the weight alone has two forces, the force k1x1 from the spring which touches it from above, and the weight W , due to the Earth. Here x1 is the extension of the first spring. Note that the second spring does not touch the weight, and therefore does not contribute a force to the free body diagram for the weight. The second free body diagram is for the compound object consisting of the weight together with the first spring. This object is touched by the second spring, from above, which exerts a force k2x2, where x2 is the extension of the second spring. It has the same weight W , since we assume the weight of the first spring is negligible. In equilibrium the forces must balance, so we deduce

W = k1x1 = k2x2 (5.25) 188 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM

k x 2 2 k 2

k x 1 1

k 1

WW

Fig. 5.6: Two springs in series support a weight W . The free body diagrams are for the weight W and for the compound object consisting of the weight W together with the first (massless) spring.

Notice that in series each spring supports the full weight W , unlike the case in parallel, where the weight was divided between the springs. The series arrangement is in effect a single compound spring with an effective spring constant keff . The extension of the compound spring is the sum of the extensions of the individual springs,

W W W x = x1 + x2 = + = (5.26) k1 k2 keff and thus 1 1 1 = + (5.27) keff k1 k2 quite a different keff from the parallel case. If we have two springs k, then in series keff = k/2. The series spring is less stiff than either of the springs 5.6. NEWTON’S THIRD LAW 189 individually. By algebra, µ ¶ k1 keff = k2 (5.28) k1 + k2

Since the expression in parentheses is less than 1, keff is less than k2. Ex- changing the suffixes 1 and 2 we see that keff is also less than k1. To sum up, two springs in parallel make a stronger spring, as in Eq (5.23), and two springs in series make a weaker spring, as in Eq (5.27). We will see analogues of these two expressions in many other situations where two components can combine in parallel or series.

5.6 Newton’s Third Law

Our treatment of springs in series depended upon a rather non-obvious choice, illustrated in Fig 5.6 at the far right, the choice of weight+spring in the rightmost free body diagram. With that choice the argument was easy. But suppose we had not thought to make that choice – would we have been stuck? The answer is no. Any choice of body for the free body diagram will work, but there is one additional thing we have to know about such diagrams and the forces in them. This additional thing is Newton’s Third Law. It says that if A exerts a force on B, then B exerts a force on A that is equal in magnitude and opposite in direction. In symbols we could say FAB = −FBA. We illustrate how to use Newton’s Third Law by returning to the problem of springs in series in Fig 5.7 but making a different choice. As in Fig 5.6 we will keep the free body diagram for the weight W , which tells us that W = k1x1 in equilibrium, but now for a second diagram we take as the free body the first spring. Its free body diagram is shown rightmost in Fig 5.7. When we think about the first spring as a free body, we have to think what forces are exerted on it by other bodies. The Earth can exert a force even without touching it, of course, but we are assuming the weight of the spring is negligible, so its small weight does not appear in the diagram. Now we think what objects touch this spring. The second spring touches it from above, and pulls up with the force k2x2. Also the weight W touches it from below. What is the force on the first spring due to the weight? That is where Newton’s Third Law comes in. We already know that the force on the weight due to 190 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM

k x 2 2 k 2

k x 1 1

k 1

k x W 1 1

Fig. 5.7: Another way to think about springs in series

the first spring is k1x1, as shown in the free body diagram for the weight (the middle figure). That tells us what the force is due to the weight on the first spring: it is k1x1 again, but down, not up. These two k1x1 forces are the pair of forces described by Newton’s Third Law. For every force on a free body, there is a paired force, pointing the other direction, on some other body, the body responsible for the force in the first place. The two k1x1 forces in Fig 5.7 are a good example of a Newton’s Third Law pair. Since the first spring is just suspended in equilibrium, the balance of forces on it, reading from the diagram, says k1x1 = k2x2, in agreement with Eq (5.25). The rest of the analysis of springs-in-series goes just as before. Let us think what else Newton’s Third Law says about this situation. There is a force W (down) on the weight, due to the Earth. That means there is a force −W on the Earth, due to the weight (the minus sign mean- ing opposite, or in this case up). That is not something we would notice, but it means “weight” is really a mutual attraction between the Earth and 5.7. YOUNG’S MODULUS 191 individual masses. Newton’s Third Law says things attract each other grav- itationally. It is a hint of Newton’s “universal gravitation”, the theory that everything attracts everything.

Finally, where is the Newton’s Third Law pair for the force k2x2 in Fig 5.7? Well, k2x2 is a force up, due to spring #2 on spring #1. We have not drawn a free body diagram for spring #2, but if we did, it would have a force k2x2 down, due to spring #1, exactly the paired force associated to the force on spring #1 due to spring #2. If you have followed this discussion, you know how to use Newton’s Third Law. It reminds us that every force in a diagram is the force exerted by some other body, because there must be another body where the paired force occurs. To give just one more example, the second spring is perhaps supported by a hook. In order to balance the force k2x2 down, exerted by the first spring, there must be a force k2x2 up, exerted by the hook on the second spring. Therefore there is also a force k2x2 down exerted by the second spring on the hook, etc. Newton’s Third Law is sometimes described as “the law of action and reaction”. This is not a very useful formulation. It is not even clear what it means. Is the force FAB on A due to B the “action,” and the force FBA on B due to A the “reaction”? Does one of them cause the other? Is one of them primary and the other secondary? The honest answer is no. They occur as a pair. That is all Newton’s Third Law says.

5.7 Young’s Modulus

A solid slab of material, like steel, obeys Hooke’s Law in the sense that if you try to stretch it, by pulling it from both sides with a force F , it will stretch by a small amount ∆x proportional to F . A steel wire is a convenient geometry to try this out. If you hang a weight from a wire, it will stretch, by an amount proportional to the weight. It is, in effect, a spring. Of course the geometry of the material makes a difference. A thick slab might deform only imperceptibly, while a long thin wire might stretch very noticeably. That is, if we write Hooke’s Law as F = keff ∆x, where keff is the spring constant, we know that keff will be much greater for the slab than 192 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM

for the wire, even though both are made of steel. Thus keff is not a material property of steel, but rather it depends on things like the thickness and the length of the steel. There is, however, a kind of spring constant k that is a material property, namely the k that describes the “spring” connecting two neighboring atoms in the material. We are assuming that applying a force to try to separate these two atoms would produce a displacement proportional to the force, i.e., that Hooke’s Law also applies at the atomic scale. This amounts to a model of a solid in which the atoms are masses and they are connected in a regular structure by springs. Each spring has a length which is the interatomic distance `atom, whatever that may be. When we stretch the material, we are really stretching all these tiny springs. Imagine the interatomic springs in a steel wire that are oriented along the long direction of the wire. There would also be springs perpendicular to these, connecting atoms across the width of the wire, but they would not be stretched by a weight hanging on the wire, so we ignore them. In any cross-section of the wire the number of lengthwise springs is proportional to the cross-sectional area A. When the wire supports a weight, all these springs are stretched in parallel, so their effective strength adds up: keff for the wire is proportional to A. But the number of lengthwise springs in a wire of length L is proportional to L. When the wire supports a weight, all these springs are stretched in series, so their effective strength is less: the effective k of the wire is proportional to 1/L. Thus A k ∝ (5.29) eff L or, putting in the constant of proportionality, AY k = (5.30) eff L where Y is called Young’s modulus. Thus when we stretch the wire with a force F , the amount of stretching ∆x obeys AY F = ∆x (5.31) L This relation is usually re-arranged to read F ∆x = Y (5.32) A L 5.8. THE FORCE BETWEEN ATOMS 193 and interpreted as follows. The left hand side, F/A, a force per unit area, is called the stress. Its SI unit is N/m2, also called the Pascal, abbreviated Pa. (In Chapter 8 we will meet the notion of pressure, which is also a stress, also measured in Pascals.) On the right hand side we have the dimensionless combination ∆x/L, called the dimensionless strain (or just strain). It is the fractional change in length. What this argument about microscopic springs really says is that stress should be proportional to strain. The constant of proportionality is Young’s modulus Y , a material property. Notice that Y has the dimension of stress. Eq (5.32) is, of course, an experimentally testable proposition, and for small enough strain it is found to be true, as if there really are little springs! Here is a numerical example. Young’s modulus for steel is about 2 × 1011 Pa. A cylindrical steel wire with radius 1 mm and length 10 m, has keff = 2 × 1011π10−6/10 ≈ 6 × 104 N/m. This means a 10 kg mass (weighing 98 N) will stretch the wire about ∆x ≈ 1.6×10−3 m, or between 1 and 2 millimeters. The strain is ∆x/L ≈ 1.6 × 10−4, and the stress is Y ∆x/L ≈ 3.2 × 107 Pa. This value is actually getting to be a bit large! The yield stress of steel is about 2.5 × 108 Pa. This is the applied stress that would make the material begin to deform permanently, not in a springy way. In our example the applied stress is less than the yield stress, but not by much. The breaking stress is about 4 × 108 Pa. We are about an order of magnitude less than the breaking stress in the example. Realistically, though, imperfections in an actual wire may make it weaker than our estimate in practice, and more liable to break. Engineers always design structures with a safety factor, so that applied stresses will be well below the yield stress and breaking stress.

5.8 The Force Between Atoms

If we think about the little springs in more detail, we can relate Young’s modulus to the microscopic strength k of an interatomic spring. Let `atom be the interatomic spacing, and consider again a cylindrical wire. Then in 2 the cross-sectional area A, each lengthwise spring occupies area `atom, and 2 hence there are A/`atom springs in a cross-section. This makes a kind of 2 compound spring of strength kA/`atom, where k is the interatomic spring constant. (We are just recapitulating the reasoning above that said the effective spring constant is proportional to A, only this time keeping the 194 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM constant of proportionality.) Now these effective springs, each with spring 2 constant kA/`atom, are in series when we put them together to make a wire of length L. The number of them is L/`atom, and hence the effective spring constant is found by dividing,

kA keff = (5.33) L`atom Comparing with Eq (5.30), we see that Young’s modulus is related to the interatomic spring constant k by

Y = k/`atom (5.34)

(check dimensions!) If we use Y = 2 × 1011 Pa, for steel, and estimate −10 `atom = 5 × 10 m for the interatomic spacing (5 A),˚ then we find the interatomic spring constant is

−10 11 k = `atomY = (5 × 10 m)(2 × 10 Pa) = 100 N/m (5.35) a very peculiar fact! After all, the Newton and the meter are chosen to be convenient units at our human scale. One doesn’t expect a constant associated with the atomic scale to have an everyday value in SI units. This accident certainly makes it easy to remember, though. Of course the little spring with this k cannot stretch very far. In the numerical example of the previous section we found that a strain of 1.6×10−4, corresponding to a stress 3.2 × 107 Pa on the wire, was already getting close to the yield stress, the breakdown of Hooke’s Law. If we extrapolate to the breaking stress, 4 × 108 Pa, we find that the strain 2 × 10−3 is definitely too large. This corresponds to a stretching of the little atomic spring by −3 −12 ∆x/`atom = 2 × 10 , or ∆x ≈ 10 m, using `atom ≈ 5A,˚ as before. The corresponding force, to stretch the atomic spring too far, is

F = k∆x = 100 pN (Force to break atomic spring) (5.36) i.e., about 100 piconewtons. Interestingly, a similar value, tens of piconewtons, is found in biophysical settings, as the typical force necessary to separate two atoms or molecules which are adhering without having formed a chemical bond. Living cells, 5.8. THE FORCE BETWEEN ATOMS 195 when they crawl over a surface, form molecular adhesions with their sub- strate, and the force necessary to break these adhesions is typically tens of pN. The force necessary to pull membrane proteins out of membranes is similar. Clever methods to pull on the ends of single RNA molecules can detect the breaking of adhesions between one part of the RNA molecule and another, to pull it out straight. Again the force necessary is tens of piconew- tons. The molecules in these experiments are certainly not the constituents of steel, but the strength of adhesions is about the same. We get the impres- sion that non-specific adhesions between atoms can be broken with a force of some tens of piconewtons, at least in order of magnitude. This observation is especially interesting because we don’t have a good theory for it. Our understanding of how individual atoms interact is quan- tum mechanics, and quantum mechanics uses the notion of energy, not force. We understand covalent chemical bonds better than we understand these somewhat ill-characterized situations of atoms which adhere simply because they are close to each other. Even in the steel, the breaking strength is deter- mined not by the perfect crystalline structure but by defects in this structure, grain boundaries between crystals, for example. Perfect crystals could sur- vive much higher strains than 2 × 10−3, the breaking value we estimated for steel, and 100 piconewtons is certainly not enough to separate the atoms in a molecule – rather it can separate two nearby molecules from each other, leaving the molecules intact. This scale, in between quantum and classical physics, is still a research area. 196 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM Problems

Law of the Lever

5.1 Use Archimedes’ result to show that if a weight q balances a weight p at the two ends of a lever of length L, then the balance point is a distance pL/(p + q) from q and a distance qL/(p + q) from p. Sketch how this looks, taking p = 3 and q = 4.

5.2 Explain how a bottle opener works (the kind that pries off a bottle top) in terms of forces and torques. Make a free body diagram for the bottle opener, and give numerical estimates for the forces, assuming equilibrium.

5.3 Describe the forces on the lower jaw when you bite something. Make a sketch of a free body diagram for the lower jaw bone and give numerical estimates of the forces, assuming equilibrium. Distinguish the force down due to the upper jaw at the hinge where the two bones meet, and the force up of the muscle that pulls the jaw closed.

5.4 When you row, you pull one end of the oar, the water pushes the other end, and at the oarlock the boat pushes the oar. Make a free body diagram for the oar, and give numerical estimates for the forces, assuming equilibrium. Why is the boat pushed forward?

5.5 Suppose you hold a 20 lb. weight in your right hand, with your elbow at your side and your forearm extended horizontally. Sketch a free body diagram for the forearm, and estimate the forces, assuming equilibrium. Distinguish the force down of the humerus (upper arm bone) at the joint, and the force up of the biceps muscle.

197 198 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM

5.6 Estimate the forces on a diving board when a diver walks out to the end of it. Make a free body diagram of the board, of course! Note that a diving board typically is fastened at the back and also is contacted by a support underneath, closer to the back than the front, which may even be adjustable (in position).

5.7 A seesaw is pivoted at the middle. How can three children, weighing 40 lbs., 60 lbs., and 80 lbs., distribute themselves on the seesaw so that there is a child at each end and the seesaw balances? Find all possible ways.

Newton’s Third Law

5.8 In Fig 5.8 three masses are shown hanging in equilibrium one below the other on strings. Find the tensions in the strings by more than one argument.

Hooke’s Law

5.9 A 100 g mass is suspended on a spring, and the spring stretches by 5 cm. (a) Find the spring constant k in N/m, making clear your assumptions. (b) Assuming that this k is the same one that governs the frequency of small oscillations about equilibrium, find the angular frequency ω.

5.10 A 3 kg mass is suspended on a spring, and the spring stretches by 20 cm. (a) Find the spring constant k in N/m, making clear your assumptions. (b) Assuming that this k is the same one that governs the frequency of small oscillations about equilibrium, find the angular frequency ω.

5.11 A 2 kg mass is suspended on a spring and stretches it a certain amount. An additional 1 kg mass is added, and the spring stretches an additional 8 cm. Find k, the spring constant of the spring, making clear your assumptions. 5.8. THE FORCE BETWEEN ATOMS 199

m1

m2

m3

Fig. 5.8: Three masses hang one below the other on strings

5.12 (a) Show that 3 springs in parallel, each with spring constant k, act together like a spring with spring constant 3k.

(b) Show that N springs in parallel, each with spring constant k, act together like a spring with spring constant Nk.

5.13 (a) Show that 3 springs in series, each with spring constant k, act together like a spring with spring constant k/3.

(b) Show that N springs in series, each with spring constant k, act to- gether like a spring with spring constant k/N. 200 CHAPTER 5. MASS, WEIGHT, AND EQUILIBRIUM Weight and Mass

5.14 Show how the definition of the pound mass follows from the definition of the pound and the choice of a standard representative value of g. (Note that the representative value of g is a kind of fiction, since at any location g has a definite value that is not a matter of choice, and will not be equal to the representative value.)

Young’s Modulus

5.15 Young’s modulus for copper is about 1.2×1011 Pa, and the yield stress is about 1 × 108 Pa. Suppose you have a 1 m length of copper wire, with diameter 0.5 mm. What is the heaviest weight you can suspend on it without exceeding the yield stress? How much will the wire stretch?

Atomic Forces

5.16 Use the ideas of Section 5.8 to estimate the force to break an “atomic 2 spring” as Sb`atom, where Sb is the breaking stress for the bulk material. A more direct argument to this result is also possible – if you can simplify and shorten the argument, by all means do so. Use the data for steel to evaluate this expression and confirm the estimate we gave of the breaking strength: 100 pN. Chapter 6

Mechanical Energy and Motion

In this chapter we introduce the idea of energy. In particular we see how the notion of energy can replace the notion of force in equilibrium. Instead of saying that forces and torques balance, we say that a certain energy is minimized. We also see how to describe certain kinds of motions, like falling and oscillation, using the notion of energy. This is a hint of a very general trend in physics – the replacement of force methods with energy methods. In quantum mechanics one hardly runs into the notion of force, but the idea of energy is everywhere.

6.1 Gravitational Potential Energy

Everyone knows that things tend to fall down, and to find as low a place as possible. This statement about equilibrium was one of the basic foundations of Aristotle’s physics. Aristotle’s physics has nothing to teach us now about physics, but it does represent what seemed like common sense for roughly two thousand years. It confirms that we all intuitively understand the tendency of things to move downward. How does Nature choose what should go at the bottom, in case there should be a choice about it? In Aristotle’s physics the answer was that Earth, understood here as one of the four elements, should have the lowest place. Next would be Water, another element, so that the equilibrium arrangement

201 202 CHAPTER 6. MECHANICAL ENERGY AND MOTION is Earth on the bottom and Water just above that. In the Aristotelian scheme most materials are compounded of four elements, including Air and Fire, so admixtures of these could make wood, for example, less liable to sink than pure water, with the result that wood floats. In fact pure Air and Fire want to go up, not down, in Aristotle’s physics, perhaps to explain why the air stays overhead and does not fall to the ground.

In the writings of Galileo we can see how a careful observer and ex- perimenter, who was himself educated as an Aristotelian, finally convinced himself that Air does not in fact tend to go up, but has weight, just as Wa- ter does, and tends to go down, as all matter does. Galileo eventually even weighed a quantity of Air, and describes how he did it in his last book Two New Sciences.

In wrestling with these ideas Galileo interpreted Archimedes’ Law of the Lever in a startling new way. This new interpretation answers the question of what should go down in case there is a choice. We can recognize in this argument the beginnings of the idea of gravitational potential energy. Needless to say, Galileo’s formulation was mathematical, in contrast to the wholly non-mathematical formulations of Aristotle.

L 1 ∆θ

L 2

m m 1 2

Fig. 6.1: If the lever rotates, one mass rises and the other falls 6.1. GRAVITATIONAL POTENTIAL ENERGY 203

Galileo’s idea refers to a lever like that in Fig 6.1 and asks what happens if the lever should rotate about the fulcrum. Clearly one mass goes up and the other goes down, but at different speeds, because the two arms are different. The longer arm corresponds to the greater speed. The balance condition says that a small mass with a long lever arm balances a large mass with a small lever arm, but Galileo read this as saying that a small mass with a large speed balances a large mass with a small speed, replacing “lever arm” with “speed” (they are proportional, after all, in a rotation – recall Eq (4.11)). Nowadays we look at how far the masses rise or fall in the rotation (this is also proportional to speed). Galileo’s idea then becomes, a large mass rising a small amount balances a small mass sinking a large amount. This idea is captured in the idea of gravitational potential energy Ug. For a single mass m, the gravitational potential energy is

Ug = mgh (6.1) where h is the height of the center of the mass above some definite level. Ug is a quantity that contains both the amount of mass m, and the height h, but it is only the product that matters, so that large m can compensate for small h, or vice versa. For a system of masses, Ug is just the sum of the energies of the components separately. In particular, for a system of two masses, Ug = m1gh1 + m2gh2.

In the case of the lever in Fig 6.1, the change in Ug when the lever rotates is ∆Ug = m1g∆h1 + m2g∆h2 = (−m1gL1 + m2gL2) sin(∆θ) (6.2)

What does this mean? Ug for the system changes because the heights change. These changes in height are ∆h1 and ∆h2, one negative and one positive, and are both of the form L sin(∆θ). We see that if the lever was in balance, that is, if m1gL1 = m2gL2, then ∆Ug = 0, that is, Ug does not change when the lever rotates. But if m1 is larger than it should be to balance, we know that m1 will go down, that is, the lever will rotate in the direction shown. Since m1gL1 > m2gL2 in this case, ∆Ug < 0, that is, Ug decreases. On the other hand, if m2 is too large, m2 goes down, so that ∆θ < 0 (the opposite of what is shown). Since m1gL1 < m2gL2 in this case, and sin(∆θ) < 0, we find ∆Ug < 0 in this case too. To summarize, if Ug can decrease, it does so. That explains why the lever tips the way it does. If it cannot decrease, then it is in balance. 204 CHAPTER 6. MECHANICAL ENERGY AND MOTION

Let us look at this in a less formal way. The lever has the choice of tipping one way or the other. Either way, the gravitational potential energy of one of the masses goes down, and that of the other mass goes up. What the lever actually does is tip to the side where the lowering of the potential energy of one mass more than compensates for the raising of the potential energy of the other, so that the net effect is to lower the gravitational potential energy of the whole system. The behavior of the system is predicted by the principle that it seeks to lower its gravitational potential energy. We notice that this also “explains” why things fall: they are lowering their gravitational potential energy. This idea can even be formulated as a variational principle: the equi- librium of a system of masses subject to gravity is the configuration that minimizes the gravitational potential energy. Otherwise, if the system can lower its gravitational potential energy, it will move to do so, and it is not in equilibrium. (Note: this variational principle turns out to be a very good way to think about equilibrium, but in more complex situations it must include also other kinds of energy, not just gravitational potential energy.) Let us use this variational principle to find the equilibrium of two masses m1 and m2 mounted on the rim of a wheel of radius R free to rotate, as in Fig 6.2. This turns out to be a kind of generalization of the lever. The strategy will be to find how much the gravitational potential energy changes if the wheel rotates by a small amount ∆θ. If, by rotating, the energy could be lowered, we know that this is not equilibrium. We will work in small angle approximation, because we only want to know if the energy could be lowered by even a small amount. In a rotation, each mass moves a distance R∆θ along the circumference (this is just the definition of radian measure). In small angle approximation, this is the same as moving along the tangent line. The change in height is not this entire distance R∆θ, but only the projection on the vertical direction, which brings in a factor sin θ. Thus

∆Ug = m1g∆h1 + m2g∆h2 = (m1gR sin θ1 − m2gR sin θ2)∆θ (6.3) Note that one mass goes down and one goes up. If the quantity in parentheses is different from zero, then ∆θ can be chosen to make Ug decrease, that is, the energy can be lowered by turning one way or the other. But if the quantity in parentheses is zero, then the energy cannot be lowered. The energy must be at a minimum. This is the equilibrium condition. We write it out below:

m1gR sin θ1 = m2gR sin θ2 (6.4) 6.1. GRAVITATIONAL POTENTIAL ENERGY 205

R

L L 1 2 θ 1 θ 1 θ m 2 1 θ 2 m 2

Fig. 6.2: A wheel of radius R with two masses m1 and m2 mounted on the rim is free to rotate and find its equilibrium. The lever arm for computing the torque about the hub of the wheel is L1 = R sin θ1 for m1 and L2 = R sin θ2 for m2.

Referring to Fig 6.2, we see that the equilibrium condition can also be written as

m1gL1 = m2gL2 (6.5)

where L1 and L2 are the displacements of the masses from the center, pro- jected onto the horizontal line. This is just Archimedes’ Law of the Lever! We see that the lever arm in a situation like this is not the entire distance of the mass from the fulcrum, or pivot, which is R for both masses, but just its projection onto the horizontal, or its horizontal component. Here L1 = R sin θ1 and L2 = R sin θ2, so it is the sine of the angle that accom- plishes this projection. We used this relation from trigonometry in arriving at Eq (6.5).

We now have two characterizations of equilibrium. We can balance forces and torques, or we can minimize potential energy. 206 CHAPTER 6. MECHANICAL ENERGY AND MOTION 6.2 Spring Potential Energy

In the last section we saw that the balance of torques in a lever could be replaced by a different condition, namely that the gravitational potential energy be a minimum. This seems like a completely different way of looking at the lever, but it leads to the same equilibrium condition. In this section we do the same thing for the equilibrium of a weight hanging on a spring. There too the equilibrium can be described as the minimum of a potential energy, but now we must introduce a new term, the potential energy of the spring.

The spring potential energy Us should be a minimum when the spring itself is in equilibrium, and that is when x, its extension away from its relaxed length, is zero. The simplest expression that is a minimum at x = 0 is x2. This is positive for all x 6= 0, and thus clearly a minimum at x = 0. The spring potential energy Us is just proportional to this, and therefore quadratic in extension x. As we shall see, the constant of proportionality has to be k/2, where k is the usual spring constant. That is, 1 U = kx2 (6.6) s 2 Now we consider the potential energy of a mass hanging on a spring, including both the gravitational potential energy and the spring potential energy. The mass and spring hang vertically, along the conventional y axis. Therefore let y be our name for the vertical position of the mass, measured from the position the weight would have if the spring were not stretched. Thus y is both the height of the mass and the deformation of the spring. If y is positive, the mass is above the relaxed position of the spring, so the spring is compressed. If y is negative, the mass is below the relaxed position of the spring, and the spring is stretched. The total potential energy is 1 U = U + U = mgy + ky2 (6.7) g s 2 Now we ask for what value of y the potential energy is a minimum. The min- imum is easy to find by a trick from algebra called “completing the square.” The purpose of completing the square is to represent a quadratic expression like Eq (6.7) as a constant plus a square, i.e., by algebra, 1 ³ mg ´2 m2g2 U = k y + − (6.8) 2 k 2k 6.3. THE POTENTIAL ENERGY OF A PENDULUM 207

The coordinate y occurs only in the squared expression, and that term is of course never negative. Its smallest value is zero, and that occurs when mg y = − (6.9) k This is exactly the equilibrium position. The negative value means the spring is stretched downward, as we know it will be, and the amount of the extension agrees with what we found in Eq (5.18) from the balance of forces. Thus it seems that we can either say the forces balance in equilibrium, or we can say the potential energy is minimized in equilibrium. These formulations are equivalent.

6.3 The Potential Energy of a Pendulum

A pendulum bob of mass m may swing for awhile, but eventually it runs down and reaches equilibrium at the minimum of gravitational potential energy, hanging motionless straight down. We will call this lowest possible position y = 0, and we will measure height on the y axis from this point. Non- equilibrium positions of the pendulum correspond to y > 0, and gravitational potential energy mgy > 0. There is another way to look at the energy of the pendulum, using x, the horizontal coordinate of the pendulum bob, instead of y, the vertical coordi- nate, since they are not independent of each other. The relation between x and y is shown in Fig 6.3 From the Pythagoras Theorem,

L2 = x2 + (L − y)2 (6.10) and solving for y in terms of x we have µ ¶ µ ¶ x2 1/2 x2 x2 y = L−(L2 −x2)1/2 = L−L 1 − ≈ L−L 1 − = (6.11) L2 2L2 2L where we used the binomial approximation Eq (4.20), assuming |x/L| << 1 (small displacement of the pendulum). This says the graph of y is approxi- mately a parabola for small |x|. In fact, of course, the graph of y is a circle, not a parabola, so the restriction to small |x| is necessary. Since x and y are 208 CHAPTER 6. MECHANICAL ENERGY AND MOTION

L L

x m y

Fig. 6.3: A pendulum’s position can be described by x and y coordinates. Here the origin of coordinates is taken at the equilibrium position.

related in this way for the pendulum, we can express the potential energy Ug as 1 ³mg ´ U = mgy ≈ x2 (6.12) g 2 L The gravitational potential of the pendulum looks just like the potential en- ergy of a spring if you write it in terms of the horizontal displacement x instead of the vertical displacement y, because it is quadratic in x. The ef- fective spring constant keff of the pendulum is the expression in parentheses, keff = mg/L. Remarkably, we have already seen this in a different way: from the behavior of the pendulum as a harmonic oscillator, we found that it has this “spring constant” in Section 4.11. Thus the pendulum’s potential energy, which is really gravitational po- tential energy, looks like a spring’s potential energy when it is expressed in terms of the horizontal coordinate. The pendulum is a kind of “gravity 6.4. FALLING, AND KINETIC ENERGY 209 spring.”

6.4 Falling, and Kinetic Energy

When a mass m falls, it loses gravitational energy Ug = mgh, because its height h decreases. On the other hand, it also gains what we might informally call oomph. Galileo explicitly wonders about this in his writings, citing the example of pile drivers – the greater the height h from which the pile driver falls, the better it is at knocking the pile into the ground. It has more oomph. Another example that intrigued Galileo is the behavior of a pendulum when you block the string. Quite remarkably, if the pendulum swings down from a height h above the minimum, then it rises to the same height h on

BP A

Fig. 6.4: A blocked pendulum still rises to the same level. Here the pendulum starts from rest at A, the string is blocked at P , and the pendulum rises to B. 210 CHAPTER 6. MECHANICAL ENERGY AND MOTION the other side, even if you block the string, as illustrated in Fig. 6.4. It still has just enough oomph to get up to the original level. We now understand oomph as a form of energy, called kinetic energy, frequently notated K. In the examples above, K increases as Ug decreases in such a way that the total energy E, meaning kinetic energy plus potential energy, is constant, i.e.

E = Ug + K = constant (6.13)

In the case of the blocked pendulum, the mass m regains exactly its original height at the moment that it stops, because then K is zero and all the energy is once again in the form Ug. If mgh has the same value as originally, then the height h must also be the same. Galileo knew other examples of things that behaved like this too. A smooth ball, like a billiard ball, rolling down a hard, smooth ramp, rolls up again to the same height if it is guided smoothly onto another ramp sloping upward. Its energy becomes kinetic as it loses potential, and then it becomes potential as it loses kinetic, but the total is always the same. The simple relationship in Eq (6.13) can be pictured in a graph, as in Fig 6.5. Although the relationship is simple, the graph is rather abstract. The independent variable is the height h, but it is represented on the horizontal axis. The gravitational energy Ug is proportional to h, so its graph is a straight line through the origin. The constant total energy E is the same no matter what height h the mass m has, so its graph is a horizontal line at that constant value. The difference between these two lines is the kinetic energy K, so what the graph really shows, rather indirectly, is K at any h. Even more indirectly this says how fast m is moving at any h. In particular, m comes to rest at h = H, where K = 0. This height is called a turning point, because something thrown upward with energy E would reach that height (and no farther) before turning to fall back. In Newton’s theory the kinetic energy of a mass m has a simple expression 1 K = mv2 (6.14) 2 where v is the speed. Clearly K is zero if v is zero, at a turning point, for example, and K and v increase together, so K has the general characteris- tics we would expect from the examples. It tells us “how much oomph” m 6.4. FALLING, AND KINETIC ENERGY 211

Ug

E

K

h H

Fig. 6.5: The gravitational potential energy Ug = mgh is shown as a function of height h. The difference between Ug and the constant level E is the kinetic energy K. At h = H, the value of K is zero, so at that height the mass m would come to a stop (turning point). has, due to its motion. This form could have been guessed by dimensional −2 analysis! If we look at the dimension of Ug, recalling [g] = [LT ], we find 2 −2 [Ug] = [ML T ]. If we ask for something with this dimension involving v, with its dimension [LT −1], we see that the only possible combination is mv2. Of course dimensional analysis can’t tell us that there is also a factor 1/2.

With this expression for K, Newton’s theory says that E = Ug + K is constant under certain conditions, especially the condition that friction should be negligible. Let us assume that friction is negligible and see what we can learn from E = constant. Suppose a mass m falls from a height H, being initially at rest. This situation corresponds exactly to Fig 6.5. What is the speed of m when it hits the ground? If we measure height h from the ground, then h = H initially, and h = 0 at the end. Also v = 0 initially (m starts at rest) and we don’t know v at the end. The statement that 212 CHAPTER 6. MECHANICAL ENERGY AND MOTION

E = Ug + K is constant then says that 1 E = mgH + 0 = E = 0 + mv2 (6.15) i f 2 f where subscripts i and f mean “initial” and “final”. Solving for vf we have p vf = 2gH (6.16)

A quick dimensional check shows that this expression is dimensionally con- sistent. (Always check dimensions!) The expression also passes a common sense check, because if m falls from a higher H it will be going faster – clearly correct. A possibly surprising thing is that m has cancelled out of the expression, and in the absence of friction everything should fall at the same speed, independent of mass. This familiar fact is easy to check by dropping things together: they do fall together. Finally, try a numerical common sense check: what would vf be for something that falls off a 10 m house? Taking −2 g ≈ 10 ms , we find vf ≈ 14 m/s. Does that seem right? In units that may be more familiar, this is about 30 miles per hour. It seems reasonable. Turning the problem around, suppose we throw something vertically up- ward at speed vi. How high will it go? The energy calculation is exactly the same as before, except that we switch the indices i and f. Now it starts with kinetic energy and ends with potential energy. Solving for the height H in Eq (6.16) we have v2 H = i (6.17) 2g Do the dimensions agree on both sides? Does the expression make sense? By the same computation as the one in the previous paragraph, we see that if we can throw something like a baseball upward at 30 miles per hour, it will go about 10 meters high. Most people could probably do that, but not much more. If g were less, on the other hand, we could do a lot better. On the moon we could throw 6 times higher, because g is only one sixth its terrestrial value, and g is in the denominator of Eq (6.17). (Reasoning like this from proportionality is much quicker and easier than putting in a lot of numerical values and doing arithmetic.) Caution: some students seeing the example above generalize too quickly and imagine there must be some principle like “kinetic energy equals potential energy.” Nothing like that is true. If the total energy is zero, as sometimes 6.5. VELOCITY V IN FALLING 213 happens, then it is true that “kinetic energy is the negative of potential energy.” But that is obvious: they have to add to zero. More generally all you can say, given that energy is conserved, is that “kinetic energy + potential energy is constant.” That constant is the total energy. Another way to say it is that “potential energy lost is kinetic energy gained” (so that the sum stays constant). Maybe that is what they mean to say.

6.5 Velocity v in falling

There is a slight awkwardness about these computations, because we are as- suming that it makes sense to talk about speed v even though v is continually changing. Before, when we talked about v, it was in the context of constant speed (like the speed of light). There v was the constant of proportionality in case D ∝ t. But if v isn’t constant, it certainly isn’t a constant of propor- tionality, so what is it? Most people feel that they understand what speed v means even if v is changing. It is what the speedometer on a car shows, for example. Still, a careful definition requires calculus, and we have made a conscious choice not to use calculus. That means certain topics are off limits. We have run into one of those limits here: we can compute v for an object√ that falls a distance H, using the method of energy – we have found v ∝ H, in fact, in Eq (6.16) – but we can’t say precisely what v means! (A similar comment would have been in order after Eq (4.37).) Most introductory physics texts spend quite a lot of time on this topic, and do far more with it. In that sense they are paying more attention to the history of physics than we do, because this topic really was very important historically, and led quite directly to the invention of calculus and Newton’s mechanics. If these connections intrigue you, by all means learn calculus and do this right. You will find that the notion of instantaneous v is identical with the notion of derivative in calculus. Without calculus we can only make assertions, and we will keep these to a minimum. Galileo says that for a long time he thought that the velocity of a falling mass should be proportional to the height H that it falls, but Galileo frequently meant “oomph” when he said velocity, that is, kinetic energy K, and it is actually true that K ∝ H for an object that falls from rest, as in Eq (6.15). He describes it as one of the great discoveries of his life when he 214 CHAPTER 6. MECHANICAL ENERGY AND MOTION

finally realized that in this case v ∝ t (6.18) where t is the time the mass falls from rest. That is, for a falling body, v is proportional to time, not space. As an equality, v = gt (6.19)

The constant of proportionality is g, the acceleration due to√ gravity (finally seen here in the context that gives it its name). Since v ∝ H, Eq (6.16), and v ∝ t, Eq (6.19), we have also √ H ∝ t (6.20) If falling were motion at constant speed, we would have H ∝ t, but instead we have Eq (6.20). Squaring both sides and keeping track of the constant of proportionality from Eqs (6.16) and (6.19), we find more precisely 1 H = gt2 (6.21) 2 for the height H fallen in time t, starting from rest. This famous result is due to Galileo, except that Galileo didn’t pay much attention to the numerical value of g. What seemed significant to him was H ∝ t2 in the case of free fall from rest. He often expressed it in the way that he apparently discovered it experimentally. If we put in t = 0, 1, 2, 3, 4, ... we find H = 0, 1, 4, 9, 16, ... in some units. Then in each unit of time we have the differences ∆H = 1, 3, 5, 7, .... This progression of the odd integers, for the ∆H fallen in each successive second, delighted him. It seemed to show very clearly how m was picking up speed at a constant rate in time, faster in each successive second. The accelerated fall of an object dropped from rest is shown in Fig 6.6.

6.6 Universal Gravitation

Newton’s universal gravitational force law says that a particle of mass m1 attracts a particle of mass m2 with a force that falls off like the inverse square of the distance r between them, i.e. Gm m F = 1 2 (6.22) grav r2 6.6. UNIVERSAL GRAVITATION 215

1 3

5

7

Fig.6.6: An object falling from rest is shown at equal intervals of time. The distance fallen in each unit of time is given in units of the first distance. This picture expresses H ∝ t2, where H is total distance fallen, and t is time.

Here G is a constant of Nature, called Newton’s gravitational constant. Re- markably, even large objects attract each other this way if they are symmet- rical spheres. In this case the r in Eq (6.22) is the center-to-center distance between the spheres. This extended case covers the case of the (spherical) Sun and planets. Associated with the gravitational force law is a gravitational potential energy Gm m U = − 1 2 (6.23) G r

(Check dimensions!) We can use UG just as we have used other potential energies, to see how speed changes in falling. First, though, we should clear up something that might be bothering you. Didn’t we already have an expression for the gravitational force on a mass 216 CHAPTER 6. MECHANICAL ENERGY AND MOTION

m2, namely its weight m2g? And didn’t we already have an expression for its gravitational potential energy, namely m2gh? And aren’t these expressions quite different from what we are saying now, in this section? Remarkably, no! For particles near the surface of the (spherical) Earth, these expressions agree, if m1 is the mass of the Earth ME, and r is the radius of the Earth RE. That is, it must be that

GME g = 2 (6.24) RE

In this case Eq (6.22) just says Fgrav = m2g, and also, by the binomial approximation Eq (4.20) ∆U ∆r G = − (6.25) UG r which means for r = RE

GMEm2 ∆UG = 2 ∆r = m2g∆r (6.26) RE using Eq (6.24). Since ∆r, the change in distance from the center of the Earth, is just another name for change in height h, these “new” expressions are really the old expressions for objects near the surface of the Earth!

Graphing the potential energy UG in case m1 is the Earth, in Fig 6.7, we see how an object dropped from a great height would pick up kinetic energy K as it fell, assuming no friction. The fall ends at r = RE, of course, when the dropped object actually hits. Interpreting this graph is just like interpreting the other energy graphs we have seen. Notice that the total energy E in this example is negative, meaning there is a turning point. The case of a planet, of mass m, in circular orbit about the sun, of mass M, can also be summarized in terms of energy, but now the distance r of the planet does not change, because r is the constant radius of the circle. The planet has constant kinetic energy K, but this refers to its speed in its circular orbit. It turns out that for a circular orbit at radius r the kinetic energy is U GMm K = − G = (kinetic energy in circular orbit) (6.27) 2 2r just half the potential energy, in magnitude. Therefore the total energy is U GMm E = K + U = = − (6.28) 2 2r 6.7. ENERGY OF AN OSCILLATOR 217

UG RH r

E K

GMm U = grav r

Fig. 6.7: A mass m falls from rest at height H above the center of a sphere of mass M and radius R. Its total energy is E, and the energy graph shows how its kinetic energy K increases as it approaches the surface at R. This could represent an object falling (without friction) from a great height above the Earth. Compare Fig 6.5, which shows only a small region near r = R, appropriate for heights H ≈ R (i.e., close to the Earth’s surface).

The same considerations apply, with the masses and r changed, to the Moon in circular orbit about the Earth, to artificial Earth satellites, and to the moon systems of other planets. This simple relation between K, UG and E, involving just a factor of 2, is a result from Newtonian mechanics called the Virial Theorem. It holds in a more general form for any bound orbit in a potential proportional to 1/r. Since the electrostatic force between electrical charges, like the proton and the electron, also corresponds to a 1/r potential, the Virial Theorem reappears in the Bohr model of the hydrogen atom!

6.7 Energy of an Oscillator

2 The spring potential energy Us = (1/2)kx , where x is the displacement of a mass m, is characteristic of a simple harmonic oscillator with spring constant k. It has a minimum at x = 0, but if the oscillator is not in equilibrium at 218 CHAPTER 6. MECHANICAL ENERGY AND MOTION p x = 0, then it must be oscillating with angular frequency ω = k/m between x = A and x = −A, for some amplitude A, like a pendulum that is disturbed. A graph of Us vs. x, shown in Fig 6.8, is a good way to organize what is Us

E

K

x -A A

2 Fig. 6.8: The potential energy Us = (1/2)kx of an oscillator. If the oscillator has ampli- 2 tude A, its total energy is E = (1/2)kA , corresponding to Us at the turning points ±A. The kinetic energy K is the difference between E and Us. happening. This is just like Fig 6.5, but with Us replacing Ug. Now the constant energy is

E = Us + K = constant (6.29) The constant total energy E is graphed as a horizontal line. As in Fig 6.5, the graph shows indirectly, through the kinetic energy K, the speed of the oscillator at every point of its oscillation. At the turning points of the oscil- lations, x = ±A, the spring potential energy takes the value E = (1/2)kA2, and the kinetic energy K = 0. At other places in the oscillation K > 0, meaning the oscillator is in motion. K is maximal at x = 0, where Us is minimal. We actually know how x and v behave in time for a simple harmonic 6.8. OSCILLATORS LOSING ENERGY 219 oscillator. From Eqs (4.34) and (4.37) we have

x = A sin(ωt + δ) (6.30) v = ωA cos(ωt + δ) (6.31)

Therefore the energies, as a function of time, are 1 1 U = kx2 = kA2 sin2(ωt + δ) (6.32) s 2 2 1 1 K = mv2 = mω2A2 cos2(ωt + δ) (6.33) 2 2 Since ω2 = k/m and sin2 θ + cos2 θ = 1, we find 1 U + K = kA2 (6.34) s 2 and this is clearly a constant in time, the total energy E. It coincides with Us at the turning points, when K = 0 and all the energy is potential energy.

Both Us and K oscillate in time, but in such a way that their sum is constant. The energy gets passed back and forth from kinetic to potential and back to kinetic, forever, as the oscillator oscillates, as shown in Fig 6.9.

6.8 Oscillators Losing Energy

When a simple harmonic oscillator is not in equilibrium, it has energy E greater than the minimum of Us. It oscillates between turning points x = ±A. To reach equilibrium it must somehow lose this excess energy, represented in Fig 6.8 by the level E. Oscillators left to themselves do run down, like a pendulum given a push and then left alone. Oscillators do lose energy somehow. In Newton’s theory, this can only be due to friction forces. In the case of a pendulum, there might be air friction. If the pendulum is a mass on a string, there might be rubbing of the fibers of the string against each other as the pendulum swings. Such friction has the effect of dissipating energy E, i.e., causing E to go down. One can picture this in Fig 6.8 as the level E literally going slowly down: the turning points x = ±A approach x = 0 from either side. This 220 CHAPTER 6. MECHANICAL ENERGY AND MOTION

E

Us K

t 0 T/2 T

2 2 Fig. 6.9: The potential energy Us = kx /2 and the kinetic energy K = mv /2 for an oscillator add to the constant E at every time t. corresponds to what we see, as the amplitude of the swing becomes gradually less. Eventually the oscillator reaches equilibrium, E = 0, and A = 0. A good oscillator may go through thousands of oscillations as it runs down, however. A (theoretical) ideal oscillator would lose no energy and would oscillate forever at the fixed amplitude A, never reaching equilibrium.

This same picture is used in quantum mechanics. The difference is that the oscillator cannot lose its energy E gradually, so one cannot think of the horizontal line moving gradually down. Instead, the energy E changes in quantum jumps, the same definite amount of energy each time. What we call the equilibrium here is called the “ground state” in quantum mechanics, but it is still the same thing: the state of lowest energy. The quantum oscillator reaches the ground state by losing quanta of energy in discrete jumps.

In the 1900 paper that was the beginning of theoretical quantum mechan- ics, Max Planck proposed this mechanism for the way a simple harmonic os- cillator loses energy, with the further stipulation that the quantum jump in energy would be proportional to frequency, ∆E ∝ ω. Putting in a constant 6.8. OSCILLATORS LOSING ENERGY 221 Us

E hω

x

Fig. 6.10: A quantum oscillator loses energy in discrete jumps of ~ω of proportionality, we now say

∆E = ~ω (6.35) where ~ = h/2π (pronounced h-bar), and h is called Planck’s constant. It turns out to have the fantastically small value ~ ≈ 10−34 kg m2s−1. In our everyday macroscopic world we don’t notice energy losses on this tiny scale. So as far as we can tell, an ordinary pendulum, with ω ≈ 1 s−1 , loses energy smoothly and gradually, not in jumps. At the scale of molecules, though, frequencies may be much higher, easily of the order ω ≈ 1014 s−1. Then ∆E ≈ ~ω ≈ 10−20 kg m2s−1, and in a molecule a change in energy of 10−20 kg m2s−1 might be very noticeable. At the level of molecules the discreteness of ∆E is crucial to understanding energy transfer. The quantum mechanical picture is dominated by the notion of energy. Fig 6.10 is the right way to think of a simple harmonic oscillator in the context of quantum mechanics. The oscillator can only have one of the discrete set of energies indicated in the figure, separated by ~ω, where ω is the frequency of 222 CHAPTER 6. MECHANICAL ENERGY AND MOTION the oscillator. That is why the oscillator can only lose energy in this discrete amount (or quantum). The idea of a friction force does not come into this picture at all. It is a question of current research to understand friction forces in terms of the more fundamental quantum description.

6.9 A Chemical Bond

U

unbound state E K x 0

bound states

Fig. 6.11: The potential energy U of two atoms is shown as a function of their separation x. If the two atoms have energy E as shown, then they could separate, but if they lose enough energy, they will be in a bound state, bound to each other at roughly the separation where U has its minimum.

To see how these ideas are used in quantum chemistry, imagine that the interaction of two atoms is described by a potential energy U(x), where x is their separation, as shown in Fig 6.11. If the total energy of the system is E, then the difference between E and U(x) is the kinetic energy of the system, as a function of separation. The unbound state has an energy E so large that K > 0 for large x. Thus there is enough energy for the system to separate into two atoms arbitrarily far apart. The atoms are not bound to each other. If, however, the system should lose energy and make a transition to one of the bound states, the 6.9. A CHEMICAL BOND 223 atoms could not separate. Their motion in the classical picture would have turning points where K = 0. They would be oscillating in the “potential well”, that region around the minimum of U. The potential well is not a perfect parabola, but it would still be a reasonable approximation to model this situation as a quantum oscillator. The allowed energy levels would not be exactly evenly spaced, though, and there would be only a finite number of them. We have not said how one would know U(x) for the two atoms. Finding and improving this kind of description for atoms and molecules is still a research area. The meaning of U(x) is quite straightforward to interpret if we happen to know what it looks like as a graph, as in this case. In this example there were bound states because of the minimum in U(x), giving the possibility of turning points. It may also happen that U does not have a minimum. In this case there are no bound states, and a molecule could not form. 224 CHAPTER 6. MECHANICAL ENERGY AND MOTION Problems

Gravitational Potential Energy

6.1 (a) How does the sinking of a stone in water illustrate the tendency of systems to minimize their gravitational potential energy? Remember that it is not just the stone that is moving, it is also the water. (b) How does the rising of a bubble in water illustrate the tendency of systems to minimize their gravitational potential energy?

6.2 (a) Suppose two equal weights W are at equal distances L from the center of a balance, but the (massless) beam that they are connected to is not horizontal. Rather it has turned on its fulcrum and is inclined from the horizontal by an angle θ. Use the ideas and idealizations of Section 6.1 to see if, starting from this position, the gravitational potential energy could be lowered by a further rotation of the beam through a small angle ∆θ. Which way will the beam rotate? A picture will help. (Of course if the gravitational potential energy cannot be lowered, then the system is in equilibrium, also a theoretical possibility.)

Spring Potential Energy

6.3 (a) In the example of the mass hanging on a spring, with the potential energy given in Eq (6.7), there is a given mass m, a spring constant k and of course the acceleration due to gravity g. Find a combination of these three quantities that has the dimension of length. This is a natural length for the

225 226 CHAPTER 6. MECHANICAL ENERGY AND MOTION

problem, that simply comes along with the problem somehow. Call it yN , where the N stands for “natural”. (b) Similarly, find a combination of m, k and g that has the dimension of energy. This is a natural energy in the problem, call it UN .

(c) yN and UN would make good units of length and energy for this problem. Find the equilibrium y in units of yN and the equilibrium energy U in units of UN . (d) Graph the potential energy U given in Eq (6.7) as a function of y.

Pendulum Potential Energy

6.4 Explain in words, and using a picture, why a pendulum, meaning a mass m on a string of length L, is effectively a less stiff “spring” if L is longer.

Falling

In the problems below, assume energy is conserved (i.e., ignore friction), and explain your reasoning in detail.

6.5 (a) If in falling 10 m from rest an object reaches a speed 30 mph, what speed will it reach in falling 20 m? Use proportional reasoning, but note that speed is not proportional to distance fallen. (b) If in falling 10 m from rest an object gains kinetic energy 17 J, what kinetic energy will it gain in falling 20 m? (c) If the objects in (a) and (b) are one and the same, what is the mass m of the object? (d) What is the value of g at this location?

6.6 Use conservation of energy to find the speed v of an object that falls a distance H from rest, but unlike the treatment in Section 6.4, choose the 6.9. A CHEMICAL BOND 227 h-coordinate so that h = 0 at the starting point. Since also v = 0 at the starting point, the total energy must be E = 0. Nonetheless, the result for v after falling a distance H must be the same as before, since this is a physical fact, and has nothing to do with how we choose to measure height. Make a clear and careful argument, and include a graph like Fig 6.5, with appropriate changes.

6.7 The profile of a sledding hill is shown in Fig 6.12. Assuming friction is negligible, how fast will a sled be moving on the flat at the bottom if it starts from rest at the top? Note that the figure can also be considered a graph of gravitational potential energy Ug vs horizontal position.

20 m

Fig. 6.12: A sledding hill is 20 m high – what will be the speed of a sled at the bottom?

6.8 Suppose a pendulum of length L = 10 m and mass m = 2.4 kg is pulled back to an angle 5◦ from the vertical and released from rest. (a) Taking the zero of height to be where m hangs at equilibrium, what is its potential energy as it is released? How different would this value be if you used the approximation in Eq 6.12? (b) With what speed does the mass m swing through its lowest point?

6.9 The height H you can throw depends on the speed v you can throw (upward) and the local value of g. Without referring to the text, but only 228 CHAPTER 6. MECHANICAL ENERGY AND MOTION using dimensional analysis, show that if you could throw twice as fast, you could throw four times as high.

6.10 (a) Show how Eq (6.21) follows from Eq (6.16) and Eq (6.19). (b) If you drop a stone into a well and hear a splash 1 s later, the water is about 16 feet down. Suppose you drop in a stone and hear a splash 2 s later. How far down is is the water?

6.11 (a) A stone of mass m is dropped from a high bridge, 20 m above the surface of a river. Ignoring air friction, how fast is the stone moving when it hits the water? (b) Suppose someone had thrown the stone straight down from the bridge at an initial speed 10 m/s. In this case how fast is the stone moving when it hits the water? (c) Give a common sense argument explaining why the answers to (a) and (b) do not differ by 10 m/s.

6.12 Two masses fall from the same height H to level ground at height 0, but the first mass simply drops down starting from rest, while the second is projected horizontally at high speed. Which hits the ground first, and why?

Universal Gravitation

6.13 In this problem, assume the Earth is at rest in space, and not in a complicated solar system! Ignore air friction too. (a) How fast would something hit the Earth if if fell in from a very great distance? (b) How fast would you have to shoot something straight up to give it enough energy to escape the Earth (i.e., not run into a turning point)?

6.14 (a) In 1798 Henry Cavendish, in a delicate and beautiful experiment, measured Newton’s gravitational constant G. Cavendish actually measured 6.9. A CHEMICAL BOND 229 the tiny force of gravitational attraction between two lead spheres a known distance apart! Explain how this determines G. (b) Explain how knowing G, and of course knowing the acceleration g due to gravity at the Earth’s surface, and the radius of the Earth, RE, Cavendish was able to deduce the mass of the Earth. (c) Check the accepted numerical values for these things – does everything agree?

6.15 (a) Use the Virial Theorem to determine the speed of a mass m in circular orbit about a much greater mass M at radius r. (b) How long will it take m to get once around the orbit? This is, of course the period T . You should find T 2 ∝ r3, Kepler’s 3rd Law. (c) We know T and r for the Earth in its orbit around the Sun. Explain how your result in (b) determines the mass of the Sun in terms of things we know about the Earth’s orbit, like its period (1 year) and its radius (1 AU). Evaluate the mass of the Sun.

Energy of an Oscillator

6.16 (a) If the amplitude of a harmonic oscillator is A, then it oscillates from A to −A, a distance of 2A, in half a period. Dividing distance by time, what is the average speed of the oscillator? (b) What is the maximum speed of the oscillator in its oscillation? (c) The ratio of the answers in (a) and (b) is the average speed expressed in units of the maximum speed, a pure number. Find it, and verify that it is less than one, as common sense suggests, and the same for all harmonic oscillators.

6.17 In Section 4.12 we described how an oscillator might swing to an amplitude A always smaller than the previous amplitude by the same factor r < 1, thus running down in a manner that is called exponential decay. Show that in such an oscillator the energy also decays exponentially, being less on each swing by a constant factor (but not the factor r that describes the decay of the amplitude). 230 CHAPTER 6. MECHANICAL ENERGY AND MOTION Chapter 7

Vector Quantities

In the previous chapter we saw how things fall when they fall straight down. A more complicated kind of falling is projectile motion, the motion of some- thing flying through the air, like a ball when you throw it. Galileo discovered how to describe both kinds of falling, and showed that projectile motion is just a simple generalization of vertical falling. Projectile motion requires two dimensions to describe. There is motion both vertically and horizontally. It was Galileo’s great insight that these two motions take place independently. That turned out to be the very simple answer to an old question: what is the path of missiles through the air? It is really quite amazing that this answer wasn’t discovered earlier. Galileo himself seems to be surprised that he was the first to understand it. More generally, the problem of projectile motion is a good example for thinking about motion in two dimensions.

7.1 Projectile Motion

The trajectory of an object that is thrown horizontally in the gravitational field is very simple if you look at the horizontal components and vertical components separately. The amazing fact is that the horizontal motion and the vertical motion are completely independent! While the object moves horizontally at constant speed, it falls vertically in just the same way that it

231 232 CHAPTER 7. VECTOR QUANTITIES x 0 1 1 3 4

5 9

7

16 y Fig. 7.1: An object thrown horizontally falls vertically as if it were simply dropping down, while it moves horizontally equal distances in equal times. The object’s position is shown here at equal intervals of time. The points lie on a parabola. would drop straight down. If we introduce horizontal and vertical coordinates (x, y), with y increasing down, the motion is

x = v0xt (7.1) 1 y = gt2 (7.2) 2 where v0x, the horizontal component of velocity, is constant. This motion, shown in Fig 7.1, traces out a parabola, a curve known to the Hellenistic Greeks. When we say x and y in Eqs (7.1) and (7.2), or in Fig 7.1, we mean the coordinates of a particle in the x-y plane. We can also consider the pair (x, y) to be the two components of the displacement vector, meaning the displacement of the particle from the origin. Here x and y could be either positive or negative, just as in the case of 1-dimensional displacements. 7.2. VECTOR ADDITION 233 7.2 Vector Addition

The first mention of the idea of combining two motion along two different directions seems to be in the Questions on Mechanics attributed to Aristotle (but actually not by him, on stylistic grounds). Sometimes the author is called pseudo-Aristotle. In any case, this little book of problems raises the question of something that moves along a a line while the line itself is moving. The picture is in Fig 7.2. An object moves along the line from A to B, but

CD

E

AB

Fig. 7.2: While an object moves rightward from A to B along the line AB, the line itself moves upward to position CD, so that when the object reaches B, the point B itself has reached D. The effect is that the object moved from A to D, along the diagonal. Halfway through the motion, the line is halfway up, and the object is at point E. as it moves, the line itself moves up. By the time the object reaches B, the line has moved up to coincide with the line CD, so the object reaches the point D. It has actually moved along the diagonal AD. Nowadays we would express this idea with vectors (arrows) for the dis- placements, as in Fig 7.3. The idea is the same as in Fig 7.2, but perhaps easier to interpret. A displacement horizontally along the vector ~r1 combines with the displacement vertically along vector ~r2 to give the net displacement along the diagonal, called the vector sum of the two displacements. The vector sum results from putting the two addends tail-to-head. When we look 234 CHAPTER 7. VECTOR QUANTITIES

r1 + r2 r2

r1

Fig. 7.3: The vector addition of displacements in Fig. 7.2 The displacement rightward along ~r1 combines with the displacement upward along ~r2 to give a net displacement which is diagonally up. This displacement is the vector sum ~r1 + ~r2, the cumulative effect of the two displacements. at it this way, we say displacement is a vector. This implies that we can add displacements as vectors to get net displacement.

A similar thing can be said about velocity, which can also be regarded as a vector. In Fig 7.2 there was a horizontal motion, that is, a horizontal velocity ~v1 that combined with a vertical motion, that is, a vertical velocity ~v2, to produce the actual net velocity along the diagonal ~v1 +~v2. The picture in Fig 7.4 shows how these two velocities combine.

The length of the velocity vector is what we mean by speed. It is just a number, and in fact a non-negative number (with units). We may say v1 or |~v1| for this length. But if we mean the vector, we will always put an arrow over the top, as in ~v1. It is a very good idea to do this when you are writing out problems, too, so that your notation tells you which items are vectors and which are ordinary numbers (also called scalars). The velocity vector is not a number, or scalar. Velocities add (in the sense of vectors), but speeds do not add. In Fig 7.4 we could get the speed of the composite motion from v1 and v2 by the Pythagoras theorem, but if the vectors ~v1 and ~v2 are not perpendicular, then this isn’t true either. In fact, the pseudo-Aristotle book 7.2. VECTOR ADDITION 235

v1 + v2 v2

v1

Fig. 7.4: A horizontal velocity and a vertical velocity combine to give the net velocity, the vector sum, in a diagonal direction. seems particularly surprised to notice that two velocities corresponding to high speeds can add as vectors to make a low speed, as in Fig 7.5.

v2 v1 +v2 v1

Fig. 7.5: If two velocities are somewhat anti-aligned, their vector sum may be shorter than either of them. This corresponds to adding two velocities and finding a new velocity with smaller speed than either summand.

The extreme case of anti-alignment occurs when two vectors point in exactly opposite directions. In this case vector addition is more like sub- traction! Since motion along a line is one dimensional, we wouldn’t actually need to use vectors. We could just express displacements x and velocities v as numbers (either positive or negative along the line, depending on direction). This is what we have been doing in previous chapters, where motion was along just one direction, for the most part. Alternatively, we could say that when we allow position x and velocity v to be either positive or negative, indicating direction, we are actually using vectors, but because they always point along the same line, we don’t need to remind ourselves that they are vectors by using the special vector notation (arrow over the top). That is just for two and three dimensions. 236 CHAPTER 7. VECTOR QUANTITIES

As a physical realization of the pseudo-Aristotle idea, we could think of someone walking on the deck of a ship, while the ship itself is moving. Here the moving ship replaces the moving line. Someone watching the motion from a stationary position on shore would see the velocity of the person as the vector sum of the walker’s velocity with respect to the deck plus the ship’s velocity. The idea of velocities cancelling is easy to picture in this case. Imagine that someone walks toward the stern of the ship at the same velocity that the ship moves forward. Then, to someone on shore, the walker seems to be walking in place, as if on a treadmill. The two velocities add to zero.

7.3 Velocity and Speed

As we said above, speed is the length of the velocity vector. The scalar quantity speed carries less information than the velocity vector, because all information about the direction of the vector has been lost. Still, sometimes all one wants to know is the speed. In case a velocity is known as the vector sum of two perpendicular vectors, as in Fig 7.4, one can find the speed using the Pythagoras theorem. From Fig 7.4, we see that the velocity ~v1 +~v2 is the hypotenuse of a right triangle with sides (lengths) v1 and v2. This means, 2 2 2 from geometry, that the speed squared, |~v1 + ~v2| , is v1 + v2.

In case ~v1 and ~v2 are not perpendicular, though, as in Fig 7.5, we would have to use a generalization of the Pythagoras theorem called the Law of Cosines. In this generality the speed squared is

2 2 2 |~v1 + ~v2| = v1 + v2 − 2v1v2 cos(θ12) , (7.3) where θ12 is the angle between the vectors ~v1 and ~v2 when you place them tail- to-head. If these two vectors are perpendicular, as in Fig 7.4, then θ12 = π/2, and thus cos θ12 = 0, so that Eq. 7.3 is just the Pythagoras theorem. But in Fig 7.5, there is an acute angle between the vectors, and thus the speed is less than it would be if they were perpendicular, because the last term in Fig 7.3 represents something subtracted off. Vectors are frequently expressed in terms of their projections onto stan- dard axes. Usually the x-axis is taken horizontal, pointing to the right, and the y-axis as vertical, pointing up. A vector with projections vx and vy onto 7.4. GALILEAN RELATIVITY 237

these axes can be called ~v = (vx, vy). The projections can be either posi- tive or negative, indicating the directions. The two components together tell everything about the vector. In three dimensions we would have a third axis, the z-axis, and a projec- tion vz onto that axis. Then the velocity vector, in terms of these projections, or components, would be ~v = (vx, vy, vz). The speed squared in this case, the most general case, is 2 2 2 2 |~v| = vx + vy + vz , (7.4) by the Pythagoras theorem. Kinetic energy K of a mass m depends only on the speed |~v|, and not on all the details of the vector ~v. In the most general case it is therefore

m|~v|2 m K = = (v2 + v2 + v2) . (7.5) 2 2 x y z

7.4 Galilean Relativity

It is a fascinating and very physical fact that when you are on a moving ship, or a moving train, or a moving plane, so long as the motion is in a straight line at constant velocity, you do not feel that you are moving. If you drop something, for example, it seems to fall straight down, just as if you were at rest. Galileo noticed this in connection with the problem of whether the Earth moves or is at rest. On the basis of examples like moving ships, Galileo argued that we could be moving and yet not feel it, or even have any way to demonstrate that we are moving. The statement that we cannot tell by experiment whether we are moving smoothly along or not is the statement of the Principle of Relativity. It says that we can tell if we are moving relative to something else, but we cannot tell if we are moving in any absolute sense. In fact, it has no operational meaning to say we are moving in an absolute sense. The effect of changing point of view to another point of view, moving with respect to the first point of view, is very simple in Galileo’s picture of it. All velocities simply get a certain constant velocity added on. This constant velocity just expresses the relative velocity of the two points of view. In Fig 7.4, for example, the velocity ~v1 might be the velocity of a sailor walking 238 CHAPTER 7. VECTOR QUANTITIES across the deck. With reference to the ship, this is the sailor’s velocity. But if the ship is moving with velocity ~v2 with respect to the shore, then an observer on shore gets the velocities with reference to the shore by adding on ~v2 to all velocities with reference to the ship.

7.5 Falling and Relativity

Simple falling, as shown in Fig 6.6, and projectile motion, as shown in Fig 7.1, seem to be rather different things, but in the Theory of Relativity they are the same! The first shows something dropping straight down, and the second shows something that has been thrown horizontally to the right. How could these be the same? In Fig 6.6, something that is initially at rest is released, and in equal units of time drops straight down, falling distances 1, 3, 5, ... in successive units of time. But who is to say that it was initially at rest? From the point of view of someone moving smoothly to the left, this object, even before it was dropped, is moving smoothly to the right. When it is dropped, it will continue to move smoothly to the right, from this point of view, because this is really just the observation of Fig 6.6 from the point of view of someone moving to the left. The result is Fig 7.1! The Principle of Relativity says that both these points of view are equally valid. Einstein frequently used this example in explaining his Relativity The- ory to popular audiences. As he put it, suppose someone drops a small stone from a railway carriage, while it is moving smoothly at constant velocity. With respect to the railway carriage, the stone falls straight down, as in Fig 6.6. But someone outside the train on the embankment who “observes the misdeed” sees the stone falling as in Fig 7.1, because with reference to the embankment the stone is moving along with the velocity of the railway carriage even before it is dropped. Now, Einstein asks, what is its actual path in space, a straight line or a parabola? He argues that this question makes no sense. Objects do not move with reference to space, he says, but only with reference to other objects. Thus you seem to get different answers depending on what reference objects you use, the carriage or the embank- ment, but since they all describe the same physical thing, namely the falling stone, they all must be consistent. The Theory of Relativity is about how to 7.5. FALLING AND RELATIVITY 239 reconcile apparently different points of view, and also about how to use the possibility of switching points of view to get new insight. As we will see in Chapter 20, the famous formula E = mc2 follows from an argument of this type. For our purposes, the only difference between one point of view and an- other is that all velocities change in a simple way when we change points of view. Namely, they all appear to have a certain constant vector velocity added (in the sense of adding vectors). This new velocity is just the one that relates the two different points of view. In going from Fig 6.6 to Fig 7.1, a horizontal velocity is added, corresponding to the horizontal velocity of the railway carriage. We could imagine viewing Fig 6.6 from other points of view as well, say from the point of view of someone moving smoothly up. Then before being released, the object in Fig 6.6 is moving smoothly down, and it continues to do this after being released, while the falling motion is superimposed on it. This leads to the more general equation for motion in the y direction (with positive y down) 1 y = v t + gt2 (7.6) 0y 2 where v0y is the additional constant velocity due to the change in point of view. This generalizes Eq (7.2) to the case of something that is not thrown horizontally, but is projected downwards. Taking v0y negative, we get the case of something projected upwards. Comparing with Eq (6.19) we see that the y-component of velocity in the new point of view would not be just gt but vy = v0y + gt (7.7)

The constant velocity v0y has been added to express how the object moves from the new point of view. The Principle of Relativity says that anything that happens, like simple falling, is really just one representative of a whole family of things that could happen. One gets other members of the family by imagining how the first one would look if it were viewed from a point of view moving at constant velocity with respect to the initial point of view. Since all such points of view are equally good, all members of the family are things that would actually happen. Fig 6.6 and Fig 7.1 are members of the same family, related just by a shift in point of view. That is the sense in which they are really the same. 240 CHAPTER 7. VECTOR QUANTITIES 7.6 Falling and Impulse

In one dimension, a good way to think of how a (constant) force F changes velocity v of a mass m over some time ∆t is to compute the impulse I = F ∆t. This quantity takes into account not just the force F but also the time ∆t over which it acts. Both are important in determining how it changes velocity. One can think of impulse I as being a kind of “kick” administered to a mass m. The last thing to know is that the response to this kick is determined by m, the mass. More massive things (having more inertia m) respond less, with the result I F ∆t ∆v = = (7.8) m m for the change in velocity v. Since both force F and velocity v have a direc- tion, the kick is in the direction of F , and the change in velocity v is in that same direction. In the case of gravity, the force on m is F = W = mg down (just the weight). Thus in a time ∆t, the impulse is I = mg∆t. Dividing by m, the change in velocity is ∆v = I/m = g∆t (down). This is just the same as Eq (6.19) (where ∆t, the elapsed time, is simply called t). The response ∆v is the same for all masses m because the force W , and hence the impulse I is proportional to m, but then to find the response we divide by m. More generally, in two dimensions, force F~ and velocity ~v are vectors, and thus impulse I~ is also a vector. Mass m is a scalar, but the acceleration due to gravity ~g is a vector. The vectors all have a direction. Velocity certainly has a direction, and force does also. Weight W~ = m~g, the force due to gravity, is down, for example, so ~g is a vector down. If we take the down direction to be positive, then we could say

W~ = (0, mg) (7.9) expressing the vector in terms of its horizontal and vertical components. The impulse due to this force over time ∆t is then

I~ = W~ ∆t = (0, mg∆t) (7.10) and the change in velocity is I/m~ , that is,

∆~v = (0, g∆t) . (7.11) 7.7. MORE ON PROJECTILE MOTION 241

That is, the y-component of velocity changes in time, but the x-component does not change, since there is no x-component of force. If you compare the beginning of this section with the end, you will see that the two parts are basically saying the same thing twice. The first part treats the falling problem as 1-dimensional, considering only the vertical direction. The sec- ond part, when we switch to vectors, keeps track of two dimensions, and in particular points out that the horizontal component of velocity is constant (zero change).

7.7 More on Projectile Motion

If, instead of taking the down direction to be positive, we take the up direction to be positive, then the projection of the force due to gravity, W~ = m~g on this axis, being down, is −mg, and the formulae of projectile motion look slightly different, g being replaced by −g. We collect these formulae here, in slightly more general form,

vx = v0x (constant) (7.12)

vy = v0y − gt (7.13)

x = x0 + v0xt (7.14) 2 y = y0 + v0yt − gt /2 (7.15) and say what they mean. The horizontal component of velocity vx is constant. The subscript 0, as in v0x, will always indicate a constant quantity. In vy, for instance, v0y is the value of vy at the initial time t = 0, just some number, indicating by its sign whether the projectile was going up or down at that moment. The velocity vy itself is not constant, but continually decreasing due to the force of gravity, according to the impulse theory of the preceding section. We see this in the term −gt in the formula for vy. The x coordinate changes linearly in time, and its value at t = 0 is the quantity x0. Finally the y coordinate shows the usual falling behavior, but starting at y0 at time t = 0, not necessarily at y = 0.

If we know where the projectile starts, i.e., the position (x0, y0), and its initial velocity (v0x, v0y), then Eqs (7.12)-(7.15) tell exactly how the projectile moves. Frequently, though, these things will not be given explicitly. Someone may ask how high a projectile rises above its starting point, for example. This 242 CHAPTER 7. VECTOR QUANTITIES

is asking for the difference y −y0 when y has its maximum value. It would be enough to take y0 = 0 and simply find the maximum value of y. Or one may ask for the distance a projectile goes from its starting point before hitting the ground again. In this case one may take x0 = 0 and find x at the time the projectile hits. Or one may be told the initial speed v0 of the projectile and the angle θ above the horizontal with which it is projected. In this case one must find the components of the initial velocity ~v0 from a picture. Since

v0 v0y

θ v0x

Fig. 7.6: The components of the initial velocity vector are found by trigonometry. the speed v0 = |~v0| is the length of the hypotenuse of the right triangle in Fig 7.6, we have

v0x = v0 cos θ (7.16)

v0y = v0 sin θ (7.17)

Now we may ask how long a time a projectile rises before beginning to fall back. This is just the time from t = 0, when it is launched, to the time when vy = 0. This occurs (solving for t in Eq (7.13)) at time t = v0y/g. Thus it rises for a time v0y/g. We check dimensions and common sense: it is dimensionally a time, it increases with v0y and decreases with g.

How high does a projectile go? This is just y − y0 at time v0y/g, namely 2 v0y/2g, as we also found in Eq (6.17), where the problem was considered in one dimension. We could also do this problem using energy methods. The constant energy is 2 2 E = mgy + (m/2)(v0x + vy) (7.18) 7.8. IMPULSE AND CONSERVATION OF MOMENTUM 243

Initially y = y0 and vy = v0y. At the highest point y is unknown and vy = 0. Thus 2 2 2 mgy0 + (m/2)(v0x + v0y) = mgy + (m/2)v0x (7.19) 2 Solving for y − y0 we find once again v0y/2g. How far does a projectile go before hitting the ground again, if it is launched over level ground? The time to fall back to earth is the same as the time to rise, so the total time is now 2v0y/g, and the horizontal distance is x − x0 = 2v0xv0y/g. Again we check dimensions and common sense, as well as some special cases. Suppose v0y = 0, corresponding to a horizontal launch. Does it make sense that the total distance is then zero? Suppose v0x = 0. Does this case make sense? Many projectile problems are variations on these.

7.8 Impulse and Conservation of Momentum

The impulse law Eq (7.8) for the way velocity changes due to a force is the closest we will come to stating Newton’s 2nd Law of motion, the fundamental law of Newtonian mechanics. To repeat, it says that velocity changes in response to a force acting through time, and the response of a mass m is inversely proportional to mass, i.e., larger masses respond with less change in their velocity. This law has a particularly simple meaning if it describes two masses m1 and m2 exerting forces on each other, with all other forces, ~ due to other masses, negligible. In this case the only force on m1 is F12, ~ the force on 1 by 2, and the only force on m2 is F21, the force on 2 by 1. ~ ~ Furthermore, by Newton’s 3rd Law, F12 = −F21. We are now representing the forces as vectors. The impulse law says that the change in velocity of the masses will be ~ ~ F12∆t ∆v1 = (7.20) m1 ~ ~ F21∆t ∆v2 = (7.21) m2 and therefore, multiplying in the first equation by m1 and in the second by m2, we see ~ ~ m1∆v1 = −m2∆v2 (7.22) 244 CHAPTER 7. VECTOR QUANTITIES

The quantity m~v, called momentum, has appeared as a consequence of seeing how the two masses interact. What one loses in m~v the other gains. Thus the total momentum, defined as

Total momentum = m1~v1 + m2~v2 (7.23) doesn’t change. This is called the law of conservation of momentum. Notice that momentum is a vector. For a single mass it has the direction of the velocity vector of that mass, and for a system of masses it is a vector sum.

We should notice all the conditions that apply to our derivation of this law. In the first place, the impulse method as we have given it is only for a constant force. Second, the interaction we described was between two particles only, with no forces from any other particles. (This might be a good description in a collision where two things exert much larger forces on each other than anything else does.) It is possible to relax these conditions considerably, and conservation of momentum holds much more generally than this short description would suggest.

7.9 Impulse and Circular Motion

When something of mass m moves at constant speed in a circle, ~v, its velocity vector, is always changing direction (turning). Even though the speed is constant, the vector is changing, and the change ∆~v in a short time ∆t is just the impulse F~ ∆t divided by the mass m. It takes a little thought to make this intuitive, but the force F~ and the resulting impulse, as well as the change ∆~v, are in this case perpendicular to the velocity, towards the center of the circle. The velocity ~v itself is along the circle. Figure 7.7 shows how a change toward the center can turn ~v in the way that it actually does turn.

If we think just about the magnitudes, the speed v is ωR, where ω is the angular velocity and R is the radius of the circle (recall Eq 4.11). In time ∆t the mass m moves through angle ω∆t, but from Fig 7.7, in small angle approximation, this angle is also (∆v)/v. Thus

∆v v = ω∆t = ∆t (7.24) v R 7.9. IMPULSE AND CIRCULAR MOTION 245

v ∆v v ∆ R v v later

Fig. 7.7: When an object moves in a circle of radius R at constant speed, its velocity is continually changing in the centripetal direction, toward the center. The vector addition, adding ~v + ∆~v to produce the later ~v is shown at the right. Strictly speaking, the figure must be understood in small angle approximation only. and therefore v2 ∆v = ∆t (7.25) R We see from the right side of Eq 7.25 that the impulse over this short time must be (mv2/R)∆t and hence that the force causing the mass m to move in a circle is centripetal, with magnitude mv2 |F~ | = (7.26) centripetal R Note that this says nothing about what the force is that makes m move in a circle. It only says that, whatever it is, it must have this magnitude to be consistent with the observed v and R. To make use of this idea, we should think of examples where we know something else about the force. Here is a famous example that makes use of these ideas of centripetal force and circular motion. A planet with mass mP in circular orbit of radius 2 R about the Sun is subject to the attractive gravitational force GmP mS/R , where mS is the mass of the Sun. Since this is just the centripetal force that 246 CHAPTER 7. VECTOR QUANTITIES causes it to move in that orbit, it must be that the speed v of the planet in its orbit is such that the magnitude comes out right, namely

Gm m m v2 P S = P (7.27) R2 R Multiplying through by R/2, we can express this relationship in terms of energies, Gm m 1 P S = m v2 . (7.28) 2R 2 P That is, the kinetic energy of the planet is just half its potential energy −GmP mS/R, in magnitude. We had met this way of looking at it as the Virial Theorem, Eq 6.27. Thus, solving for v in either of the above equations, we have two equivalent ways to know the speed of planets in their orbits. (This relationship implies Kepler’s Third Law, relating the period T to the radius R of the orbit, Problem 7.12.) Problems

Projectile Motion

7.1 A marble is batted horizontally off a table at a speed of 3 m/s. The table is 1 m high. How far from the table will the marble hit the floor?

7.2 For a fireworks display, the rockets are to be fired at an angle of 70◦ above the horizontal, and they should reach a height of 100 meters. What should be the speed of the rockets as they are launched? (Assume that they are given this speed almost instantaneously, and just coast upward after that).

7.3 (a)With what speed should a projectile be launched if it is to carry 1 km over level ground? Assume that it is launched at an angle θ = π/4. (b) How high will this projectile go?

7.4 A projectile is launched at an angle θ = π/3 above the horizontal. It takes 2 seconds to reach its zenith. (a) How high does it go? (b) How far does it go horizontally before hitting the ground?

7.5 A 1 kg mass is thrown upward at an angle θ = π/4 to the horizontal with a speed of 10 m/s. What is the minimum kinetic energy K that it has in its flight?

247 248 CHAPTER 7. VECTOR QUANTITIES Velocity and Speed

7.6 Fig 7.8 shows mutually perpendicular x, y, and z axes, and a vector ~v, with its projections vx, vy, and vz onto the axes. Use the figure to justify Eq (7.4) for the length of the vector ~v. z

v y vz

v x vy x

Fig. 7.8: The velocity vector ~v is projected onto three mutually perpendicular axes.

7.7 (a) Use the Law of Cosines in case θ = 0 to find the speed corresponding to the sum of two collinear velocities ~v1 and ~v2. Also draw the corresponding picture. (b) Repeat in case θ = π.

Impulse

7.8 The gravitational force is down, so the impulse it gives to any mass m is also down, and hence it changes only the vertical component of velocity vy, leaving the horizontal component vx alone. Suppose the velocity vector of a mass m has vertical component v0y and horizontal component v0x initially, and let the gravitational force mg act for a time t, changing the vertical component. (a) Find the impulse delivered to m in time t. 7.9. IMPULSE AND CIRCULAR MOTION 249

(b) Find vy after the time t.

(c) Find vx after the time t. (d) Check common sense: are your answers correct in the special case t = 0?

m 2 2 (e) Initially the kinetic energy was K = 2 (v0x + v0y). What is the kinetic energy after the time t? (f) How much does K change in the time t? Can you interpret this result?

7.9 (a) A superball, of mass 0.2 kg, is dropped from a height of 1 m. With what speed does it hit the floor? (b) Suppose it rebounds upward with the same speed. What is the change in velocity? (hint: not zero!) (c) Suppose the collision with the floor lasts for the short time ∆t = 0.01 s. What force must have been acting over this time to account for the change in velocity? Note: this force has nothing to do with the ball’s weight!

Momentum and Relativity

7.10 Two equal masses m colliding and bouncing away from the collision might behave as shown in Fig 7.9. Since one mass has velocity v and the other has velocity −v, the total momentum is mv + m(−v) = 0. If the only forces on these masses are the forces F12 and F21 = −F12 that they exert on each other, the total momentum should be conserved in the collision, and we see that it is: after the collision the momentum is m(−v) + mv = 0, the same as before. Now imagine how this collision would appear to an observer moving with velocity v, i.e., with the mass on the left initially, who therefore sees this mass initially at rest, and not moving. How would the other mass look, and how would the situation look after the collision, to an observer who always moves smoothly to the right with speed v? Draw pictures showing the collision from this point of view. Show that momentum is also conserved according to this observer (as it must be, according to the Principle of Relativity). 250 CHAPTER 7. VECTOR QUANTITIES

v -v (before) m m

-v v (after) m m

Fig.7.9: Two equal masses approach each other with equal speed, collide, and bounce away with the same speeds. How would this collision look from a moving point of view?

7.11 The previous problem described an elastic collision, so-called because the kinetic energy is the same before and after (no energy lost). Consider the situation of a perfectly inelastic collision shown in Fig 7.10. Answer the ques- tions of the previous problem in this case, and also show that although the kinetic energy before and after the collision is different for the two observers, the change in kinetic energy (the energy lost in the collision) is something they agree on.

v -v (before) m m

(after) m m

Fig. 7.10: In a perfectly inelastic collision, the two masses shown colliding stick together. Clearly the kinetic energy of the system goes down. How would this collision look from a moving point of view? 7.9. IMPULSE AND CIRCULAR MOTION 251 Impulse and Circular Motion

7.12 (a) Kepler’s Third Law says that the period T and radius R of plane- tary orbits about the Sun are related by T 2 = kR3 for some constant k. Show that Kepler’s Third Law follows from Newton’s theory of universal gravita- tion, and determine the constant k in terms of G, Newton’s gravitational constant, and MS, the mass of the Sun. (b) Use the fact that the Earth, with an orbital radius R ≈ 1.5 × 1011 m, orbits the Sun in 1 year to determine the mass of the Sun. You will need to know G ≈ 6.67 × 10−11 N · m2/kg2.

7.13 (a) Suppose you whirl a mass m in a horizontal circle on the end of a 50 cm cord. If m = 0.1 kg and makes 1 revolution per second, what is the horizontal component of tension in the cord? (Only the horizontal component is centripetal). (b) The cord in part (a) pulls along its length, which means the tension force also has an upward, or vertical, component. What is the vertical com- ponent of tension in the cord? (Hint: something is supporting the mass, keeping it from falling.) (c) What angle does the cord make with the vertical as the mass m whirls around? 252 CHAPTER 7. VECTOR QUANTITIES Chapter 8

Density and Fluids

King Hiero of Syracuse commissioned a crown of gold, but when it was deliv- ered he suspected the goldsmith of cheating him. The crown had the correct weight of the gold he had given, but what if the goldsmith had kept some of the gold and made up the weight with less precious silver? He told his friend Archimedes of his suspicions. The most famous story of Archimedes tells how he detected the forgery of the crown. (This is the one where he jumps out of the bath shouting ‘Eureka!’ and runs naked through the streets.) The 1st century Roman author Vitruvius tells this story. According to him, Archimedes’ great insight came when he stepped into the bath and noticed how the displaced water overflowed, giving a way to measure volume, and hence density. If the crown were not pure gold, it would not have the density of gold, and thus the forgery could be proven.

8.1 Mass Density

Weight is proportional to volume for a pure substance, like gold. If you have twice as much gold (by volume), it weighs twice as much. Since weight is also proportional to the more fundamental quantity mass, we can also say mass M is proportional to volume V , that is,

M ∝ V (8.1)

253 254 CHAPTER 8. DENSITY AND FLUIDS

The constant of proportionality is called mass density, and is often given the Greek letter ρ (‘rho’), so that M = ρV (8.2) The dimension of ρ is [ρ] = [ML−3]. A mass density has to be multiplied by a volume to become a mass. Mass density is an example of a “material property.” It is a characteristic of the material, and can be measured and tabulated for future use. It is common to give this property in the cgs unit g/cm3. In SI units the value is larger by a factor 1000. Below are the densities of a few familiar materials: Substance Mass Density (g/cm3) Water 1.0 Aluminum 2.7 Iron 7.9 Copper 9.0 Silver 10.5 Gold 19.3 Mercury 13.5 Lead 11.4 The least imaginative way to measure the density of a substance is to take a known mass M, to measure its volume V , perhaps by measuring the volume of water it displaces, and then to take the ratio ρ = M/V . According to Vitruvius, this is what Archimedes did. Later readers have suspected that in this version the story has been dumbed down to the level of Roman comprehension. One internal piece of evidence: Vitruvius actually does not even seem to understand the concept of ratio. Rather, according to him, Archimedes took a lump of gold of the same weight as the crown, and compared the water it displaced with the water the crown displaced. The crown displaced more water, being less dense than gold. Galileo noticed that the rather childish reasoning in Vitruvius’ story was not really worthy of Archimedes (‘that godlike man’), especially when one considers that Archimedes’ own theories suggest a much more sensitive method. In an unpublished essay written at about the age of 20, Galileo sug- gests that what Archimedes actually did was much more interesting than the Vitruvius story. To understand the idea, we have to know about Archimedes’ Principle. 8.2. ARCHIMEDES’ PRINCIPLE 255 8.2 Archimedes’ Principle

If a substance is less dense than water, it floats, and if it is more dense, it sinks. This is just one consequence of Archimedes’ Principle, which says that in equilibrium a substance immersed (or partially immersed) in a fluid is buoyed up by a force equal to the weight of the fluid it displaces. This insight is simple, general, applicable, and useful. You would think that something proved thousands of years ago could be improved on now, but really, Archimedes’ Principle is perfect as it is. There is nothing to add. This is all the more amazing when you look at Archimedes’ proof, which is so simple that you almost think there must be something wrong with it. Somewhat surprisingly, the fluid is taken to have a spherical surface, with the center of the Earth as center. In most applications we would be looking at such a small volume of water that its surface would look flat, but Archimedes is correct that strictly speaking the equilibrium surface is curved. This assumption does not play an essential role in the argument, but it is necessary to know about it in order to understand Archimedes’ diagrams.

AB

D C

Fig. 8.1: An object floating on the left side displaces a volume of water equal to the symmetrically located volume on the right. Since the system is in equilibrium, the two weights must be the same. The volume ABCD might be very small, despite appearances, just water in a pail, for example.

Archimedes’ only postulate in his book On Floating Bodies is that in a fluid “that part which is thrust the less is driven along by that which is thrust the more.” Thus in equilibrium the “thrust” must balance. In Fig 8.1 256 CHAPTER 8. DENSITY AND FLUIDS the only difference between the left side and the right side is that a certain volume of water on the right has been replaced by an object on the left which also protrudes out of the water. These must weigh the same amount, or one side will be “thrust the more.” Thus the weight of the water displaced is the same as the weight of the object, and since the object is in equilibrium, it is buoyed up by a force equal to its weight. Thus the object is buoyed up by the weight of the water displaced. This law of buoyancy is also shown to hold in more general situations. Suppose you push the floating object under water and hold it there with a force F . The thrust on the left will be the weight of the object Wo plus your force F . It is balanced by the weight Ww of the symmetrically located volume of water, corresponding to the water displaced by the object, which is now more than before, since the whole object is immersed. Since F + Wo = Ww in equilibrium, you must push down with a force

F = Ww − Wo (8.3)

Similarly, if an object does not float, but sinks, you could keep it from sinking by exerting a force F up such that −F + Wo = Ww, and therefore

F = Wo − Ww (8.4)

You don’t need to support the whole weight Wo, with your force F , but only the excess of that weight over the weight of the displaced water. It is as if the object had lost as much weight as the weight of the displaced water.

In both these examples the weight Ww of the water displaced acts like a force up, a buoyant force. In a free body diagram for an immersed object we should include it as a force Fb (b for buoyant) due to the adjoining fluid, as in Fig 8.2. From the diagram we can read off the net force on the object, meaning the sum of both forces. If the object has volume V and density ρo, then its mass is ρoV , and its weight is Wo = ρoV g. Suppose it is completely immersed. Then the net force is

Fnet = Wo + Fb = Wo − Ww = (ρo − ρw)V g (8.5) where ρw is the density of water. Note that the volume of displaced water is also V if the object is completely immersed, so the weight of the displaced water is ρwV g. The object is in equilibrium if these forces balance, i.e., 8.2. ARCHIMEDES’ PRINCIPLE 257

Fb= -Ww

Wo

Fig. 8.2: An immersed object is subject to two forces, its weight Wo, and the buoyant force Fb due to the adjoining fluid, equal in magnitude to the weight of the fluid displaced. It is called −Ww here, since weight is a force down, but the buoyancy force is up.

Fnet = 0, which happens if ρo = ρw. If ρo > ρw, then Fnet > 0 and the object sinks. If ρo < ρw, then Fnet < 0 and the object rises. (We implicitly took the down direction to be positive when we represented weight by a positive quantity and represented Fb by the negative of a weight.) The behavior is entirely controlled by the relative densities.

The ratio ρo/ρw is called the specific gravity of the object. It is just density measured in units of the density of water. Specific gravity greater than 1 implies the object will sink, and less than 1 implies the object will float. The values in the table of the previous section are specific gravities, since they are given in units where ρw = 1.

The weight Wo in Fig 8.2 is shown acting at the center of gravity of the object (we are assuming that the object is the same density everywhere, so that it would balance at its geometrical center). The buoyant force Fb is shown acting at what would be the center of gravity of the fluid volume, if the object were not there. The fluid is homogeneous, so this is the same 258 CHAPTER 8. DENSITY AND FLUIDS geometrical center. That is, both forces act at the same place, and the torques about this point automatically balance. There is no unbalanced torque, and hence no tendency for the object to turn, even if it sinks or is buoyed up. In principle, though, these two forces act at two different points, and it is only the simplicity of this object that makes the two points coincide in this case.

Fb

ρ=1 ρ=2

Wo

Fig. 8.3: A symmetrically shaped object is denser at one end than at the other. When it is immersed there is an unbalanced torque about the center, because the buoyant force operates where the (homogeneous) fluid volume would balance if the object weren’t there, namely at the center. (The density is given in units of the left hand density. The actual value is irrelevant.)

In Fig 8.3 we imagine an immersed object that has a symmetrical shape but an asymmetrical distribution of mass (denser on the right than on the left). Again the weight acts at the center of gravity, where the object would balance, and the buoyant force acts where the corresponding fluid volume would balance, but now the two points are different, and there is a net torque. Whether this object is buoyant and rises, or is denser and sinks, it will twist in the clockwise direction. There would be no unbalanced torque in Fig 8.3 if the object were aligned vertically instead of horizontally. This is true whether the center of gravity is directly above the geometrical center or below it. In the first case, though, the equilibrium position is unstable, because as soon as the object tips a little bit, as in Fig 8.4 on the left, there is an unbalanced torque that tends to turn it in the direction it has tipped. Any accidental tipping gets amplified. If the 8.2. ARCHIMEDES’ PRINCIPLE 259

2 1

1 2

Fig. 8.4: The object in Fig 8.2 is imagined to be immersed at an angle. If the denser part is above, the unbalanced torque tends to turn the object over. If the denser part is below, the unbalanced torque tends to keep it below. Thus the left side is close to the unstable vertical alignment, and the right side is close to the stable vertical alignment. center of gravity is below the geometrical center, as in Fig 8.4 on the right, the unbalanced torque tends to restore the (rotational) equilibrium. This is the stable alignment. One could imagine the object seeking to minimize its gravitational potential energy. If the denser part is on top, it can lower its potential energy by turning over. If the denser part is on the bottom, it tends to stay there. One obvious application of these ideas is to the stability of ships. The buoyant force that holds them up is applied at the center of the displaced fluid, which is somewhere below the water line. In a careless design, the center of gravity of the ship might very well be above this, especially if the ship has an elaborate superstructure. Such a ship would be unstable. To solve this problem, old sailing ships carried a permanent load of rocks in their hold for ballast. This brought the center of gravity low enough for stability. Modern racing yachts have a heavy keel for the same reason. A familiar example of this phenomenon is a floating cube. Since the cube protrudes out of the water, it is clear that a symmetrically floating cube would have its center of gravity above the point where the buoyant force acts. This alignment is unstable. As a result, the cube tips and floats at an odd angle. We will follow historical treatments in talking about rectangular solids 260 CHAPTER 8. DENSITY AND FLUIDS

floating in a geometrically simple way, as if they were somehow forbidden to tip, but when you think about it, you realize that tipping makes the problem much more complicated!

8.3 Galileo’s Balance

Galileo argued that Archimedes would have tested the crown of Hiero using a combination of Archimedes’ Principle and the Law of the Lever, as shown in Fig 8.5. The measurement occurs in two steps. First the crown, with weight

L' L

W

L' L-x x

W

Fig.8.5: Galileo’s balance: the specific gravity of the crown is L/x. (Details of the argument are given in the text.) The crown in the figure is apparently only a little bit denser than aluminum!

Wo, is balanced by a counterweight W at the distance L from the fulcrum. 8.4. GALILEO’S PROOF OF ARCHIMEDES’ PRINCIPLE 261

The balance of torques tells us

0 WoL = WL (8.6) Now a pail of water is brought up and the crown is immersed. In effect the crown loses weight Ww, the weight of the displaced water. The counterweight is moved in by a distance x so that the crown once again balances. The balance condition is now

0 (Wo − Ww)L = W (L − x) (8.7) Subtracting Eq (8.7) from (8.6) we have

0 WwL = W x (8.8)

0 0 By Eq (8.6), we have L = L(W/Wo). Putting this expression for L into Eq (8.8), and dividing both sides by WL, we have W x w = (8.9) Wo L

Since Ww/Wo = ρw/ρo, the specific gravity ρo/ρw is ρ L o = (8.10) ρw x The specific gravity of the crown, which is essentially the unknown density of the crown, is represented visually by the position x of the counterweight. In Fig 8.5 the crown has specific gravity a little more than 3, and is certainly not gold! For some reason Galileo never published this elegant idea. He was, in fact, surprisingly secretive, for a man who later became such a public figure. Looking back, though, we can see in this youthful essay that Archimedes had, so to speak, come to life again in the work of Galileo.

8.4 Galileo’s Proof of Archimedes’ Principle

Galileo may have been unsatisfied with Archimedes’ argument for Archimedes’ Principle, even while he was convinced that it must be true. He gave an- other proof. The idea, translated into modern terms, is that a floating body 262 CHAPTER 8. DENSITY AND FLUIDS

A ∆y

h D

Fig. 8.6: How much does the gravitational potential energy change if an object moves vertically by the amount ∆y? The object has height h, cross-sectional area A, and it has sunk a depth D into the fluid. in equilibrium, being free to move in any way, actually moves to minimize the gravitational potential energy (of the whole system). The situation is pictured in Fig 8.6. The object has height h, cross-sectional area A, and therefore volume V = Ah, mass density ρo, and therefore mass ρoAh. It has sunk a depth D into the water. As before, we compute the change in gravita- tional potential energy in case the object moves. If this can be negative, then we know it is not at the equilibrium, because it can move to lower its energy. The effect of moving the object down a distance ∆y in terms of energy is the same as taking a slice of small thickness ∆y off the top and putting it on the bottom, which would lower the energy of the object by an amount ρoA∆ygh. At the same time, though, one must take a “slice” of water of 8.5. BUOYANCY AND PRESSURE 263 the same dimensions and move it up to the surface (where it spreads out – the level of the water goes up slightly). This raises the potential energy of the water by ρwA∆ygD. The change in the gravitational energy Ug of the system is therefore

∆Ug = (ρwD − ρoh)A∆yg (8.11) If the quantity in parentheses is not zero, then one can choose ∆y to make ∆Ug < 0, so that the potential energy can be lowered. Thus the equilibrium condition is ρwD − ρoh = 0, or

ρ h D = o (8.12) ρw

This tells us the depth of the object in its equilibrium position. We check the weight of the displaced water in equilibrium: its volume is AD, so its weight is, using the equilibrium D from Eq (8.12), µ ¶ ρoh ADρwg = A ρwg = Ahρog (8.13) ρw

This is exactly the weight of the object. The condition that the gravitational energy should be minimized has thus led back to Archimedes’ Principle.

8.5 Buoyancy and Pressure

The buoyancy force Fb on an object is surely due to the adjoining water, but Archimedes’ proof gives no clue how the water actually exerts this force. The energy method does not even mention force. We now understand the force exerted by a fluid in equilibrium in terms of the concept of pressure. Pressure P is a force per unit area, with dimension [P ] = [ML−1T −2]. The SI unit of pressure is the Newton per meter squared (N/m2), also called the Pascal, abbreviated Pa. Thus if the pressure is 10 Pa, the force on 1 square meter would be 10 Newtons, and the force on 0.1 square meter would be 1 N. A pressure has to be multiplied by an area to be a force, so it is a kind of “force density.” Such a thing is also called a stress. Pressure is called a normal stress, because the force due to pressure acts normally to the surface (i.e., in the normal direction, perpendicular to the surface). 264 CHAPTER 8. DENSITY AND FLUIDS

h D

A

Fig. 8.7: The fluid pushes inward on an object of height h and cross-sectional area A, having sunk to a depth D. The net force due to the fluid is a force PA up, where P is the pressure at depth D. The horizontal forces balance.

Let us see how this force supports a buoyant object, like the one in Fig 8.6. We redraw it, showing the normal forces due to pressure in Fig 8.7. On every part of the surface contacted by the fluid there is a force due to pressure P , directed normally into the surface. The horizontal forces balance, but on the bottom surface there is a force inward, which is to say up, equal to PA, where P is the pressure at the depth D on the surface A. This must be the buoyant force! We already know its magnitude is the weight Ww = ρwADg of the water displaced, by Archimedes’ Principle. Thus the pressure at depth D is W ρ ADg P = w = w = ρ Dg (8.14) A A w We notice that this equilibrium pressure, or hydrostatic pressure, as it is also 8.5. BUOYANCY AND PRESSURE 265 called, is proportional to depth D. This makes sense: as you go down, the pressure goes up. P is also proportional to ρ, in case it is some fluid different from water. The pressure in liquid mercury, for example, would be about 13 times greater than in water at the same depth. This same hydrostatic pressure accounts for why the fluid is at rest in equilibrium. The volume of fluid above an area A is supported by the force PA (up). That is, the object in Fig 8.7 could be removed, and fluid allowed into its place. That fluid becomes, in effect, the object, and is buoyed up in exactly the same way, by the same hydrostatic pressure. Since that volume would weigh much more if the fluid were mercury, the pressure has to be correspondingly more in mercury than in water. This gives a very simple interpretation of hydrostatic pressure: at any depth D, it is whatever is necessary to hold up the weight of the fluid above. For a fluid of density ρf the pressure at depth D is

P = ρf Dg (8.15)

If we know how pressure translates into force on a surface, Eq (8.15) implies Archimedes’ Principle. In Fig 8.8 we consider an object completely immersed. The horizontal pressure forces balance, but the pressure on the

h

A

Fig. 8.8: An object completely immersed in a fluid is buoyed up by the weight of the water displaced, because the pressure below is greater than the pressure above. 266 CHAPTER 8. DENSITY AND FLUIDS

bottom (depth D2) is greater than the pressure on the top (depth D1). Using the hydrostatic pressure from Eq (8.15) we have the buoyant force

Fb = ρf D2gA − ρf D1gA = ρf hgA = ρf V g (8.16) using V = hA for the volume of the object. Note that ρf is the density of the fluid! Thus, in magnitude, Fb is the weight of the fluid displaced. It doesn’t matter whether the object is shallow or deep, whether the pressures are large or small. All that matters is the difference of the pressure on the top and bottom, and this difference is always the same for this object, leading to a buoyant force that is the same at any depth. Intuitively you might think that at great depth the enormous pressure on the top would drive the object down more, but this is not true: the even more enormous pressure on the bottom more than makes up for this effect. It might occur to you that at high pressure the object could be com- pressed, and have a smaller volume. Then it would displace less fluid and the buoyant force would be less. This is true! We have ignored compressibil- ity in the above discussion, but a clever toy called a Cartesian diver exploits this effect. The object traps an air bubble, and this bubble is quite com- pressible. The bubble is created to make the object just barely buoyant, so that it floats, but if you can increase the pressure on the whole system, the bubble is compressed, the object becomes more dense than water, and it sinks. Releasing the pressure allows the bubble to expand, displace more water, and it floats again.

8.6 More on Hydrostatic Pressure

The fluid in Fig 8.9 is in equilibrium, although it would be possible to get confused, wondering how the small weight of water on the left can “balance” the great weight of water on the right. The answer is that the equilibrium condition for a fluid is a condition on pressure, given in Eq (8.15), and not a condition on weight. It is true that the force on the floor of the righthand compartment is much greater than the force on the floor of the lefthand compartment, but that is because its area is greater: the pressure is the same. In fact, if the pressure were to be greater on the floor at the right, the fluid in the connecting tube would be “thrust the more” from the right, 8.6. MORE ON HYDROSTATIC PRESSURE 267

Fig. 8.9: The fluid is in equilibrium, despite the greater weight on the right. and would move from right to left. In equilibrium, however, the fluid is not moving, and hence the pressure must not change as one moves horizontally. The pressure depends only on depth.

A more surprising application of the hydrostatic pressure in Eq (8.15) is the siphon, shown in Fig 8.10. If we think about the hydrostatic pressure in the top vessel, we measure depth from the top surface. Continuing into the tube we have a region of negative depth, where we are above the top surface, before following the tube down to the second vessel, where the hydrostatic pressure of Eq (8.15) is positive again (we will revise this picture slightly in the next section, but not in a way to change the analysis). This fairly large positive pressure is the hydrostatic pressure we would find if the tube were closed off at the bottom. If we think of the hydrostatic pressure in the bottom vessel, however, we find a lesser value, because we measure from its top surface, which is lower. In particular, the hydrostatic pressure at the mouth of the tube, in case the tube is closed off, is less than the pressure in 268 CHAPTER 8. DENSITY AND FLUIDS

Fig. 8.10: The fluid is not in hydrostatic equilibrium, but flows in the direction that it is “thrust the more.” If it were in equilibrium, as it would be with the bottom of the tube closed off, the pressure in the tube would be higher than the pressure just outside the tube. the tube. If the tube is now slightly opened, so that the equilibrium is only slightly disturbed, the higher pressure inside drives fluid out into the lower vessel, and the fluid slowly transfers from the upper to the lower vessel.

8.7 Atmospheric Pressure

It is easy to understand that deepsea creatures live in a world of high pressure, but it was only in the 17th century that it was understood that we too live in a deep sea: a sea of air. We too are subject to a hydrostatic pressure, sufficient to hold up the column of air over any area down here on the ground. That 5 2 pressure, atmospheric pressure, is roughly Patm ≈ 10 N/m . Thus on every square centimeter of surface there is a force inward, due to the atmosphere, of about 10 Newtons, about the weight of 1 kg, more than 2 pounds. Thus there 8.7. ATMOSPHERIC PRESSURE 269 is a force distributed on our bodies of hundreds of pounds, tending to crush us. Why don’t we feel it? One answer is that the fluid in our tissues is all at this same pressure, so there is no particular danger of actually being crushed. Rather we are in hydrostatic equilibrium with this ambient pressure. It is not of any use to us to be aware of it, and we have not evolved any sensory organs to detect it. We do, of course, detect sudden deviations from equilibrium. Atmospheric pressure, so easy to forget, should really have been included in our hydrostatic pressure result, Eq (8.15). If we think about how hy- drostatic pressure supports a column of water, we should realize that the pressure force PA on the bottom of the column must balance not only the weight of the water, but also the force PatmA on the top. That means the hydrostatic pressure at depth D is actually

P = Patm + ρDg (8.17)

This correction to Eq (8.15) does not change any of our previous examples in any significant way, though. The reason is that it was always a difference of pressures that was the important thing. In understanding buoyancy, for example, we looked at the difference in pressures between the top and bottom of an object: when we take the difference, the constant Patm cancels, leaving us with the same expression that we would have without it. In fact, it is hard to think of an everyday example where the actual value of the pressure matters. Usually all that matters is the difference of the pressure from Patm, which is the term we have been emphasizing. This difference is often called gauge pressure, since it is what a typical pressure gauge measures, by comparing the pressure (in a , say) with the ambient atmospheric pressure. If your tire pressure is 26 pounds per square inch (psi), according to a tire gauge, that is above and beyond the atmospheric pressure of about 14 psi. It would be technically correct, but not a good idea, to insist to your mechanic that your tire pressure is really 40 psi. This could only be confusing. On the other hand, in the context of the ideal gas law, which we will meet soon, you must use the true pressure of 40 psi! One place where the difference between Eq (8.15) and (8.17) might appear significant is in our discussion of the siphon. When the depth D becomes negative (above the surface of the top vessel), Eq (8.15) gives a negative pressure P , but Eq (8.17) gives a positive pressure. In the end it is only a difference of pressures that is relevant, and Patm cancels out, but one might 270 CHAPTER 8. DENSITY AND FLUIDS worry that perhaps negative pressure doesn’t make any sense. We will see, however, that true pressure can be negative, even if it isn’t actually negative in this case.

8.8 The Barometer

Evangelista Torricelli, a student of Galileo, showed how to measure Patm, essentially by measuring the gauge pressure of the vacuum (the true pressure is 0). His barometer is shown schematically in Fig 8.11. In equilibrium the

Fig.8.11: A tube evacuated at the top indicates the ambient pressure by the height to which a fluid rises: a barometer pressure at the level of the fluid in the reservoir is Patm, and the pressure at the top of the tube (a vacuum) is 0. Thus, from the equilibrium of the fluid column, Patm = ρgH (8.18) 8.8. THE BAROMETER 271 where H is the height of the fluid column (what we would call depth D measuring from the top of the column to the level of the reservoir). This fluid column weighs exactly as much as a column of the entire atmosphere of the same horizontal cross-section. The equilibrium condition determines H: P H = atm (8.19) gρ

We see that H ∝ Patm, so a measurement of H is a measurement of Patm in some units. Torricelli measured H and found that Patm varies a little bit from day to day! Barometric pressure is a very familiar part of our weather reports now, but at the time this discovery must have been completely unexpected. Let us estimate the size of a water barometer. Putting everything in SI 5 2 2 3 3 units, we have Patm ≈ 10 N/m , g ≈ 10 m/s , and ρ ≈ 10 kg/m , giving H ≈ 10 meters. This is rather tall! Torricelli’s barometer was a glass tube (so that he and his neighbors could see the water level) sticking out through a hole in the roof of his house. A much more practical device uses mercury. Since the density of mercury is over 13 times that of water, and H ∝ 1/ρ, a mercury barometer is shorter by a factor of 13, and can sit comfortably on a table. It is about 0.76 meters tall. In the US the height is still given in inches of mercury, and is about 30 inches, but varies with the weather, of course. In the eye of a hurricane it can be 29 inches. Yet another unit of pressure is the “atmosphere.” One atmosphere is, by definition, 760 millimeters of mercury, referring to the above barometer design. The millimeter of mercury, as a unit of pressure, is also called the Torr, for Torricelli. Aristotle believed that “Nature abhors a vacuum”, and that what is going on here is that the fluid is being sucked upward by the vacuum, and is trying to fill it. This is so intuitively appealing that one must make a conscious effort to think about it in terms of pressure, a pressure that we are intuitively unaware of. The vacuum does nothing to suck the fluid upward. Rather the atmospheric pressure outside pushes the fluid into the tube. This is how you drink through a straw too – contrary to intuition! Basically, there is no such thing as suction in these examples. Try to imagine that! What we call suction here is really just the creation of a pressure less than atmospheric pressure. There is such a thing as suction in the sense that it is possible to create negative pressure. This would be tension in a fluid. There is still a pressure 272 CHAPTER 8. DENSITY AND FLUIDS force, but it points the other direction, pulling normally on a surface, not pushing. The best familiar example is tall trees. Atmospheric pressure can only lift the sap in trees to a height of 10 m, as we have just seen, but many trees are taller than that. The reason is that there is a mechanism in the crown of the tree (evaporation) for continually removing fluid, like a pump, putting the fluid beneath under enough tension to literally suck the sap up!

8.9 Bernoulli’s Principle

The Renaissance engineers knew how to build ornamental fountains for their aristocratic patrons. In particular, they knew that the reservoir that feeds the fountain must be as high as you want the fountain to rise. This sounds very much like conservation of energy. If we think of following a small volume ∆V of water through the system, we can imagine it starting at the top of the reservoir, at height H, where it has initial gravitational potential energy Ug = ρ∆V gH. The water is drawn off through a pipe at the bottom, and the reservoir is fed by a stream at the top, so that it is always full. Our volume ∆V gradually sinks lower in the reservoir, approaching the pipe, but not moving quickly at all. Thus it loses its potential energy, but it doesn’t gain kinetic energy. Then, as it goes into the pipe, it picks up speed v, meaning kinetic energy K = (1/2)ρ∆V v2. If the pipe is horizontal, this speed is just enough to bring it back to height H in the fountain, so it must be that K is just Ug, as if energy were conserved. And yet we know that when the water lost potential energy, it didn’t gain kinetic energy K right away. There was an intermediate time when it was at rest at the bottom of the reservoir. When it finally gained kinetic energy K, it was because it moved from the high pressure of the tank to the low pressure of the pipe: the unbalanced pressure accelerated it through the pipe. This tells us how it managed to “remember” its original height: it loses Ug but goes to higher pressure P . Then it moves to lower pressure P but gains K. It is trading off these quantities among each other, but something is staying the same, so that in the end, when it is all Ug again, it is the same as before, that is, the same height H. The above discussion is not a proof of anything, but rather an example of a situation governed by Bernoulli’s Principle, a rather sophisticated result of the theory of fluid motion. It says that in a steady flow, if there is no 8.10. APPLICATIONS OF BERNOULLI’S PRINCIPLE 273 appreciable friction, then the quantity 1 P + ρgh + ρv2 = constant (8.20) 2 along a streamline, that is, it is constant as you follow some ∆V along, h being its height at any time, and v being its speed. The quantity ∆V has disappeared from the statement: the three terms each have the dimension of energy density, not energy. If you multiply each term by a volume ∆V , then it has the dimension of energy. It is rather a surprise to see P , which we think of as a force per unit area, a kind of surface force density, appearing as an energy per unit volume, a volume energy density, but dimensionally these are the same. In the example of the fountain, the energy density is initially ρgH. As we follow our ∆V to the bottom it loses this energy density, but the pressure P increases, and sure enough: at the bottom, P has increased by the hydrostatic value ρgH. Thus Bernoulli’s principle includes the notion of hydrostatic pressure, in case K = 0. Then, in our example, P essentially goes to Patm in the pipe, and K takes the value ρgH, just the energy density necessary to give ∆V the energy ρ∆V gH, the amount it must have to reach a turning point at height H. Each term in the statement of Bernoulli’s Principle, Eq (8.20), is the important one at some time in this process. The conserved quantity moves around among all three terms.

8.10 Applications of Bernoulli’s Principle

8.10.1 Force of the wind

How hard does the wind push on you? Enough to lean into? Enough to knock you off your feet? Of course it matters how fast it is blowing, and whether you stand facing it or sideways. We will model this situation by assuming Bernoulli’s Principle applies, since air in motion does not seem to be slowed by friction. We imagine a region ∆V in the air moving horizontally with speed v until it encounters an obstacle (like you!) and is brought to rest. What does Bernoulli’s Principle say? Since the height h doesn’t change, the term ρgh is constant, and can 274 CHAPTER 8. DENSITY AND FLUIDS be considered part of the constant on the right hand side of Eq (8.20). Far 2 away from you, P = Patm, and the kinetic energy density is (1/2)ρv . Thus, when the air is brought to rest, the pressure is

1 P = P + ρv2 (8.21) atm 2 and the extra force on you, apart from the force PatmA which is always there, is 1 F = (P − P )A = ρv2A (8.22) wind atm 2

To estimate its value we need to know the mass density ρ of the air, and we need to assume a wind speed v and an area A that you present to 2 the wind. Notice Fwind ∝ v , so that when the wind speed doubles, the force quadruples! A very rough estimate for ρ comes from the barometer. We noticed that a column of water 10 meters high weighs the same as a column of atmosphere of the same cross-sectional area A. If we take the atmosphere to be 10 km high, then the column of atmosphere has 1000 times more volume, and yet it weighs the same. Therefore its density must be 1000 times less than that of water, 1000 kg/m3, so that ρ ≈ 1 kg/m3 for air. Despite the crudity of this estimate, the result is about right, and the argument is a kind of mnemonic. The area A you present to the wind might be A ≈ 0.3 m2, and the wind might be blowing at v ≈ 10 m/s, a bit more than 20 mph. Then the force of the wind would be F ≈ 15 N, a bit more than 3 pounds force, not very much, but definitely noticeable. A hurricane force wind of 4 times this speed would exert 16 times more force, more than 50 pounds – you could certainly lean into it.

These rough estimates seem to agree with common experience, and sug- gest that we really do understand essentially what is going on here: the wind exerts a pressure force on us, because the pressure on the windward side of an obstacle, where the air is brought to rest, is higher. We must also as- sume that the pressure on our back, the leeward side, is just Patm, because in the shelter of the obstacle there are no streamlines terminating – rather the wind streams past us, leaving a sheltered region roughly in hydrostatic equilibrium. It is then the difference between the pressure on front and back that we experience as the force of the wind, as in Eq (8.22). 8.10. APPLICATIONS OF BERNOULLI’S PRINCIPLE 275

8.10.2 Flow Past an Airfoil

The most spectacular and counterintuitive consequence of Bernoulli’s Prin- ciple is the lift on an airplane wing, or airfoil. As sketched in Fig 8.12, flow

v0, Patm v1, P1

v0, Patm

v2, P2

Fig. 8.12: Steady flow past an airfoil, shown here in cross section, can lead to different speeds above and below, and hence, by Bernoulli’s principle, different pressures. past the airfoil may result in different speeds above and below. The quan- tity P + (1/2)ρv2 is constant along the flow lines, and has the same value 2 on each flow line, namely Patm + (1/2)ρv0, its value before it encounters the airfoil. If v1 > v2, then necessarily P1 < P2, and the pressure is greater on the underside of the wing. Thus the pressure force P2A up is greater than the pressure force P1A down, and the net force due to the air on the wing is up. That is lift! In Fig 8.12 the wing is imagined to be stationary while the air streams past it to the left, as in a wind tunnel. According to the principle of relativity, however, all that matters here is that the air is moving relative to the wing. It would be the same if the air were stationary and the wing moved to the right, as in flight, the view we would have of this phenomenon if we were to move uniformly to the left. In that view the air on top of the wing might be approximately stationary, but the air under the wing would be to some extent moved along by the wing and compressed, making the pressure higher. 276 CHAPTER 8. DENSITY AND FLUIDS

It is amazing that aircraft are able to stay aloft. We should not leave this example without a numerical estimate, just to be sure that we have basically understood it. Suppose that v0 ≈ 200 m/s (about 450 mph), and that v1 ≈ 210 m/s, v2 ≈ 190 m/s, so that the difference in the two speeds is 10% of the average speed. We take ρ ≈ 1 kg/m3 for air. Then the pressure imbalance is 1 P − P = ρ(v2 − v2) ≈ 4000 N/m2 (8.23) 2 1 2 1 2 The weight of 100 passengers, if each, together with luggage, has a mass 100 kg, is about 105 N (taking g ≈ 10 m/s2). The wing area A necessary 5 2 for (P2 − P1)A ≈ 10 N is 25 m . The load would be more than just the passengers, of course, since the plane itself weighs a lot, so let us say the wing area should be several times this. It is still in the realm of plausibility. Jetliner wings are roughly this size. This estimate also lets us know why 2 planes have to fly fast. If ∆P ∝ v0, as is suggested here, then cutting airspeed by a factor of 2, cuts lift by a frightening factor of 4.

8.11 Flow in Pipes

Liquids are essentially incompressible in flow (i.e., the density ρ is constant), and this has an interesting consequence for flow in pipes. If we consider any section of pipe with fluid filling it and flowing through it, the amount of fluid entering this section must just equal the amount leaving it, since there is no space for any extra fluid to go. We picture steady flow through a constriction in Fig 8.13, paying special attention to the section between the heavy vertical lines. In a time ∆t a volume V enters, of length v1∆t and cross-section A1, i.e., V = A1v1∆t (8.24) and the same volume V , of length v2∆t and cross-section A2 leaves, where v1 and v2 are the speeds at the entrance and exit. Since these two volumes are equal, A1v1 = A2v2 = I = constant (8.25) This tells us how speed changes with the cross-section of the pipe. Equiva- lently 1 v ∝ (8.26) A 8.11. FLOW IN PIPES 277

A1 A2 ∆ v2 t ∆ v1 t Fig. 8.13: Flow in a pipe with changing diameter: the volume entering this segment in a short time ∆t, namely A1v1∆t, must equal the volume exiting, A2v2∆t.

As the area A goes down, the speed v goes up. The constant of proportion- ality is I. A nozzle like that in Fig 8.13, constricting the cross-sectional area, may be explicitly intended to give a large exit speed. You have probably done this yourself with a garden hose, constricting the exit with your thumb. The volume V that flows through any cross-section A of the pipe is pro- portional to ∆t, according to Eq (8.24). The constant of proportionality is I = Av, the volumetric flow rate, also called current, the same everywhere along the pipe. (Check that [I] = [L3T −1], and could have units liters/second, gallons/hour, etc.) You could collect the fluid coming out of the pipe, and the volume collected would increase with time at the constant rate I. Knowing A for the pipe, and measuring volume flow rate I = Av by collecting fluid, you could find the speed v = I/A by dividing. If we multiply the volume flow rate I = Av by the mass density ρ we have ρvA, the mass current. Multiplying by ρ essentially converts volume to mass. The dimension of mass current is [MT −1], with SI unit kg/s. Its significance is the rate at which mass passes through the area A. In particular, it is the rate at which mass accumulates if you collect it at the end of the pipe.

8.11.1 Venturi Flow Meter

If friction at the walls and in the interior of the flow is unimportant, the steady flow in a pipe obeys Bernoulli’s Principle. That means we can also say how the pressure changes in the pipe. Taking the pipe horizontal, so that 278 CHAPTER 8. DENSITY AND FLUIDS the height h is a constant, we have 1 1 P + ρv2 = P + ρv2 (8.27) 1 2 1 2 2 2 in Fig 8.13. One could measure the pressure difference in the tube

µ 2 ¶ 1 2 2 1 2 A2 ∆P = P2 − P1 = ρ(v1 − v2) = ρv1 1 − 2 (8.28) 2 2 A1

2 using Eq (8.25). Since ∆P ∝ v1, with a known constant of proportionality, this measurement of ∆P actually measures v1, the flow speed into the con- striction. Such a device is called a Venturi flow meter. It could be used to measure wind speed, for example, a case where the volumetric method would not be very appropriate.

8.11.2 Poisseuille Flow

The flow speed v = I/A, found from the current I, should be understood as a kind of average v. In Fig 8.13 we drew the fluid motion as if v were constant over the cross-section A. Such a flow is called “plug flow”, because the fluid moves like a solid plug in sections where A is constant. But it may very well happen that the flow is faster in the center of the pipe and slower near the pipe wall. In that case v = I/A is really an average speed. The flow rate I says nothing about how the flow is distributed across the cross-section. Fig 8.14 shows a pattern of flow in a pipe called Poisseuille flow. The flow speed is zero on the wall of the pipe and has a maximum in the middle. In this case it turns out that the volumetric method gives v = I/A = vmax/2. Like all averages, v is somewhere between the extreme values, 0 and vmax, but it is just a coincidence that v is exactly the arithmetic mean of the two. If you measured v by the volumetric method, you might be surprised to find that some impurity introduced into the pipe actually appears at the other end in only half the time you expected.

8.11.3 Current Density

We began this section with a method for finding the velocity of the flow v from the volume current I, namely v = I/A. Now we turn that around. We think 8.11. FLOW IN PIPES 279

∆ A vmax t

Fig. 8.14: Poisseuille flow has a parabolic flow profile, that is, the volume flowing through the cross-section A in time ∆t is bounded by a parabola. The volumetric flow rate for this flow turns out to be I = Avmax/2.

of v as a kind of density of current, that must be multiplied by a geometrical factor A (the area of the pipe) to be a current I = vA. In the section above, on Poisseuille flow, we even saw that the current density v varies over the cross-sectional area A, so some parts of A have more current going through than other parts. The center has the most current going through, in the sense that if we took a little area ∆A located on the centerline of the pipe, it would have more current v∆A through it than the same ∆A would have if it were located near the wall of the pipe (where v is less). The dimension of volume current density is volume per unit time per unit area. That turns out to be just [L/T]: velocity!

This notion of current density is surprisingly useful and important. We will run into it wherever there is some current spread out over an area. Solar energy is best described as an energy current density, for example, because there energy is arriving at some rate from the Sun, but it is spread out over the area that collects it. Electric current in a wire can be described just like volume current in a pipe: the electric current density might look like Poisseuille flow, for example, highest in the middle. Or maybe electric current is like plug flow. Or maybe it is completely different, with most of the current density on the surface of the wire and not much in the middle – these would be different current densities, even with the same current. How the current is actually distributed is, of course, an experimental question. 280 CHAPTER 8. DENSITY AND FLUIDS ∆ y vmax t

D

x

Fig. 8.15: Flow over a plane solid surface is a simple example of shear flow. The volume that flows through the area of height D in time ∆t is shown in blue. The rate of shear strain is γ˙ = vmax/D. The flow could be created by a horizontal surface contacting the fluid from above, moving at speed vmax, dragging the top layer of fluid along with it.

8.12 Shear Stress and Viscosity

In the Poisseuille flow of Fig 8.14 layers of fluid slide on other layers, since they are not all moving together at the same speed. This sliding of fluid layers on each other is called shear flow. The simplest geometry for shear flow is shown in Fig 8.15, where horizontal layers slide on each other. In this example the speed of flow (in the x direction) is proportional to the height y above the solid floor, vx ∝ y (8.29) The subscript x on the speed v indicates that it is in the x direction. The con- stant of proportionality is called the rate of shear strain, or more informally shear rate, often denotedγ ˙ (‘gamma dot’). Thus

vx =γy ˙ (8.30)

Note the dimension [γ ˙ ] = [T −1], like an angular speed. This quantity is a measure of how fast the fluid is being sheared, that is, how fast it is being distorted. As we shall see, fluids resist being sheared fast: the faster they are sheared, the more they resist. If you do it slowly, they resist less. In the limit as you do it very slowly, they don’t resist at all, and in this respect are different from solids, which resist even static distortion (Hooke’s Law). The shear flow in Fig 8.15 might be a model for the flow at the bottom of a river. The fluid doesn’t move on the river bed, but as you go up, into the 8.12. SHEAR STRESS AND VISCOSITY 281 river itself, the fluid speed increases, shearing the fluid. In this situation the fluid exerts a force on the river bed in the x direction, as if it were tending to drag the solid surface along. By Newton’s Third Law, the river bed must exert a force on the fluid to hold it back (this is the “resistance” of the fluid to shearing, alluded to above). The drag force on the river bed is best described by a stress, i.e., a force per unit area, because it is distributed over the surface: every little area on the bottom feels a drag force proportional to its area. Similarly, if there is a solid surface on the top, dragging the fluid along at speed vmax, it exerts a force on the fluid in the x direction and the fluid exerts a force on the surface in the −x direction. This shear stress is like pressure, in that it is a stress, but pressure is a normal stress, and shear stress is a tangential stress. The corresponding force is tangential to the surface, not normal to it. Recall that the SI unit of stress is the Pascal or N/m2, force per unit area. Multiplying by an area in m2, you get a force in Newtons.

This tangential shear stress is frequently given the name σxy (‘sigma’). For normal fluids the shear stress is proportional to the rate of shear strain γ˙ , i.e., σxy ∝ γ˙ (8.31) This makes precise what we meant by the fluid resisting more if it is sheared faster. At higher shear rate, the stress is proportionately higher. The con- stant of proportionality is called viscosity, given the symbol η (‘eta’). Thus

σxy = ηγ˙ (8.32)

Viscosity is a material property of the fluid. It determines how much tangen- tial force the fluid exerts on a surface in a given shear flow, or, by Newton’s Third Law, how hard you have to push a fluid tangentially to shear it at a given rate. Note the dimension of viscosity: [η] = [ML−1T −1]. The SI unit of viscosity is the kg/m·s or Pa·s. Let us do a numerical example to make this idea concrete. The viscosity of water is about η ≈ 10−3 Pa·s. This rather small value suggests that water does not exert much tangential force in shear flow at our human scale, and that turns out to be true. In fact water is a good lubricant, and a smooth wet floor can be dangerously slippery. We will do a rough estimate of the force on a shoe moving along the floor at vmax = 1 m/s in case there is a thin water layer. Suppose the layer is D = 10−3 m thick (1 mm). Then, 282 CHAPTER 8. DENSITY AND FLUIDS referring to Fig 8.15, and imagining the shoe sole as the solid surface at the top of the fluid region, separated from the floor below by the fluid layer, we 3 −1 see thatγ ˙ = vmax/D ≈ 10 s . Multiplyingγ ˙ by η for water, we find the tangential shear stress σ ≈ 1 Pa, according to Eq (8.32). On a shoe of area A ≈ 300 cm2 ≈ 3 × 10−2 m2, we have a force of only σA ≈ 3 × 10−2 N. It is no wonder the floor is slippery! We are accustomed to push tangentially with a force of many pounds against the floor when we walk. If there is a water layer, a much smaller force creates a shear flow like the one in Fig 8.15, and we . If the water layer is thinner than we estimated, the rate of shear strain would be proportionately larger. If for example it were only 10−6 m thick (1 micron), then the force we estimated would be 1000 times larger, 30 N – that’s still only a few pounds.

A similar dangerous effect is ‘’ in automobiles: if you drive at high speed into a flooded layer on the highway, the may lose contact with the road surface, and the customary friction with the roadway is replaced by the shear stress of the water, which is essentially zero. Tires have tread patterns mainly to give water a channel to get out from under the tire.

The lesson of this computation is that friction in water, at least at our length scale, is really quite small. It is tempting to ascribe the resistance of the water that we feel in swimming, for example, to friction, and hence to viscosity, but that is not really where it comes from. That force of resistance is almost entirely due to pressure (normal stress) not shear stress (tangential stress). Bernoulli’s Principle gives a better way to think about it, as in the problem of the force of the wind in Section 8.10.1. Replace the wind by water.

8.13 Stokes Flow

At the small length scale of one-celled organisms, viscous resistance in swim- ming is dominant. If we are swimming and take a stroke or give a kick, we can coast along through the water, especially if we hold a streamlined position. But if a paramecium stops moving its cilia it comes to a dead stop immediately. It does not coast at all. Streamlining would be no advantage to a paramecium or any other small creature. 8.14. POISSEUILLE FLOW REVISITED 283

The viscous drag force F on a small sphere of radius R moving at speed v through a fluid of viscosity η was calculated in the mid 19th century by George Stokes, who found F = 6πηRv (8.33) in the direction opposite to v (i.e., tending to slow the sphere down). This is the force which in fact does slow a small sphere down in extremely short time. If, however, there are other forces on the sphere, like gravity and the buoyant force, then they will balance the Stokes force and the sphere will move with constant speed v, proportional to 1/η. This suggests a way to measure viscosity: observe the slow sinking through a fluid of a small sphere of known material and size. The speed indirectly tells you the viscosity: the smaller the speed, the greater the viscosity. The viscous force on a small object moving through a fluid is surprisingly insensitive to shape. A flat disk of radius R is quite a different shape from a round sphere, but the force on it is about the same as the force on the sphere – and it doesn’t matter much whether it is moving face first or edge first! All that matters is its typical linear dimension, the radius R in both these examples.

8.14 Poisseuille Flow Revisited

Bernoulli’s Principle does not apply to Poisseuille flow. Since Poisseuille flow is a shear flow, there must be shear stress, which is a kind of friction. In the steady flow of Fig 8.14 the cross-section A is constant, so also v is constant, but contrary to what Bernoulli’s Principle would predict, P is not constant. Rather there is a pressure difference between one end of the pipe and the other, and this is what pushes the fluid through the pipe, in spite of friction. (Without friction, the fluid would just coast through at constant v, and the pressure P would be constant. Bernoulli’s Principle describes only this frictionless case.) Flow in a large pipe, at moderate speed, like the flows in household plumbing, obeys Bernoulli’s Principle to good approximation. We don’t need pumps to push water through our horizontal pipes. It pretty much coasts through. Because the viscosity of water is so small, the friction is not very important. Also, in our larger arteries, the pressure of the blood 284 CHAPTER 8. DENSITY AND FLUIDS is related to the pressure at the heart by Bernoulli’s Principle. That is why it makes sense to measure it in the arm. In large blood vessels, friction is unimportant. In a narrow pipe at low speed, however, like capillary flow in the cardiovascular system, Bernoulli’s Principle certainly does not hold, even approximately, and an appreciable pressure drop is necessary to push the flow through the capillaries, just the pressure difference between the arterial side and the venous side of our circulatory system. So when does Bernoulli’s Principle apply? And when is friction important? Apparently we cannot simply say that water has small viscosity and forget about it at this smaller scale. We can understand this situation with the help of dimensional analysis. A key quantity to think about is the pressure drop ∆P from one end of a pipe to the other in Poiseuille flow, pushing a steady current I through the pipe. Let the pipe have length ` and radius R. How does ∆P depend on these quantities? Well, in terms of dimensions, ∆P is a stress, i.e., [∆P ] = [ML−1T −2]. It seems reasonable that ∆P ∝ `, because if we follow the pipe with another identical one, making a pipe twice as long, we would need the same pressure drop again to drive I through the second pipe. That is, if ` became twice as long, ∆P would be twice as big too. ∆P ∝ I, because higher pressure would push more current through (by making the fluid flow faster). The dimension of current is [I] = [L3T −1]. In order to get the dimension [M], we must also introduce the viscosity [η] = [ML−1T −1], but that makes sense, because if the viscosity were higher, like in thick oil or syrup, we would need more pressure to create the same current I. Then in order for the dimensions to agree we must have, for some dimensionless constant C, µ ¶ Iη` ∆P = C (8.34) R4

The factor R4 is dimensionally necessary, R being the only other length in the problem. It also makes sense: if the pipe is wider, the pressure to drive current I would be less, dramatically less, in fact. If the pipe is twice as wide, ∆P is less by a factor 16. For Poisseuille flow, the constant C turns out to be 8/π ≈ 2.5, but the actual value is not so important. We just note that it is “of order 1.” By this dimensional argument we know something about ∆P . Now is ∆P small or not? Remember that Bernoulli’s Principle predicts 8.14. POISSEUILLE FLOW REVISITED 285

∆P = 0 for steady flow in a horizontal pipe, since it says

1 P + ρv2 = constant (8.35) 2 and the second term is constant. Bernoulli’s Principle is approximately true if the change in P , that is, ∆P , is much less than the (truly constant) second term (1/2)ρv2. We therefore find the ratio and ask if it is much less than 1:

∆P 2Cπη` = (8.36) 1 2 2 2 ρv ρR v

Here we used ∆P from Eq (8.34) and I = πR2v to express the current I in terms of speed v. Is this ratio small? We notice that the length ` of the pipe is in the numerator. This means that for a long enough pipe, the ratio is as large as we want, and Bernoulli’s Principle certainly fails. That is, friction is important in a long enough pipe, but that is just common sense. Of course the fluid eventually loses energy in that case. But let us take a short pipe, with length ` some small multiple of R, say ` = cR, with c a pure number of order 1. Then, ignoring all the dimensionless constants, and just keeping the dimensional factors, ∆P η ∼ (8.37) 1 2 2 ρv ρRv

Is this small? We can estimate it for flow in the aorta, where we have argued Bernoulli’s Principle holds. The viscosity of blood is perhaps 5 times that of water, so 5 × 10−3 Pa·s. Also ρ for blood is essentially the same as water, 103 kg/m3, R ≈ 10−2 m, and v ≈ 1 m/s. The right side of Eq (8.37) is then the dimensionless number 5 × 10−4. This is small – and we see why Bernoulli’s Principle holds to good approximation.

On the other hand, we can do the same estimate for flow in a capil- lary. Then R ≈ 5 × 10−6 m and v ≈ 10−3 m/s, and we find the right side of Eq (8.37) is the dimensionless number 1000, very large compared to 1. Bernoulli’s Principle doesn’t hold here! 286 CHAPTER 8. DENSITY AND FLUIDS 8.15 The Reynolds Number

The number we were computing at the end of the last section is the reciprocal of a very famous quantity, the Reynolds number. We define it here, ρRv Re = (8.38) η where η and ρ are the viscosity and mass density of a fluid, R is a typical length characterizing the flow, like the radius of the pipe, and v is a typical speed in the flow. Taking reciprocals of the quantities we computed in the last section, we note that Re ≈ 2000 in the aorta and Re ≈ 0.001 in the capillaries. If Re >> 1, then we expect Bernoulli’s Principle to hold, and friction to be unimportant. On the other hand, if Re << 1, then friction dominates the flow. This just restates the result of the last section in terms of Re instead of 1/Re. Now it is easy to see why the viscosity of water is not very important at the human scale of R ≈ 1 m and v ≈ 1 m/s: we find Re ≈ 106, which is much greater than 1. On the other hand, for single celled organisms, with size perhaps 10−5 m, accustomed to move at 10−5 m/s, we have Re ≈ 10−4. This is much less than 1, so viscosity dominates their world. In fact, whenever you look at things moving in a microscope, you are looking at a situation dominated by viscosity, with small Reynolds number. The Reynolds number helps us decide for a given flow whether viscosity is important or not. If Re is less than one, viscosity is very important. If Re is greater than 100, viscosity is not important – ah, if only life were that simple! As Re gets larger, viscosity should become less and less important, and Bernoulli’s Principle should become more and more accurate. But as Osborne Reynolds showed in the 1890’s, in careful experiments involving pipe flow, something amazing happens when Re becomes larger than 2000 or so – the flow suddenly makes a transition to a new kind of flow, turbulence. This transition to turbulence happens in a pipe at high enough Re, which means large enough radius R, large enough speed v, small enough viscosity η, or any combination of these that makes Re sufficiently large. In turbulent flows, Bernoulli’s Principle fails for a new reason: the new complicated flow includes shear flow at all length scales, including the very small scales where viscosity is important. That is, the radius R of the pipe is not the only thing setting the length scale. The flow itself can set a new length scale! 8.15. THE REYNOLDS NUMBER 287

We still have the proportionality relationship

∆P ∝ I (8.39) between pressure drop ∆P and current I in the pipe, as in Eq (8.34), argued there by dimensional analysis, but now the pure number C in that relation- ship need not be “of order 1”. It might involve the large, dimensionless Reynolds number. We can still write this proportionality relationship as

∆P = rI (8.40) but here r is a resistance that we cannot actually calculate, or even estimate, from first principles. It is called a resistance, because for fixed pressure drop ∆P , larger r means smaller I (their product is fixed). In turbulence r is unexpectedly large, and this represents a departure from Bernoulli’s Principle (which, you recall, says ∆P = 0 for horizontal flow through a cylindrical pipe, i.e., r = 0). Turbulent flow is not well understood theoretically, even though most familiar flows are turbulent. This means we are surrounded by fluid flow phenomena that we don’t fully understand. Just to give one example, the resistance r of a pipe to turbulent flow depends in a mysterious way on the roughness of the interior wall, something that is completely irrelevant in Poisseuille flow. In turbulent flow the motion is chaotic, even if there is some overall average speed of the fluid as a whole. Within the turbulent flow the whole fluid is rapidly mixed, and there is no such thing as a smooth flow line. The average flow of a turbulent fluid down a pipe looks like plug flow, but within this average flow the fluid is doing incredibly complicated things. In the flow past an airfoil in Fig 8.12, the flow lines behind the airfoil would be impossible to draw. (Leonardo da Vinci attempted careful drawings of turbulent flow in rivers: it is interesting to see how he visualized it, very swirly!) There is no fully reliable mathematical model of turbulent flow, so this represents a problem physics has identified, but not yet solved. In the absence of a good mathematical theory, designers and engineers must build expensive prototypes and test them in real flows, wind tunnels, etc. One extremely useful consequence of the theory as we have given it is that you don’t have to use full-scale prototypes for such tests. You can make little scale models and adjust v, η and ρ to make the Reynolds number the one you are interested in. The resulting scaled flow should model the real flow. 288 CHAPTER 8. DENSITY AND FLUIDS

One practical consequence of turbulence, as we have already noted, is increased resistance r to flow in pipes or channels, as in Eq (8.40). Flow in fire hoses is turbulent, and the pressure drop down the hose due to turbu- lence means that the water that emerges out the nozzle does not have all the kinetic energy K that it could, and hence it cannot go as high as it should. For unknown reasons, if a small amount of polymer is dissolved in the water, the turbulence is inhibited, and the turbulent resistance r is less, meaning the water can now reach higher, say to the sixth floor of a burning build- ing. This example hints at how useful it would be to understand turbulence better. Another practical consequence of turbulence is mixing. In industrial processes one frequently wants to mix two solutions. This happens very fast if the the mixture is made turbulent. Nowadays there is a lot of work on microfluidic devices, essentially networks of very small pipes, with perhaps some moving parts, maybe etched on a silicon chip. One problem in these devices is getting two fluids to mix: being small, these devices operate at low Reynolds number, so the flows are never turbulent. In the absence of turbulence, our usual method for mixing things doesn’t work.

8.16 Resistance in Series and Parallel

The relationship in Eq (8.40), expressing the proportionality between pres- sure difference and volume current in a single pipe, can be extended to net- works of pipes. In Fig 8.16 we imagine two pipes in parallel, with resistances r1 and r2, connecting fluid at pressure P2 to pressure P1. Because of the pressure difference ∆P = P2 − P1, there will be a flow in each pipe, and the total current flowing will be µ ¶ 1 1 I = I1 + I2 = ∆P/r1 + ∆P/r2 = ∆P + (8.41) r1 r2 Since for the system as a whole we would define the resistance r by ∆P = rI, the total resistance r is given by 1 1 1 = + (8.42) r r1 r2 A simpler way to say this is to define the conductance g = 1/r where I = g∆P , and to note that in parallel the conductances add, since each pipe conducts current. 8.16. RESISTANCE IN SERIES AND PARALLEL 289

r1, I1 I I P2 r2, I2 P1

Fig.8.16: Two pipes in parallel, with resistances r1 and r2, connect a reservoir at pressure P2 with a reservoir at pressure P1. The total current I through the system is shared between the two pipes. Since the fluid is incompressible, I = I1 + I2.

When one pipe follows another, the pipes are said to be in series, as in Fig 8.17. Taking the resistances of the individual pipes to be r1 and r2, we

I I P2 P1 r1 r2

Fig. 8.17: Two pipes in series, with resistances r1 and r2, connect reservoirs at pressures P2 and P1. Each pipe carries current I. The pressure drops continuously as you move from left to right through the pipe, and in particular the junction between the two pipes (dotted line) is at some intermediate pressure.

see that the pressure drop in the first pipe, r1I, plus the pressure drop in the second pipe, r2I, must be the total pressure drop ∆P = P2 − P1. Thus

∆P = I(r1 + r2) (8.43) 290 CHAPTER 8. DENSITY AND FLUIDS so that the resistance r of the series combination of pipes is just

r = r1 + r2 (8.44)

Thus in series, the resistances add. When we study electrical current we will see exactly these same relation- ships. The role of pressure difference will be taken over by voltage difference (potential difference), and the role of volume current will be taken over by electric current. The role of pipes (with their resistance) will be taken over by resistors.

8.17 The Human Circulatory System

Blood flows through arteries and veins the way water flows through pipes. Physics ought to have something to say about this. The branching network of the arteries is like a network of resistances. Physiologists have distinguished various levels in this branching scheme, beginning with the aorta, the single large artery from the left ventricle of the heart, then the large arteries, the small arteries, the arterioles, and the capillaries. The walls of the arteries include smooth muscle. The arterioles, in particular, are under involuntary control of the nervous system, and can open up or constrict. Flow in the arterioles is Poisseuille flow, so the resistance depends sensitively on the radius R of the arteriole (like 1/R4). Thus the body can selectively control resistance. From the capillaries blood flows into venules (i.e., small veins), then veins, and back to the right atrium of the heart. The veins are more passive, and many contain valves to keep blood from flowing backward. The volume of the venous side is greater than that of the arterial side. In fact, almost 80% of the blood at any moment is in the veins! The output of the right ventricle into the pulmonary artery circulates blood to the lungs. Again there is a branched network down to the pulmonary capillaries, where the blood is oxygenated before going back to the left atrium of the heart. One remarkable observation follows from the incompressibility of the blood. The volume flow into any part of the system must, on average, equal 8.17. THE HUMAN CIRCULATORY SYSTEM 291 the volume flow out, as in Eq (8.25). Thus the flow into the systemic arteries from the heart must equal the flow out through the capillaries. The heart pumps about I = 6 liters/minute of blood, i.e, I = 10−4 m3/s, into the aorta, −2 −4 2 with radius r0 = 10 m, and hence area A0 = 3×10 m . Thus the average velocity of flow in the aorta is u0 = I/A = 0.3 m/s. On the other hand, the −3 velocity of flow in the capillaries is uc ≈ 10 m/s, and the radius of a capil- −6 −11 2 lary is about rc ≈ 3 × 10 m, and hence area Ac = 3 × 10 m . The total area of the capillaries is A = NcAc, where Nc is the number of capillaries in −4 3 the body. We can compute Nc, because NcAcuc = I = 10 m /s implies

10−4 m3 /s N ≈ ≈ 3 × 109 capillaries, (8.45) c 10−3 m/s 3 · 10−11 m2 a number that would be hard to obtain any other way.

When you have your blood pressure taken, the two numbers, say 130/80, are the systolic and diastolic pressures, measured in mm of Hg (gauge pres- sures). The beating heart creates pulses of higher pressure (here 130) on top of a resting pressure (here 80). On the venous side the pressure is not much more than atmospheric pressure (gauge pressure 0). Thus the pressure drop from the heart through the capillaries – the pressure that drives the flow – is about ∆P = 100 mm Hg≈ 1.3 × 104 Pa. The measurement of blood pres- 1 2 sure reminds us that Bernoulli’s principle, P + ρgh + 2 ρu = constant, holds through the large arteries. Furthermore, as we will see, the average blood speed u stays quite constant through this part of the system. Therefore, if you measure the pressure P in a large artery at the height of the heart (call this h = 0), then you are essentially measuring pressure P at the heart. At a different height h, the measurement would differ from the value at the heart by the hydrostatic pressure ρgh. Clinicians usually measure blood pressure in the upper arm, at mid-chest height.

Knowing ∆P and I for the flow, we can find the resistance RA for the arterial system,

∆P 1.3 × 104 Pa R = ≈ ≈ 1.3 × 108 Pa · s/m3 (8.46) A I 10−4 m3 /s

Is this resistance perhaps due to the capillaries, which are so narrow? The resistance of a single capillary can be computed from what we know of Pois- 292 CHAPTER 8. DENSITY AND FLUIDS seuille flow, using Eq (8.34),

−3 −4 8η`c 8(5 × 10 Pa · s)(3 × 10 m) 16 3 Rc = 4 ≈ −6 4 ≈ 4 × 10 Pa · s/m (8.47) πrc π(3 × 10 m) We have used values for blood viscosity η ≈ 5 × 10−3 Pa · s, the length of −4 −6 a capillary `c ≈ 3 × 10 m, and the radius of a capillary rc ≈ 3 × 10 m. Now the capillaries taken all together are Nc such resistances in parallel, so by Eq (8.16), the resistance of all the capillaries is

16 Rc 4 × 10 7 3 R = ≈ 9 ≈ 10 Pa · s/m (8.48) Nc 3 × 10

The uncertainty in this computation is rather large, but in comparing Eq (8.46) with Eq (8.48), it does seem that there is appreciably more resis- tance in the arterial system than just the resistance of the capillaries. That is consistent with the observation that in the arterioles there is a smooth mus- cle mechanism to change the resistance. That wouldn’t make much sense if the arteriole resistance were negligible! We will see in the next section a model in which the resistance is rather uniform through the whole system, in a sense to be explained. When you exercise hard, your muscles need more oxygen, and hence more blood. The arterioles supplying these muscles open wider. That creates a problem, though, as diagrammed in Fig 8.18. In the branched network, the only relevant branch to look at is the one that separates arteries serving the exercising muscles from everything else. Suppose R1 is this exercising part, and let I1 and I2 be the currents through R1 and R2. When its arterioles open wider, R1 decreases. The problem is that the pressure drop across R1 is the same as the pressure drop across R2. Thus I1R1 = I2R2, and therefore I R 2 = 1 (8.49) I1 R2

When R1 goes down, I2 goes down relative to I1, and since these two together sum to the total current output I of the heart, i.e.,

I1 + I2 = I (8.50) it means that the muscles have “stolen” blood from the rest of the body (including the brain). Now that I2 is a smaller fraction of the total I, the 8.17. THE HUMAN CIRCULATORY SYSTEM 293

I R0

R1 R2

I

Fig.8.18: Flow through the systemic arteries is indicated schematically, with hydrodynamic resistance, i.e., subnetworks of arteries, indicated by the resistor symbol. The resistance R1 is the subnetwork serving an exercising muscle. It is in parallel with another subnetwork R2 which has only normal demand. When R1 goes down, the current I supplied by the heart goes preferentially to R1, starving R2.

only way to bring I2 back to its proper value is to increase I, the total output of the heart. On each stroke the heart empties (most of) the left ventricle, a fixed volume, so the only way to increase I is for the heart to make more strokes, i.e., to beat faster. And of course that is what happens when we exercise! We know that the cross-sectional area of the arterial system goes up even as the individual vessels get smaller, because the blood slows down, reaching a speed of only uc ≈ 1 mm/s in the capillaries. The way the cross-sectional area goes up, though, is interesting. It is shown in Fig 8.19. That diagram is not so easy to interpret, but it appears that the area stays quite constant until the vessels are about 200 µm in diameter. In the large vessels, then, the average blood speed does not slow down. Then the area begins to increase (to a total area larger than indicated there, in vessels that are smaller – a little extrapolation is necessary), and the blood slows down. That is, there seem to be two regimes, a constant area regime, and a growing area regime, with the crossover at diameter ≈ 200 µm. That is also roughly where the crossover occurs from a Bernoulli’s Principle regime, in large vessels, to a Poisseiulle flow regime in small vessels. This observation will be part of the 294 CHAPTER 8. DENSITY AND FLUIDS

Total cross-sectional area (m2 x 10-4)

100

10

1 1 10 100 1000 10000 Vessel diameter (µm)

Fig. 8.19: The cross-sectional area of the arterial system as a function of the diameter of vessels. The increase in area at small vessel size implies the slowing down of the flow. (Redrawn from The Mechanics of the Circulation, C.G. Caro, et. al., Oxford University Press, 1978, Fig. 12.4.) fractal model of circulation in the next section.

There is one more thing we should say about the circulation in the large vessels, where Bernoulli’s Principle applies. This is also the regime where you can feel your pulse, and where you can measure the systolic and di- astolic pressures, i.e., where the pressure oscillates appreciably. The large vessels actually bulge when the systolic pressure pulse occurs, and this bulge propagates along the arteries like a wave. (Sometimes you can even see this in arteries near the skin.) A peculiar thing can happen to a wave when it runs into some change in its medium: it can reflect. This could happen where the arteries branch into smaller arteries, for example, since the branch repre- sents a kind of interruption. A reflected systolic wave would represent blood flowing the wrong way(!) or really, it would subtract from blood flowing the 8.18. A FRACTAL MODEL OF CIRCULATION 295 right way. It would be a kind of design flaw in the system. Thus one should expect that branching is designed to minimize reflection of systolic waves in the large vessel regime. This is a physical insight into the problem which is also part of the fractal model of the next section.

8.18 A Fractal Model of Circulation

Warm-blooded animals maintain a constant temperature, typically above the ambient temperature, despite losing energy by flow of heat to the outside. Since this heat loss is through the surface, it goes as the square of the linear size, or as the 2/3 power of the volume (or mass). This led, in the 19th century, to the expectation that the nutritional requirements B of animals would grow with mass like

B ∝ M 2/3 Bergmann0s Law (8.51)

That B grows more slowly than M, as Bergmann’s Law implies, is a familiar fact. Large animals need proportionately less food than small ones. Humans eat roughly 1/50 their weight each day, but small mice may eat half their weight each day! Since it is not difficult to determine the food requirements of animals as a function of their mass, there is now a lot of data on this question, and the result is surprising. The actual empirical law is

B ∝ M 3/4 (8.52) and it is accurately obeyed not just by mammals but also by cold-blooded an- imals, and even such diverse life forms as insects, plants and bacteria! Clearly there is some other need being fueled by food intake, and it is more demand- ing than balancing heat flow through the surface, since it grows slightly faster than that particular need, which is still there, for some animals. In 1997 Geoffrey West, James Brown, and Brian Enquist (WBE) sug- gested that this law, which we will call the 3/4 law, was a consequence of the need to supply three-dimensional bodies through one-dimensional networks, for which the human circulatory system is a convenient example, duplicated in one way or another in virtually every living thing. In geometrical terms, they were suggesting that the circulatory system is a fractal, a set that does 296 CHAPTER 8. DENSITY AND FLUIDS not have the dimension that naively it should have. We begin by explaining this idea briefly. The dimension of a set S describes how it scales. A clever way to measure this is to think of covering S with spheres of diameter `1. Let the number necessary be N1. Now take smaller spheres, of diameter `2, and let the number necessary to cover be N2, etc. In this way you find N as a function of sphere diameter `. Now you ask how N scales with ` by graphing ln N vs. ln `. If this is a straight line with slope −α, then the fractal dimension is α. (The minus sign is because as the size of spheres goes down, the number of them goes up.) This means N ∝ `−α, and so N, the measure of S, scales with the exponent −α. It is clear for geometrically simple sets like the line segment AB or the square surface ABCD in Fig 8.20, that α agrees with the usual notion of dimension.

AB

AB

D C

Fig. 8.20: Using spheres with 2−1 = 1/2 the diameter, it takes 21 times as many to cover the line AB, and 22 = 4 times as many to cover the square ABCD, because these objects have dimension 1 and 2 respectively.

This definition of dimension turns up some sets that behave very pecu- liarly! One of the easiest examples to see clearly is the “Koch snowflake”, shown in Fig 8.21. Let the length of the initial line segment in the construc- tion (not shown) be 1. Then using spheres of diameter ` = 1, 1/3, 1/32, 1/33, ..., we need N = 1, 4, 42, 43, ... to cover the snowflake. Graphing N vs. ` on log-log paper gives a slope α = − ln 4/ ln 3 ≈ −1.26. Thus the Koch 8.18. A FRACTAL MODEL OF CIRCULATION 297

(a)

(b)

(c)

(d) etc.

Fig.8.21: The Koch snowflake has triangular excursions 1/3 of the way along each side! It can be built up by starting with a straight segment (not shown), (a) introducing a triangular excursion, then (b) introducing triangular excursions into each of its 4 sides, then (c) doing the same for each of the resulting 16 sides, etc. Its fractal dimension is ln 4/ ln 3 ≈ 1.26, and not 1, as one might naively expect. snowflake, although it seems to be made out of one-dimensional segments, has fractal dimension 1.26, which is greater than 1. The WBE theory assumes that the arterial system (or its analogue in plants) is a fractal branching network of essentially one-dimensional tubes that has fractal dimension 3. The network starts with the aorta, of length `0. This then branches into n large arteries, where n is a fixed parameter, characterizing the network. For the sake of concreteness, let us suppose n = 3, but continue to call it n. Each of the n arteries of level 1, of length 2 `1, branches into n smaller arteries. (There are therefore n of these smaller arteries.) These, in turn, after length `2, each branch into n still smaller arteries, etc. The number of arteries at each level is 1, n, n2, n3, .... In general, at the kth level there are nk arteries. Eventually, after N branchings, we get to the capillaries, so N n = Nc (8.53) where Nc is the number of capillaries. If we try to fit this scheme to the 298 CHAPTER 8. DENSITY AND FLUIDS

9 human circulatory system, knowing Nc ≈ 3 × 10 , and taking n = 3 for the number of branches at each branching, we find N = 20, since 320 = 3.5×109. The fractal branching scheme is shown in Fig 8.22 What is not shown l2

l1

l 0 etc. ... capillaries

level: k = 0 1 2 ... N

Fig. 8.22: The branched arterial network is shown pulled out nearly straight. At each k branching, the n tubes at level k, of length `k, each produce n new tubes, each of length `k+1. The capillaries are shown greatly magnified! there is that each level, down to the capillaries, covers the same region of space. In fact, even the tissues that make up the walls of the larger arteries are nourished by capillaries! This suggests that the fractal dimension of the network is 3, so that for each k we can cover the body by spheres of size `k −3 k with a number that goes as `k . But the number of these spheres is n . Thus

k −3 n ∝ `k (8.54) or −k/3 `k ∝ n (8.55) Putting in the constant of proportionality, we have

−k/3 `k = `0n (8.56) 8.18. A FRACTAL MODEL OF CIRCULATION 299

At each level, the length of the artery goes down by the same factor, n−1/3. This, as we have seen, follows from the first assumption of the WBE theory, that the arterial system is a fractal of dimension 3. Let us try this on the human example, using n = 3. If the aorta has length `0 = 0.5 m, the capillaries, at level k = N = 20 would have length −20/3 −4 `c ≈ 0.5 × 3 ≈ 3 × 10 m. This is about right! The second assumption of the WBE theory is that the radii of the arteries are designed so as to minimize the energy required to pump blood through, consistent with getting blood down to the capillaries. The third and last assumption of the theory is that the capillaries are essentially the same in all organisms, designed for the efficient exchange of dissolved substances with surrounding tissue. In plants there is a very simple rule about the radii of branches of the “arterial” system: the cross-sectional area doesn’t change when branching occurs, because it is always the same “pipes” (xylem) as you go up the stem (the “aorta”), only various pipes are redirected into various branches, without changing their areas. The same rule, that cross-sectional area stays constant, applies to humans (and all animals with beating hearts) in the upper part of the arterial system, where the flow obeys Bernoulli’s Principle. We have seen this in Fig 8.19. Amazingly, this follows from the principle of minimizing the energy required to pump the blood. The only resistance in the upper part of the circulation comes from reflection of the systolic wave at places where branching occurs, and the condition for no reflection is exactly that the area stay the same at the next level! Thus in both plants and animals, although for different reasons, we have areas staying the same from one level to the next, 2 2 πrk = nπrk+1 larger arteries (8.57) so that −1/2 rk+1 = rkn larger arteries (8.58) Thus −k/2 rk = r0n larger arteries (8.59) and the radius becomes less by the same factor, n−1/2, at each branching. Each new, smaller artery has cross-section less by the factor n−1, but there are n of them, so the total cross-section stays the same. 300 CHAPTER 8. DENSITY AND FLUIDS

In animals this pattern holds only until the flow becomes Poiseiulle, which −2 in humans occurs when rk ≈ 100 µm. Taking n = 3 and r0 = 10 m (the radius of the aorta), this implies k = 8, since 10−2 · 3−8/2 ≈ 120 × 10−4 m≈ 120µm. Thus in humans there are 12 more branchings (for a total of 20) to get down to the capillaries.

k We already know the lengths `k and the number of arteries n for each level of this part of the system. To minimize its Poiseuille flow resistance turns out to require

−1/3 rk+1 = rkn smaller arteries (8.60) a result known since the 1920’s, known as Murray’s Law. Now the area does not stay the same at branches. Rather, if Ak is the area at level k,

2 1/3 2 1/3 Ak+1 = nπrk+1 = n πrk = n Ak (8.61) The area grows with the factor n1/3 at each branching. Since the diameter of −1/3 −1 vessels goes down by the factor n at each branching, we can say Ak ∝ rk in this regime, and that is clearly visible in Fig 8.19! It shows up as the slope −1 in the log-log plot for the smaller vessels. It is easy to check that the resistance of each level in this part of the system is the same, and since these resistances are in series, they simply add. Compare Eqs (8.46) and (8.48). If we ask how small the capillaries are according to this rule, continuing our computation with the data for humans, we should start with r8 = 120 µm, computed above according to the rule for large arteries, and then follow 12 −12/3 more branchings by the new rule for small arteries, finding rc = r8 ×3 = 120/81 = 1.5 µm. Again, this is about right! This fractal scheme seems uncannily accurate. We really shouldn’t be working our way down to the capillaries to see what the theory says about them, because the third assumption of the the- ory, mentioned above, is that the capillaries are always the same, and con- ceptually should be the starting point. We should work our way up from them, expressing other things in terms of them. We illustrate this principle by deriving the 3/4 law, B ∝ M 3/4. We will assume the constant-area rule holds all the way down to the capillaries. This is true for plants, but not for animals. It turns out that for animals one can replace “capillaries” by the smallest vessels for which the constant-area rule holds (100µm radius), so the derivation is essentially the same. 8.18. A FRACTAL MODEL OF CIRCULATION 301

The nutritional requirement of any organism is proportional to the volume rate of flow from the heart Q0, which is the rate at which the organism is supplied with nutrient,

2 N 2 B ∝ Q0 = πr0u0 = n πrc uc (8.62)

Here u0 is the mean speed of flow in the aorta, uc is the mean speed in the capillaries, etc. The point is that having expressed it in terms of the constants uc and rc describing the capillaries, the only thing that varies from one organism to another is the factor nN , i.e.,

B ∝ nN (8.63)

Similarly the mass M of the organism is proportional to the total volume Vb of the blood, and this in turn is proportional to the volume of the capillaries,

2 −1/3 −2/3 −3/3 −4/3 M ∝ Vb = πr0`0(1 + n + n + n + n + ...) (8.64) 2 ≈ Cπr0`0 (8.65) N/2 2 N/3 ≈ Cπ(n rc) (`cn ) (8.66) 2 4N/3 ≈ Cπrc `cn (8.67)

Here C is a number of order 1, about 3 if n = 3 and about 4 if n = 2. Thus, after expressing everything in terms of capillary quantities, we have

M ∝ n4N/3 (8.68)

Comparing Eq (8.63) and (8.68), we have the 3/4 law! 302 CHAPTER 8. DENSITY AND FLUIDS Problems

Density

8.1 Give a careful argument for converting mass density in cgs units to SI units. Why do you think cgs units are often used for mass density in spite of the trend to SI?

8.2 The acceleration due to gravity g is proportional to the mass of the Earth ME. If the Earth were more massive, g would be larger than it is. In Newton’s theory of universal gravitation, in fact,

GME g = 2 (8.69) RE where RE is the radius of the Earth, and G is Newton’s gravitational con- stant, a constant of Nature. Newton’s constant was first measured by the gifted English experimentalist Henry Cavendish in 1798. It is still not known to high accuracy, but it is about 6.67 × 10−11 in SI units. From these data, determine the average mass density of the Earth. Is your answer plausible?

8.3 Find your own volume, what you might call your “personal space”, assuming that your density is that of water.

Archimedes’ Principle

8.4 When people say “this is just the tip of the iceberg”, they are referring to the fact that most of an iceberg’s volume is under water, and the tip that

303 304 CHAPTER 8. DENSITY AND FLUIDS is visible is only a very small fraction of the iceberg. Look up the data you need and determine what fraction it is. Be clear what data you are using, and make your reasoning clear.

8.5 Name some familiar metals that will float in mercury, and find two metals that will sink in mercury.

8.6 (a) Draw a diagram like Archimedes’ own diagram in Fig 8.1 to illustrate why, if you push a buoyant object of weight Wo down into water with a force F , to completely submerge it, the equilibrium condition is F + Wo = Ww, where Ww is the weight of the water displaced. Give the argument in words that should accompany the diagram. (b) Similarly, draw a diagram like Archimedes’ own diagram in Fig 8.1 to illustrate why, if you support a submerged dense object with a force F (upward), the equilibrium condition is −F + Wo = Ww. Give the argument in words.

8.7 (a) Two liquids, with densities ρ1 < ρ2, immiscible in each other, are allowed to come to equilibrium in a container. Use minimization of energy to argue that the liquid with with density ρ2 will be on the bottom, and the interface between the two liquids will be a horizontal plane.

(b) A rectangular solid, with intermediate density ρ0, i.e., satisfying ρ1 < ρ0 < ρ2, is placed into the container. Show that it can float (fully submerged) at the interface between the two liquids, and if its height is h, then it sinks a distance D into the denser liquid given by µ ¶ ρ − ρ D = 0 1 h (8.70) ρ2 − ρ1

(c) Show that the above expression makes sense in various ways, including limiting cases.

8.8 (a) When you step on a scale, what you read is actually less than your true weight! Estimate the size of this effect, realizing that you are immersed in a fluid, the atmosphere, with a mass density roughly 1 kg/m3. (b) Could scales be engineered to correct for this effect and read true weight? What would be required? 8.18. A FRACTAL MODEL OF CIRCULATION 305

8.9 The hot air in a hot air balloon may have a density only 0.9ρa, where 3 ρa ≈ 1 kg/m is the density of the ambient air. What must be the volume of a balloon that could lift your mass m ≈ 50 kg? Assume the skin of the balloon, supports, basket, etc. taken all together weigh as much as you do. Ignore your own buoyancy and the buoyancy of the solid parts of the balloon.

Pressure

8.10 Sketch a graph of the pressure as a function of depth in the two-liquid system described in Problem 8.7. Explain in words why your graph looks the way it does.

8.11 (a) With a force F you support a rectangular object of mass M by holding it in one hand. Draw a free body diagram for this situation and find the force F . (b) The rectangular object is actually a small aquarium, filled with water of mass M (the glass bottom and sides of the aquarium are so thin that they have negligible mass). Draw a free body diagram for just the glass of the aquarium, including, of course, the pressure force PA on the bottom of the aquarium due to the water, since in this way of thinking of it the water is now an external object that contacts the glass. Show that you deduce the same force F as in part (a).

8.12 DANGER: DO NOT TRY THIS. A simple idea for providing yourself air under water is a hollow tube extending to the surface to breathe through. Why is this a bad idea for all but the shallowest descents? Or to put it another way, why are snorkels short?

Bernoulli’s Principle

8.13 (a) An above-ground water tank of height H develops a hole near the bottom. If it is full of water, what is the speed of the water exiting the tank through the hole? Give an expression showing the proportionalities involved, and evaluate your expression in case H = 5 m. 306 CHAPTER 8. DENSITY AND FLUIDS

(b) A bathysphere lowered into the deepest trench in the ocean, a depth of about 10 km, develops a leak. What is the speed of the water coming in through the leak? (Compare with the speed of sound in air, about 340 m/s.)

8.14 Use Bernoulli’s Principle to estimate the horizontal force on you if you are standing waist-deep in a stream flowing at 1 m/s. Is the result plausible?

8.15 Draw a picture of a siphon connected to an overhead reservoir, label the picture with appropriate dimensions (give letter names), and give an expression for the speed with which the fluid should exit the siphon if there are no frictional losses in the system. Does it matter how the entrance of the siphon is positioned in the reservoir?

8.16 Water comes out of a tap of area A0 with speed v0 (straight down). (a) Use Bernoulli’s Principlep to show that when the water has fallen a 2 distance h, it has speed v = v0 + 2gh. (b) Show the same thing using conservation of energy for a small part of the water, of mass m. (c) Since the water speeds up as it falls, and yet the volume current I = A0v0 = Av is constant, it must be that the area A of the flowing stream of water becomes smaller as v gets larger. Find A as a function of distance fallen h, and graph it. Also look carefully at water coming out of a tap!

Shear Stress and Viscosity

8.17 (a) Following the arguments in Section 8.12, explain why shear rate and shear viscosity have the dimensions they do. (b) Suppose that (in a horizontal pipe) volume current I, with dimension [L3/T], and pressure drop ∆P (down the pipe) are proportional, as in ∆P = rI (8.71) Find the dimension of the pipe resistance r. (c) Suppose resistance r in a pipe is proportional to the viscosity of the fluid and to the length of the pipe. These factors alone don’t have the right dimensions to be resistance. What is the dimension of the missing factor? 8.18. A FRACTAL MODEL OF CIRCULATION 307 Stokes Flow

8.18 When a small sphere of radius R sinks in a fluid of viscosity η, the gravitational force, the buoyant force, and the Stokes drag force exactly bal- ance. (a) Make a free body diagram for the sphere, showing the three forces. (b) Find the speed v of the sphere in terms of its density ρ from the condition that the forces should balance, i.e., that the net force should be zero. (c) Verify explicitly that your result has the dimension of a velocity.

Pressure, Current, and Resistance: ∆P = Ir

The problems in this section are analogous to DC circuit problems in elec- tronics. The pressure difference ∆P from one end of a pipe to another is analogous to the voltage difference ∆V that might be applied by a battery across a resistor. The resulting volume current I is analogous to the electri- cal current I. The pipe resistance r is analogous to electrical resistance R. Because of this analogy we indicate pipes by the electrical resistance sym- bol, like a wiring diagram, and give values in SI units without naming them explicitly.

8.19 In Fig 8.23 the points A and B are connected in parallel by 3 pipes, with resistances 100, 200, and 500 (SI) respectively. (a) If the pressure drop from A to B is ∆P = 10 (SI), what is the SI current through each pipe individually? (b) What is the total current through all 3 pipes together? (c) What is the effective resistance of the network, relating the pressure drop to the total current? How does it compare to the individual pipe resis- tances?

8.20 In Fig 8.24 there is a pressure drop of 30 (SI) from A to B, and the resistances are as indicated. 308 CHAPTER 8. DENSITY AND FLUIDS

A

100 200 500

B

Fig. 8.23: Three pipes connect A to B, with individual SI resistances indicated.

A

100 C 300

200

B

Fig. 8.24: Pipes connect A to B, with individual SI resistances indicated. The point C is at the junction of one pipe with another.

(a) What is the pressure drop from A to C? What is the pressure drop from C to B? (Check: they must add to the full 30 from A to B.) 8.18. A FRACTAL MODEL OF CIRCULATION 309

(b) What is the total current through the network? (c) What is the effective resistance of the network?

8.21 Assuming Bernoulli’s principle holds in the large arteries, what blood pressure would you measure in the carotid artery 30 cm above the heart in a subject whose blood pressure taken in the standard way was 130/80? 310 CHAPTER 8. DENSITY AND FLUIDS Chapter 9

Temperature, Heat, and Internal Energy

When Fire was thought to be one of the four elements, a popular idea due to Plato was that the “atoms” of Fire were tiny tetrahedra, like the one in Fig 9.1. This seems a peculiar idea now, but it once made a kind of intuitive

Fig. 9.1: According to Plato, the atoms of Fire are tetrahedra. sense. The points of the tetrahedron, being sharp, could cause injury in the form of burns. Galileo suggested that these pointed tetrahedra could act like

311 312CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY little knives to melt metals, by cutting a solid into so many small particles that it would flow like a fluid. The theory of the four elements, with atoms in the shape of the Platonic solids, was already quite archaic in Galileo’s time. In this theory all materials are a mixture of Fire, Earth, Water, and Air, and burning a material just means liberating the Fire that is in it. In the 18th century, advances in chemistry began to reveal what the real chemical elements are, but the old idea nonetheless persisted in a slightly different form. Now heat was imagined to be a fluid, although perhaps not a chemical substance, that flowed out of materials when they burned, and warmed things by flowing into them. This fluid was called “phlogiston.” In many ways it was just the old element Fire. The 18th century chemists were frustrated in not being able to purify this substance, the way they could purify other things. They could make phlogiston move from one body to another, in what they (and we) call the flow of heat, but they could never isolate it. That is, they could never get heat by itself, not associated with another substance. This ancient idea was finally laid to rest by the American Benjamin Thompson, who did his scientific work in Europe, having left the Colonies in 1776 as a Tory. He had a distinguished career in England and later Bavaria, where he acquired the title Count Rumford, and among other duties was responsible for overseeing the manufacture of cannons. These cannons began the process as solid cylinders, which were then bored by turning them against an abrasive drill to hollow them out. They became very hot and could be continually cooled with water, which itself became hot, and so on. Rumford pointed out that if this were the flow of some substance from the cannon to the water, then the cannon seemed to have an infinite supply of it. On the other hand, the substance never appeared except when the cannon was in motion, being turned. Stop the turning and the phlogiston soon stopped coming out. And no matter how much phlogiston a cannon lost, it was in no way a different material after it cooled off. All this strongly suggested to Rumford’s mind that phlogiston was not a substance at all, but rather a kind of motion, and that the motion was being supplied by the turning apparatus. The motion could communicate itself from one substance to another at a level too small to see. It could never be isolated, because motion is always motion of some thing. This is essentially our modern idea of what the flow of heat is: the flow of internal energy from hot to cold bodies. The only difference is that the word “motion” has been replaced by the words “internal energy.” 313

In modern statistical mechanics, some or all of the internal energy would be the random kinetic energy K of atoms, that is, motion.

Rumford’s insight became precise and quantitative in 1843 in the work of James Prescott Joule, for whom the SI unit of energy is named. Joule used a weight W of initial height h to power a rotary paddle wheel device that churned water as the weight descended. In this way he took the potential energy Ug = W h of the weight, and transferred it all to the water. The water was thermally insulated from the room in which the experiment was done, and Joule found that the result of giving the water energy in the known amount ∆E = Ug was to raise its temperature T by a definite small amount ∆T . It didn’t matter whether the energy was transferred quickly or slowly (within practical limits), and if twice as much energy was transferred, the temperature change was twice as much, indicating a proportionality

∆E ∝ ∆T (9.1)

Knowing this proportionality he could interpret a change in temperature ∆T as a change ∆E in the internal energy of the water. With this insight the notion of energy finally took on the central importance it has for us today. This idea is now called the Law of Conservation of Energy, or the First Law of Thermodynamics. It says that energy, if we just keep track of it in all its forms, including potential energy, kinetic energy, and internal energy, never increases or decreases, but simply moves around. In Joule’s experiment it went from being potential energy of a weight to being internal energy of the water. As the water gradually equilibrated with the room, despite the thermal insulation around it, we would say that the energy slowly became internal energy of the entire room and its contents, diffusing into everything. This observation is a hint of the Second Law of Thermodynamics, which says that the flow of heat, meaning the transfer of internal energy because of a temperature difference, has a peculiar quality, tending to make energy more diffuse, and never to concentrate it.

Is this diffusion of energy the motion of some thing? It is tempting to say so, but that really takes us back to the phlogiston theory. Energy is certainly not a material thing, nor is it simply motion. It is a thing in the sense that it is conserved, and hence has a kind of integrity, but that is all one can say. 314CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY 9.1 Temperature

In the account of Joule’s experiment we took for granted a notion of temper- ature. Of course by Joule’s time there were thermometers that he used to measure the temperature change ∆T in the churned water (he used mercury thermometers). What is temperature, though?

The most important idea in the theory of temperature is the concept of thermal equilibrium. This is the notion that when two systems are able to exchange internal energy through the flow of heat, they will do so until an equilibrium is established, and after that no further change occurs. Once they have reached this equilibrium, we say that they are at the same temperature. It is postulated that if A is in equilibrium with B and B is in equilibrium with C, then A is in equilibrium with C. You might say, this is obvious! If A is the same temperature as B and B is the same temperature as C, then of course A is the same temperature as C! That would be missing the meaning of this idea, though, because you are already thinking of temperature as a number, measured by a thermometer. The statement about the mutual equilibrium of A, B, and C is a statement about how things actually interact and behave, and has nothing to do with thermometers. Conceivably A and C brought into contact would start exchanging internal energy through the flow of heat, even if the pairs A and B, and B and C, would not. Then it would be impossible to characterize temperature by a number. This hypothetical case seems never to occur, however, for any systems A, B, and C, and therefore we can assign a single number, temperature, to their equilibrium. This is sometimes called The Zeroth Law of Thermodynamics, to make the point that the other laws rely upon it, and also that this one apparently wasn’t recognized as a logical necessity until later. It is a bit subtle! It helps, though, to have the notion of thermal equilibrium firmly in mind as we think about temperature.

You are probably sitting in a room where nothing too dramatic has hap- pened for awhile. The objects in the room have had plenty of time to ex- change internal energy through the flow of heat, and they have probably reached a mutual equilibrium. That means they are all at the same temper- ature (by the definition of temperature). Try touching a few things around you. I find that some things feel noticeably cooler than others. Ceramic and metal feel colder than wood and fabric. But they are not! They are all at the same temperature! What is going on? 9.1. TEMPERATURE 315

We do not actually sense temperature. What we sense is the flow of heat, driven by temperature difference. Since you are warmer than the other things in the room (and not in thermal equilibrium), you exchange internal energy with them when you come in contact: internal energy flows from you to the things you touch. The temperature difference is always the same, no matter what you choose, but the flow of internal energy is not always the same. Apparently there is an energy current J driven by the temperature difference ∆T , but the current depends on what you are touching. This is so much like the flow of a fluid driven by a pressure difference that we can model it in the same way as Eq (8.40), with a ‘thermal resistance’ r, so that

∆T = rJ (9.2)

Here J, the energy current, would have SI units Joules/sec (also called Watts). This relation is usually expressed in terms of the thermal conduc- tance κ = 1/r (‘kappa’), the reciprocal of thermal resistance, as

J = κ∆T (9.3)

This says that an energy current J arises proportional to temperature dif- ference ∆T , with the constant of proportionality κ depending on the sit- uation. It is just a rough idea here, but it is phenomenologically true for small temperature differences ∆T . This is sometimes called “Newton’s Law of Cooling”. What we sense as ‘hot’ or ‘cold’ seems to be J. This is an energy current, representing energy coming in or going out at some rate. It does seem as if we should be aware of this, for reasons of basic survival, especially a very rapid flow of energy (which we would sense as very hot or very cold). Nature has equipped us with the means to detect these internal energy flows. If ∆T is fixed in Eq (9.3), then the different currents J that we feel correspond to different thermal conductances κ. Since metals are good conductors of heat, they feel cool at room temperature (big κ, big current out), while wood, which is not a good conductor of heat, does not (small κ, small current out). The energy flow J in response to a temperature difference ∆T is always from the hotter body to the cooler body. Since we haven’t even said what ‘hotter’ and ‘cooler’ mean, this is really a definition of those terms! The flow J is always in the direction to move the system toward thermal equilibrium. Thus it can never happen that A is hotter than B, B is hotter than C, and 316CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY

C is hotter than A, because then the energy would flow around cyclically, never reaching equilibrium, contrary to the Zeroth Law. Rather the flow J tends to raise the temperature of the cooler body and lower the temperature of the hotter body, until they are the same. Of course we know examples of this, but here the idea is asserted as a general law of Nature, applying to everything.

The energy flow J must arise in response to a temperature difference ∆T if there is any physical process whatsoever that could move energy from the hotter system to the cooler one. One may try to make the thermal conductance very small, like in thermos bottles that use silvered interiors and vacuum between glass layers, etc., to try to thwart the transfer of energy from inside to outside, but Nature will find a way. The hot coffee does eventually cool off. An interesting case is the Sun. It is much hotter than we are, but it is separated from us by millions of miles of vacuum. Does this vacuum insulate us? Not at all! There is a mechanism for transferring energy through the vacuum, namely sunlight, and hence there is an energy current J from the Sun to us, solar energy. A solar collector of 1 square meter, designed to absorb this energy, may receive an energy current of almost 1000 J/s (a kilowatt!) This transfer is in the direction that would eventually bring us to thermal equilibrium (warming us up and cooling the Sun), but since the Sun itself is far from equilibrium, burning nuclear fuel, this thermal equilibrium is far in the future. Before the source of the Sun’s energy was known, it was a mystery how it could have avoided cooling off in the geologic time that was the (more or less) known age of the Earth.

These considerations of thermal equilibrium bring us to a rather peculiar view of temperature: temperature is a measure of the tendency of a system to give up its internal energy through the flow of heat. When two systems are in contact, and one of them has a greater tendency than the other to give up internal energy, so that heat flows from it to the other one, then we say it is at a higher temperature. We do not say that it has more or less energy, and we cannot even speak of the phlogiston it may contain, since we do not believe in that, only that it has a greater tendency to give up internal energy, for whatever reason. And heat has nothing to do with temperature, but only with temperature difference. Heat may flow at low temperature just as well as at high temperature. ‘Heat’ has nothing to do with ‘hot’! 9.2. THERMOMETERS 317 9.2 Thermometers

Every material property – density or viscosity, for example – depends on temperature. That makes almost everything a possible thermometer. Every property responds to changes in temperature. We ourselves are crude but sensitive thermometers. If our core body temperature changes by just a few degrees Fahrenheit, we die! Most material properties don’t change so dramatically in just a few degrees Fahrenheit, which means inert matter is, on the whole, much less sensitive to temperature than living systems are. When we construct useful thermometers, we have to look for rather small changes in material properties. The familiar mercury thermometer uses the relative change in the den- sity of mercury with temperature as a measure of temperature. What you actually see is the small change in volume V of a fixed mass M. Since mass density ρ = MV −1, with exponent −1 on V , a small change in mass density ρ is related to the corresponding small change in volume V by ∆ρ ∆V − = (9.4) ρ V recalling the argument leading to Eq (4.26). The minus sign makes sense, be- cause if the volume goes up, the density goes down, i.e., the changes are in the opposite direction. Our familiar Celsius and Fahrenheit temperature scales define change in temperature ∆T to be, at least approximately, proportional to this quantity, i.e. ∆V ∝ ∆T (9.5) V This idea only defines the change in temperature T , so we could still choose T arbitrarily at one particular V , and then measure changes from that. The constant of proportionality is also a choice, determining what we mean by the unit of temperature (the degree). The Fahrenheit and Celsius temperature scales differ precisely in these choices. In the Celsius temperature scale the value T = 0 is assigned to the freezing point of water, while in the Fahrenheit scale T = 0 was originally assigned (very arbitrarily!) to a particularly cold winter day. Also 1 degree on the Celsius scale is 1.8 degrees on the Fahrenheit scale. This means that Celsius temperature C and Fahrenheit temperature F are related by 9 F = C + 32 (9.6) 5 318CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY since C = 0 and F = 32 are the same temperature (freezing point of water). In the Celsius temperature system, using mercury in the thermometer, Eq (9.5) becomes ∆V = β∆T (9.7) V with β = 1.8 × 10−4/◦C. The constant of proportionality β (‘beta’) is the thermal coefficient of volume expansion of mercury in inverse degrees Celsius. Since 1 = 1◦ C/1.8◦ F, we can also say β = 1.0 × 10−4/◦ F. The mercury thermometer no longer defines the Celsius system. The actual definition is different, and by that definition, Eq (9.7) is only approx- imate. The reason is that the relative volume changes of different materials as they are warmed up are only approximately proportional to each other, but not exactly. This means each material would give a slightly different notion of temperature, if it were used to define temperature. Conceivably we could just choose some material, like mercury, and use it, but we would know that it wasn’t a very fundamental quantity, that it was a bit arbitrary, and depended on the peculiarities of mercury. There is actually a much bet- ter thermometer available, one that is independent of the properties of any material. The temperature defined in this way is called absolute temperature. We will see it in the next section. Before we leave the mercury thermometer, let us see how it is that we can measure the very tiny change in relative volume, just 1 part in 104, that corresponds to 1 degree. The secret is to rearrange Eq (9.7) to read

∆V = V β∆T (9.8)

Since ∆V ∝ V , we can make ∆V bigger by making V very large. In the mercury thermometer, this large V is in a sort of reservoir bulb at one end. The change ∆V is confined to a cylindrical volume with very small cross- sectional area A, so that the change shows up as a change in length ∆`, i.e., ∆V = A∆`. Then the change you actually read is

V β∆T ∆` = (9.9) A By making V big and A small, you can “amplify” a small change ∆T to be a large change ∆`, despite the small coefficient β. 9.3. THE GAS THERMOMETER 319

Another practical consideration in a real mercury thermometer is that the glass containing the mercury also expands when the temperature goes up (al- though its coefficient of thermal expansion is much less). If the cross-sectional area A of the cylindrical volume gets larger, because of this expansion of the glass, then the length ∆` that we read will not be as large for a given ∆V .A properly calibrated thermometer will of course take all this into account in the way its temperature scale is marked off. One thing you may have noticed is that the mercury in a thermometer may first go down before it goes up to a new, higher temperature. The reason is that the glass is warmed first, and it expands. Only later does the energy current J reach the mercury and cause it to expand.

9.3 The Gas Thermometer

The thermal properties of solids and liquids are peculiar to each substance, but gases are all very much alike. This is the basis of a thermometer that is independent of substance. A fixed quantity of gas in thermal equilibrium at a fixed temperature has the property that its pressure P and its volume V obey the simple relation PV = constant (9.10) This means that if the pressure goes up, the volume goes down, in just such a way that their product is always the same, and this is true for all gases (at least as long as they are not too dense: there may be small deviations from this relationship at high pressure). If the temperature of the thermal equilibrium is raised, then the gas expands (at constant pressure P ), so the new value of PV is larger, but once again it is constant at this new value when you change P , since V changes in a corresponding way, as long as the new thermal equilibrium is maintained. Since this product PV depends on temperature, it suggests defining a temperature T by PV ∝ T (9.11) where the constant of proportionality is still to be chosen. For a fixed quantity of gas, the choice of constant c in PV = cT (9.12) 320CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY determines the size of the absolute degree, and this is customarily chosen to be the same size as the Celsius degree. That completely determines T , which is called the Kelvin temperature.

PV

-273 0 100 T (Celsius) 0273 373 T (Kelvin)

Fig. 9.2: Extrapolating from measurements of PV at two known Celsius temperatures locates the absolute zero on the Celsius scale. We imagine doing this with two different gas thermometers. The lines are graphs showing proportionality of PV to a new temperature scale (Kelvin) with its zero at −273◦ C, as in Eq (9.12). A measurement of PV can thus be interpreted as a Kelvin temperature, although the constant of proportionality c in Eq (9.12) is different for each thermometer.

Unlike the other temperature scales, the absolute (Kelvin) temperature scale has a definite zero, called absolute zero, which is not a matter of choice. It is the temperature at which PV would go to zero! One can determine it by measuring PV at 100◦ C (boiling point of water), then at 0◦ C (freezing point of water), where the value is less, and finally extrapolating to the Celsius temperature at which it would be zero, as in Fig 9.2. The result is that the absolute zero occurs at about −273.15◦ C, which is called 0 K, the K standing for the Kelvin temperature scale, after William Thompson, Lord 9.4. AVOGADRO’S HYPOTHESIS 321

Kelvin. Thus Kelvin (absolute) temperature K and Celsius temperature C are related by K = 273.15 + C (9.13)

9.4 Avogadro’s Hypothesis

In Fig 9.2 we imagine using two gas thermometers to locate the absolute zero, one with twice the volume V of the other at the same pressure P . Thus the constant of proportionality c in Eq (9.12) is twice as big for one thermometer as for the other. In 1811, Amedeo Avogadro suggested that the volume V of a gas (at constant P), and hence the constant c, is proportional to N, the number of molecules of the gas. This was long before the very existence of molecules was established! This idea was suggested by observations like what happens when water is decomposed into hydrogen and oxygen. There is twice as much hydrogen as oxygen (by volume). Suppose each water molecule consists of two hydrogens and an oxygen (as we now know it does). Then the 2 : 1 ratio of hydrogen to oxygen molecules obtained by hydrolysis would show up as a 2 : 1 ratio of hydrogen to oxygen volumes. The two gas thermometers in Fig 9.2 could have been a hydrogen thermometer and an oxygen thermometer, using the hydrogen (upper line) and oxygen (lower line) from decomposing some fixed quantity of water. Accepting Avogadro’s hypothesis, we say that the reason that c is twice as big for the upper line in the graph is that the hydrogen has twice the number N of molecules. Thus for each thermometer c ∝ N, where N is the number of molecules, and writing it with a constant of proportionality, we have c = kN, where k is a constant, the same for all gas thermometers, i.e., all gas samples, independent of amount and substance. This k is now called Boltzmann’s constant, and is often written kB to distinguish it from all the other uses of the letter k. Thus we write Eq (9.12) as

PV = NkBT (9.14) where N is the number of gas molecules. Eq (9.14) is called the ideal gas law. When Avogadro made his suggestion, he had no idea how large N might be in a typical gas thermometer, or how to determine it: it would certainly 322CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY be a very large number. Since P , V , and T would be of order 1 in sensible, laboratory units, kB must be a very tiny number in those units. We now know −23 kB ≈ 1.38 × 10 J/K (9.15) The units here are Joules per Kelvin, Kelvin meaning degrees on the Kelvin scale. The physical chemists of Avogadro’s day made a different use of his hy- pothesis. Suppose you take two equal volumes (at the same T and P ) of two different (pure) gases, and one of them weighs more. How can that be? Since they each have the same number of molecules, by Avogadro’s hypothesis, it can only be that the molecules of one gas weigh more than the molecules of the other, and you determine their ratio when you weigh them, even without knowing the absolute number of molecules. In the case of hydrogen and oxy- gen, for example, a volume of oxygen weighs 16 times more than the same volume of hydrogen (at the same T and P ). It had already been noticed that hydrogen seemed to be the lightest gas, so it would make sense, perhaps, to make it the unit of weight for this purpose: molecular weight. This is es- sentially what we do to this day, only we now (essentially) use the hydrogen atom as the unit, and we know that the hydrogen molecule is H2, so the hydrogen molecule has molecular weight 2, more or less by definition, and then the oxygen molecule O2 has molecular weight 16 times more, namely 32. Turning this observation around, suppose you had 2 grams of hydrogen gas and 32 grams of oxygen gas (just naming their molecular weights in each case). Then you would have the same number of hydrogen molecules as oxygen molecules. The actual number would perhaps be unknown, but whatever it is, it is the same for both. This number is called Avogadro’s number NA, and Avogadro’s number of the molecules of any gas is called 1 mole of that gas, a word from the Italian. In practice, this means that a mass equal (in grams) to its molecular weight is 1 mole, for any gas. This definition does not require that we be able to count molecules, or even that we know the value of NA. (Note: the term molecular weight is unfortunate: it is the mass of a mole, in grams, not the weight.) As late as 1900 Avogadro’s number was not known very accurately, and a few people even doubted the very existence of molecules! We now know 23 NA ≈ 6 × 10 (9.16) 9.5. HEAT CAPACITY 323

If a gas consists of N molecules, then there are n = N/NA moles of the gas. Thus we can rewrite the ideal gas law Eq (9.14) as PV = nRT (9.17) where n is the number of moles of a gas, and

R = NAkB ≈ 8.314 J/K · mole (9.18) is the gas constant. The gas constant R can be determined experimentally from measurements on any gas with known molecular weight, without know- ing NA, so the version of the ideal gas law in Eq (9.17) was useful long before the version in Eq (9.14). Here is a numerical example. Suppose we have a 1 liter container of some gas at atmospheric pressure and at room temperature 20◦ C. What does it weigh? The given information would only be enough to find how many moles we have, but not to find the weight. To do that we would have to know the molecular weight, to find the mass, and the local value of g, to find the weight. From the ideal gas law the number of moles n is PV (105 N/m2)(10−3 m3) n = = = 0.041 mole (9.19) RT (8.3 J/K · mole)(293 K) (Notice how the units work.) This is as much as we can say without more information. If we happen to know that the gas is nitrogen (N2), with molec- ular weight MW = 28 g/mole, then the mass is m = nMW ≈ 1 g. The weight would be mg ≈ (10−3 kg)(10 m/s2) = 0.01 N. Since the air we live in is mostly N2, at atmospheric pressure, and roughly room temperature, just like this sample, it would be the same computation to find the density of our air, which is therefore about 1 g per liter, or 1 kg/m3, the same estimate we arrived at once before, in section 8.10.1. The method in this section, however, is in principle very precise, since V , P , and T can all be measured very accurately, and the average molecular weight of air is also known very accurately.

9.5 Heat Capacity

Even before the flow of heat was well understood, it was possible to quantify it. The amount of heat that raises 1 gram of water by 1◦ C was called 1 324CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY calorie, abbreviated ‘cal’. When Joule found that the mechanical energy 4.18 J dissipated in water also raises 1 gram of water by 1◦ C, it became clear that the calorie is just another unit of energy, and that

1 cal = 4.18 J (9.20)

Although the calorie is not the SI unit of energy, it is still in wide use because it is so convenient. Conversion to SI units uses the relation above, of course. The continued use of the calorie poses a subtle pitfall for the unwary, because it suggests that there is something distinct, called “heat”, that is measured in calories, as opposed to other kinds of energy. No no no! That would be phlogiston, and we no longer believe in it. Energy that flows into a system as heat can come out as energy in some other form, because in the end it is just energy, and energy is not trapped in any particular form. And energy added mechanically, like in Joule’s experiment, can raise the temperature just as well as the flow of heat can. There is a different unit of energy, also called the Calorie, or more prop- erly the “big calorie,” that is 1 kilocalorie, or 1000 calories, abbreviated with a capital C as ‘Cal’. 1 Cal = 1000 cal (9.21) The energy content of food is always given in big calories, so to convert to Joules, you must first convert to calories by multiplying by 1 = 1000 cal/Cal, then multiply by 1 = 4.18 J/cal. Any energy can be conveniently measured if it can be converted into the internal energy of water, because it will show up as a rise in temperature. Suppose the temperature of 1 kg of water is raised by 2 K. (Since the Kelvin degree and the Celsius degree are the same, we might as well use Kelvins.) Then each gram of water is warmer by 2 K, and since there are 1000 grams, the energy added to the water was 2000 cal, from whatever source. We are using Eq (9.1), which said ∆E ∝ ∆T (9.22) but now we know the constant of proportionality:

∆E = C∆T (9.23)

The constant C, called the heat capacity, is 1000 cal/K in the example above. Putting in ∆T = 2 K, we find ∆E = 2000 cal. Clearly, though, the heat 9.5. HEAT CAPACITY 325 capacity C is proportional to the mass M of the water, i.e.

C ∝ M (9.24)

To heat twice as much water by the same 2 K would take twice as much energy. Thus it makes more sense to think of heat capacity as

C = Mc (9.25) where the constant of proportionality c has units cal/K·g, and is called the specific heat. It is really the heat capacity per gram. For water, we know cwater = 1 cal/K·g, by the definition of the calorie. For any substance the specific heat c is a material property, and can be looked up in handbooks. Suppose a lump of metal at 30◦ C is dropped (carefully) into 1 kg of water at 20◦ C. Since the two are at different temperatures, and they are in contact, they will exchange internal energy and come to thermal equilibrium. Suppose the temperature of the equilibrium is 22◦ C. Then, as we have just noticed, this means that ∆E = 2000 cal for the water, and since energy is conserved, this means that the metal lost 2000 cal, i.e., ∆E = −2000 cal for the metal. (Experimentally, one must carefully insulate the water so that all heat flow is internal to the system, and no heat flows out to the room. This would include even the containing vessel, a bit of an idealization.) Since ∆T = −8 K for the metal, its heat capacity is C = ∆E/∆T = 250 cal/K. This by itself does not tell us very much, but if we weigh the metal and determine that its mass is M = 2700 g, then we find the specific heat of the metal is c = C/M = 0.093 cal/K·g. This is close to the specific heat of zinc or copper. Perhaps the metal is brass, which is an alloy of the two. The principle in the example above is that energy is conserved, so that the energy that moves into the water must be the same as the energy that moves out of the metal. Here is another example of that type. Suppose 100 g of copper at 30◦ C is dropped into 1 kg of water at 20◦ C. The heat capacity of the water is Cw = 1000 cal/K, and the heat capacity of the copper, using the specific heat of copper ccu = 0.092 cal/K·g, is Ccu = (100 g)(0.092) cal/K·g= 9.2 cal/K. The total change in energy is zero, since the energy only moves from one to the other, so

0 = ∆E = Cw∆Tw + Ccu∆Tcu (9.26) 326CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY and thus ∆T C cu = − w (9.27) ∆Tw Ccu This says the temperature changes are in the same ratio as the heat capaci- ties. The minus sign means one temperature goes down while the other one goes up (we knew that). The large heat capacity changes less in tempera- ture, while the small heat capacity changes more. In fact, in this example, the right hand side is very large, more than 100, because Cw is so much greater than Ccu. That means that almost all the temperature change is the copper cooling off. The temperature change of the water is less than 1/100 that of the copper, a very slight warming up. A large quantity of water, like the one in this example, is sometimes called a ‘heat bath’, because it maintains a nearly constant temperature due to its large heat capacity, even while it exchanges energy with smaller systems. By analogy, any large heat capacity, like a large enough block of copper, could be a ‘heat bath’, controlling the temperature of an experiment.

In the example above we can solve for the final equilibrium temperature, which we already know will be approximately the original temperature of the ◦ water, 20 C. Let Tw, Tcu, and Te be the initial temperatures of the water and the copper, and the final equilibrium temperature. Then Eq (9.26) says

0 = (Te − Tw)Cw + (Te − Tcu)Ccu (9.28)

(We wrote the temperature changes as the final temperature minus the initial temperature in each case.) Then, solving algebraically for Te, we find

TwCw + TcuCcu Te = (9.29) Cw + Ccu

This is a weighted average of the initial temperatures, so it is somewhere in the middle. The weights (multiplying Tw and Tcu) are the heat capacities (Cw and Ccu), and we have already noticed that Tw is heavily weighted, since Cw is much larger than Ccu. That is why the average comes out to be very close to Tw. In fact, evaluating the expression, we have Te = (20 · 1000 + 30 · 9.2)/(1000 + 9.2) = 20.09◦ C. The change in temperature of the copper, −9.91◦ C, is more than 100 times the change in temperature of the water, 0.09◦ C. 9.6. MOLAR HEAT CAPACITIES 327 9.6 Molar Heat Capacities

Specific heat, or heat capacity per gram, is the practical measure of heat capacity, but the more fundamental measure is heat capacity per mole. We convert specific heat c to molar specific heat cM simply by multiplying by the molecular weight MW (i.e., grams/mole). Right away we find something quite amazing: many, if not most, solids have the same molar specific heat, around 6 cal/K · mole ≈ 3R, where R is the gas constant, Eq (9.18). (Note that the gas constant R does have the right units to be a molar specific heat.) This peculiar fact, called the Law of Dulong and Petit, is illustrated in the table below:

Material c (cal/K·g) MW (g/mole) cM (cal/K·mole) Al 0.216 27.0 5.83 Cu 0.092 63.5 5.85 Fe 0.107 55.85 5.98 Au 0.031 197 6.09 Pb 0.031 207 6.40 What does it mean? Let us interpret this by idealizing it slightly, taking the molar heat capacity to be cM = 3R, the same for all solids. Imagine we have two solids, X and Y , in thermal equilibrium, and suppose that X consists of nX moles and Y consists of nY moles. Then their heat capacities are CX = nX cM and CY = nY cM . If we add energy ∆E to the system, then some of it, ∆EX , goes to X and the rest, ∆EY , goes to Y . But since they are in thermal equilibrium, the temperature change ∆T is the same for both. Thus ∆EX = nX cM ∆T and ∆EY = nY cM ∆T , so

∆E n N X = X = X (9.30) ∆EY nY NY

That is, each solid gets energy in proportion to the number of moles it con- tains, which is proportional to the number of molecules it contains (NX and NY ). For example, if X has twice as many molecules as Y , then X gets twice as much energy as Y . On average, then, each molecule gets the same energy! This is called equipartition of energy, and gives a very simple picture of what is happening microscopically. The energy spreads out equally over all the molecules. 328CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY

The molar heat capacities of gases are different from those of solids, but again, they show a surprising simplicity. For noble gases, like helium and argon, in a container of definite, fixed volume, the molar heat capacity is 3 just half that of solids, namely 2 R. This means that if 1 mole of helium is in equilibrium with 1 mole of aluminum, then when you add a little energy ∆E, twice as much goes to the aluminum as to the helium. Each aluminum atom gets twice as much energy, on average, as a helium atom, when the system reaches its equilibrium temperature. Why? This does not seem like equipartition. Why do the helium atoms not get their share? The main 5 constituents of the air, N2 and O2, each have molar heat capacity about 2 R in a container of fixed volume. This is almost as large as that of a solid, 3R, so each molecule of the air would get almost as much energy as each molecule of a solid with which it is in equilibrium. In the next section we give a simple statistical model of what is happening here.

9.7 Statistical Model for Molar Heat Capac- ity

Statistical mechanics takes the view that internal energy is just the familiar kinds of mechanical energy, kinetic energy and potential energy, but at a microscopic scale that we can’t see, and continually exchanged among the molecules in a random fashion. Thus the internal energy of a gas would just be the kinetic energy of the molecules. Their interaction with their container would be collisions at the wall, where they might pick up some energy or lose some energy, but statistically they would have a definite average energy over time. The molecules might also collide with each other, redistributing the energy randomly over the all the molecules of the gas. The kinetic energy of a single molecule of mass m is 1 K = m(v2 + v2 + v2) (9.31) 2 x y z This is just Eq (6.14), but v2, the speed squared, is the sum of three terms, referring to the components of velocity along x, y, and z directions. For a real molecule in a real gas these terms will be rapidly changing in time, but 9.7. STATISTICAL MODEL FOR MOLAR HEAT CAPACITY 329 the statistical theory gives a very simple rule for their average at temperature T , denoted < >: ¿ À ¿ À ¿ À mv2 mv2 mv2 1 x = y = z = k T (9.32) 2 2 2 2 B where kB is Boltzmann’s constant. Thus the average kinetic energy < K > of one molecule is the sum of three equal terms, giving 3 < K >= k T (9.33) 2 B and the internal energy E of 1 mole of a gas is just the total energy of Avogadro’s number of such molecules, 3 3 E = k TN = RT (9.34) 2 B A 2 where R is the gas constant. This gives a simple relationship between the internal energy E and the temperature T of a gas. A change of temperature 3 by ∆T implies a change of energy ∆E = 2 R∆T , and the molar heat capac- 3 ity, the constant of proportionality in this relationship, is 2 R, just what is observed for the inert gases. 3 The number “3” in the molar heat capacity 2 R comes from the three quadratic terms in the kinetic energy < K > of a molecule. Those, in turn, were there because the molecule is free to move in any of three dimensions of space. We say the molecule has 3 degrees of freedom, each represented by a quadratic term in the energy. Now we ask ourselves how it can be that 5 3 the air has molar heat capacity 2 R, not 2 R. Could there be 5 degrees of freedom for air molecules instead of 3? Yes! The molecules of the air are N2 and O2, each molecule containing two atoms. Each atom has its own kinetic energy – that would be 3 degrees of freedom for each, 6 degrees of freedom in all, except that along the line connecting them they must have the same component of velocity, because they stay together, so that leaves only 5 degrees of freedom. The two new degrees of freedom correspond to rotation 5 of the molecule. Thus the molar heat capacity is 2 R, and a measurement of heat capacity, which is a laboratory scale measurement, is really telling us something about the geometry of the molecules, and that they are rotating! The molecules of a solid cannot translate along independently, like the molecules of a gas, but they can still have kinetic energy K if they are 330CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY oscillating in place about their (mechanical) equilibrium positions. They must, in effect, be on springs that provide a restoring force and always push them back where they belong in the solid. A simple model of this situation says that each molecule is a simple harmonic oscillator with energy 1 1 K + U = m(v2 + v2 + v2) + k(x2 + y2 + z2) (9.35) S 2 x y z 2 Note there are 3 degrees of freedom for velocity, because the molecule can be moving in any direction, and 3 degrees of freedom for position, because the molecule can be displaced in any direction. With 6 degrees of freedom in all, each represented by a quadratic term in the energy, the average energy at temperature T of a molecule is 1 < K + U >= 6 × k T = 3k T (9.36) S 2 B B and the molar internal energy is

E = NA < K + US >= 3NAkBT = 3RT (9.37) corresponding to molar heat capacity 3R. Now we can see why the molecules of a solid get more energy than the molecules of an inert gas when the two substances are in equilibrium. The solid has “more places to put the energy”, having both kinetic and potential energy. The gas molecules have only kinetic energy. There is an equipartition of energy, but it is an equipartition among the degrees of freedom, not among 1 the molecules. Each degree of freedom gets energy 2 kBT at temperature T . It is instructive to look at water in this connection. We know the specific heat of water is 1 cal/K·g, by definition of the calorie, and the molecular weight of H2O is 1 + 1 + 16 = 18 g/mole. Thus the molar specific heat of water is 18 cal/K·mole, or about 9R. This is an enormous molar specific heat! It suggests that the water molecule, in the liquid state, has 18 degrees of freedom. What could they all be? It is true that the molecule consists of 3 atoms, but it is as if they could all move independently as oscillators, in every direction, as if the molecule didn’t constrain them at all. It suggests that the water molecule, as a constituent of the liquid state, is remarkably dissociated. Most molecules are quite rigid, like N2 and O2 in the atmosphere, which behave like rigid rotators, with one degree of freedom suppressed. The 9.8. PHASE TRANSITIONS 331 water molecule by itself is rigid like any other small molecule, but in the liquid state it seems to lose this rigidity. The anomalously large heat capacity of water is still not really understood. The problem hinted at here is sometimes called the problem of the “structure of water.”

9.8 Phase Transitions

We defined the heat capacity to be the constant of proportionality in Eq (9.1), ∆E ∝ ∆T . But suppose we added a little energy ∆E to a substance, in- tending to measure the corresponding ∆T , and we found ∆T = 0: we would conclude that the heat capacity C was infinite! Such a system would be a perfect heat bath, capable of absorbing energy and not changing its temper- ature T at all. That sounds like some kind of idealization, but it is actually completely commonplace. A mixture of ice and water in equilibrium at 0◦ C is such a thing. This is the temperature at which ice and water coexist. When you add energy ∆E, it doesn’t go into raising the temperature. Rather it goes into converting some ice into liquid water, at the same temperature. This melting transition requires energy. If we try to warm the water, by putting it in contact with something warmer and allowing heat to flow in, but keeping the ice and water well stirred and in equilibrium with each other, then heat flows at 0◦ C from the water to the ice, melting the ice. Similarly, if you remove energy from the ice water mixture, by bringing it into contact with something colder, for example, its temperature does not go down. Rather, as heat flows out of the system, liquid water converts to ice at the same temperature. This is low temperature flow of heat without a change in temperature: a concept sufficiently different from ordinary language to require some thought. A transition from one state to another, differing in molar internal energy at the transition temperature, is called a first order phase transition. The internal energy difference between the two phases is called the latent heat. In the case of a solid/liquid transition, it is called the latent heat of fusion (Latin word for melting). The latent heat of fusion for water is 80 cal/g, or 1440 cal/mole. The liquid/gas transition is another first order phase transition. The latent heat of vaporization for water is 540 cal/g, or 9720 cal/mole. These are surprisingly large energies. The 80 calories necessary to melt a 332CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY gram of ice at 0◦ C would have warmed 1 gram of melt water almost to boiling. Where does all this energy go?

Fig 6.11 illustrates how a water molecule might be bound to a neighboring molecule, at a fairly well defined distance, in the solid, ice. Its interaction with its neighbor is described by a potential energy, and it is in a low energy state, nearly the minimum of that potential energy. Without acquiring more energy, it can only oscillate by a small amount, between turning points that are very close together. But with more energy, it reaches the higher states, with larger amplitude oscillations, and with even more energy it can break free altogether. Thus it is a plausible model that the energy we add to melt a solid goes into exciting the molecules to higher oscillator states until the bound structure can no longer hold together. The melting transition is not well understood, however.

In the vaporization transition, the molecules must end up with the average kinetic energy of a gas at the transition temperature. By the equipartition argument, though, they already have this energy! Thus the internal energy needed to vaporize a liquid must be used to separate molecules from each other, much like in the melting transition. That the latent heat of vaporiza- tion is much larger than the latent heat of fusion tells us that the molecules are energetically bound to their neighbors in the fluid almost as strongly as they are in the solid.

A technique called differential scanning calorimetry (DSC) looks for phase transitions systematically in samples of newly synthesized compounds. It starts at a low temperature, adds a little energy ∆E, and monitors the cor- responding change in temperature ∆T . The ratio ∆E/∆T is the heat capac- ity C. It does this again and again, gradually raising the temperature and keeping track of C. If we come to a phase transition, the ratio C is suddenly very large, because the energy doesn’t go into raising the temperature, but rather into latent heat. Thus in the scan, which might be automated and simply produce a graph of C at each temperature, there would be spikes at the transition temperatures. You might think this technique would be quite superfluous, because surely one can look at a sample and see if it has melted or vaporized! The point is, though, these aren’t the only phase transitions. There are whole classes of compounds, including biologically interesting com- pounds like lipids, that show transitions between liquid crystalline phases. These are different liquid phases, so they might look just the same to the ca- 9.9. ENTROPY 333 sual eye, but the heat capacity shows that phase transitions occur at special transition temperatures. Naturally one wonders what is happening micro- scopically! Typically the molecules are partially ordering in some way, but not becoming as perfectly ordered as they are in a crystal. In a material that forms a nematic liquid crystalline phase, for example, the molecules are elon- gated, and quite rigid. In the normal liquid phase they are densely packed together, but completely disordered in orientation. At the transition to the liquid crystalline phase (going down in temperature) they acquire a statisti- cal tendency to point in the same direction, even though they still slide past each other like the molecules in any liquid. In the DSC measurement, ∆E added at the transition temperature goes to break the orientational order, not to raise the temperature. The name “liquid crystal” very neatly captures the idea of partial ordering within a fluid phase.

9.9 Entropy

The use of words like “order” and “break the orientational order” at the end of the last section actually has a precise technical meaning in the concept of entropy. The entropy S of a system is a definite, numerical measure of its disorder. When heat flows into a system, its entropy increases, as you might suppose. In the differential scanning calorimeter, we did not say how the energy ∆E was added to the sample, but suppose it were added as heat. To emphasize the way the energy was added, as a flow of heat and not in some other way, ∆E is called ∆Q in such a case. This should not be read as a “change in Q” however, as there is no quantity Q to change. Rather it is a change in E, the energy, by the method of heat flow. Keeping the meaning of ∆Q firmly in mind, we can now define how much the disorder S increases when heat flows into a system at temperature T . Amazingly, the symbol ∆S really means the change in a well defined quantity S. It is ∆Q ∆S = (9.38) T By keeping track of the entropy changes ∆S in the quantity S, starting from some convenient state, one can know the entropy S of a system in any equilibrium state. The discovery that there is such a quantity S, defined by Eq (9.38), was one of the great surprises of 19th century physics. Its 334CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY interpretation as molecular disorder came even later. It still seems a peculiar and elusive concept. Only in the 20th century did it become clear that S = 0 at T = 0, that is, the disorder is zero at absolute zero. This is the Third Law of Thermodynamics. Of course the interpretation of S as disorder makes this very plausible. Knowing S at T = 0, Eq (9.38) tells us how to find it at any other temperature, because as we add heat (and therefore disorder), the sample warms up. The simplest place to understand Eq (9.38) is precisely at a first order phase transition, because in that situation, as we add energy in the form of heat, the temperature T doesn’t change. That means we can say exactly how much a mole of water increases its disorder when it melts: it is just the latent heat of fusion divided by the transition temperature,

(∆Q)fusion 1440 cal/mole (∆S)fusion = ≈ = 5.27 cal/K · mole = 2.65 R Tmelt 273 K (9.39) Notice that molar entropy can be expressed as a multiple of the gas constant R, which has the same units. Does the disorder increase by more than this or less when water boils? Do other substances behave similarly? Problems

Thermometers

9.1 Most materials expand as they are heated and contract as they are cooled, but a notable exception to this behavior is water near its freezing point.

(a) What familiar phenomenon, observable on lakes in cold climates in the winter, makes it certain that water does not contract when it freezes, like most materials, but rather expands? Explain carefully.

(b) This expansion of ice just continues a trend that is also visible in the liquid state: water expands as it is cooled in the narrow temperature range from 4◦ to 0◦ (Celsius). Above 4◦ water expands as it is heated, like most other materials. Sketch a qualitative graph (no need to look up real data) of the volume of a fixed mass of water between 0◦ and 10◦, labelling the axes. Describe how to use the volume of the water as a thermometer in this temperature range, and what difficulty you might have with such a thermometer.

9.2 (a) Design a mercury thermometer, small enough to fit most of it under a human tongue, such that 1◦ F corresponds to a change in length of 1 cm in the visible mercury column.

(b) The volume expansion coefficient of alcohol is 1.1 × 10−4/◦C. How would your thermometer perform if alcohol were substituted for mercury?

335 336CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY Gas Thermometers

9.3 (a) What is the molecular weight of air, if it is 80% nitrogen (molecular weight 28) and 20% oxygen (molecular weight 32)? Explain your reasoning clearly. (b) Find the temperature if a mass of 5 g of air occupies 2000 cm3 at atmospheric pressure 105 Pa. (c) The computation in (b) assumes that the air is a gas. Check consis- tency: how does the temperature you found compare with the liquefaction temperatures of oxygen and nitrogen?

9.4 We have used the convenient approximation 1 kg/m3 for the density of air under normal conditions. Describe how to determine this density more accurately, and use this method to find a better value.

9.5 A thin, tough beach ball of diameter 40 cm is inflated to a pressure of 5 atmospheres with air at room temperature. Before it was blown up it weighed 1 N. What does it weigh after it is blown up? Include a free body diagram, clearly labelled. (Don’t neglect to consider buoyancy).

9.6 One frequently hears that 1 mole of gas at standard temperature and pressure occupies a volume 22.4 liters. Explain why this is true.

Heat Capacity

9.7 (a) 50 g of copper at 100◦ C and 50 g lead at 0◦ C are brought to- gether and heat flows from the copper to the lead until they reach the same temperature. Assume both metals obey the Law of Dulong and Petit. The molecular weight of copper is about 64 and of lead is about 207. Find the equilibrium temperature. (b) Why is the resulting temperature not 50◦ C? (c) More generally, if equal masses m of two different materials, substance A with molecular weight (MW )A at temperature TA and substance B with 9.9. ENTROPY 337

molecular weight (MW )B at temperature TB, exchange heat until they come to equilibrium, what will be the final temperature (assuming the Law of Dulong and Petit)?

9.8 By what temperature should a lake warm up in one day if the main energy input is solar energy, with an energy current density of 1 kW/m2? Assume all this energy is captured by the lake, and ignore other energy inputs and losses. Since the lake is horizontal, and not oriented perpendicular to the Sun’s rays, we are overestimating the energy current density on the lake surface, so assume the day is effectively only 4 hours long to correct for this. Also assume the water is mixed to a depth of 2 m, but the water below does not get any of the energy input.

9.9 Can you warm up a nail appreciably by hitting it with a hammer? Estimate the kinetic energy of a hammer and assume it is all delivered to a 5 g nail made of steel. (You can take the molecular weight to be about 60 and assume the Law of Dulong and Petit.) What is the change in temperature per blow? 338CHAPTER 9. TEMPERATURE, HEAT, AND INTERNAL ENERGY Chapter 10

Thermodynamics

We can warm ourselves at a fire: that’s been obvious since we lived in caves. But we can also get a fire to do things for us: that is a recent discovery. Hero of Alexandria, an author about whom we know virtually nothing, and who might have lived any time between the 2nd century B.C.E and the 3rd century C.E., describes toy-like devices that operated by steam power. This is a hint that the Alexandrian Greeks might have been on the verge of their own industrial revolution before the Romans got there. We will never know about this. It seems unlikely they would have been satisfied with just toys if they had had more time. When our industrial revolution began, in the 18th century, people quickly tried to improve the early engines, to make them more powerful and efficient. In doing this they were investigating, without even fully realizing it, fundamental questions about the flow of energy. In Section 9.9 we emphasized that energy might be transferred to an object as the flow of heat, and gave this particular kind of transfer the name ∆Q. We could also transfer the same energy to a similar object by lifting it up higher: this gives it additional gravitational potential energy. These two ways of transferring energy are clearly quite different though. In particular the first one increases the entropy S of the object (its internal disorder), while the second one doesn’t. If the second object were to convert its extra gravitational potential energy to internal energy, perhaps by falling onto the floor, then it might indirectly end up in the same state as the first one. The first one could not so spontaneously use its extra energy to join the second one on its high shelf, however.

339 340 CHAPTER 10. THERMODYNAMICS

A transfer of energy that is not the flow of heat – roughly speaking – is said to be mechanical work. Mechanical work involves just a few degrees of freedom: when we lift something up, for example, we are operating with only one degree of freedom, the height. The flow of heat, on the other hand, spreads energy over Avogadro’s number of degrees of freedom. This distinction, involving many degrees of freedom or few, seems to be crucial.

10.1 Work

There is a simple interpretation of how a mass m at height h got its me- chanical energy Ug = mgh. Someone put it there, and in doing so gave it 1 2 that energy. Similarly, if a spring has energy 2 kx , it is because someone has compressed or stretched it by a distance x, and has thereby given it that energy. In both cases, we envision a process, of lifting, or compressing, that transfers energy to the mass m, or the spring. To lift something you need to support its weight mg as you move it up through a distance h. The energy you give to m is just the product of the force mg you exert (up, to support it) and the displacement h (also up, since you are lifting it). This is a recipe for finding gravitational potential energy mgh, as the work you would do to place m at height h, where

Your Work on m = Force up × Displacement up (10.1)

The mass has energy Ug = mgh because you did that much work on it. At the same time, it did negative work on you! It pushes down with force mg on your hand, while your hand moves up a distance h. Since the force in this case is opposite to the displacement, the work is negative. You have lost energy mgh while the mass m has gained energy mgh. It is much less obvious that you have lost energy, but m has clearly gained energy, and that energy must have come from somewhere. The whole process is really a transfer of energy from you to m. When things push on each other, they are generally doing work on each other, either positive or negative, and hence transferring energy. When you compress a spring by a distance x, you must push as hard as the spring pushes on you, that is, with the force F = kx. The work you do as you push through distance x is just the potential energy the spring acquires, 10.1. WORK 341

F F

mg kx

mgh kx2/2 y h x

Fig. 10.1: The work done in lifting a weight or compressing a spring is the area under the graph of force vs. displacement. This generalizes the simple rule Force×Displacement.

1 2 namely Us = 2 kx . You can think of the work you do as energy stored in the spring. That energy has been transferred from you to the spring, so you have lost that much energy: the spring did negative work on you, because it pushed one way, but the displacement was the other way. The factor 1 2 , which might seem surprising, is explained to some extent in Fig 10.1. The rule Work=Force×Displacement is the area under the graph of force vs. displacement if the force is constant, and that turns out to be the appropriate generalization in case the force changes with displacement, like the spring force F = kx, which is proportional to displacement. We emphasize that the work done on a spring in this sense exactly agrees with the potential energy US of the spring. 342 CHAPTER 10. THERMODYNAMICS 10.2 P ∆V Work

In the context of thermodynamics the most important work done by a gas is the work it does in expanding or contracting its volume. In steam engines the gas in question would be the steam, and its expansion is the whole point of the engine: that is where it does something useful. The gas is at a pressure P , and hence exerts a normal force F = PA on every area A of the containing vessel. To change the volume, some part of the area A must move normally through a displacement ∆`. Then the work done would be PA∆` = P ∆V , recognizing the change in volume ∆V = A∆`. The simplest geometry for this is a piston, as in Fig 10.2. There is only a single degree of freedom, the

∆ PA l

Fig. 10.2: The gas does work P ∆V on the piston when it expands by ∆V = A∆`. The piston does work −P ∆V on the gas. If P changes appreciably during the process, P ∆V should be understood as the area under the P vs. V graph. position of the piston in the cylinder. As we are imagining it, the gas does positive work on the piston, because it pushes the piston in the same direction as the displacement. Notice, however, that the piston does negative work on the gas, because the piston pushes inward, by Newton’s Third Law, while the displacement that we have drawn is outward, opposite in direction. This means that the energy of the gas in the cylinder goes down. That energy has been transferred to the piston in the positive work that the gas did. In case the pressure P changes as the volume changes, the meaning of P ∆V is the area under the graph of P vs. V , just like the work in other situations (see Fig 10.1). 10.3. VARIOUS PROCESSES 343 10.3 Various Processes

We will consider various processes involving the gas in the piston in Fig 10.2. The gas is really a kind of metaphor for any system that can do work, have work done on it, receive heat, or lose heat, which means any macroscopic system at all. We are really thinking more abstractly than it may appear about how energy moves around, because the details of the system will play no essential role. At the same time, gas in a cylinder is a very concrete and practical example of what the theory means. The basic observation is that in a process in which the gas does work P ∆V , and receives a flow of heat ∆Q, its internal energy changes by

∆E = ∆Q − P ∆V (10.2)

The minus sign is because when the gas expands (∆V > 0), it does work on the piston, but the piston does negative work on it. Both ∆E and ∆V refer to the gas.

10.3.1 Adiabatic Process: ∆Q = 0

As we noticed in Eq (9.3), heat flows in response to a temperature difference, but the constant of proportionality, the conductance κ, may be large or small. There is nothing to prevent κ from being very small, and from our doing the process quickly, so that there is little time for heat to flow. Under these conditions it is a reasonable model to assume ∆Q = 0. We must still carry out the process smoothly, so that the different parts of the gas stay in thermal equilibrium with each other. Such a process is called adiabatic. Under these conditions, ∆E = −P ∆V . The internal energy of the gas goes down as the gas expands, and the temperature T also goes down. Since PV goes down, and V goes up, P certainly goes down. Thus the work P ∆V must be understood as the area under the P vs. V graph, as in Fig 10.3. The work done on the piston comes from the internal energy of the gas. This example clearly shows that internal energy, spread over Avogadro’s number of degrees of freedom, can become energy associated with a single degree of freedom. The motion of the piston could be used to raise a weight, for example. 344 CHAPTER 10. THERMODYNAMICS

P Thigh

Tlow V

Fig. 10.3: The work done in an adiabatic expansion is the area under the P vs. V (solid) curve. The dotted curves are PV = constant, that is, isotherms corresponding to a high temperature and a low temperature.

If the gas is compressed adiabatically by the piston, then ∆V < 0. In this case ∆E = −P ∆V > 0, and the internal energy of the gas increases, by the amount of work done on it by the piston. Its temperature goes up. You have probably noticed this effect in using a hand pump.

In the adiabatic process we are considering (note: the process must be done reversibly, as we clarify below), ∆S = 0, by Eq (9.38), since ∆Q = 0. That is, the entropy, or disorder of the gas, doesn’t change. This seems surprising, since the temperature T goes down in the expansion. Isn’t lower temperature associated with lower disorder? After all, S = 0 at T = 0. It is true that the gas has more orderly motions at lower temperature if everything else stays the same, but here, as T goes down, V , the volume, goes up. Now the gas is more disordered in the sense that it has more room V to spread out. What we see in the adiabatic process is a tradeoff between order in space, enforced by the walls of the piston, and orderly motions, associated with temperature. As the gas is compressed adiabatically, it becomes more 10.3. VARIOUS PROCESSES 345 ordered in space (more confined), but less ordered in its motions, because it is at higher temperature. As it expands, it becomes more ordered in its motions, but less ordered in space (less confined). These two contrary tendencies exactly cancel in the measure of disorder called entropy. The adiabatic process is a constant entropy process, isentropic: ∆S = 0.

10.3.2 Isothermal Processes, ∆T = 0

An isothermal process is one that takes place at constant temperature T . This could happen if the gas is in thermal contact with a heat bath, that is, a large heat capacity at the temperature T . For a gas obeying the gas law Eq (9.17), ∆T = 0 means ∆E = 0, that is, the internal energy E is also constant. This is especially easy to see in case E ∝ T , as in the statistical theory of Eq (9.34), but it is true for any ideal gas. In a process like this

0 = ∆E = ∆Q − P ∆V (10.3)

The work done by the gas is again the area under the P vs. V curve, P

T=constant

V

Fig. 10.4: The work done in an isothermal process is the area under the P vs. V curve. In this case the curve is the isotherm PV = constant.

Fig 10.4. In this case, if the gas does positive work P ∆V , the energy does not come from its internal energy. Rather its internal energy stays the same, and the energy comes from heat flowing into the gas, which means, from the heat bath! We could think of it as an expansion like the adiabatic expansion, except that now the thermal conductance κ to the outside is large, so that 346 CHAPTER 10. THERMODYNAMICS instead of cooling as it expands, the gas receives heat to keep its temperature constant. We could imagine it cooling just slightly as it expands, but as this happens, the temperature difference with the heat bath drives a flow of heat from the heat bath to warm the gas back up. The high thermal conductance between the gas and the heat bath means they can never differ significantly in temperature. The pressure P in the gas goes down as it expands, since V goes up, and PV = constant. Similarly, if we do work on the gas, compressing it isothermally, the in- ternal energy does not go up. Rather the slight temperature increase that we might initially create drives a heat flow to the bath. The work we do on the gas goes into the heat bath as internal energy. The pressure in the gas goes up as the volume goes down, since PV = constant. We can also follow the change in entropy ∆S in an isothermal process. By Eq (9.38), when the gas expands and energy flows in as heat ∆Q, the entropy of the gas increases by ∆S = ∆Q/T . This might be a surprise: the temperature stays the same but the disorder increases. Why? The answer is that the volume has increased in the expansion, so the gas is more disordered in space, spreading out more. This is an effect that we already noticed in the adiabatic expansion, but there it was in connection with a contrary effect as the temperature went down, and here we see it by itself. The ∆Q entering the gas at temperature T is exactly the ∆Q that leaves the heat bath at temperature T , so the entropy of the heat bath goes down by the same amount that the entropy of the gas goes up, and the change in entropy for the entire system of heat bath plus gas is zero. The actual value of the work done in the isothermal expansion of an ideal gas is surprisingly important. As we already said, it is the area under the P vs. V curve in Fig 10.4, and it can be found by calculus. It is µ ¶ V P ∆V = nRT ln f (10.4) Vi where P ∆V is our notation for the work done, Vf and Vi are the final volume and the initial volume respectively, and ln is the natural logarithm, a function available on calculators. This is also ∆Q, since ∆E = 0 in this process, and hence the isothermal change in entropy is µ ¶ µ ¶ ∆Q V P ∆S = = nR ln f = nR ln i (10.5) T Vi Pf 10.3. VARIOUS PROCESSES 347

The form involving the initial pressure Pi and the final pressure Pf follows because PV =constant in an isothermal expansion. We see that S increases in the expansion because the logarithm is an increasing function of its argu- ment: larger final volume Vf means larger S. We can understand this result without calculus using dimensional anal- ysis. The work done is an energy, so it must be proportional to RT on dimensional grounds. It must also be proportional to n, the number of moles of gas, because the heat necessary to maintain temperature T is proportional to the number of molecules: this energy gets equipartitioned over all degrees of freedom. Since T is constant, the only variable is V , and on dimensional grounds the result can depend only on the dimensionless ratio Vf /Vi, i.e., the work done must be nRT f(Vf /Vi), where f is some unknown function. But the work done in expanding from VA to VB, plus the work done in expanding from VB to VC , where these are any volumes, must be the same as the work done in expanding from VA to VC , i.e., the function f must have the property f(VC /VA) = f(VB/VA) + f(VC /VB), or more simply f(xy) = f(x) + f(y). You may recall that the logarithm has this property. In fact it is the only function that does. This argument does not determine a dimensionless con- stant multiplying everything, but this is the same as saying that we have not determined what the base should be for the logarithm. It turns out to be the natural logarithm (base e). We will return to this result on the en- tropy change associated with a change in volume or density at the end of this chapter.

10.3.3 A Constant Pressure Process

Suppose we have a mole of helium gas in the cylinder of Fig 10.1, and we want to add enough energy, via heat flow ∆Q, to raise its temperature some fixed amount ∆T . How much heat do we need? Well, ∆E ∝ ∆T , where E is the internal energy of the helium, and the constant of proportionality 3 is the molar heat capacity CM . We even know CM = 2 R for helium, as discussed in the text around Eq (9.34). Thus you might very justifiably say 3 that the heat required will be ∆Q = 2 R∆T , a little more than 12 J to raise the temperature 1 K. But now we say we will do this keeping the helium at constant pressure P , by maintaining a constant force on the piston from the outside. What difference could that make? you might ask. Well, since 348 CHAPTER 10. THERMODYNAMICS

PV ∝ T for the helium, and T is increasing, and P is constant, we see that V must increase, that is, the gas must expand (at constant pressure). In doing so, it does work P ∆V on the piston, and this represents a loss of internal 3 energy to the gas. In fact, since ∆E = ∆Q − P ∆V = 2 R∆T , we find 3 ∆Q = R∆T + P ∆V (10.6) 2 so that the heat ∆Q required is more than you thought! In fact, since 5 P ∆V = R∆T , by the ideal gas law Eq (9.17), we find ∆Q = 2 R∆T , as if 5 3 the molar heat capacity CM were 2 R and not 2 R. We need more than 20 J to raise the temperature by 1 K. The reason is that not all the heat we add goes into internal energy of the gas, and hence into increasing the temperature. Some of it comes out again as work done in expansion of the gas. That is why we have to put in some extra heat. To make sense of this somewhat confusing situation, we should realize that the original question – how much heat do you need to raise the temperature by ∆T ? – was not precise enough. We have to say exactly what the process is. If it is a constant volume process, then we should use the heat capacity 3 we already knew about, CV = 2 R, where the subscript V means that this is the heat capacity at constant volume. In this process, the pressure goes up as the temperature goes up, and all the added heat goes into raising the temperature. But if the process is at constant pressure, then we must use CP = CV + R, the heat capacity at constant pressure. If we keep track of all the energy, including the work done in expansion in this process, the confusion goes away.

In principle CV and CP are different for solids and liquids as well. In fact, though, as we have seen, solids and liquids typically expand very little as they are heated, only a part in 104 or so, per degree. Thus the work done in expansion is tiny. And since this “work done” is just the difference CP ∆T − CV ∆T , we see that CP ≈ CV . What makes gases different is that they expand a lot at constant pressure when they are heated.

10.3.4 Reversible and Irreversible Processes

The adiabatic and isothermal processes, as we have described them, are re- versible. If we compress a gas adiabatically, the work we do is stored in the 10.3. VARIOUS PROCESSES 349 internal energy of the gas, the temperature goes up, and the pressure goes up. Allowing the gas to expand again, we get back the work we invested, as the gas expands and cools back to its initial temperature. Both processes are represented in Fig 10.3, differing only in the direction we move along one and the same P vs. V curve (the solid one in the figure). The same remarks apply to the isothermal process, using Fig 10.4. We can do work on a gas to compress it isothermally, and the pressure goes up. The energy is stored in the heat bath(!) Allowing the gas to expand again isothermally, we get that work back. It is interesting to notice that this process includes a reversible flow of heat, between the gas and the heat bath at the same temperature. The prototype irreversible process is the spontaneous flow of heat from hot to cold, i.e. between two genuinely different temperatures. Once this happens, there is no way simply to reverse it. It is true that there exist devices which act as refrigerators, moving energy from cold to hot, leaving the cold even colder, but this is not a simple reversal of the spontaneous flow of heat. Refrigerators are machines that require their own flows of energy.

When heat ∆Q flows from a hotter body, at temperature Th, to a colder body, at temperature Tc < Th, the total entropy of the two bodies, taken together, increases. By Eq (9.38), the entropy of the hotter body changes by ∆Sh = −∆Q/Th. This is a negative change (decrease) in entropy, because heat is flowing out. The entropy of the colder body changes by ∆Sc = ∆Q/Tc. This is a positive change in entropy, because heat is flowing in. The total change in entropy is µ ¶ 1 1 ∆S = ∆Sh + ∆Sc = ∆Q − > 0 (10.7) Tc Th

It is positive because Tc < Th, so that 1/Tc > 1/Th. Since heat flows spontaneously from hot to cold, but not from cold to hot, entropy can increase in such a process, but not decrease. Finally, processes with friction are irreversible. The prototype is Joule’s experiment with the paddlewheel and the water. The paddlewheel does work F ∆x on the water, by pushing it with some force F through some displace- ment ∆x. The water is churned around, and is not in thermal equilibrium as this happens, but when it finally reaches equilibrium, it is in the same state 350 CHAPTER 10. THERMODYNAMICS as could have been attained by the addition of heat ∆Q equal to the work done on the water. Thus its entropy has increased, by Eq (9.38). (Notice that heat was not in fact added. The entropy of the final state can only be calcu- lated, however, by thinking of some other process, involving the addition of heat, that would get to that state.) Once the water has settled down, at a higher temperature, and a higher entropy, it is clear that the reverse process, in which the water pushes the paddlewheel back to its initial position, will not occur.

Let us think about the effect of friction between the piston and the cylin- der in the adiabatic expansion of a gas, Fig (10.3). The gas does work P ∆V on the piston, as illustrated there, but because of friction, not all the force PA is available to, say, raise a weight. Some of the force PA is balanced (opposed) by the friction force f on the piston due to the cylinder. The remaining force PA − f is available to do work on the weight, and thus the work done is only (PA − f)∆`. The remaining energy taken from the gas, f∆`, which we could call “the work done by the friction force f”, eventually brings the piston and cylinder to a state that could have been achieved by adding heat ∆Q = f∆` (even though that is not in fact how it happened). The entropy of the cylinder and piston have therefore increased. The entropy of the gas has not changed. The process cannot be simply reversed, because the energy stored in the raised weight is not enough to compress the gas back to its original volume.

In general, in an energy transfer with friction, system A does work W on system B, but the change in the (few) mechanical degrees of freedom of B leaves it in a position only to do (in return) a smaller amount of work W 0 < W on system A. The energy which is not represented in the mechanical degrees of freedom of B is “thermalized”, that is, spread over Avogadro’s number of degrees of freedom of B, as if it had been transferred as heat.

One possible model for how the entropy increases in the piston and cylin- der with friction (system B) is to think microscopically about how friction is caused. There could be little rough spots on the moving surfaces that catch, stick, and then slip. In the process, little vibrations are excited in certain degrees of freedom of the solid. When this happens, the energy is surely not equipartitioned, so the system is not in thermal equilibrium. The degrees of freedom that have extra energy are, in effect, hotter. As thermal equilibrium is attained, heat flows from the hot degrees of freedom to the cold ones, and 10.4. HEAT ENGINES 351 this flow of energy from hot to cold is irreversible, and increases entropy. How such friction processes actually operate is still a research topic, as is the description of systems with many degrees of freedom not in equilibrium. To summarize, in irreversible processes, ∆S > 0: that is, total entropy increases. In reversible processes, total entropy does not change, i.e., ∆S = 0. The statement that ∆S ≥ 0 in every process that actually occurs, whether reversible or irreversible, is the Second Law of Thermodynamics.

10.4 Heat Engines

The steam engine was the inspiration for the most brilliant and beautiful conception in thermodynamics, the ideal engine of Sadi Carnot. Carnot published this idea at the age of 28 in 1824, with the title Reflections on the Motive Power of Heat. He died only a few years later, of cholera. (To be fair, the first really clear statement of what Carnot had glimpsed was due to Rudolf Clausius some 20 years later.) Carnot drew on an analogy between steam power and hydropower. In the case of hydropower, you have water at a height H1, and there is also a lower level available, at height H2 < H1. You exploit the difference in heights when you let the water fall from H1 to H2, doing work along the way (turning a turbine, for example). Note that it does no good to have water at H1 if there is not a lower level for it to move to. The analogy with steam power is the following: in a steam engine (or any heat engine) you have a material at high temperature T1, and you also have a lower temperature T2 available, and heat will spontaneously flow from one to the other. Along the way you can divert some of it into useful work. Note that no heat flows if you do not have the lower temperature T2. The difference in temperature drives the flow of heat, just as the difference in gravitational potential energy drives the flow of water. It makes no sense in the context of hydropower to let water fall freely, without putting in some kind of turbine that extracts useful work. Similarly it makes no sense in the context of heat engines to let heat flow irreversibly from T1 to T2. Rather, the heat should flow through a mechanism that extracts work. That is what a heat engine is. Carnot solved the problem of making the best possible mechanism, in the sense of getting the most work for 352 CHAPTER 10. THERMODYNAMICS a given flow of heat. Carnot’s insight is that the best you can do is to make the engine reversible. And amazingly, any reversible heat engine operating between T1 and T2 is equivalent to any other. They all do exactly the same thing, regardless of their construction, whether they are steam engines or devices of the distant future, using materials and methods yet undreamed of! The proof of this amazing result is simple. It is basically the picture in Fig 10.5. What is diagrammed there is two reversible engines operating

T1

Q1 Q'1 W

Q2 Q'2 T2

Fig. 10.5: Reversible heat engine 1 on the left takes in heat Q1 at T1, exhausts heat Q2 at T2, and performs work W = Q1 − Q2 on reversible heat engine 2 on the right, which takes 0 0 0 in heat Q2 at T2 and exhausts heat Q1 at T1. The argument in the text shows Q1 = Q1 0 and Q2 = Q2, no matter how the engines are made. between the high temperature T1 and the low temperature T2. On the left, heat flows from T1 to T2 and some of it is diverted into work W = Q1 − Q2. The surprise comes on the right: the output work of the first engine becomes the input of the second engine, operating in reverse! Since this engine is reversed, it takes in heat from the low temperature T2 and exhausts it at the high temperature T1 (it is operating as a referigerator). The two engines 10.4. HEAT ENGINES 353 together become a kind of compound engine. What is its effect on the two heat baths?

0 The heat Q1 exhausted at T1 by engine 2 cannot be greater than the heat Q1 taken in by engine 1, since then the net effect would be that heat has been 0 tricked into flowing from cold to hot. But neither can the heat Q1 exhausted at T1 by engine 2 be less than the heat Q1 taken in by engine 1, because the engines are reversible, and reversing both of them we would again find heat 0 spontaneously flowing from cold to hot. Since Q1 > Q1 is impossible, and 0 0 Q1 < Q1 is impossible it can only be that Q1 = Q1, i.e., the heat exhausted by engine 2 is the same as that taken in by engine 1. Thus if the engines were disconnected from each other and made to run in the same direction, they would both take in the same heat at T1, perform the same work, and exhaust the same heat at T2. We can go further and say exactly how efficient this reversible engine is, defining efficiency e as W e = (10.8) Q1 i.e., the fraction of the high temperature heat Q1 that we can turn into work W . For the reversible engine the value is called the Carnot efficiency, the highest efficiency attainable by an engine operating between temperatures T1 and T2. Irreversible engines are less efficient than this: for example, zero efficiency corresponds to W = 0, which means Q1 = Q2, and the heat simply flows irreversibly from T1 to T2. We have computed the change in entropy ∆S in this case, in Section 10.3.4, and we recall that it is positive. Of course ∆S > 0 is the characteristic feature of an irreversible process. As we imagine making a heat engine more efficient for fixed Q1, but still irreversible, W increases as Q2 decreases, but Q2 is still large enough that the entropy change is positive, Q Q ∆S = − 1 + 2 > 0 (10.9) T1 T2 The limiting case, the best engine, is the case ∆S = 0, the reversible engine, with Q2 as small as it can be. To get still more work, i.e., to be even more efficient, the engine would have to decrease total entropy, something that never happens. (Such an engine, powering a reversible refrigerator, would make heat spontaneously flow from cold to hot.) For the reversible engine, the total change in entropy of the two heat 354 CHAPTER 10. THERMODYNAMICS baths is Q Q ∆S = − 1 + 2 = 0 (10.10) T1 T2 and therefore Q T 2 = 2 (10.11) Q1 T1

The Carnot efficiency e is therefore, using W = Q1 − Q2, W Q − Q Q T e = = 1 2 = 1 − 2 = 1 − 2 (10.12) Q1 Q1 Q1 T1 an astonishingly simple result. The efficiency depends only on the absolute temperatures. If, for example, we can use steam at 500 K, with heat flowing to room temperature 300 K, then the optimal efficiency, the Carnot efficiency, is 1 − 300/500 = 2/5 = 0.40, or 40%. More than half the energy available from the fuel that heats the high temperature bath is exhausted as waste heat Q2 at room temperature, doing no good to anyone. If we could use hotter steam, we could do better! Of course higher temperatures may be impractical for other reasons, but Carnot’s insight tells us what we must do to have even a hope of high efficiency.

10.4.1 The Carnot Cycle

The heat engines in Carnot’s argument are perhaps a little too abstract! What makes us think that such machines could be built, even in principle? In this section we describe an actual device that would operate as a reversible engine (to the extent that we can minimize friction, of course). This makes it clear that the Carnot efficiency can be approximated in real devices. We imagine using the cylinder and piston of Fig 10.2. The machine will run in a cycle, taking in some fixed amount of heat Q1, doing some corresponding work W , exhausting the rest as heat Q2, and repeating, over and over. The first step, taking in heat Q1 reversibly, is just an isothermal expansion of the gas at temperature T1, in the course of which an equal amount of work, namely Q1, is done on the piston. Since this is more work than we ultimately get out, some of that energy will later be put back into the gas to complete the cycle. In the next step we must get reversibly down to the low temperature T2. This can be done with an adiabatic expansion: 10.4. HEAT ENGINES 355

more work is done, and the gas cools to T2. Now heat Q2 must be exhausted isothermally and reversibly: this requires an input of work equal to Q2. Finally the gas must be heated reversibly back to the high temperature T2: this can be done by compressing it adiabatically (which again requires that we do work on the gas, giving energy back). Since each step is reversible, as discussed in Section 10.3, this Carnot cycle must have the Carnot efficiency of an optimal engine. Actually carrying out these steps might be cumbersome, requiring thermal insulation for the adiabatic steps, and then thermal contact with different heat reservoirs for the isothermal steps. It is not proposed as a practical engine, but rather as a proof of existence, a proof of concept.

10.4.2 Refrigerators, Heat Pumps

In examining Carnot’s idea, we noticed that a heat engine running backward is a refrigerator. Mechanical work is used to move heat from cold to hot. We can see how that might work in practice by following the Carnot cycle in reverse. At low temperature the gas expands isothermally – this is where heat flows reversibly into the gas from the low temperature bath. Now the gas is isolated thermally and adiabatically compressed, by an input of work, to raise its temperature. The hot gas can now give up its heat reversibly as it is compressed still more, in thermal contact with the high temperature bath. The gas is now isolated again and allowed to expand and cool adiabatically to the low temperature, where the cycle begins again with isothermal expansion. Metaphorically, it is like bailing out a basement. You must do work to raise water from the basement, so that you can dump it out at a higher level, just as you must do work to raise the temperature of the gas, so that you can dump heat to a higher temperature bath. Work changes the temperature, just as work changes the height. For a given input of work W , the best refrigerator would remove the most heat Q2 at temperature T2. How much would that be? (See the problem section.) In refrigeration it is the low temperature end that is of interest. We want to remove energy from it, even though there is no colder place for heat to flow. The same machine accomplishes a different task if we focus on the high temperature end. The refrigerator dumps energy into the high temperature bath, in effect trying to warm it. In this application it is called a “heat pump”, transferring energy from a most unlikely place, the low temperature 356 CHAPTER 10. THERMODYNAMICS bath. An application of heat pumps that is actually used is the heating of buildings. The low temperature bath is the wintry outdoors, and the heat pump runs as a refrigerator, trying to cool the outdoors and pumping the heat indoors. Again, it is a nice problem to see how much heat Q1 you can get for a given expenditure of work W . With a reversible heat pump we can certainly anticipate Q1 > W , since a not very smart (but widely used) way to heat space is just to dissipate W in friction, getting the same energy W as if it had been a flow of heat. Surely a clever mechanism can do better than that!

10.5 Life at Fixed Temperature

Living systems are vitally concerned with energy transfer. That makes it seem as if thermodynamics ought to be relevant to them. On the other hand, living systems never extract work from two temperature baths, and a cylinder of compressed gas also does not seem a very useful model for a living thing. How, if at all, does thermodynamics apply to life? The thermodynamic concept which emerges as crucial for this purpose is entropy. The concept of entropy S, as in Eq (9.38), was actually discov- ered through Carnot’s argument about heat engines. But as a measure of a system’s disorder it has much more general applicability. Knowing about en- tropy we can understand, in very general terms, how living systems succeed in building structure. Organisms live at one temperature T . Thus reversible transfers of energy involving a system A obey

∆E = ∆Q − ∆W = T ∆S − ∆W (10.13)

Here E is the internal energy of A, ∆Q is heat flowing into A, and ∆W is work done by A on some other system B, hence a loss of energy to A. We used Eq (9.38) to replace ∆Q with T ∆S for a reversible process. Now living systems need subsystems like A which are capable of doing work ∆W on other subsystems. Remember that work is an investment of energy in just a few degrees of freedom, not spread over Avogadro’s number of degrees of freedom. That is how structure can be built. So one should 10.5. LIFE AT FIXED TEMPERATURE 357 see the term −∆W above as “structure”, at least conceptually. The more negative it is, the more work subsystem A has done, and the more it has contributed to structure. We highlight this term by rearranging Eq (10.13),

−∆W = ∆E − T ∆S (10.14)

So how can A make its contribution? We see two ways. First, A may have a lot of internal energy E initially, and it may give it up in doing work, so that ∆E is negative. That is not such a surprising thing. If it has energy, it can transfer its energy to something else. But second, A may have small entropy S, that is, it may be quite ordered. In that case, it can increase its entropy, even without changing its energy, and in the process −T ∆S contributes to −∆W : structure. The entropy of A goes up, but if A is building a structure in subsystem B, the entropy of B may go down. Only the total change in entropy is guaranteed not to decrease. In this case a disordering of A can create order in B. The energy comes from the constant temperature environment! This is really quite far from what intuition would suggest. The combination of energy and entropy that occurs so naturally above is called the Helmholtz free energy

F = E − TS (10.15)

Work done by a system at fixed temperature reduces its Helmholtz free en- ergy, and work done on a system at fixed temperature increases its Helmholtz free energy. A subsystem A with large Helmholtz free energy is able to do work to create structure in another subsystem B because this work is re- ally a transfer of Helmholtz free energy from A to B, and thus might be either an increase in the internal energy E of B (not so interesting), or a decrease in the entropy S of B (interesting)! Either process would increase the Helmholtz free energy of B. (Note that Helmholtz free energy is not conserved: it spontaneously decreases in irreversible processes. So we can think of “transferring” it with the attendant possibility of “spilling” some of it.) Thus living systems need access to subsystems of high Helmholtz free energy, i.e., food, oxygen, etc. They use this free energy to create order, like partitioning ions on one side of a membrane. That free energy can then be used to create high free energy molecules like ATP. It is all transfer of Helmholtz free energy. That is what thermodynamics has to say. 358 CHAPTER 10. THERMODYNAMICS 10.6 Life at Fixed Temperature and Pressure

Organisms typically live not only at fixed T but also at fixed P . This means some of the work done by a subsystem A might simply go into expansion, that is, into P ∆V work, not usually associated with useful structure. To highlight the interesting work done, it is useful to introduce a slight generalization of the Helmholtz free energy called the Gibbs free energy,

G = E − TS + PV (10.16)

In any process that ends up at the same T and P , the change in G is

∆G = ∆E − T ∆S + P ∆V = ∆Q − ∆W − T ∆S + P ∆V (10.17)

In a reversible process we find, using Eq (9.38),

−∆G = ∆W − P ∆V (10.18) which is just the “interesting” work done (i.e., with the “uninteresting” P ∆V subtracted off). That is, a subsystem A that does interesting work does so precisely by giving up Gibbs free energy (∆G is negative, so that −∆G, the interesting work, will be positive). When a system reaches its minimum G, by doing all the work it can, its usefulness is over. Since in irreversible processes ∆G goes down more than it needs to for the given work done, this amounts to a kind of “spilling” of free energy, and as usual the most efficient use of energy is in reversible processes. In constant P processes for which ∆V is zero, the change in Gibbs free energy and the change in Helmholtz free energy agree. Otherwise one should simply remember that ∆E contains the term −P ∆V , which may or may not be worth splitting off as “interesting” or “not interesting.” This observation is essentially the same as in Section 10.3.3. Problems

Work

10.1 (a) A 600 g spring with k = 200 N/m is compressed by 30 cm and released. It oscillates and eventually dies down, having dissipated internally the energy it was given. Assume negligible flow of heat to the surroundings. By how much does its temperature rise (assume a molecular weight of 60 and the Law of Dulong and Petit). (b) Repeat in case the compression was 60 cm.

10.2 (a) A hammer and a nail, each made of steel, fall to the floor from a height of 3 m. The hammer is 1 kg and the nail is 5 g. How much does the temperature of each one rise? Assume no friction with the air and no transfer of heat to the surroundings. Also take the molecular weight to be 60, and assume the Law of Dulong and Petit. (b) What is an easy way to see that the temperature changes in (a) are the same?

10.3 How much P ∆V work is done per mole by a substance that converts from solid to gas at 0◦ C and atmospheric pressure? Ignore the volume of the solid.

10.4 Suppose a pressure difference ∆P = Phigh − Plow between the ends of a pipe drives a volume current I according to Eq (8.40), ∆P = rI. Here r is the resistance of the pipe. In a short time t, during which a small volume

359 360 CHAPTER 10. THERMODYNAMICS

∆V = It moves into the pipe at the upstream end and the same volume moves out at the downstream end, the work done on the fluid in the pipe is W = Phigh∆V − Plow∆V = (∆P )It. Thus the work done is proportional to time t, and the rate at which work is done (called power, SI unit Watts) is (∆P )I. (a) Verify that the SI unit of (∆P )I is J/s, or Watts (W). (b) Show that the rate at which work is done can also be expressed as I2r or (∆P )2/r, where r is the pipe resistance. (c) Use the estimates from Section 8.17 to find the rate at which work is done (by the heart) to force blood through the human circulatory system. (d) Compare your result in part (b) with the rate of energy input available from food, about 2000 Cal/day (express in SI). Is the result reasonable?

10.5 (a) In a reversible constant temperature expansion of a gas, what hap- pens to its internal energy? Its entropy? Its pressure? Give an explanation in words in each case. (b) In a reversible adiabatic expansion of a gas, what happens to its internal energy? Its entropy? Its pressure? As in (a) give an explanation in words. (c) Suppose we are sloppy, and the gas expands adiabatically, but irre- versibly. What happens to its internal energy? Its entropy? Its pressure? Justify in words!

10.6 The best refrigerator, running in a cycle and moving heat from low temperature T2 to high temperature T1 with input of work W , would extract the most heat Q2 from the low temperature reservoir for given W . Describe this best possible refrigerator, and find how much heat Q2 it can extract. Chapter 11

Statistical Physics

Since even a very small sample of matter is made of many atoms or molecules, it is natural to think of using statistics, that is, averages, to describe some of its properties. We have already mentioned the statistical theory of molar specific heat in Section 9.6. In this chapter we look at some other uses of statistical ideas in physics, including the simple but effective experimental technique of averaging repeated measurements to reduce error due to ran- domness in the measurement process.

11.1 Ideal Solutions as Ideal Gases

The solute molecules in a solution are much like the molecules of a gas. They are free to move around randomly within certain confines, and the associated disorder (entropy) is just like the corresponding disorder in a gas. They strike the walls of their container and thus exert a pressure just like the molecules of a gas. Albert Einstein was one of the first to make this point. The meaning of “ideal” in an ideal solution is that the solute molecules do not interact with each other, the same as the meaning of “ideal” in an ideal gas. This will be the case if the density of solute molecules (concentration) is not too large. The pressure Π of the solute molecules, called osmotic pressure, is related to concentration c by, essentially, the ideal gas law,

Π = cRT (11.1)

361 362 CHAPTER 11. STATISTICAL PHYSICS where R is the gas constant and T is absolute temperature. In SI units, c = n/V would be expressed in moles/m3, where n is the number of moles of solute molecules in volume V . The ideal gas law in this context is sometimes called the van’t Hoff relation. If we measure concentration C = N/V in 3 molecules/m with N the number of molecules in volume V , so that C = NAc, recalling the Boltzmann constant kB = R/NA, we could also write the van’t Hoff relation as Π = CkBT (11.2)

One very interesting analogy with the ideal gas is the case of a solution with a semi-permeable membrane boundary. Imagine that the solute is in aqueous solution, and that the container is permeable to water but not to the solute molecules (this is the meaning of semi-permeable). It might be a red blood cell, for example, with Na+ and Cl− ions inside. These ions exert an osmotic pressure on the cell membrane. If there is the same concentration outside, then there is a pressure outside to balance the pressure inside (a so- called isotonic medium), but if it is pure water outside, the semi-permeable membrane admits water, diluting the solution inside, the cell swells up, and the osmotic pressure of Na+, say, does isothermal work µ ¶ Ci P ∆V = NkBT ln (11.3) Cf

+ where Ci and Cf are the initial and final concentrations of Na , and similarly for other solute ions, by exactly the arguments that led to Eq (10.4). In fact, the osmotic pressure in this case is enough to swell the membrane into a sphere and rupture it, so that Cf would have to refer to some moment before this happens. The solute molecules in a solution are different from the molecules of an ideal gas in that they interact with solvent molecules. There is no analog of the solvent for a gas. This means that in a solvent, call it solvent 1, there is some average energy of interaction E1 per solute molecule, and in a second solvent there could be a different average energy of interaction E2. Suppose the two solvents are separated by a semi-permeable membrane which is impermeable to the solvent molecules, but allows the solute molecules to go through. How will they partition themselves? One might think that if E1 < E2 then any molecules in solvent 2 have excess energy that they could give up by going to solvent 1, thus lowering their energy until they 11.1. IDEAL SOLUTIONS AS IDEAL GASES 363 can lose no more, finding themselves then at equilibrium. This, however, would be forgetting what equilibrium is for a system with many degrees of freedom. In fact, at constant temperature, the system minimizes not its energy, but its Helmholtz free energy at equilibrium, and this involves not just the energy, but also the entropy. By the argument that led to Eq (10.5) we know that changing the concentration from C1 to C2 at temperature T implies an entropy change of µ ¶ C1 ∆S = NkB ln (11.4) C2 Thus, dividing by N and recalling Eq (10.15), we find the change in Helmholtz free energy per molecule in going from solvent 1 to solvent 2 is µ ¶ C1 ∆F = E2 − E1 − kBT ln (11.5) C2 If this is different from zero, then solute molecules can lower the free energy of the system, i.e., produce ∆F < 0, by going from one solvent to the other. At equilibrium no further lowering of F is possible, and thus the expression above must be zero, so that the equilibrium concentrations obey µ ¶ C1 kBT ln = E2 − E1 (11.6) C2

We check common sense: if E1 < E2, so that we might naively expect all solute molecules to go to solvent 1, we find E2 − E1 > 0, so that C1 > C2. That is, the solute molecules do tend to go to the lower energy environment, but not entirely. The entropy cost of putting all the solute molecules in one solvent is too high (the system is too ordered), and the actual equilibrium is a kind of compromise. The strangest thing about the entropy term, perhaps, is that the entropy change per molecule, µ ¶ C1 ∆S = kB ln (11.7) C2 depends on the concentrations C1 and C2. It is as if the entropy change for adding a molecule to a solution at concentration C were

∆S = −kB ln C (11.8) 364 CHAPTER 11. STATISTICAL PHYSICS

Then removing a molecule from concentration C1 and adding one to concen- tration C2 gives Eq (11.7). The work you can get from the system per molecule depends on the way the molecules are apportioned, even though they do not interact with each other! In particular they do not exert forces on each other. The contribution of the entropy to the free energy is a little bit like the effect of a repulsive force between the molecules proportional to temperature, but that is not at all what is actually going on. There is no force between the molecules in this model of an ideal solution, but they behave a bit as if there were. The statistical tendency of the molecules to spread out suggests a repulsive force that “makes” them do that, even though there is no such thing.

11.2 Statistical Mechanics

The observations of the previous section have far-reaching application through- out biology, chemistry, and physics more generally. If we solve for the ratio of concentrations in Eq (11.6) we find

C1 = e−∆E/kB T (11.9) C2 where ∆E = E1 −E2 is the difference in energy, per solute molecule, between one environment and the other. Now we stop and interpret this expression. The concentrations can be thought of as proportional to the probabilities that a molecule in thermal equilibrium will occupy space in one solution or the other, as if it were a matter of chance where a molecule finds itself. Higher probability in one region would mean higher concentration in that region. Thus we can think of Eq (11.9) as telling us the relative probability of being in region 1 or region 2. It is as if the probability P of being in a given volume in a region depended on the energy E the molecule would have there according to P ∝ e−E/kT (11.10) Then C P e−E1/kB T 1 1 −(E1−E2)/kB T = = −E /k T = e (11.11) C2 P2 e 2 B 11.3. RANDOMNESS 365 correctly reproduces Eq (11.6). This description of the behavior of molecules in terms of probabilities given by Eq (11.10) is called equilibrium statistical mechanics, and Eq (11.10) is called the Maxwell-Boltzmann distribution. In deriving Eq (11.10) we thought about a molecule that could be on one side or the other of a semipermeable membrane, but the result turns out to be much more general than that. Eq (11.10) describes the equilibrium probability distribution of molecules that have available to them any states at all characterized by energy E. For example, molecules in the air have available to them states at different heights, and these are characterized by gravitational potential energy mgh, where m is the mass of the molecule. If we take h = 0 to be ground level, and assume the atmosphere is in thermal equilibrium at temperature T , the probability of a molecule’s being at height h is less than the probability of being at h = 0 by the factor e−mgh/kB T . Thus the density of the atmosphere, and hence the pressure, is also less by this factor. This prediction, that the pressure falls off exponentially with height, is called the “Law of Atmospheres.” It isn’t quite right, because the atmosphere is not really in thermal equilibrium, but it is a reasonable approximation. This is just one example of a quick insight from statistical mechanics. It tells us why the atmosphere doesn’t fall to the ground!

11.3 Randomness

A most intriguing thing about measurement uncertainty is that it often obeys a mathematical law, in spite of being uncertain. The law is statistical, a statement about average values. Suppose we measure something over and over, getting the values X1, X2, ..., XN . Even though we meant to be measuring the same thing again and again, the values are all different. In a situation like this, where we really don’t know the true value, because of measurement uncertainty, we usually find an average value for this quantity X, denoted with brackets < >, X + X + ... + X < X >= 1 2 N (11.12) N i.e., we add up all the values, and divide by the number of values, the usual sense of average, or mean. This is the sample mean of X, because it has been obtained by “sampling”, i.e., repeated measurements. 366 CHAPTER 11. STATISTICAL PHYSICS

Now we try to model this uncertainty mathematically by assuming that for each trial the measured values were a little bit off from the “true” value XT , namely Xi = XT + ei where i = 1, 2, ..., N (11.13)

Here ei = Xi − XT is the little random error in the ith measurement. Then the sample mean of all the Xi is (more or less repeating ourselves) (X + e ) + (X + e ) + ... + (X + e ) < X > = T 1 T 2 T N (11.14) N e + e + ... + e = X + 1 2 N (11.15) T N

Thus the sample mean is the “true” value XT , with a little bit of “wrapping” added, just as Galileo said. In what follows we will call the second term in (11.15) the “error”. Now why would we go to the trouble of making many measurements and averaging? The result < X > still has error, much the way each individual measurement Xi had its error ei. What has been gained? Averaging is only worth doing if the error in < X > is somehow less than it is in an individual measurement Xi, that is, if the error has gone down. This could happen if the individual ei’s were sometimes positive and sometimes negative, so that in adding them up there is considerable cancellation. The whole point of averaging is that we assume that this will happen, and that the sample mean < X > will be less uncertain than any individual Xi. This situation is complicated, and hard to think about. To simplify it we introduce a mathematical operation called expected value, denoted E. Expected value is like an average, but it is not found by sampling, adding, and then dividing by the number of samples. It is a purely theoretical operation, a kind of “ideal average”, simpler than averaging. It is a mathematical model of averaging, intended to model the unattainable case of sampling and averaging an infinite number of times. Instead of doing that, which would be impossible, we just use E. Here is what E gives for error terms involving e1 and e2:

E[e1] = E[e2] = 0 (11.16) 2 2 2 E[e1] = E[e2] = σ

E[e1e2] = 0

In particular, the expected value of e1 is zero, because it is sometimes posi- 2 tive and sometimes negative, and the expected value of e1, which is always 11.3. RANDOMNESS 367 positive, and so can’t average to zero, is a number σ2 which tells us the ex- pected size of the error, just the quantity a good experimentalist is always keeping in mind. Thus σ2 is a definite number that we know, at least ap- proximately, if we are doing our job. The typical size of e1 itself would be the square root of σ2, namely σ (‘sigma’), called the root mean square value of the error e1. The product e1e2 has expected value 0 because it is positive if e1 and e2 have the same sign, and negative if they have different sign, and neither possibility is favored, so that the product might well average, in an ideal sense, to 0. Finally, we imagine that all the uncertainties behave in the same way, because the measurement process is the same each time. Thus 2 2 E[e5] = σ , E[e7] = 0, E[e3e7] = 0, etc. With the operation E we can assess the error term in the sample mean < X >. Applying it to the error in Eq (11.15) we find · ¸ e + e + ... + e E[e ] + E[e ] + ... + E[e ] E 1 2 N = 1 2 N = 0 (11.17) N N

This does not mean that the error term is zero, or that the sample mean < X > will give us the true value XT , but rather that the uncertainty term, or error term, is as often positive as negative, so that an ideal average of it would be zero. That is what E computes. (Note a property of E, coming from the idea of average: the average of a sum is the sum of the averages, and constants like 1/N can multiply outside the average.)

To find the typical size of the error, we should really look at the ex- pected square error, averaging quantities that are positive, so that cancella- tion doesn’t conceal what is going on. Thus we compute "µ ¶ # e + e + ... + e 2 E 1 2 N (11.18) N

Squaring a sum like that is a bit of a mess, so we do it below in case N = 3. It will be clear how to generalize to any N. µ ¶ e + e + e 2 e2 + e e + e e + e e + e2 + e e + e e + e e + e2 1 2 3 = 1 1 2 1 3 2 1 2 2 3 3 1 3 2 3 3 9 (11.19) 368 CHAPTER 11. STATISTICAL PHYSICS

2 2 Now apply E. Terms like E[e1] give σ , and terms like E[e1e2] give zero, according to Eq (11.16). There are only 3 nonzero terms, so "µ ¶ # e + e + e 2 3σ2 σ2 E 1 2 3 = = (11.20) 3 9 3 More generally, in a sample mean with N samples, there would be N non- zero terms, and the expectation value of the square of the error term would be "µ ¶ # e + e + ... + e 2 Nσ2 σ2 E 1 2 N = = (11.21) N N 2 N

Then√ the root mean square error in < X > is the square root of this,√ namely σ/ N, and as N gets large, this does indeed get smaller, because N is in the denominator. This is why it is worth it to average many samples to get a sample mean. The error does indeed go down.

11.4 Brownian Motion

Brownian motion is a very concrete example of adding random quantities together. Since it is described by exactly the mathematical model of the preceding section, it is possible to say a word about it here. Recall from Section 1.4 that Brownian motion is the random jittering motion of a small particle suspended in a liquid. We model the situation by a series of random steps. Let us just think about the motion projected along one direction, so that it is 1 dimensional – for example we think about how the particle is going left and right in the microscope field, ignoring how it may also be going up and down. The description would be the same in the other direction.

We call the random steps e1, e2, etc., and imagine that the expected values are those in (11.15). In particular, since E[e1] = 0, the step is as likely to go left as right. If we call the position of the Brownian particle X, then starting at X = 0 and adding up steps, after N steps we have

X = e1 + e2 + ... + eN (11.22) Applying the expected value E we find, by the same arguments as before, E[X] = 0 (11.23) E[X2] = Nσ2 (11.24) 11.4. BROWNIAN MOTION 369

If each step takes some average time τ (‘tau’), then the total time to take N steps is t = Nτ, so that N = t/τ. The sample mean should approach the expected value if we take enough samples, so we expect an experiment to find t < X2 >= σ2 = 2Dt (11.25) τ where 2D = σ2/τ is a constant. That is, the mean squared displacement is proportional to time. This peculiar proportionality law is actually obeyed by Brownian particles, making it clear that their motion really should be considered random. From a plot of < X2 > vs t you could determine the unknown constant D (called the diffusion constant) from the slope.

Fig. 11.1: A typical Brownian path, a series of random steps 370 CHAPTER 11. STATISTICAL PHYSICS Problems

11.1 (a) Use the Maxwell-Boltzmann distribution to find the scale height of the atmosphere, assuming thermal equilibrium at temperature T = 0◦ C. This is the height at which the pressure is lower than Patm by the factor 1/e. Is the value reasonable? (b) Say in words why the atmosphere doesn’t simply fall to the ground because of its weight.

11.2 Molecules of NaCl in water can dissociate into Na+ and Cl−. As a result, all three species will be present, with concentrations we denote by [NaCl], [Na+], and [Cl−]. The two atoms in the molecule have one energy, E1 when they are bound together as NaCl in an aqueous environment, and a different energy E2 when they are dissociated in the aqueous environment. In going from bound to dissociated, the energy change is ∆E = E2 − E1. (a) Use Eq (11.8) to find the change in entropy in going from bound to dissociated. (b) In thermal equilibrium, the concentrations must be such that ∆F = ∆E − T ∆S is zero, since otherwise the system could lower its free energy. What does this imply about the equilibrium concentrations? (c) ∆E is quite negative in this situation, i.e., the molecule lowers its energy by dissociating. What does this imply about the equilibrium concen- trations?

371 644 CHAPTER 11. STATISTICAL PHYSICS Useful Values

We collect here useful natural constants and values. Symbol Meaning Value First Appearance AU astronomical unit 1.50 × 1011 m 4.1.4 −5 BEarth Earth’s magnetic field 0.5 G = 5 × 10 T 17.2 c speed of light 3.00 × 108 m/s 2.7 cwater specific heat of water 1 cal/g·K=1 Cal/kg·K 9.5 cal calorie 4.18 J 9.5 e elementary charge 1.60 × 10−19 C 14.3 eV electron volt 1.60 × 10−19 J 14.3 −12 ²0 permittivity of vacuum 8.854 × 10 F/m 14.4.1 −3 ηwater viscosity of water 10 Pa·s 8.12 G Newton’s gravitational constant 6.67 × 10−11 N·m2/kg2 6.6 g acceleration due to gravity 9.8 m/s2 1.1 h Planck’s constant 6.63 × 10−34 J·s 6.7 ~ h-bar h/2π = 1.05 × 10−34 J·s 6.7 in inch 2.54 cm 2 jsolar solar energy current density 1 kW/m 12.10 k Coulomb’s k 9.0 × 109 N·m2/C2 14.5 −23 kB Boltzmann’s constant R/NA = 1.38 × 10 J/K 9.4 lb pound weight 4.45 N 5.4 mi mile 1.6 km

645 646 CHAPTER 11. STATISTICAL PHYSICS

Symbol Meaning Value 1st Appearance −31 me mass of electron 0.000549 u= 9.11 × 10 kg 20.2 −27 mp mass of proton 1.00726 u= 1.67 × 10 kg 20.2 −27 mn mass of neutron 1.008665 u= 1.67 × 10 kg 20.2 −7 µ0 permeability of vacuum 4π × 10 H/m 17.9 23 NA Avogadro’s number 6.022 × 10 9.4 5 2 Patm atmospheric pressure 10 N/m = 760 mm Hg 8.7 R gas constant 8.31 J/K·mole=2 cal/K·mole 9.4 6 RE radius of the Earth 6.37 × 10 m 4.10 3 3 3 ρwater density of water 1 g/cm = 10 kg/m 8.1 σ Stefan-Boltzmann constant 5.67 × 10−8 W/m2K4 18.9 u atomic mass unit 1.66 × 10−27 kg= 931.44 Mev/c2 20.2