<<

Foundations of Physics II PHYS-2302/6 Fall Term Preliminary Notes Gabor Kunstatter University of Winnipeg September 2016

The following contains preliminary lecture notes for the first half of a two term second year course on the Foundations of Physics. They are based on the lectures given in 2015 and will undoubtedly be revised as things progress.

1 Contents

1 Introduction 5

2 Symmetry 6 2.1 Introduction: Role of Symmetry in Modern Physics ...... 6 2.2 Formal Aspects of Symmetry ...... 7 2.2.1 Definition of a Symmetry Operation ...... 7 2.2.2 Rules obeyed by symmetry operations ...... 8 2.3 Examples of groups of symmetry operations ...... 9 2.4 Symmetries were made to be broken ...... 11 2.5 Symmetries and Conservation Laws: Noether’s Theorem ...... 11

3 Spatial Symmetries 13 3.1 Review of Vectors ...... 13 3.1.1 Position vector ...... 13 3.1.2 Vector Operations ...... 14 3.2 Transformations of the plane ...... 16 3.2.1 Translation ...... 16 3.2.2 Rotations ...... 16 3.2.3 Reflections ...... 17 3.3 Linear Transformations and matrices ...... 17 3.3.1 Rotations ...... 19 3.3.2 Reflections ...... 19 3.3.3 Matrix Representation of Permutations of Three Ob- jects Optional...... 20 3.4 Pythagoras and Geometry ...... 21

2 4 22 4.1 Preliminaries ...... 22 4.1.1 Paradigm Shifts ...... 22 4.1.2 Reference Frames ...... 23 4.1.3 Diagrams ...... 23 4.2 Derivation ...... 26 4.2.1 Fundamental Postulate (Newtonian and Einstein) . . . 26 4.2.2 Galilean Relativity ...... 26 4.2.3 The problem with Galilean Relativity ...... 28 4.2.4 Maxwell’s Equations ...... 29 4.2.5 Michelson-Morley Experiment ...... 30 4.3 Consequences ...... 31 4.3.1 Relativity of Simultaneity ...... 32 4.3.2 Dilation ...... 36 4.3.3 ...... 40 4.3.4 Lorentz Transformations ...... 43 4.3.5 Relativistic Energy and Momentum ...... 53 4.3.6 Final Notes ...... 63

5 64 5.1 Problems with Newtonian Gravity ...... 64 5.2 Einsteins Thinking: the Strong Principle of Equivalence . . . 64 5.3 Geometry of Spacetime ...... 66 5.4 Some Consequences of General Relativity: ...... 67 5.5 Black Holes ...... 68

6 Introduction to the Quantum 69 6.1 Particle or Wave? ...... 69 6.2 Light as particles: early experimental hints ...... 70 6.2.1 Blackbody Radiation: the Ultraviolet Catastrophe . . 70 6.2.2 Photoelectric Effect: not covered this year ...... 73 6.2.3 Compton Effect ...... 74 6.3 Particles as waves ...... 75 6.3.1 deBroglie Wavelength ...... 75 6.3.2 Observational Evidence ...... 77 6.4 The Heisenberg Uncertainty Principle ...... 79 6.5 Non-Locality in Quantum Mechanics ...... 80 6.5.1 The EPR Paradox ...... 80

3 6.5.2 Bell’s Inequalities ...... 80

7 The Wave Function 82 7.1 Classical description of the state of a particle ...... 82 7.2 Quantum description of the state of a particle ...... 83 7.3 Interpretation ...... 84

8 Momentum 86 8.1 Momentum and wave number ...... 86 8.2 The momentum operator ...... 89

9 The Schrodinger Equation 91 9.1 Energy Operator ...... 91 9.2 Stationary states ...... 92 9.2.1 Example 1: Free Particle ...... 93 9.2.2 Example 2: Particle in a Box ...... 94 9.2.3 Example 3: Simple Harmonic Oscillator ...... 95 9.3 The time dependent Schrodinger equation ...... 97

10 Conclusions 99

11 Appendix: Mathematical Background 100 11.1 Complex Numbers ...... 100 11.2 Probabilities and expecation values ...... 102 11.2.1 Discrete Distributions ...... 102 11.2.2 Continuous probability distributions ...... 103 11.2.3 Dirac Delta Function ...... 107 11.3 Fourier Series and Transforms ...... 109 11.3.1 Fourier series ...... 109 11.3.2 Fourier Transforms ...... 112 11.3.3 The mathematical uncertainty principle ...... 114 11.3.4 Dirac Delta Function Revisited ...... 114 11.3.5 Parseval’s Theorem ...... 115 11.4 Waves ...... 115 11.4.1 Moving pure waves ...... 115 11.4.2 Complex Waves ...... 117 11.4.3 Group velocity and phase velocity ...... 117 11.4.4 Wave packets ...... 119

4 Chapter 1

Introduction

The goal of physics is synthesis. We wish to take diverse physical phenomena and understand them within the context of a single model or theory. In order to test a theory we then try to apply it to phenomena not previously observed, i.e. make predictions. It is this last step that makes it falsifiable, a necessary condition for a model to constitute an acceptable theory. The rules by which physics is played are indistinguishable from the rules of mathematics, with one additional rule: the assumptions and logical con- clusions must not only be internally consistent, but they must be testable. Physics and mathematics are two sides of the same coin. Mathemat- ics provides the logical structures that are at the foundations of physics. Physics provides mathematics with new ideas for creating new logical struc- tures, mostly out of necessity: structures that Nature is implementing. It is impossible to disentangle physics from mathematics. This is a more profound statement than the usual “mathematics is the language of physics”. Examples where math and physics are literally two sides of the same coin: • Differential equations–Newtonian mechanics • Fourier transform theory–complementarity in quantum mechanics • Partial differential equations–general relativity • Differential geometry–general relativity • Group theory–Symmetry Symmetry turns out to be a fundamental concept in both special relativity and many applications in quantum mechanics.

5 Chapter 2

Symmetry

2.1 Introduction: Role of Symmetry in Mod- ern Physics

Physics is ultimately a quest to understand nature in terms of a few underly- ing principles. The goal is synthesis: to bring together within a single, unified framework as many different physical phenomena as possible. Examples of such synthesis include:

• Newton’s Law of Universal Gravitation, which described terrestrial and celestial motion within a single theory.

• Maxwell’s theory of electrogmagnetism, which unified electricity and magnetism.

• Einstein’s Special Relativity, which reconciled Maxwell’s theory with Newtonian mechanics.

• Electroweak theory: unified Maxwell’s theory and the weak interac- tions, with the help of the God particle (i.e. the Higgs)

In their quest for synthesis physicists are guided by somewhat subjective notions, such as simplicity and elegance. One important concept is the notion of symmetry. Everyone has an intuitive notion of what symmetry means. Symmetry plays an important role in art and aesthetics: pictures and peoples faces are more attractive if they are symmetrical, but not too symmetrical. While closely related to aesthetics, it is a concept that allows for rigorous

6 quantitative analysis, and leads directly to a field of mathematics called group theory. Symmetry plays a foundational role in modern physics. It is used (sometimes overused) as a guiding principle to decide on the “attractiveness” of theories and provide guidelines for constructing new theories: the more symmetry the better and certain symmetries (eg translational invariance, and invariance under “boosts”) are thought by some to be “sacred”. There are sometime surprises, however. As we will discuss a bit later, it was shown experimentally that the laws of physics are not invariant under reflection through a mirror. One of the most important features of symmetry is its connection to con- served quantities. Noether’s theorem (Emmy Noether, 1915) proved that every continuous global symmetry of the action for a given theory has asso- ciated with it a conserved “charge”. This is of great physical significance and also provides a tool for solving complicated partial differential equations. Finally, symmetry turns out to be an indispensible practical tool for sim- plifying calculations. For example, in my research on black holes, I assume that black holes are perfectly round and smooth (not a bad approximation in most cases). This reduces significantly the number of degrees of freedom I have to consider and hence makes the problem (such as quantization of black holes) more tractable, hopefully without removing the essential features.

2.2 Formal Aspects of Symmetry

2.2.1 Definition of a Symmetry Operation Clearly the notion of symmetry is important in modern physics, but in order to be useful to science, one must be able to quantify it: what does it mean to say that one object, or one spacetime, or one theory is more symmetrical than another? The basis for such a quantification is the following definition due originally to Herman Weyl: A symmetry operation is an action on an object, set of objects or set of equations (laws of physics) that leaves the system invariant, i.e. looking the same as before. Examples of operations on objects or systems of objects: translations, rota- tions, reflections, interchanging them (permutations) Examples of operations on systems of equations (laws of physics): same as above, applied to coordinates; taking linear combinations (permuting) of in-

7 dependent degrees of freedom (fields) in the equations; One says that one object, or system of equations, is more symmetric than another if there are more symmetry operations that leave it invariant.

2.2.2 Rules obeyed by symmetry operations We will use upper case latin letters, A, B, ..., to denote symmetry operations. When considering two or more operations, the order that they are carried out is important. AC denotes the action of C first followed by A. The reason for this initially counter-intuitive convention will be explained a little later on. Consider the set of all possible symmetry operations of some object, i.e. the set S := {A, B, C, .....}. This set of symmetry operations evidently obeys the following: 1. If one performs one operation, C ⊂ S, say and follows it by another, B say then the combined operation must also belong to S. That is G := BC ⊂ S. Note: The operation CB must also belong to S but there is no reason that BC and CB must correspond to the same symmetry operation in general. 2. Somewhere in the set S is the operation I, say, that corresponds to doing nothing at all, since this definitely leaves the object unchanged. 3. Any given symmetry can be undone, i.e. reversed, by another op- eration, which must by definition also be a symmetry operation and therefore belong to the set S. More technically for every A, there ex- ists an H such that HA = I. There must also be an operation K, say, that is undone by A, so that AK = I. 4. Suppose one performs three operations one after the other: ABC. This must produce an unambiguous outcome that corresponds to some other symmetry operation. However we know from the above that (BC) := F is a symmetry operation, and that (AB) := G is also a symmetry operation. For the combination of three symmetry operations in a row to be unambigous the result must be independent of how you choose to associate the operations. In particular: AF := A(BC) ≡ (AB)C =: GC (2.1)

8 Note:

1. Mathematically speaking the very physical notion of a set of symme- try operations actually defines something called a “group”, so that the study of symmetries leads naturally to the mathematical discipline called group theory. This is one of many examples in which it is diffi- cult, if not impossible, to disentangle physics from mathematics. They are two sides of the same coin.

2. There is no physical or mathematical reason for the operation A · B to be the same as B · A. If it is the same, we say that A commutes with B. If this is true for all elements of S then we say that the group of operations is commutative.

2.3 Examples of groups of symmetry opera- tions

• Doing nothing is constitutes a rather trivial“set” of symmetries for any system: S = I. The multiplication table that satisfies all of the above is I · I = I.

• The set of all possible permutations of two identical particles (a, b). One symmetry operation is I, i.e. doing nothing. The other operation we will call P (for permutation), takes (a, b) → (b, a). You can verify that the complete multiplication table is: I · I = I, I · P = P · I = P , P · P = I. This means that P is its own inverse. This group of operations is commutative: it doesn’t matter what order you do any pair of operations. Note that since a and b are identical, the labels are arbitrary, but we do have to find some way of distinguishing between them otherwise we cannot keep track of the operations. Try to figure out the multiplication table for the two identical objects that don’t have different names: (a, a) ;-).

• The set of all possible permutations of three objects (a, b, c):

9 Exercise 1: Construct all distinct elements of this group:

 a   a   a   b  I :  b  →  b  ; A :  b  →  a  (2.2) c c c c  a   a   a   c  B :  b  →  c  ; C :  b  →  b  (2.3) c b c a  a   c   a   b  F :  b  →  a  ; G :  b  →  c  (2.4) c b c a

Note that the operation is defined in terms of its action on the elements in specific locations in the vector (i.e. A interchanges top and middle) and not by the action on specific letter names (interchange a with b). Exercise 2: Construct the multiplication table for this group of symmetry operations (see assignment). Verify explicitly that it is associative. Is the multiplication table commutative?

• The set of rotations that leave a square invariant.

S = {R(0), R(90), R(180), R(270)}. (2.5)

Note that: R(360) = R(0).

• The set of rotations that leave a hexagon invariant:

S = {R(0), R(60), R(120), R(180), R(240), R(300)}. (2.6)

• The set of rotations that leave a circle invariant

S = {R(θ), 0 ≤ θ < 2π} (2.7)

• Note that rotations in a plane commute:

R(θ2) ·R(θ1) = R(θ1 + θ2) = R(θ1) ·R(θ2) (2.8)

• The set of rotations of a cube (non-commutative)...

10 2.4 Symmetries were made to be broken

• If symmetries are perfect, then they cannot be resolved, need to break them by hand to keep track of operations.

• The laws of physics are thought to be invariant under translations, rotations, time reversal. The universe itself, at small scales at least, does not possess that much symmetry, thank goodness, since it would be a very boring place indeed. The question is, where did this observed diversity come from? Partial answer: initial conditions at the beginning of the Universe, i.e. the Big Bang. But then you have to asked, who ordered these initial conditions?

• Spontaneously broken symmetries: This happens when a set of equa- tions that are invariant under some symmetry operations have solutions (minimum energy solutions in fact) that do not respect all the symme- tries. Symmetry breaking exists in real world and play an important role in particle physics. The main role of the famous Higgs particle is to break the symmetry associated with the equations describing the unification of electromagnetism and the weak interaction. Without the presence (non-zero expectation value) of the Higgs, many of the par- ticles in the model would not have masses, and the world would look very different indeed.

• Reflections (eg y → −y) are a good symmetry of Newtonian mechanics. This is called invariance under “parity” . Suprisingly, the quantum world does not respect parity (“violates parity”). This possibility was proposed by T.D. Lee and C.N. Yang and confirmed experimentally by C-S Wu in 1956. Lee and Yang received the 1957 Nobel Prize for their prediction.

2.5 Symmetries and Conservation Laws: Noether’s Theorem

Noether’s theorem is one of the most important, if not the most important, theorem in modern physics. The theorem was discovered by Emmy Noether, Emmy Noether (1882-1935) who made important contributions to abstract algebra as well as theoretical physics.

11 The theorem states that there exists a conserved quantity (charge) corre- sponding to every continuous global symmetry of the equations describing a system. Applications of Noether’s theorem are ubiquitous in physics: sym- metry under spatial translations yields linear momentum as the conserved charge, while time-translational invariance implies energy conservation. You may wonder where electric charge conservation comes from. Since we have (almost) run out of space-time symmetries. In fact charge conservation comes from an “internal symmetry” associated with the quantum descrip- tion of charged particles interacting with an electromagnetic field. As we will discuss in the section on quantum mechanics, charged particles can be described by complex wavefunctions interacting with a four-vector potential. The symmetry in question is a global constant change of the phase of the electron wave functions. The above uses a lot of jargon that you will come to understand over the course of this year, and possibly next.

12 Chapter 3

Spatial Symmetries

3.1 Review of Vectors

3.1.1 Position vector The following should be review and serves mostly to establish notation:

• We start by considering the points on a two dimensional plane. Once a set of cartesian x and y coordinate axes are specified, including scale, one can locate any point by specifying two numbers (x, y), the coor- dinates, which tell you how far along each axis you need to move to locate the point.

• Each point can also be found by specifying its position vector r. When the tail of r is placed at the origin, its tip sits on the point.

• The relationship between the coordinates and the position vector is:

r = xˆi + yˆj (3.1)

where ˆi and ˆj denotes a unit vector located at the origin pointing along the x-axis and y-axis, respectively.

• It is common to denote the vector r by the “column vector” consisting of its components  x  r = (3.2) y

13 3.1.2 Vector Operations • Addition/Subtraction: Add/subtract vectors by adding/subtracting their components:       x1 x2 x1 ± x2 r1 ± r2 = ± = (3.3) y1 y2 y1 ± y2

• Magnitude: The magnitude squared (length squared) of position vector is given by Pythagorus’ theorem:

|r|2 = r · r = x2 + y2 + z2 (3.4)

The magnitude is therefore: √ p r := r · r = x2 + y2 + z2 (3.5)

The notion of magnitude generalizes to other types of vectors: Velocity Vector: dr v := dt ˆ ˆ ˆ = vx(t)i + vy(t)j + vz(t)k dx dy dz = ˆi + ˆj + kˆ (3.6) dt dt dt Note that in general the time derivative of a vector is obtained (in cartesian coordinates) by differentiating the components with respect to time. For example instantaneous accelaration is: ˆ ˆ ˆ a(t) = ax(t)i + ay(t)j + az(t)k dv dv dv = xˆi + y ˆj + z kˆ dt dt dt d2x d2y d2z = ˆi + ˆj + kˆ (3.7) dt2 dt2 dt2

14 The magnitudes squared of the velocity and acceleration are:

2 2 2 2 |v| = v · v = vx + vy + vz √ q 2 2 2 v = v · v = vx + vy + vz 2 2 2 2 |a| = a · a = ax + ay + az √ q 2 2 2 a = a · a = ax + ay + az (3.8)

• Dot Product: We have been defining lengths in terms of the dot product of a vector with itself, as the notation suggests. One can also take the dot product of two different vectors. For example:

v · a := vxax + vyay + vzaz (3.9)

The dot product allows you to calculate the angle θ between two vec- tors. For example the angle between the velocity and acceleration of a particle is: v · a cos(θ) = (3.10) av Properties of the Dot Product

– Symmetric: a · b = b · a – Linear a · (pb + qc) = pa · b + qa · c

• Vector product: The vector product of two vectors lying in the x − y plane is ˆ a × b := (axby − aybx)k (3.11)

We won’t bother with the three dimensional generalization, except to say that the general definition of a × b produces a new vector c whose direction is perpendicular to both a and b, with “sense” given by right hand rule, and magnitude |c| = ab sin(θ), where θ is the angle between them. This definition of course applies to (3.11). Note that if a and b are parallel, then a × b = 0.

15 3.2 Transformations of the plane

3.2.1 Translation One can take any set of points on the plane and move all of them over by some constant vector  d  d = x (3.12) dy simply adding this vector to the position vectors of the points. The transla- tion D then has the following action on all points (vectors) on the plane:

D := r → ˜r = r + d  x + d  = x (3.13) y + dy

This is equivalent geometrically to just taking the origin and moving it by −d. It is assumed that the laws of physics are invariant under translations. This means that the laws of physics are the same everywhere. As long as your laboratory is isolated from its surroundings, it doesn’t matter where in the universe you do the experiment, you will get the same result.

3.2.2 Rotations A rotation is defined as linear transformation that leaves the lengths of vec- tors unchanged:

 x   x˜   cos(θ(x) + sin(θ)y  r = → ˜r = = (3.14) y y˜ − sin(θ)x + cos(θ)y

|˜r|2 := ˜r · ˜r =x ˜2 +y ˜2 (3.15) = r · r = x2 + y2 (3.16)

This linear transformation rotates the vector r by an angle θ in the clockwise direction about the origin (for positive θ). If θ is negative, then the rotation is in the counterclockwise direction. It should be obvious that R(−θ) is the matrix inverse of R(θ), but you can verify this explicit by performing one transformation after the other.

16 3.2.3 Reflections A reflection through the x-axis is given by:

 x   x  → (3.17) y −y

This also preserves lengths. To reflect through any other axis, say one that is rotated counterclockwise by an angle θ relative to the x-axis, one simply needs to first rotate the vector through θ in the clockwise direction in order to line up the reflection axis with the x-axis, then reflect through the x-axis, then rotate back (counterclockwise) by θ to get the reflection axis back to where it was. Exercise: Check that this works for reflections through the x = y axis, i.e. θ = 45%.

3.3 Linear Transformations and matrices

A linear transformation, T , of the points on a plane is a rule for taking any point and moving it to another point as follows:

 ax + by  T : r → ˜r = (3.18) cx + dy

It is characterized by four numbers (a, b, c, d), which define the transfor- mation and act exactly as above on all points in the plane. It is useful to separate the transformation from the points/vectors on which it acts, so it is often collected into a matrix ( array):

 a b  T = (3.19) c d

Using this notation the transformation above reads;

˜r = T · r  a b   x  = · c d y  ax + by  = (3.20) cx + dy

17 The above, which physically defines the linear transformations of points on a plane, also serves to define mathematically the left multiplication of a vector by a matrix. Again, two sides of the same coin. So what happens if you do two linear transformations one after the other? Define:  a b   e f  T = ,T = (3.21) 1 c d 2 g h From the above rules we know that after the first transformation:  ax + by  r → ˜r = T · r = (3.22) 1 cx + dy

Now apply T2 to ˜r:  e(ax + by) + f(cx + dy)  ˜r →˜r = g(ax + by) + h(cx + dy)  (ea + fc)x + (eb + fd)y  = (3.23) (ga + hc)x + (gb + hd)y

The last line looks like a new linear transformation,T3 with elements:  ea + fc eb + fd  T = (3.24) 3 ga + hc gb + hd

The above follows automatically from the expected properties of linear trans- formations. It also defines the product of two matrics as follows:

T3 = T2 · T1  e f   a b  = g h c d  ea + fc eb + fd  = (3.25) ga + hc gb + hd

From the notion of linear transformations we have derived the two basic relations of linar algebra: the multiplication of a vector by a matrix and the multiplication of a matrix by a matrix. As you might imagine, we will now use linear transformations to provide a concrete representation of symmetry operations on two dimensional objects. In particular, we will consider two very special types of linear transforma- tions.

18 3.3.1 Rotations A rotation is defined as linear transformation that leaves the lengths of vec- tors unchanged:

|˜r|2 := ˜r · ˜r =x ˜2 +y ˜2 (3.26) = r · r = x2 + y2 (3.27)

You can verify that the following matrix rotates the vector r by an angle θ in the clockwise direction about the origin (for positive θ). If θ is negative, then the rotation is in the counterclockwise direction.

r → ˜r = R(θ) · r (3.28) with  cos(θ) sin(θ)  R(θ) = (3.29) − sin(θ) cos(θ) You can also check that R(−θ) is the matrix inverse of R(θ), where the matrix inverse A−1 of A is defined as:

A−1 · A = A · A−1 = I (3.30) where I is the identity matrix:

 1 0  I = (3.31) 0 1

3.3.2 Reflections A reflection through the x-axis is given by:

 x   x  → (3.32) y −y

This also preserves lengths. It’s matrix representation is:

 1 0  R := (3.33) 0 −1

To reflect through any other axis, say one that is rotated counterclockwise by an angle θ relative to the x-axis, one simply needs to first rotate the

19 vector through θ in the clockwise direction in order to line up the reflec- tion axis with the x-axis, then reflect through the x-axis, then rotate back (counterclockwise) by θ to get the reflection axis back to where it was:

 cos(θ) − sin(θ)   1 0   cos(θ) sin(θ)  R(θ) = sin(θ) cos(θ) 0 −1 − sin(θ) cos(θ)  cos2(θ) − sin2(θ) 2 cos(θ) sin(θ)  = (3.34) 2 cos(θ) sin(θ) − cos2(θ) + sin2(θ)

Exercise: Check that this works for reflections through the x = y axis, i.e. θ = 45%.

3.3.3 Matrix Representation of Permutations of Three Objects Optional Consider the group of permutations {I, A, B, C, G, F } of three identical ob- jects as defined in class.

• Deduce the multiplication table for this group of symmetry operations.

• Represent each element by a 3x3 matrix.

• Verify explicitly that the matrices obey the multiplication table in part (3.3.3) above in the following cases: F · C, C · F and F · F .

20 3.4 Pythagoras and Geometry

We take Pythagoras’ Theorem for granted. For any right trangle, the square of the length of the hypotenuse is equal to the sum of the squares of the lengths of the other two sides. One way to interpret it is as a definition of length in terms of displacements along two orthogonal axes (x and y, for example). The distance ∆sbetween any two points A and B on the plane is:

|∆s|2 = |∆x|2 + |∆y|2 (3.35) where ∆x = xB − xA and δy = yB − yA. In this sense it defines the geometry of the space in which we live to be Euclidean. From the one theorem you can derive all the other properties of Euclidean geometry that we all know and love: parallel lines never meet, the sum of the angles in a triangle is 180%, the cosine law, etc. Other, non-Euclidean, geometries are possible, each with its own version of Pythagorus. For example, if you live on the surface of a sphere of radius R (wait a minute, we do: it’s called the Earth) you can locate your location by specifying usual two angles, θ, and φ. θ tells you your location along a line of longitude relative to the north pole, hence specifies your line of latitude. φ tells you your location along the lines of latitude at that longitude, hence determines your line of longitude. The corresponding Pythagorus law for infinitesmal displacements is:

|ds|2 = R2(dθ2 + sin(θ)2dφ2) (3.36)

Because the right hand side depends explicitly on θ, namely how far along a line of latitude you are, the full non-linear relationship (i.e. for finite dis- placements) is more difficult to calculate than the flat earth society would have you believe. We can nonetheless determine some properties of the ge- ometry associated with the surface of a sphere: the sum of the angles in a trangle is greater than 180% and parallel lines do meet. In order to fully appreciate the essence of special relativity we need to understand not just the (Euclidean) geometry of space, but in fact the (Lorentzian) geometry of the space-time continuum. This was the great paradigm shift set in motion by Einstein’s special in 1905 and completed with his general theory of relativity in 1915.

21 Chapter 4

Special Relativity

Resources: Chapter 39 of text, and online lecture notes: Modern Technology Lecture Notes

4.1 Preliminaries

4.1.1 Paradigm Shifts Every now and then our understanding of how the universe works gets turned on its head. Such paradigm shifts (a phrase coined by Thomas Kuhn, 1962) have a deep and long term impact on scientific and to some extent sociological development. The following lists some of the most important paradigm shifts in physics: 1. 16th Century: Before Copernicus, it was thought that the earth was the center of the universe and that the heavenly bodies stars and planets moved around it on trajectories based on the most perfect (i.e. most symmetrical) geometrical shape, the circle. Copernicus argued that the motions of the planets could most easily be understood by assuming that they, and the earth move around the Sun. This was radical change of perspective: humans no long occupied a special place at the center of the universe. 2. 17th Century: Newton showed that mechanistic laws governed the be- haviour of both terrestrial and heavenly bodies. The resulting view of a “clockwork universe” played a big role in fueling the industrial revolution.

22 3. 1905: Special relativity implied that time is not absolute, but relative and must be united with space to form a single “space-time continuum”.

4. early 1900’s: quantum mechanics implied, much to Einstein’s distaste, that “God rolled dice”, i.e.that at the microscopic level the outcome of experiments was determined by probabilities and cannot be uniquely predicted in general, even in principle. Moreover it was later shown theoretically (Bell, 1964) and experimentally (Aspect, 1981) that the quantum world contained a “spooky action at a distance” called quan- tum entanglement. So much for Newton’s “clockwork universe’.

5. 2015 General relativity: the “space-time continuum” is not a fixed arena in which events play out, but in fact it is a dynamical participant.

It is interesting to note that Einstein played a major role in the last three paradigm shifts. It seems to me at least that we are ready for another paradigm shift.

4.1.2 Reference Frames A (or “frame” for short) is simply a meter stick (or set of three perpendicular meter sticks in three spatial dimensions) with a clock attached. One sometimes talks of an observer in a particular frame of reference. An Inertial Frame is a frame of reference that is moving at constant speed. Inertial frames are special (preferred) in Newtonian physics (and in spe- cial relativity), because a body in motion tends to stay in motion. You can detect whether or not your are in an inertial frame of reference by whether or not there are any spurious accelerations in the absence of external forces.

4.1.3 Spacetime Diagrams Snapshots of space at successive stacked on top of each other. Each point is the location of an event or potential event, labelled by a spatical coordinate (location) and time coordinate (time). Trajectories mapped out by moving objects are called world-lines. It is useful (for reasons we will see later) to choose units for the axes such that light rays are described by

23 world-lines at 45◦K. Although such units seem far fetched, the simplest is light-years for distance and years for time, so that the is 1 light-year per year. Note that a world-line whose slope is greater than 1 moves indicates an object moving slower than light, wherease if the slope is less than 1, the speed is greater than 1.

Figure 4.1: Two inertial frames O and O0 moving at constant speed relative to each other see same event (explosion) but assign it different spacetime coordinates (x, t) and (x0, t0) respectively.

24 Figure 4.2: Successive snapshots of the motion of a particle in a single refer- ence frame stacked on top of each other to form a

Figure 4.3: Spacetime diagram showing of observer O0 and light rays coming from two different events A and C.

25 4.2 Derivation

4.2.1 Fundamental Postulate (Newtonian and Einstein) The laws of mechanics are the same in all inertial frames.

Note that this is a symmetry principle, on par with the invariance of the laws of physics under time translations, spatial translations and rotations.

4.2.2 Galilean Relativity Galilean Relativity refers to the transformations that relate coordinates in one inertial frame to those in another in Newtonian mechanics. One fundamental, implicit assumption was that time is universal: it moves at the same rate for all observers. Then, for the two frames shown in (4.1) the coordinates (x0, t0) of the “explosion” as measured by O0 are related to the coordinates (x, t) of O by:

∆t0 = ∆t (4.1) ∆x0 = ∆x − v∆t (4.2)

Addition of velocities in Galilean Relativity:

∆x0 U 0 = ∆t0 ∆x − v∆t = ∆t ∆x = − v ∆t = U − v (4.3)

As expected velocities add linearly. Unfortunately, this nice intuitive picture has been proven experimentally to be incorrect when at least one of the velocities is large.

26 Figure 4.4: Addition of velocities in Galilean relativity

27 4.2.3 The problem with Galilean Relativity 1. Consider the following set-up: Alice in a stationary (relative to Bob) spacecraft. She sends a light ray from a mirror at the bottom of her spacecraft to a mirror directly above it at the top. The mirrors are three light seconds apart (it is a big spacecraft). Both Alice and Bob measure 3 seconds for light transit time between mirrors.

2. Now put Alice in same spacecraft, moving at speed v = 4c/5. (See Fig.(4.9), Alice is observer O0, Bob is O.) Bob sees light ray move a horizontal distance of 4 lt-sec as it makes its way from bottom to top. The top to bottom light transit time measure by Bob is 5 seconds (since speed of light is independent of motion of source).

3. Question: Does Alice still measure 3 seconds, or 5 seconds same as Bob, for the top to bottom light transit time? Answer(s): (a) Galilean physics requires 5 seconds, so that time is absolute. (b) requires that she still measure 3 seconds ei- therwise she would know she is moving.

4. Question: What does each of the above imply about the speed of light that Alice measures? Answer(s): (a) According to Galilean physics light is moving more slowly in her frame, namely at c’= 3lt-sec/5sec=3c/5. (b) The principle of relativity implies that she measures the same speed as Bob, despite the fact that she is moving relative to him.

5. Question: Does (a) necessarily violate the principle of relativity? Answer: No! One can reconcile the fact that Alice knows she is moving with relativity by supposing the existence of a medium, the aether, through which the spaceship is moving and through which the light propagates. This medium implies the existence of , i.e. at rest relative to aether, which explains the apparent violation of the spirit of SR. In particular, if both Alice and the aether were moving, she would have no way of detecting this motion. The possible existence of such an aether was ruled out by experiment, i.e. Michelson Morley.

Conclusion: In the absence of a (detectible) aether, the principle of rela- tivity forces us to conclude that the speed of light is constant, independent

28 of the motion of either the source or the observer. This is very different from usual waves propagating through a fixed medium, in which motion of the observer through the medium would result in a different observed wave propogation speed.

4.2.4 Maxwell’s Equations In Maxwells theory of electromagnetism, the speed of light is given by: 1 c = √ (4.4) 0µ0

−12 where 0 = 8.854 × 10 Farads/meter is the premittivity of free space that appears in Coulomb’s law:

1 q1q2 FC = 2 (4.5) 4π0 r It is the electrostatic analogue of Newton’s constant. −7 2 µ0 = 4π × 10 Newton’s/Ampere is the permeability of free space. It appears in Ampere’s force law, relating the magnetic force per unit length between two very long, straight current carrying loops:

F µ I2 m = 0 (4.6) L 2π r where I is the current in each and r is the perpendicular distance between them. It is therefore the magnetic analogue of Newton’s constant. Both µ0 and 0 are fundamental constants of nature in the theory. Thus according to (4.4), c also appears to be a universal constant. There are two possibilities. Either Maxwell’s equations look very different in different iner- tial frames, or the speed of light is the same in all inertial frames, independent of the motion of the emitter or the detector. It is the latter (independent of the motion of the detector) that was problematic for all physicists except Einstein. Physicist thought of waves as propagating through a medium, like water waves, or sound waves. Since light propagates everywhere, it was natural to assume that all of space was filled with a medium (the ether, through which light propagates. The speed of these waves doesnt change if the emitter is moving relative to the medium, but it does change if the detector moves relative to the medium.

29 4.2.5 Michelson-Morley Experiment This was the first experiment that tried to detect the Earths motion through the ether. Heuristically, a light beam was split into two pieces that propa- gated in perpendicular directions before rejoining. See figure 4.5. Suppose we are moving with a speed v to the right through the ether (or the ether is moving with a speed v to the left past us.

Figure 4.5: Michelson-Morley experiment ScienceWorld.wolfram.com

The portion of the beam travelling parallel to the ether has speed relative to the lab frame c − v when it is moving against the ether and speed c + v when it is moving with it. The time of flight to traverse L to the right and then L to the left is: L L 2Lc ∆t = + = (4.7) 2 c − v c + v c2 − v2

The component of velocity vperp of the beam that perpendicular to the flow of the ether is: √ 2 2 vperp = c − v (4.8) To see this, note that during the time ∆t the beam takes to get from the splitter to the top mirror, the lab has traveled a distance v∆t in the horizontal direction. The total distance the light has travelled in getting from the splitter to the top is: q 2 2 2 2 D = c∆t = vperp(∆t) + v (∆t) (4.9)

Solving for vperp gives the above result.

30 The time of flight along L and back is then: 2L ∆t1 = √ (4.10) c2 − v2 By doing a Taylor expansion in v2/c2, and assuming v << c, the difference in time of flight is approximately: L v2 v3  ∆t − ∆t = + O (4.11) 2 1 c c2 c3 where L is the distance the separated light rays moved in perpendicular di- rections. v is the speed of the earth through the ether. Given the speed of the earth around the sun and the motion of the sun through the Milky Way, this speed should roughly 30 km/s. This time difference should be detectable via the interference pattern produced when the two beams merge. The experi- ment produced a null result, as did all subsequent attempts to measure the motion of the earth through the ether. Thus, there is no detectable aether relative to which Alice (O’) can measure her motion. Thus the principle of relativity requires her to measure a time of 3 sec, whether or not she is moving relative to Bob. Conclusion: The speed of light must be the same for all observers, irrespec- tive of the observers’ motion relative to the source.

4.3 Consequences

Summary of Consequences 1. The Galilean transformations in Section 4.2.2 can’t be right, because they imply the linear addition of velocities (4.3). If a light ray was moving at a speed U 0 = c relative to O0, it necessarily would move at a speed U = c − v relative to O. Einstein realized that if v was close to the speed of light, then the observer O would in principle be able to see an almost stationary light ray (actually he realized this at sixteen). This had never been observed and reinforced his notion that the speed of light must be the same in all frames. 2. The Relativity of Simultaneity 3.

31 4. Length Contraction

5. General → the inevitability of the space-time continuum (a huge paradigm shift) → space-time diagrams.

6. Momentum and energy as space-time “four-vectors” that transform relativistically → E = mc2 → nuclear bombs and nuclear energy.

We will now discuss these consequences one at a time.

4.3.1 Relativity of Simultaneity A thought experiment indicates that observers in different frames can dis- agree on whether two events occurred at the same time, or at different times. Specifically if events A and B occur at equal distances from some observer, O, and the light from both events reaches O simultaneously, then O concludes that A and B happened at the same time. As illustrated in the Figures below, if another observer O0 is moving relative to O in the direction from

Figure 4.6: Relativity of simultaneity

A towards B, and happens to be at the same place as O at the time that O supposes the two events happened, then O0 will necessarily be somewhat closer to B when O observes the events. Thus, O0 will have already seen B, but not A, and must conclude that B happened before A. If O0 is moving fast enough relative to O, they can also disagree on the time ordering of two events. That is, even if O receives the signals from A before B, if O0 has

32 Figure 4.7: Relativity of simultaneity illustrated on a spacetime diagram travelled far enough she will pass the signals from B before those of A. Surface of Simultaneity When we draw the x axis on a spacetime diagram we take for the granted that the x-axis corresponds to “space” at the time t = 0. Any horizontal line will correspond to a snapshot of “space” at the corresponding later time. In more technical terms. A horizontal line crossing the t axis at some time, say t = 3 pm corresponds to the location in space- time of all possible events that are simultaneous with 3 pm as marked on a clock sitting at x = 0. It is a surface of simultaneity for observer O. To try to clarify this, suppose at 2pm we send a laser beam out in space towards the asteroid belt surrounding the solar system. The laser beam hits an asteroid and causes it to explode. We see the explosion at 4pm. Assuming that the speed of light is the same coming and going, we easily conclude that the asteroid exploded at 3pm. This is the general procedure by which we can, in principle map out all the events in spacetime that coincide with a particular time on our clock. This is what is meant by the surface of simultaneity at some time t for an observer O. What about the surfaces of simultaneity for an observer O0 moving with

33 Figure 4.8: Surface of simultaneity for moving observer O0. respect to O at some speed v? We need to remember how one might establish the time of events that are happening far away. One way is to send out light rays that you can use to “look at” the events. Assuming the speed and time of flight is the same going and coming, if you send out a light ray at say ta, and it comes back to you at tb with a picture of the event, then you know for sure that the time of the event as measured on your wristwatch (cell-phone?) was (ta + tb)/2, i.e. right in the middle. By sending out a series of pulses and waiting for them to come back you can map out a surface of simultaneity for any specific time. Note that this effect depends on the finite propagation speed of light. It demonstrates a deep meaning of special relativity once it is understood properly. The point is that because the speed of light is the same for O and O, in fact for all observers, there are no criteria by which one can conclude that one or the other is correct with respect to time ordering. As long as

34 the postulates of special relativity are correct, there can be no such criteria. Thus, in a very real sense, the time ordering of certain events has no absolute, frame independent meaning. The reason we restrict this statement to certain events will become clearer a bit further on.

35 Figure 4.9: Time dilation

4.3.2 Time Dilation Back to Alice and Bob....

Derivation: In general consider a vertical box (spaceship) of height l in frame O0 as it moves at speed v relative to O. The clocks in O and O0 were originally synchronized to run at the same speed when they were both in the same inertial frame. A pulse of light drops from the top of the box to the bottom as shown in Fig.4.9.

The time measured between these two events by O0 is dt0 = l/c (4.12) Let’s say that the time elapsed between the same two events is in O is dt. During this time the box has moved a distance vdt. From O’s point of view the light pulse has moved a distance √ D = l2 + v2dt2 (4.13) Since O and O0 both must measure the speed of light to be the same, namely c, it must be true that: D = cdt (4.14)

36 Putting (4.12), (4.13) and (4.14) yields a relationship between dt and dt0: √ cdt = c2dt0 2 + v2dt2 1 → dt = dt0 q v2 1 − c2 = γdt0 (4.15)

As long as v < c, γ > 1, which means that from√ O’s point of view, the clock in O0 is running slowly. For example of v = 3/2 × c, then γ = 2 so that for every second that passes in O0, 2 seconds pass in O.

Properties of Time Dilation • γ is infinite if v = c and γ is imaginary when v > c. This suggests that speeds greater than c are impossible in special relativity.

1 v2 • If v/c << 1, then γ ∼ 1 + 2 c2 . • Despite the fact that the clocks are identically and were synchronized when they were in the same frame of reference, O sees the clock of O0 as running slowly. This is why it is called time dilation.

• The effect is completely symmetrical. Nothing distinguishes the frame O from O0 in the above argument. We can therefore also conclude that O0 sees the clock of O as running slowly, by the same factor.

• One must be cautious in applying this formula. It specifically compares a the a clock that is stationary in the O0 frame to the rate of a clock that is stationary in a different inertial frame, O. If the clock is also moving relative to O0, then the transformation equation needs to be modified. We will learn about this later.

: The time interval along a particular world line between two events as measured by a clock moving along that world line. As we will see later, it defines the geometrical coordinate independent distance along that path in spacetime. Note ∆x0 = 0, so that ∆t0 is the proper time elapsed between the two ticks of the clock carried by O0. ∆t is not the proper time between those same two events because ∆x 6= 0.

37 • Example 1: Trip to the grocery store Calculate how much less you age as you travel 5km in a car at 100km/hr = 30 m/s, relative a clock that sits stationary. You can take the speed of light to be a billion km/hr (check this). Note that 5km is the distance measured in the “Earth’s” frame, not in your moving frame. More about that later.

• Example 2: Trip to Mars Suppose astronauts tavel to mars at 0.001 c. The distance to Mars is about 55million km. How long would the trip take from NASA’s point of view. How much would the astronauts age? (i.e. What is the proper time elapsed along the journey?)

• The : The Olsen Twins (anyone remember them?) are turning 30 years old. Ashley decides to leave the business and spend her fortune on a space trip to alpha centauri four light years away. The spaceship she rents travels at 0.8c. It turns out that Alpha Centauri is rather boring, so she only stays a day or two before returning.

– What is γ at 0.8c? answer: 1.7 – How long does the return trip appear to take to Mary-Kate? 4light years t = L/v = 2 = 10yrs (4.16) 0.8light years/year

– How long does the return trip appear to take for Ashley? t 10 t0 = = years = 6yrs (4.17) γ 1.7

Question: This fits with our expectation that Mary-Kate sees Ashleys (biological)-clock running slowly, but during the trip Ashley should see Mary-Kates clock run slowly. Which twin is really older once they meet again on Earth? Answer: There is no paradox here. The time dilation formula, and in fact all Lorentz transformation laws that we will learn, apply only to the relationship between coordinates of observers in inertial (non- accelerating) frames. As long as both remain in inertial frames, they

38 cannot synchronize their clocks more than once. One, or both of them have have to accelerate/decelerate in order to meet up again. In this case it is Ashley who decelerates to stop at Alpha Centauri, and then accelerates to return to Earth. She is therefore not always in an inertial frame, and her path through spacetime (world-line) is very different than Mary-Kate’s. It is therefore no surprise that one ages more. As discussed above, the amount we age is determined by the proper time along our particular world-line. Ashley’s trip to Alpha Centauri and back traverses a world line for which the proper time is less than that of Mary-Kate’s (even though naively it doesn’t look that way on a space-time diagram), so she ages less, by the amount given by the time dilaton formula.

Experimental Verification Muons are a type of elementary particle created by very energetic cosmic ray collisions in our upper atmosphere a few kilometers above the Earth’s surface. They are unstable, and only live about 2.2 microseconds before decaying into something else. This has been measured in laboratory experiments. The ones created in cosmic ray collisions have more kinetic energy, and travel at about 0.99c, relative to the Earth. Since they live only 2.2 microseconds they should only travel a distance:

L0 = 0.99c × 2.2 × 10−6s = 660m (4.18) through our atmosphere before they decay. If this were true, we would never detect them on Earth. However, the time dilation expression tells us that in the earths frame of reference, their lifetime should be: ∆t = γ∆t0, where γ = √ 1 ∼ 7, so that they live 7 times as long in our frame and can 1−(0.99)2 therefore travel 7 × 660m ∼ 4600m . This is greater than the distance from the upper atmosphere where they are created and the earths surface, so we do in fact detect them. This is a striking confirmation of time dilation. The lifetime of the muon appears longer as measure in our frame of reference. The distance, L, that the muon moves as measured in the earth frame is different from the distance, L that it should be able to move in its own frame of reference. This is related to the next consequence of special relativity, namely length contraction.

39 4.3.3 Length Contraction Derivation Observers O and O0 must agree on their relative speed so

L L0 v = = ∆t ∆t0 ∆t0 1 → L0 = L = L (4.19) ∆t γ

Properties: • L0 < L so meter sticks at rest w.r.t. O appear to be shortened as measured by O0.

• In the above transformation law L is the proper length of the meterstick which is stationary in O, whereas t is the proper time of the observer O who is moving relative to O. Again the effect is completely symmetrical. Meter sticks at rest with respect to O appear shortened by precisely the same factor as measure by O.

• Spacetime Diagram: Time Dilation and Lorentz Contraction

– Time dilation: clock stays at x0 = 0 in O0. – Lorentz contraction: lengths measured along surface of simultane- ity ( t0 = 0 in O0).

Examples: • How long does Earths atmosphere appear to the muon? L 4600m L0 = = = 660m (4.20) γ 7 which is of course how far the muon can be expected to go in its frame of reference.

• Global Position (GPS) Navigation Systems:

40 – Developed by US Dept. of Defense Network of 24 satellites orbit- ing about 20,000km above the earth. – Move at about 14,000km per hour, so one orbit takes about 12 hours. By triangulating position using at least four satellites, can locate your gps transmitter to an accuracy of about 10 meters. – In order to achieve this kind of accuracy, positions of satellites must be known very accurately, and clocks on satellites must be synchronized to within about 30 nanoseconds. They use atomic clocks that are accurate to 1 nanoseconds, but relativistic effects must be taken into account. In fact general relativity (i.e. Ein- steins theory of gravity) must also be incorporated, and gives a big effect than special relativity.

• Ladder in the Barn At rest, a ladder 4 meters long just fits into a barn that is also 4 meters long, in its rest frame. Suppose you run towards the open door of the barn at 0.99c holding one end of the ladder. How long is the barn in your frame of reference? How long is the ladder in the barns frame of reference? Given the above, you get to the door of the barn before, or after the other end of the ladder hits the far end of the barn? This is easiest to answer clearly using spacetime diagrams. (See assignment)

• Death Star Betrayal (Alternative to Ladder in the Barn Paradox) Princess Leia, Han Solo and Chewbakka are planning to attack the Death Star in Solos spaceship. Solo knows from the last time he was captured that the spaceship is exactly the same length as a hangar on the Death Star that is open at both ends. His plan is to have Chew- bakka fly through the hangar at 0.9c so that he and Princess Leia can jump onto the Death Star firing their laser weapons. He tells Leia to stand at the front of the spacecraft and fire the instant the front of the spaceship gets to the front end of the hangar. He will stand at the back of the spaceship and fire the instant he gets to the front of the hangar. Since the ship and the hangar are the same length, he tells Leia, they will fire precisely at the same instant and surprise the guards. How- ever, Solo, who has taken special relativity in school, is actually trying to trick Leia. He knows that the hanger will appear shorter to them because of the speed at which they are moving, so Leia will actually

41 Figure 4.10: Death Star frame of reference

fire first, drawing the guards fire and making things safer for him. Leia thinks about it for a minute and agrees. Was this a mistake? No. Leia, who also took relativity in school but passed it, knows that in the Death Stars frame of reference, the spaceship appears shorter, so the guards will actually see Han Solo fire first and he will draw their fire..This illustrates that special relativity is not a purely academic en- deavour, but can have dire consequences! Red lines are front and back of spaceship. H denotes Han Solo reaching back of hangar, L denotes Leia reaching front of hangar. Solos lasers reach guards at center of hangar first. This is true in any frame.

Note that Leia arrives first in Han Solos frame, but Solos lasers still hit the guards first. In the spaceship frame this is because the guards are moving away from the laser beams, which move at the same speed in both frames.

42 Figure 4.11: Spacetime Diagram in frame of reference of spaceship

4.3.4 Lorentz Transformations Derivation of general form So far we have derived time dilation, which deals with the proper time as measured by a clock that is stationary in one frame ∆x0 = 0, and length contraction, which pertains to the proper length of a meter stick at a single instant in time in one frame ∆t = 0. We would like to deal with the more general case, when both position and time change in both frames. Two observers O and O0 watch a clock (called C, for short) fly by. Let denote the proper time of the clock. I.e. the time elapsed as measured in the clock’s frame of reference. The speed of C relative to O is dx u = (4.21) dt and the speed of C relative to O0 is dx0 u0 = (4.22) dt0 As derived in class earlier for both O and O0:

dt = γ(u)dτp 0 0 dt = γ(u )dτp (4.23)

43 Figure 4.12: Coordinates of a spacetime event P as measured by two different observers.

Since the change in proper time must be the same in both of the above transformations, we can, with a little of algebra conclude:

c2dτ 2 = (c2 − u2)dt2 = (c2 − u02)dt02 → c2dτ 2 = c2dt2 − dx2 = c2dt02 − dx02 (4.24)

We now need to find a linear transformation from t, x to t0, x0 that satisfies the above. The answer is, for O0 moving with speed v to the right relative to O: v c∆t0 = γ(v)c∆t − γ(v)∆x (4.25) c v ∆x0 = − γ(v)c∆t + γ(v)∆x (4.26) c The relationship between the coordinate axes of O and O0 are given in Fig.4.12 below. The parameter β is related to the relative velocity v between the two frames by: v tanh(β) = (4.27) c As we will see it is the Lorentz transformation analogue of the angle θ of rotation relating two spatial axes.

44 Properties of Lorentz Transformation • They are linear

• They are completely symmetric: to get the inverse transformation, i.e. expressing (x, t) in terms of (x0, t0) simply change v → −v.

v • Non-relativistic limit: when c << 1, Taylor expansion reveals that they agree with Galilean transformations: v ct0 → ct + O( ) (4.28) c v x0 → x − vt + O( )2 (4.29) c Assignment: do Taylor expansion and calculating leading order cor- rections.

• Time dilation. ∆t as measured by O for ∆t0, i.e. two ticks of a clock stationary in O0. According (4.26)

∆x0 = 0 → ∆x = v∆t (4.30)

and (4.25) then implies (after a bit of albegra): 1 ∆t0 = ∆t (4.31) γ(v)

• Length contraction. ∆x0 = L0 is the measurement of the instantaneous length of a meter stick as measured by O0 moving past the meter stick at rest in O. Then (4.25) implies

∆t0 = 0 → ∆x = v∆t (4.32)

so that via (4.26): 1 ∆x0 = ∆x (4.33) γ(v)

• Matrix form of Lorentz transformation: Think of the coordinates (ct, x) of an event as a vector X in spacetime (with tail at origin (0, 0)). Then

45 in complete analogy with rotations of vectors in two space, we can write (4.25) and (4.26) as the operation of a matrix Λ on the vector X: X0 = ΛX  0   v    ct γ(v) − c γ(v) ct 0 = v x − c γ(v) γ(v) x  cosh(β) sinh(β)   ct  = (4.34) sinh(β) cosh(β) x

where β = tanh−1(v/c) as defined above. Note that a key property of the hyperbolic geometric functins is that: cosh2(β) − sinh2(β) = 1 (4.35) in analogy with the ordinary geometric functions: cos2(θ) + sin2(θ) = 1 (4.36)

v Assignment: Prove that cosh(β) = ±γ(v) and sinh(β) = ± c γ(v). define cosh(β) as positive, then sign of sinh(β) determined by direction that O0 is moving relative to O. Show that − sign corresponds to O0 moving to right.

Spacetime Diagrams Revisited: Scales When one rotates coordinate axes, the transformation preserves lengths, so as shown on the left hand side of Fig.(4.13), a distance a on the x-axis is equivalent to a distance a on the x0-axis. Another way to see this is to remember that rotations preserve the shape of circles: a circle of radius 1 looks exactly the same in both the (x, y) and (x0, y0) coordinates: x2 + y2 = (x0)2 + (y0)2 = a2 (4.37) Lorentz transformations on the other hand preserve proper time or proper length: Timelike vectors c2t2 − x2 = (ct0)2 − (x0)2 = c2T 02 Spacelike vectors c2t2 − x2 = (ct0)2 − (x0)2 = −(L0)2 (4.38) These are hyperbolae as show on the right hand side of Fig.(4.13). The shape of these hyporbolae is preserved in going from one set of coordinates

46 to another. Thus, the the length L0 as seen by O0, measure along her x0-axis, will also correspond to a length L0 along the x-axis where it intersects the corresponding hyperboloid, as shown. The actual proper length L of the meter stick, whose two ends are moving along world-lines x = 0 and x = L in the diagram, is clearly greater than L0. This confirms what we know as length contraction: a moving observer O0 sees the meter stick as shorter L0 < L. By the same token, the diagram also indicates that the observer O will measure a time T between two ticks of a clock that is greater than the proper time as measured by a clock carried by observer O0 moving relative to O.

Figure 4.13: Diagrams show how scales change as one goes to a rotated set of spatial axes (left) and boosted set of spacetime coordinate axes (right). On the left, lines of constant radial distance from the origin are circles, whereas on the right lines of constant proper distance/time from the origin are hy- perbolae.

47 Addition of Velocities

Suppose a rocket us moving to the right relative to O0 at speed u0, and O0 is moving to the right with speed v to O. The Lorentz transformations (4.25) and (4.26) imply, after a bit of algebra that:

∆x0 u0 := ∆t0 v − c γ(v)c∆t + γ(v)∆x = v γ(v)c∆t − c γ(v)∆x u − v = uv (4.39) 1 − c2 where u := ∆x/∆t. Some properties:

• If O0 is moving to the left instead of the right, just change v → −v.

• One can easily invert (4.39) to get:

u0 + v u = u0v (4.40) 1 + c2

• Whenever u/c or v/c is much less than one, the above reduces to the usual Galilean (intuitive) addition of velocities.

48 • Example 1: Suppose v = c/2, and O0 fires a rocket at speed u0 = c/2 relative to her frame, in the direction of motion. The speed of the rocket observed by O is:

u0 + v c/2 + c/2 4 u = = = c (4.41) u0v 1 + c2/4c2 5 1 + c2

• Now supposes the rocket fired by O0 also fires a smaller rocket moving at c/2 relative to first rocket. The speed of the smaller rocket relative to O is then 4c c 5 + 2 13 u2 = 4 = c (4.42) 1 + 5·2 14 • If you keep repeating this process (Jet propulsion!) the speed relative to O gets closer and closer to the speed of light, but can never reach it. Thus special relativity, specifically the Lorentz transformation law, prevents an massive object from being accelerated to the speed of light. Another way to say this is that as an object moves faster relative to a given frame O, it gets more difficult to accelerate. Recall that inertia, or intertial mass, is the resistance of an object to a change in motion. This means that as an object moves faster, its inertial mass increases.

• If O0 fires a laser beam, moving at the speed of light, c, relative to her frame, then the speed of the laser beam with respect to O is:

u0 + v c/2 + c u = = = c (4.43) u0v 1 + c2/2c2 1 + c2 Thus, as expected, the speed of light is the same for O as it is for O0.

Relativistic Doppler Shift Consider a source S0 of light waves with frequency f 0 moving towards an observer O. The time between emitted crests as measured by S0 is ∆t0 = 1/f 0. The wavelength, i.e. distance between crests as the waves move outwards as measured by S0 is λ0 = c∆t0. However, in the frame O the distance between crests is λ = c∆t − v∆t (4.44)

49 because S0 is moving with the crests at speed v before emitting the next crest. The frequency with which the crests hit O is therefore: c f = λ c = (c − v)∆t c = (c − v)γ(v)∆t0 p1 − v2/c2 1 = 1 − v/c ∆t0 s 1 + v/c = f 0 (4.45) 1 − v/c

• Comparison with non-relativistic expression Note that the ef- fect is completely symmetric with respect to whether the source or the observer is moving because in special relativity there is no preferred in- ertial frame. This is not the case for the usual non-relativistic Doppler shift. When considering sound waves, an observer O0 who moves to- wards a source O of sound with speed v measures the speed of the sound to be cS + v, where cs is the speed of sound relative to the air. In this case the wavelength doesn’t change so the frequency measure by O0 is:   0 cs + v v f1 = = 1 + f (4.46) λ cs On the other hand if the source is moving towards O0 then the wave- length is shortened to 0 λ = (cs − v)∆t (4.47) So the frequency measure by O0 is:  −1 0 cs cs f f2 = 0 = f = (4.48) λ cs − v 1 − v/cs

0 0 Note that non-relativistically, f1 6= f2, so that one can distinguish via the Doppler shift of sound waves who is moving. This is not the case for the relativistic Doppler effect. However, if v/c << 1 all three 0 0 0 expressions, f , f1 and f2 are approximately the same (Exercise: prove this).

50 • Example: most distant quasar has a red-shift of z=6.43, where the red-shift is defined as: λ0 − λ z := λ (4.49)

If this red-shift were due to a proper motion of the quasar (it is not: it is really due to the expansion of the universe), it would imply

λ0 − λ z := λ f = − 1 f 0 s 1 + v/c = − 1 (4.50) 1 − v/c

After a bit of algebra, the speed of the quasar away from earth would be: v (1 + z)2 − 1 = = 0.964 (4.51) c (1 + z)2 + 1

Matrix form of Lorentz Transformations Matrix form of Lorentz transformation: Think of the coordinates (ct, x) of an event as a vector X in spacetime (with tail at origin (0, 0)). Then in complete analogy with rotations of vectors in two space, we can write (4.25) and (4.26) as the operation of a matrix Λ on the vector X:

X0 = ΛX  0   v    ct γ(v) − c γ(v) ct 0 = v x − c γ(v) γ(v) x  cosh(β) sinh(β)   ct  = (4.52) sinh(β) cosh(β) x where β = tanh−1(v/c) as defined above. Note that a key property of the hyperbolic geometric functins is that:

cosh2(β) − sinh2(β) = 1 (4.53)

51 in analogy with the ordinary geometric functions:

cos2(θ) + sin2(θ) = 1 (4.54)

v Assignment: Prove that cosh(β) = ±γ(v) and sinh(β) = ± c γ(v). define cosh(β) as positive, then sign of sinh(β) determined by direction that O0 is moving relative to O. Show that − sign corresponds to O0 moving to right.

The Recall: Proper time: The time interval along a particular world line be- tween two events as measured by a clock moving along that world line. As we will see later, it defines the geometrical coordinate independent distance along that path in spacetime.

• Two events A and B are said to be timelike separated if the space- time vector (c∆t, ∆x) joining them has positive proper time squared:

2 2 2 2 2 c ∆τp = c ∆t − ∆x > 0 (4.55) In this case it is possible to get from one event to the other travelling less then the speed of light. The relative time ordering of two timelike separated events is invariant under Lorentz transformations, i.e. the same for all inertial observers.

• Two events are said to be spacelike separated if the spacetime vector (c∆t, ∆x) has positive proper length squared:

2 2 2 2 2 c ∆sp = ∆x − c ∆t > 0 (4.56) In this case one would have to travel faster than the speed of light to get The time ordering of two spacelike separated events can be different for different inertial observers.

• Two events A and B are said to be null separated if the proper time between them is zero:

2 2 2 2 2 c ∆τp = ∆x − c ∆t = 0 (4.57) In this case you would have to travel at the speed of light to get from one to the other.

52 • One can calculate the proper time (length) along any timelike (space- like) trajectory by adding up (integrate) the infinitesmal proper time (length) at each point along the trajectory. For example, for a time- like trajectory/world-line x(t) connecting the events A and B the total proper time elapsed: Z B c∆τp = dτp A Z B √ = c2dt2 − dx2 A r Z B v(t) = c dt 1 − 2 A c Z B dt = c (4.58) A γ(t) Recall that this equals the time as measured by a clock carried by an observer moving along that world-line. Different observers, travelling along different world-lines between the same two events can measure different proper times (cf twin paradox). Definition: Light Cone at a point, or event, P in spacetime is the bound- ary that separates events that are reachable from P in past and future from those that are not. See Figure 4.14 below. The light cone is the surface of all possible light rays going through that point in spacetime. The future light cone of P contains all events that it could in principle influence. The past light cone of P contains all events that could have had any influence. Note: Two events that are timelike separated are by definition within each other’s light cones. Two events that are spacelike separated are outside each others light cones. The relativity of simultaneity applies only to two events that are spacelike separated. By extension, the time ordering of two spacelike events depends on the inertial frame of the observer, whereas the time ordering of two timelike separate events is the same for all inertial observers.

4.3.5 Relativistic Energy and Momentum Kinetic energy and invariant mass The addition law for velocities implies that the faster an object is moving, the more difficult it is to speed it up even more. This suggests that the

53 Figure 4.14: Spacetime diagram showing future light cones, assuming two spacial dimensions, of event P , as well as past light cone of P . usual expressions for kinetic energy and momentum must be modified. The Newtonian expressions are useful because in the absence of external forces or changes in internal energy due to deformations (i.e. elastic collisions), the total kinetic energy and total momentum is conserved when two particles collide. Moreover, if this is true in one inertial frame, then it will also be true in any other inertial frame, provided that the coordinates are related by Galilean transformations, for which the addition of velocities is LINEAR. It is perhaps not obvious, but it should be plausible, that if the velocity addition law is non-linear, then the conservation laws could get messed up in going from one frame to another. The relativistic formulas for kinetic energy and momentum that are conserved in all inertial frames are: Kinetic energy: T (u) = mc2γ(u) − mc2 (4.59)

54 By doing the usual expansion of γ(u) you can verify that for u/c << 1:

1 v2 T (u) ∼ mc2(1 + ) − mc2 2 c2 1 = mv2 (4.60) 2 which is the usual expression for kinetic energy. Momentum (one spatial dimension):

p = mγ(u)u (4.61)

Three-Momentum (three spatial dimensions):

p = mγ(u)u (4.62)

Total Energy: E = T (u) + mc2 = mγ(u)c2 (4.63) The total energy is the sum of the kinetic energy T (u) and the rest energy 2 E0 := mc . Since p2 = p · p = m2γ2u2c4, you can verify that (4.63) and (4.62) imply the following energy-momentum relation can be written:

m2c4 = E2 − p2c2 p → E = m2c4 + p2c2 (4.64) m := (pE2/c2 − p2)/c is referred to as the invariant mass. As the name suggests, it invariant under Lorentz transformations, i.e the same in every inertial frame of reference. The energy E and momentum p do change under Lorentz transformation. The above formulas imply that a particle at rest has rest energy:

2 E0 = mc (4.65)

This is the famous formula that revealed the equivalence between mass and energy. It subsequently led to the development of nuclear power and nuclear weapons. Because c is so large, a small amount of mass corresponds to a large amount of energy.

55 Massless particles The formula for the total energy of a particle moving at speed u implies that in order to move at the speed of light, one would have to give it an infinite amount of energy, since

γ(u) → ∞ as u → c (4.66)

Thus, no massive particle can be accelerated to the speed of light because E → ∞, so this would require infinite energy. However, it is possible for a particle (such as light) to move with speed c, in which case the Lorentz transformations and addition of velocities imply that such a particle will move at the speed of light with respect to all frames of reference. Such a particle necessarily has zero rest mass, since otherwise its total energy would be infinite. The energy-momentum relation (4.64) then implies that: E = pc (4.67)

Four-vectors • Position Four-vector: We have seen that Lorentz transformations are linear transformations of the space-time coordinates (c∆t, ∆x) that preserve the proper time ∆τ. It is therefore natural to think of the space-time coordinates of an event relative to the origin as a vector in space-time:  ct  X = (4.68) x and the proper time as the “length” of the vector using a strange form of the Pythagorus theorem:

X2 = X · X = c2t2 − x2 (4.69)

Lorentz transformations can then be written in matrix form:  γ(u) −γ(u)u/c  L(u) = (4.70) −γ(u)u/c γ(u)

56 so that Lorentz transformations are “length”-preserving linear trans- formations  ct0  X → X0 = x0 = L(u)X  γ(u) −γ(u)u/c   ct  = (4.71) −γ(u)u/c γ(u) x such that X0 · X0 = X · X (4.72)

In four space-time dimensions:

 ct    ct  x  X = =   (4.73) x  y  z

Given the above interpretation it is natural to think that there are other four-vectors that are of physical relevance. One of the most important is the following.

• Momentum four-vector:

 E  P := c p  mγc  = (4.74) mγ(u)u

You can verify that pc has dimensions of energy. If we are in the same frame of reference as the particle:

 mc  P = (4.75) 0

You can verify that under a Lorentz transformation, i.e. change of

57 frame to one that is moving to the right with speed u:

 E0  P → P = c p  mc  = L(u) 0  γ(u) −γ(u)u/c   mc  = −γ(u)u/c γ(u) 0  mγ(u)c  = (4.76) −mγ(u)u

as expected, since in this frame the particle is moving to the left with speed u (velocity −uˆi). In four dimensions:

 E  c  E   px  P := c =   (4.77) p  py  pz

As well, you should verify that

P · P = P · P = m2c2 (4.78)

So one can think of the invariant mass mc2 as the invariant length of the four-vector P . In Galilean collisions, the kinetic energy and three momentum are sep- arately conserved. In relativistic collisions, one must conserve the total momentum four-vector. We can now in principle apply the above to a particularly relevant relativistic scattering process.

• Light Light travels at speed c, i.e. along null world-lines so that:

c2∆t2 − ∆x2 = 0 (4.79)

in all frames. Thus the four-vector that describes the propagation of a null ray has zero magnitude. By the same token light rays cannot be

58 slowed down, or stopped. They have zero rest mass, which means their four momentum is also a null vector. In particular

P · P = (E/c)2 − p · p = 0 → E = pc (4.80) √ where p = p · p is the magnitude of the photon’s three momentum. This is equivalent to saying that photons have zero rest mass.

Examples

1. Suppose a particle of rest mass m and three-momentum p = (0, py, 0) is acted on by a lorentz transformation with speed v in the y-direction. What is its new four-momentum vector P 0? Show that the P · P = P 0 · P 0 = m2c2. Assignment

2. We would like to send a large spaceship to Mars in order to colonize it. The propulsion method involves ejecting a large quantity of radi- ation. Consider an approximation in which the spaceship of mass Mi is initially at rest, and then after the propulsion is finished, one has a quantity of radiation, energy Er moving to the left while the space- ship of remaining mass Mf is moving to the right at speed v = 0.8c. What fraction of the initial mass Mi remains as “payload” Mf . Given that the spaceship must come to a stop once it reaches Mars, what is the final payload, as a fraction of the initial mass once it arrives? Assignment

3. We now have all the machinery to consider the effects of Lorentz trans- formations in more spatial dimensions. Starting with a particle with four momentum:  E/c   Px  P =   (4.81)  py  0

First perform a boost to a frame moving at speed vx in the x-direction, then follow this by another boost of speed vy in the y-direction. Now show that doing the same boosts in opposit order yield a different final four-momentum. Assignment

59 New Units Particle physicists find it useful to use a smaller unit of energy than a joule. An electron volt is the amount of kinetic energy gained by an electron that is accelerated across a potential difference of one volt. . This is actually too small a unit for modern accelerators, so the following are used: Unit Joules Mass Equivalent Particle 1 eV 1.6 × 10−19J 1.7 × 10−36kg neutrino mass 1 keV=103 eV 1.6 × 10−16J 1.7 × 10−33kg x-rays energy 1 MeV=106 eV 1.6 × 10−13J 1.7 × 10−30kg electron mass(1/2 MeV) 1 GeV=109 eV 1.6 × 10−10J 1.7 × 10−27kg proton/neutron mass 1 TeV=1012 eV 1.6 × 10−7J 1.7 × 10−24kg Higgs mass (0.1TeV Examples

1. LHC (the large hadron collider) has produced protons with total en- ergy of E = 7 Tev. What is the corresponding speed of the protons?

E = mγ(v)c2 = 7 T eV 7 T eV 7 T eV → γ(v) = = mc2 1 GeV = 7000 v 1  1 2 → ∼ 1 − ∼ 0.999999995 (4.82) c 2 7000

2. Fusion: in a typical fusion reaction a deuterium atom (D, one proton and one neutron) and tritium atom (3H, one proton, two neutrons) combine at rest to form Helium 4 (two protons and two neutrons) plus a single neutron. How much energy is released in per reaction? The mass of D2 is 2.01410178 u , where 1u = 1.661 × 10 − 27kg = 931.5MeV/c2 is an atomic mass unit, defined to be precisely one twelfth of the mass of a 12C atom. Because of the binding energy of the carbon atom, 1 u is less than the mass of a proton or neutron. The mass of H3 = 3.0160293u, while the mass of H4 = 4.002602 u, and the mass of proton is 1.007276467u: Solution: Net change in mass = (3.016 + 2.014 − 1.007 − 4.003)u = 0.02u ∼ 0.02GeV ∼ 10−12Joules.

60 Note: This about 1% of the initial mass. So 1000 kg of nuclear fusion matter would yields 0.01 × 1000kg × c2 = 1018 Joules. This is about 5% of Canadas total 2013 energy production of 17,912 petaJoules = 18 × 1018Joules. (StatsCanEnergy)

The Compton Effect Einstein explained the photo-electric effect by claiming that the energy car- ried by light comes in lumps, or quanta, whose energy is related to the frequency of light by: hc E = pc = hf = (4.83) λ where f is the frequency of the light and

h = 6.626 × 10−34J · s (4.84) is Planck’s constant. Einstein received a Nobel prize in 1921 for this work. Prior to 1921 it was known that the scattering of x-rays (high frequency electromagnetic waves) off electrons produced results that were not consistent with classical Maxwell + Newtonian theory. In particular, experiment show that the wavelength of the scattered light λ0 was related to the wavelength of the incident light λ0 by: h λf − λ0 = (1 − cos(θ)) (4.85) mec where θ is the angle that the scatter light makes relative to the incoming light. The numerical value of the coefficient was measured to be: h λc = = 0.00243 nm (4.86) mec We will now show that the relation (4.85) follows directly from special rel- ativity and the assumption that the light incident on the electron is in the form of particles (photons) with zero rest mass and energy given by (4.83). Derivation of Compton Formula Consider a collision between a photon and initially stationary electron, as illustrated in Fig.(4.15) below:

61 Figure 4.15: Compton scattering of a photon and electron.

The initial four momenta for the photon and electron are, respectively:   E0/c Q0 = (4.87) q0  m c  P = e (4.88) 0 0 where E0 = hf0 = hc/λ0. After the collision the respective four-momentum are:   Ef /c Qf = (4.89) qf ! q 2 4 2 2 mec + pf c Pf = (4.90) pf where qf = Ef /c Energy conservation: q 2 2 4 2 2 E0 + mec = Ef + mec + pf c (4.91)

Solving for pf : E2 E2 E E (E − E ) p2 = f + 0 − 2 f 0 + 2mc 0 f (4.92) f c2 c2 c2 c 62 Momentum conservation, using triangle inequality:

2 2 2 pf = qf + q0 − 2qf q0 cos(θ) E2 E2 E E = f + 0 − 2 f 0 cos(θ) (4.93) c2 c2 c c

2 where θ is the angle between q0 and qo. Eliminating pf from the above two equations yields: E E E − E = 0 f (1 − cos(θ)) (4.94) 0 f mc2 Using the relationship (4.83) to express the photon energy in terms of the wavelength gives the desired formula (4.85).

4.3.6 Final Notes We relied on the form of Maxwell’s theory of electromagnetism in conjunction with the fundamental postulate to conclude that the speed of light must be the same in all inertial frames. This in turn lead us directly to the Lorentz transformations. It turns out that you do not need to consider the theory of light in order to derive Lorentz transformation. A wonderful article by David Mermin called “Relativity without light” shows that one can derive the relativistic addition law for velocities in one dimension purely from the fundamental postulate and simple symmetry assumptions. This article is available on the course web-site in the Resources/SpecialRelativity folder. Another paper, also on the web site “Deriving relativistic momentum and energy” by S. Sonego and M. Pin then derivies the expressions for relativistic momentum and energy just from knowing the addition law for velocities. Thus special relativity is in fact a direct consequence of the fundamental postulate and little else. It is amazing how far one can go starting from a basic symmetry assumption! The above (this subsection only ;-) is purely for interest’s sake, and will not be on the final.

63 Chapter 5

General Relativity

5.1 Problems with Newtonian Gravity

• Gives wrong perihelion shift for orbit of Mercury by 43 seconds of arc per century

• Gravitational force corresponds to action at a distance and is inconsis- tent with special relativity

• Doesn’t explain why the gravitational mass of an object equals its iner- tial mass, i.e. why gravitational acceleration is the same for all objects

5.2 Einsteins Thinking: the Strong Principle of Equivalence

Fundamental Postulate: the laws of physics should have the same form locally in all frames of reference, including non-inertial Einstein realized that experiments done in a lab that was in free fall in a con- stant gravitational field would give results that were indistinguishable from what would be obtained in an inertial frame (i.e. at constant velocity with no gravity). This suggested via the fundamental postulate that if the two sets of experiments give equivalent results, it is because the two frames of reference are equivalent, i.e. free fall in a constant gravitational field is iner- tial. Thus, the gravitational force which causes objects to accelerate at the same rate towards the ground is really a fictitious force. Correspondingly, a

64 frame that is stationary in a constant gravitational field is accelerating (due to the electromagnetic forces between the floor and the soles of your feet, for example). Question: how can acceleration relative to the earth and straight line mo- tion in the absence of gravity both be described as inertial motion using equations that have the same form for all observers? Answer: both types of motion are in the straightest line possible, but in space-times with different geometry. Question: What causes the geometry of space-time to change? Answer: The presence of matter/energy.

65 The General Theory of Relativity: Two Line Summary (Due to John Wheeler, Princeton physicist who coined the phrase )

• Matter/energy tells space-time how to curve (Einsteins equations)

• Curved space-time tells matter how to move (Geodesic equation: curved geometry generalization of a straight line).

5.3 Geometry of Spacetime

Euclidean geometry is defined by the expression for the length of any in- finitesmal line segment expressed in cartesian coordinates:

ds2 = dx2 + dy2 + dz2 (5.1)

(5.1) determines the geometry of three-space. You know many aspects of the geometry already: the sum of the angles in a triangle is 180◦; parallel lines never meet; the triangle inequality, etc. We saw that special relativity unifies space and time into a single space- time continuum, that defines the proper time dτ 2 along any segment of a world-line to be: c2dτ 2 = c2dt2 − dx2 − dy2 − dz2 (5.2) Eq. (5.2) defines the geometry of space-time in exactly the same way as (5.1) defines the geometry of three-space. In special relativity this geometry is rigid, unchanging: it is the same at all places and for all time. More interesting geometries exist than Euclidean geometry. For example, on the surface of a sphere, we know that parallel lines do meet, and that the sum of the angles in a triangle is greater than 180◦. Consider lines of longitude. They are all at 90◦ to the equator and hence parallel. Moreover, they are straight, as straight as they can be and still stay on the surface of the sphere. And yet they meet at the north pole. Thus any two lines of longitude are parallel lines that meet at the north pole. They also form a triangle with the equator. The sum of the angles of this triangle is 90◦ + 90◦ + θ where θ is the angle with which they meet at the north pole. These properties are connected to the fact that the surface of a sphere is curved, not flat. No matter how you cut up an orange peel you cannot lay it flat on a sheet of paper without deforming the pieces.

66 The geometry of the sphere is encoded in how the length of any line segment is related to the change in coordinates needed to get from one end to the other. Suppose you use the usual angles θ, φ to navigate a sphere of radius r. The length of the line segment between two points is:

ds2 = r2(dθ2 + sin2(θ)dφ2) (5.3)

There is no way to choose coordinates on the sphere to make (5.3) look like (5.1), no matter how short the line segment. You can imagine that the surface of a balloon, if it is not too tightly stretched can deforme, oscillate and in general change its shape quite dras- tically without bursting. The technical description of this is that the local geometry that defines lengths on the surface of the balloon is changing. The surface becomes dynamical, and has a drastic effect on anything (eg an ant) trying to move on its surface. General relativity, as described by the two line summary says that in the presence of matter, the geometry of spacetime is no long described exactly by (5.2). Instead, the presence of matter causes the geometry of space-time to deform just as the presence of a very heavy ant would cause the surface of a balloon to deform. This deformation would then affect how other ants move on the balloon. Space-time is then transformed from a rigid arena in which the events of the universe play out to an active participant in the dynamics.

5.4 Some Consequences of General Relativ- ity:

• The general relativity equations that describe the motion of mercury produce the correct perihelion shift.

• GR also predicted correctly the amount that starlight bends as it passes near the sun on its way to the earth. This prediction was a factor of two greater than what you get from Newtons theory. It was verified by Ed- dington in 1919 during a solar eclipse, and made Einstein a household name overnight.

• Space-time is curved; distances between events in space-time are de- scribed by a generalization of Pythagoras; straightest possible lines look curved and can meet even if they start off parallel

67 • Time slows down in strong gravitational fields.

• The equations of general relativity are able to describe (almost!) the observed Universe as homogeneous (same everywhere) and isotropic (same in every direction) and expanding.

• The ”almost” in the above is due to the fact that recent observations have detected an anomalous acceleration in the expansion that requires the presence of an otherwise invisible dark matter in order to make it consistent with Einsteins equations.

• Similarly, the rotations of galaxies are only well described by Einsteins theory if one postulates the existence of a halo of invisible, dark matter that envelopes the galaxies.

• One of the more bizarre predictions of the General Theory of Relativity is the existence of black holes.

5.5 Black Holes

See .pdf file Blackholes.pdf in Lecture Notes folder on course website and a more recent talk about the black hole information loss problem at: BH Info Loss. You can find movies of the proper motion of the stars at the center of our galaxy at: Galactic Center 1 and a fancier version at: Galactic Center 2.

68 Chapter 6

Introduction to the Quantum

6.1 Particle or Wave?

• Electromagnetic waves vs photons (particles of light): We are very used to seeing light exhibit interference phenomena. We are also used to seeing light exhibit particle behaviour, c.f. dots on a photographic plate, or digital imaging (photoelectric detector + CCD).

• Electron waves: We are less used to thinking of electrons as waves, but there is sub- stantial experimental evidence that this is the case:

– Fig. (6.1) shows clearly the wavelike properties of electrons trapped in a “corral” of iron atoms.

Figure 6.1: Quantum Corral

69 – Electron Double Slit Experiment: A very elementary, but potentially useful cartoon description: Doctor Quantum: The results of an actual experiment, showing electrons going through two slits, one electron at a time, and creating an interference pat- tern: Real experiment

• We will see that all things (light, electrons, billiard balls) are completely described in quantum mechanics as probability waves, with very bizarre properties that are most naturally described using imaginary numbers.

6.2 Light as particles: early experimental hints

6.2.1 Blackbody Radiation: the Ultraviolet Catastro- phe • Definition of Blackbody: it is an idealized object that absorbs all the radiation incident on it and is in thermodynamic equilibrium with its surroundings at some fixed temperature. The radiation that it emits is thermal radiation, which depends only on the temperature of the blackbody.

• Model of Blackbody: an enclosed cavity containing radiation in equilibrium with the walls of the cavity. One can sample the radiation inside by looking through tiny hole in the surface.

• Thermal or Blackbody radiation

– As mentioned, thermal radiation is completely determined by the temperature of the blackbody, and does not carry any information about its composition. – Intensity of emitted radiation: I(λ)dλ is the intensity of radiation (in Joules per square meter per second) emitted from the surface with wavelength between the interval λ → λdλ. this tells us how the energy of the elec- tromagnetic waves in the cavity is distributed among all possible wavelengths.

70 – Classical expression for intensity: the energy of the radiation is independent of its wavelength, but there are infinitely many more short wavelength modes than long wavelength modes. The classical prediction is that goes to zero for large wavelengths, and goes to infinity for short wavelengths, according to the formula: k T I = 2πc b (6.1) classical λ4

where kb is Boltzmann’s constant [Exercise: Check units]:

−23 −1 kb = 1.3806485210 J · K (6.2)

where K denotes degrees Kelvin

K = ◦C + 273.15 (6.3)

The Kelvin temperature scale has physical significance as a ther- modynamic quantity: 0K is absolute zero, the lowest temperatue achievable by any physical system. It signifies the complete ab- sence of thermal fluctuations: the system is in a known quantum state.

• Experimental result (the UV catastrophe): There is a huge mis- match between the real world (i.e. experiment) and the classical theory. If the classical result were correct an infinite amount of energy would be emitted at small wavelengths so that blackbodies would radiate away all there energy almost instantaneously. This doesn’t happen. The experimental data for the intensity vs. wavelength for various temper- atures are represented by Fig.(6.2) The shape of the curve (6.2) depends only on temperature T and carries no information about what the blackbody is made of. The maximum intensity occurs at a wavelength that depends inversely with tempera- ture: 2.9 × 10−3 λ = m · K (6.4) max T The above is called Wien’s law. For objects at room temperature, ◦ −5 T = 20 = 293K, λmax ∼ ×10 m ∼ 10µm which is well into the

71 Figure 6.2: Observed blackbody intensity vs. wavelength

infrared. On the other hand, the temperature of the sun is about 6000K, so the corresponding wavelength is 5 × 10−7m, or about 500 nm, which is the wavelength of yellow light.

• Quantum Resolution: The energy of the radiation inside cavity can only come in lumps that depend on the frequency of the radia- tion: E = hf = hc/λ, where h is a new constant of nature, called Plancks constant. In thermodynamic equilibrium at a fixed tempera- ture, atomic states with high energy are suppressed according to the Boltzmann distribution:

E hf hc − k T − k T − λk T pE ∝ e b = e B = e B (6.5)

. Thus the production of high energy (low wavelength) radiation be- comes highly unlikely. A rigorous calculation with the assumption

72 about lumps of energy yields:

hc2 I(λ) = 2π  hc  (6.6) λ5 e λkB T − 1

This formula contains a new constant h that has units of joule-seconds. By curve fitting data of the major intensity one obtains an experimental value of: h = 6.6 × 10−34 J · s (6.7)

Note that Wein’s law (6.4) follows directly from (6.6) (See Assignment 8). By fitting only a single parameter h, one achieves remarkably consistency with all blackbodies at all temperatures.

6.2.2 Photoelectric Effect: not covered this year • The Experiment: Radiation is absorbed by electrons trapped in solids. Classically, elec- trons heat up gradually as they absorb the radiation, until they pop out. This takes a while, but should not depend on the frequency of the radiation.

• The problem: Experiments revealed that electrons were ejected from the metals as soon as radiation hit it. Moreover, the maximum kinetic energy of the electrons that emerged depended on the frequency of the radiation, as represented by the graph below.

• The graphs were explained by Einstein in a Nobel prize winning paper. He said that the radiation was behaving like a stream of particles, with: hc Energy E = hf = (6.8) λ E hf h Momentum p = = = (6.9) c c λ where h is again a universal constant. Each photon imparted a quantum of energy to the electron which absorbed it. Since it was not absorbed gradually, there was no time delay. If this quantum wasnt large enough

73 Figure 6.3: Photoelectric Effect

to overcome the binding potential, , which was a property of the metal, then no electrons were ejected. The maximum kinetic energy with which electrons could emerge was the difference between the photon energy and the binding energy: φ . This is precisely the relationship exhibited in the graph above, with the slope given by h and hence the same for all metals, but the y-intercept given by the binding potential, or work function, which is different for different metals. Experimental determination of the constant h from the slopes of the Kmax vs f graphs yields the same value as the blackbody radiation spectra.

6.2.3 Compton Effect • The experiment: scatter x-rays off electrons.

• Classical description: the electrons absorb the radiation, and hence ac- celerate. They then re-emit the radiation, therebye slowing down. The emitted and absorbed radiation have different wavelengths/frequencies,

74 due to the Doppler shift: the electrons “see” a different frequency be- fore absorption than after acceleration since the acceleration puts them in a different inertial frame.

• The problem: The experiments gave a different relationship than ex- pected between incident and emitted radiation: h λf − λ0 = (1 − cos(θ)) (6.10) mec where θ is the angle between the incident and emitted radiation. The coefficient on the left hand side occurs often in quantum physics. It is called the Compton wavelength of the electron (check that it has units of length):

h hc λc := = 2 mec mec = 2.4 × 10−3nm (6.11)

Note: The right hand side of the first line above is useful because it allows you to work with manageable numbers. Spefically, one has:

hc = 1241eV · nm (6.12) 2 mec = 511keV (6.13)

• The explanation: As in the photoelectric effect, the explanation is very simple. Consider the radiation to consist of a stream of particles with that undergo elastic collisions with the electrons. We have already derived (6.10) using conservation of relativistic energy and momentum in the previous section (4.3.5).

6.3 Particles as waves

6.3.1 deBroglie Wavelength The ultraviolet catastrophe, photo-electric effect and compton scattering all point to the fact that electromagnetic waves carry energy in quanta, or lumps. They therefore exhibit wavelike and particle-like attributes. The relationship

75 between wavelength and energy involves a new constant of nature, h, that is universal in the sense that all three distinct phenomena yield the same value. DeBroglie extendded the above particle-wave duality to massive particles, such as electrons by postulating that any particle with momentum p has associated with it a characteristic wavelength: h λ = (6.14) p The above is the same for light and for massive particles, but if a particle has non-zero mass m, then p = mv, so that: h λ = (6.15) mv In the case of light p = Ec so (6.14) immediately implied a relationship be- tween wavelength and energy, E = hc/λ = hf. deBroglie further postulated that for massive particles as well this relationship is satified, namely: E = hf (6.16) But what is the frequency in this case? We assume that it must be related to the velocity of the moving wave by: f = vq/λ. In the case of a massive particle vq can’t be the actual velocity that appears in (6.14) so we have labelled it with a subscript q to denote that it is some sort of quantum velocity distinct from v. We can figure out what vq is for a non-relativistic particle of mass m moving at speed v. The relationship between kinetic energy (remember we are non-relativistic) and momentum is: 1 p2 KE = mv2 = (6.17) 2 2m Using (6.16) and (6.14) to write the above purely in terms of frequency and wavelength, we get: hv KE = hf = q λ m hv mv → v2 = q = hv 2 λ q h 1 → v = v (6.18) q 2 (6.19)

76 This suggests that the “quantum wave velocity” relating the frequency to the wavelength of a massive particle is one half its Newtonian velocity. We will see how this comes about when we study moving waves.

6.3.2 Observational Evidence 1. Electron Double Slit Experiment We have already seen the video of the interference pattern produced by a beam of electrons as they pass through a double slit. Miraculously, even if the electrons pass through the slits one at a time, the interference pattern is produced, suggesting that each individual electron behaves like an extended wave as it propagates from the source to the screen. Recall how the double slit interference pattern is produced by a wave. See Fig.(6.4). In the limit that d << L << λ the interference fringes

Figure 6.4: Double Slit experiment

77 are separated by: λL ∆y = (6.20) d where L is the distance from slits to screen and d is the distance between the slits.

2. Electron Microscope The resolving power of optical instruments is diffraction limited, making it impossible to see anything small than the wavelength of the light being used. The small deBroglie wavelength associated with fast electrons allows one to see much smaller details than you can with light. You may wonder why you can’t just use higher frequency and shorter wavelength light. The reason is that you dont want the energy of the observing wave to be high enough to destroy what you are looking at. For light the relationship between resolving power and energy is given by hc E = (6.21) λ Visible light has a wavelength between 400nm and 700nm. X-rays can resolve to .1 nm, but the associated energy is: 1241eV · nm E = = 12keV (6.22) 0.1nm which is high enough that it can damage whatever microscopic object you are measuring. For example the ionizing energy for hydrogen (the amount of energy it takes to knock the electron out of orbit), is just 13.6 eV. Electrons with the same wavelength have 100 times less energy. From (6.18) we get

1 1 hc2 Ee = · 2 · 2 mec λ 1 1 1241eV · nm2 = · · 2 511keV 0.1nm 12.41 ∼ · 12keV = 120eV (6.23) 1022

78 We have been using non-relativistic formulae for the electron. We can check that at this kinetic energy γ ∼ 1 by comparing this kinetic energy to the invariant mass of the electron:

Ee .12keV 2 ∼ << 1 (6.24) mec 500keV 6.4 The Heisenberg Uncertainty Principle

The particle-wave duality exibited by electrons (and all other particles) has another consequence. It is often state as follows: One cannot know with arbitrary accuracy at any instant in time both the position and momentum of a given particle. This is embodied in the equation: h ∆x∆p ≥ ~ = (6.25) 2 4π where ~ := h/(2π). Heuristic derivation: Suppose you wish to use light of some wavelength λ to measure the position of a particle. The accuracy is diffraction limited: δx ∼ λ (6.26) You can decrease the uncertainty by increasing the wavelength. However, the light you are using has momentum p = hf = h/λ and part or all of this momentum can be transfered to the particle during the observation pro- cess. So you introduce an uncertainty in the particles momentum due to the measurement process of: h ∆p ∼ (6.27) λ Clearly the uncertainties in position and momentum have inverse dependence on the wavelength. When you multiple them you get: ∆p∆x ∼ h (6.28) Apart from the factor of 4π, this is consistent with the exact quantum me- chanical bound (6.25). Caution: As discussed below, the above description is misleading, since it focuses on what we are able to “know” or measure. In fact, the uncertainty principle is telling us something fundamental about the nature of reality, not just on our ability to perceive it.

79 6.5 Non-Locality in Quantum Mechanics

6.5.1 The EPR Paradox The above description and derivation suggest that the uncertainty comes from the measurement process and limits what is “knowable”, as opposed to placing a constraint on the actual (“real”) attributes of the particle. In fact, quantum mechanics says that one cannot create a particle in a state (to be explained further on) in which the position and momentum both have arbitrarily precise values. The limitation comes not from our ability to measure, but from the fundamental quantum mechanical description of the particle. As mentioned previously, Einstein didnt like quantum mechanics. He par- ticularly objected to the idea that the position and momentum of a particle are inherently uncertain. He used a thought experiment to try to disprove this view of quantum mechanics. The thought experiment involved to parti- cles created with zero total momentum flying far apart in opposite directions. A measurement of the position or momentum of particle 1 would immediately yield the value of the position or momentum of particle 2. EPR used the locality of physics to argue that the state of particle 2 must be independent of what is done to particle 1. Hence if particle 2 had a well defined momen- tum after the measurement done on particle 1, it must have a well defined momentum before the measured. The same with position. Thus particle 2 must have a well defined position and momentum even if nothing is measured on particle 1. Since the uncertainty principle does not allow for particle 1 to simultaneously have a definite position and momentum, EPR concluded that quantum mechanics must be incomplete. If have put the original EPR paper into the /Resources/QuantumMechanics folder on the course website. A useful link on this can be found at StanfordEPR

6.5.2 Bell’s Inequalities In the early 1980s experiments were done that proved EPR were wrong. They were based on inequalities proven by John Bell. These inequalities made predictions for experiments that could test conclusively whether the EPR view of locality applied to the microscopic world. These experiments proved that under circumstances similar to those in the EPR thought experiment, the state of particle 2 is instantaneously affected by the measurement on

80 particle 1, irrespective of their distance. For a beautiful discussion of this see the 1981 paper by Mermin in the /Resources/QuantumMechanics folder on the website.

81 Chapter 7

The Wave Function

Our next task is to understand this quantum mechanical description and how it leads to particle-wave duality, the uncertainty principle and many other interesting consequences. To make things simple, we will consider a single, pointlike (classically) particle moving in one dimension, possibly under the influence of a conser- vative force (i.e. ina potential well of some sort. We consider only non- relativistic mechanics, i.e. speeds much less than the speed of light. A relativistic formulation of quantum mechanics exists, but it is significantly more complicated. We will now contrast the classical description of the particle with the quantum description.

7.1 Classical description of the state of a par- ticle

In Newtonian mechanics, the instantaneous state of a particle is determine by two functions of time: the position x(t) and the velocity v(t). It is convenient for thesubsequent descussion to replace the velocity by momentum p(t) = mv(t). So if you know the position and momentum of a particle at any given time time, you know everything there is to know about the state of the particle. For example you can calculate its energy, both kinetic and potential (assuming you know the potential energy function). It is possible to know (i.e. measure) both the position and momentum with arbirary accuracy at any given time. This seems like a trivial statement, but we will see that it

82 becomes highly non-trivial in the quantum case.

7.2 Quantum description of the state of a particle

• The state of a particle at each instant in time can be specified by a function ψ(x, t) that assigns a complex number to each point, x in space at each time t. ψ(x, t) is called the wave function of the particle. It is a function of the spatial coordinate, x, and is in general complex valued (it has a real and imaginary part)

• The wave function is an extended object: you need to know its value everywhere in space to completely specify the state of a system. It is in this sense that a particle is actully a wave in quantum mechanics, albeit one that can exhibit particle-like properties on occasion.

• Physical interpretation of the wave function:

ψ∗(x, t)ψ(x, t)dx =: P(x, t)dx (7.1)

gives the probability that a measurement of the position of the particle at time t will yield a value between x and x + dx.

•P (x, t) is called the probability density. It is the probability per unit length of finding the particle at a position x.

• Normalization condition Suppose you are told that a particle at time t is in a state described by a particular wave function ψ(x, t). As stated above, ψ∗(x, t)ψ(x, t)dx =: P(x, t)dx (7.2) gives the probability that a measurement of the position of the particle at time t will yield a value between x and x + dx. This leads to the following: Since the probability of finding the particle anywhere must be one (assuming that the particle exists), it must be true that:

Z ∞ ψ∗(x, t)ψ(x, t)dx = 1 (7.3) −∞

83 7.3 Interpretation

• It is in general not possible to measure the entire wave function of a particle at a given instant in time. Moreover, if you do try to make a measurement, it most cases it changes the wave function and hence the state of the particle.

• Note that quantum mechanics predicts only probabilities, although in some special cases it is possible to predict the outcome of a measure- ment with certainty. For example, if you make two measurements in rapid succession of the position of a single particle the outcome of the second measurement will be the same as the first.

• The fact that quantum mechanics predicted only probabilities caused Einstein to dislike the theory intensely. His reason for this conviction was summarized by him in the famous statement“God doesn’t throw dice”.

• The classical physical attributes of the particle, such as its position, momentum and energy, are determined indirectly from the quantum description of the particle state, i.e. from the wave function. They exist only in a statistical sense until they are measured.

• Wave function collapse It is nonetheless possible for a particle to have a well defined position, say, but this requires a very special form for the wave function, as we will discuss a bit later. In particular, directly after the position of a particle is measured to be, say x0, it must be true that an immediately subsequent measurement of the position of the same particle should also yield x0, unless the particle is disturbed first. Directly after the first measurement, the particle can be said to have a position. However, the uncertainty priniple then tells us that the momentum of the particle at that instant is completent uncertain. It is crucial to not that this implies that the measurement process can make a drastic and immediate change to the wave function, i.e. it “collapses” the wave function so that after the measurement the probability is one of finding the particle at x0 is one, and zero everywhere else. The measurement process and the collapse of the wave function cannot as yet be undersood within the framework of quantum mechanics, and so it is sometimes said that quantum mechanics is incomplete.

84 There is an alternative to postulating that measurement collapses the wave function. It is called the “many-worlds intepretation” of quantum mechanics. The basic idea is that measurement doesn’t pick out one outcome from all the possibilities entailed in the wave function. Instead all possible outcomes are realized simultaneously in parallel universes. This interpretation is experimentally indistiguishable from the collapse postulate. The currently popular view is to not worry so much about the interpretation, but to “shut-up and calculate”.

85 Chapter 8

Momentum

8.1 Momentum and wave number

We have now seen that the Fourier transform allows us to write any quan- tum wave function ψ(x) as a superposition of pure waves with wave number k = 2π/λ. Moreover, the Fourier transform φ(k) is a complementary math- ematical description of ψ(x): they provide precisely the same mathematical information, but in different waves. deBroglie told us that a particle with momentum p must be associated with a pure wave with wavelength λ = h/p, or expressed in terms of wave number we have the relationship: h hk p = = = k (8.1) λ 2π ~ where we have defined a new form of Planck’s constant that is often useful: h := = 1.054571810−34J · s (8.2) ~ 2π Thus, in quantum mechanics, a state with fixed momentum must be repre- sented by a pure wave. In terms of physics, then, the Fourier transform allows us to write any

86 wave function as a linear superposition of states with fixed momentum: 1 Z ∞ ψ(x) = √ dkφ(k) exp(ikx) 2π −∞ 1 Z 0 1 Z ∞ = √ dkφ(k) exp(ikx) + √ dkφ(k) exp(ikx) 2π −∞ 2π 0 1 Z ∞ = √ dk (φ(−k) exp(−ikx) + φ(k) exp(+ikx)) 2π 0 1 Z ∞ = √ dp (φ(−p) exp(−ipx/~) + φ(p) exp(ipx/~)) (8.3) 2π~ 0 where we have simple done a change of variables k → p = ~k in the integral to getthe last line. φ(p) also has a physical interpretation as providing a probability density:

Φ(p)dp := φ∗(p)φ(p)dp (8.4) gives the probability of measuring the momentum of a particle in state ψ(x) with Fourier transform φ)p) to be in the range p → p + dp. Parseval’s theorem guarantees that this probability distribution is nor- malized: the probability of measuring some momentum, any momentum, is also one. Thus, the average of very many measurements of the momentum of dis- tinct particles in this state will be:

hpi = h~ki Z ∞ ∗ = dk~kφ (k)φ(k) −∞ = ~hki (8.5) The rms deviation of such measurements will be given by:

∆prms = ~∆krms (8.6) Thus one has from (11.53):

∆x ∆p ≥ ~ (8.7) rms rms 2 So the source of the physical uncertainty principle is the fact that there are two complementary descriptions of the state of a particle ψ(x) which as- signs probabilities to position, and φ(k), its Fourier transform, which assigns

87 probabilities to momentum. The theory of Fourier transforms then says that if the position of a particle is well known, its momentum must be uncertain and vice versa.

88 8.2 The momentum operator

The expectation value of momentum can also be calculated directly from the wave function ψ(x) in the following way: Z Z Z Z ∗ ∂ψ(x) dk ∗ ∂ dxψ (x)(−i~) = dx √ exp(−ikx)φ (k)(−i~) dkφ˜ (k˜) exp(ikx˜ ) ∂x 2π ∂x Z Z Z ˜ dk ∗ dk = dx √ exp(−ikx)φ (k)(−i~)(ik) √ φ(k˜) exp(ikx˜ ) 2π 2π Z Z 1 Z = dkφ∗(k)( k) dkφ˜ (k˜) dx exp(i(k˜ − k)x) ~ 2π Z Z ∗ = dkφ (k)(~k) dkφ˜ (k˜)δ(k − k˜) Z ∗ = dkφ (k)(~k)φ(k) = hpi (8.8)

So there is an association between momentum and a certain differential op- erator which acts on the wave function by differentiation, ∂ pˆ := −i (8.9) ~∂x Th Note that operators which are polynomial functions ofp ˆ are built simply by acting withp ˆ the required number of times. For example:

∂2ψ(x) (ˆp)2ψ(x) = (−i )2 (8.10) ~ ∂x2 etc. The above discussion leads to a general notion in quantum mechanics: the role of all physical observables is played by suitable differential operators, such as the above. Although this statement will not make a lot of sense at this stage, it does lead us to the definition of energy in quantum mechanics and also to the equation that determine wave functions in quantum mechanics, i.e. the Schrodinger equation, which is the quantum analogues of Newton’s equations. Example: For a particle in a state described by a Gaussian (11.17) calculate

89 2 ∆prms by calculating the expectation value of the operatorp ˆ and (ˆp) using (8.9).

1 Z ∂ψ(x) hpˆi = √ dxψ∗(x)(−i ) b π ~ ∂x 1 Z −2(x − x ) = √ dx exp(−(x − x )2/(2b2))(−i ) 0 exp(−(x − x )2/(2b2)) b π 0 ~ 2b2 0 1 i Z = √ ~ dx(x − x ) exp(−(x − x )2/(b2)) b π b2 0 0 = 0 (8.11) by symmetry, as expected because there is no time dependence.

1 Z ∂2ψ(x) hpˆ2i = √ dxψ∗(x)(−i )2 b π ~ ∂x2 Z   1 ∂ −(x − x0) 2 2 = √ dx exp(−(x − x )2/(2b2))(−i )2 e−(x−x0) /(2b ) b π 0 ~ ∂x b2 2 Z  1 (x − x )2  = √~ dx + 0 exp(−(x − x )2/(b2)) b π b2 b4 0 2 = ~ b2 − h(x − x )2i b4 0 2  b2  = ~ b2 − b4 2 2 = ~ (8.12) 2b2 The root mean square deviation is:

~ ∆prms = √ (8.13) 2b in agreement with (11.51) given that ∆prms = ~∆krms.

90 Chapter 9

The Schrodinger Equation

9.1 Energy Operator

The total energy for a particle moving in a potential V (x) is:

p2 E = + V (x) (9.1) 2m We would like to be able to calculate the average energy of a particle of mass m in some given state ψ(x) at some time t (we will worry about the time dependence in the next section. In the above we learned how to construct a differential operatorp ˆ that allowed us to calculated the average momentum of a particle and in fact the average value of any polynomical function of p. This suggests the following operator for energy (it is given the letter H because it is actually the quantum version of the “Hamiltonian’, which you will learn about in an advance mechanics class):

pˆ2 Hˆ = + V (x) 2m 1 ∂2 = (−i )2 + V (x) (9.2) 2m ~ ∂x2 this the average energy is:

Z  − 2 ∂2ψ(x)  hHˆ i = dx ψ∗(x) ~ + ψ∗(x)V (x))ψ(x) (9.3) 2m ∂x2

91 9.2 Stationary states

So far we have dealt with wave functions at some instant in time, without worrying about time dependence. In quantum mechanics, like in , waves don’t generally stand still. The one exception in both cases is the standing wave, which can be thought of as a linear superposition of two waves of the same amplitude and wavelength moving in opposite directions with the same velocity. The result is a wave whose nodes stand still, but each point along the wave oscillates “up and down”. In terms of formulas, we take k1 = k2 = k and ω1 = ω2 = ω in Eq.(11.66). The resultant wave is:

ytot = 2A cos(kx) cos(ωt) (9.4)

This is a standing wave. If it is a wave on a string, each point on the string at location x oscillates up and down with amplitude 2A cos(kx) and frequency f = 2πf. At fixed t it is a pure wave with wavelength λ = 2π/k and amplitude 2A cos(ωt). (see maple file) The quantum analogue of the above is a stationary state: Definition: stationary state To summarize, a stationary state in quantum mechanics is defined to have the following properties: 1. The wave function is that of a standing wave:

2. It has definite energy, which in quantum terms requires ∆Erms = 0. 3. The average values of all measurable quantities should be independent of time. The above three conditions are satisfied by a time dependent wave function of the form: Ψ(x, t) = Aψ(x)e−iEt/~ (9.5) where ψ(x) obeys obeys the time independent Schrodinger equation (9.6).

− 2 ∂2ψ (x) ~ E + V (x)ψ (x) = Eψ (x) (9.6) 2m ∂x2 E E You should verify that: • The time dependence cancels out of the probability density:

Ψ∗(x, t)Ψ(x, t) = ψ∗(x)ψ(x) (9.7)

92 • The time dependent complex wave satisfies the deBroglie postulate: E = hf = ~ω

• The particle has well definied energy: ∆Erms = 0 In particular Z  − 2 ∂2ψ(x)  hEi = hHˆ i = dx ψ∗(x) ~ + ψ∗(x)V (x))ψ(x) 2m ∂x2 = E hE2i = E2 (9.8)

The time independent Schrodinger equation is a second order differential equation. The solution is subject to boundary conditions since the wave function must be normalizable. In general (except for the free particle which does not have a normalized wave function) this gives rise to a discrete set of solutions, with a corresponding discrete set of allowed energies. This discrete energy spectrum is where the “quantum” in quantum mechanics comes from.

9.2.1 Example 1: Free Particle The potential V (x) = 0 in this case, so the energy is p2 E = (9.9) 2m The time independent Schrodinger equation is: − 2 ∂2ψ(x) ~ = Eψ(x) (9.10) 2m ∂x2 This reduces to the simple harmonic oscillator equation, so the solutions are sin and cos functions, which we can write in complex form:

ψ(x) = Aeikx + Be−ikx 2 2 2 ~ k = 2mE = p (9.11) The full time-dependent solution for this stationary state is:

Ψ(x, t) = Aeikx + Be−ikx e−iωt = Aei(kx−ωt) + Be−i(kx+ωt) (9.12) 2k2 where ω = E = ~ (9.13) ~ 2m 93 So the solution is an arbitrary linear combination of pure waves with wave number k and angular frequency ω = ~k2/2m moving to the left and to the right. This wave functions represent states of fixed momentum p and energy E(p) = p2/(2m) Note that such pure waves can NOT be normalized. They are completely de-localized (i.e. the probability of finding the particle anywhere is the same). This is way we need to either construct a wave packet, such as in Eq.(11.79), or confine them to a box, as described below.

9.2.2 Example 2: Particle in a Box Consider a one-dimensional free particle (i.e. no forces acting on it), of mass m confined to a box of length L with impenetrable walls. In practical terms this means that the probability of finding the particle outside the box is zero. In this case we are looking for solutions ψE(x) to (9.6) with V (x) = 0 subject to the boundary conditions:

ψE(0) = ψE(L) = 0 (9.14)

Thus we have:

2 ∂ ψE(x) 2mE = − ψE(x) (9.15) ∂x2 ~2 This is just the simple harmonic equation with t replaced by x. We know that the solution is a linear combination of cos and sin functions. However, in order to satisfy the boundary conditions (9.14) the solutions take the form: πnx ψ (x) = A sin , n = 1, 2, 3, ... (9.16) n L As usual the constant A is determined, up to an arbitrary phase, by the normalization condition (see Assignment). The energy is:

π2n2 E = 2 (9.17) n ~ 2mL2 Exercise: verify this by putting (9.16) into (9.6). normalization condition on the wave function gives: 2 |A|2 = (9.18) L

94 Figure 9.1: First three energy eigenstates of a particle in a box. The wave function is plotted on the left, while the probability density is on the right.

The wave functions and probability densities are illustrated in the figure below: The full time dependent state (??) is given by:

−iEnt/~ Ψn(x, t) = ψn(x)e r 2 πnx = sin( )e−iEnt/~ (9.19) L L where we have used ~ωn = En (9.20) As indicated earlier each allowed state represents a standing (complex) wave that vanishes at the sides of the box.

9.2.3 Example 3: Simple Harmonic Oscillator The potential energy function for a simple harmonic oscillator is: 1 V (x) = Kx2 (9.21) 2 where K is the spring constant. Recall that the angular frequency of the oscillator is: r K Ω = (9.22) m

95 The Schrodinger equation is 2 ∂2ψ 1 − ~ + Kx2ψ = Eψ (9.23) 2m ∂x2 2 Since the wave function ψ must be normalizable, it must go to zero sufficiently fast as x → ±∞. The simplest solution, and the one with the lowest energy is:

−Cx2 ψ0(x) = Ae (9.24) You can verify by plugging it in to (9.23) that it satisfies the Schrodinger equation providing that: mΩ C = (9.25) 2~ which impliest that the energy is: 1 E = Ω (9.26) 0 2~ Note that this means that a simple harmonic oscillator cannot have precisely zer energy as in classical mechanics. It has a “zero point energy” that plays an important role in physics. To normalize the wave function: Z ∞ Z ∞ ∗ 2 −2Cx2 dxψ0(x)ψ(x) = A dxe −∞ −∞ r π = A2 (9.27) C so that 2C 1/4 |A| = (9.28) π In your assignment you will work with this ground state as well as the first excited state, that is the solution which has an energy higher than E0. There are an infinite number of states, labelled by an integer n, with energies given by: 1 E = (n + ) Ω (9.29) n 2 ~ The figure below shows the first few states, i.e. wave functions, probability densities and energies for a simple harmonic oscillator, Thus, the energy

96 Figure 9.2: First few energy levels of the simple harmonic oscillator. Credit:http://hyperphysics.phy-astr.gsu.edu/hbase/quantum/hosc5.html is quantized. As n gets large, the difference between the energies becomes relatively small, so that they appear to be able to take on any real value. Note that they are still discrete, but the “spacing” between them is to be so small compared to energies we are used to dealing with, that they seem continuous. Example: Consider a simple harmonic oscillator of mass 1 kg, with initial amplitude 1 meter and spring constant 2 Newtons per meter. If its energy is quantized according to (9.29) above, what is the value of the integer n?

9.3 The time dependent Schrodinger equa- tion

We can provide a heuristic argument for the form of the time dependent Schrodinger equation. Recall that we wish the time dependence of the wave function in the stationary case to be of the form:

Ψ(x, t) = ψ(x)e−iEt/~ (9.30)

97 which can be written: ∂Ψ iE = − Ψ(x, t) (9.31) ∂t ~ If we assume that the wave function is the form of a standing wave, i.e. that it is separable, we get the time independent Schrodinger equation. To get the correct equation in the more general case we replace E by its operator form, just as we did for momentum in constructing the Hamiltonian. This gives:

∂Ψ(x, t) i = Hˆ Ψ(x, t) ~ ∂t 2 ∂2Ψ(x, t) = − ~ + V (x)Ψ(x, t) (9.32) 2m ∂x2 In order to solve this equation one needs appropriate boundary conditions, and one needs to know the state Ψ(x, t0) at some initial time t. If the particle is not confined to a finite region of space, then the boundary conditions are provided by the requirement that the wave function be normalized: Z ∞ dxΨ∗(x, t)Ψ(x, t) = 1 (9.33) −∞ This forces the wave function, or at least its norm, to vanish sufficiently fast at ±∞, and makes the mathematical problem well defined and solvable as long as the potential V (x) is well behaved everywhere. You may have noticed that the left hand side of (9.33) depends on time, whereas the right hand side does not. The beauty of the Schrodinger equation is that if you normalize the wave function at any one time, say t0 (and you have constructed the Hamiltonian correctly, boundary conditions included), then the wave function stays normalized.

98 Chapter 10

Conclusions

There is clearly a great deal more to be said about relativistic mechanics and quantum mechanics. It is hoped that the above provides a useful starting point. I have tried to

• emphasize the importance of symmetry throughout.

• not throw any equations at you without some justification, sometimes heuristic sometimes rigorous.

• convey some understanding of the beautiful structure underlying quan- tum maechanics.

• give you both the interest and the mathematical groundwork to delve deeper.

99 Chapter 11

Appendix: Mathematical Background

11.1 Complex Numbers

As mentioned above, the quantum description of the state of a particle at any instant in time is given by a wavefunction ψ(x, t). Recall that the wave function is complex: it assigns a complex number to each point in space, at any time t. It is therefore useful to review complex numbers a bit: 1. In general a complex number can be written in terms of a real and imaginary part: z = a + ib (11.1) where i2 = −1 (11.2) a, b are real numbers. a is the real part of z, b is its imaginary part.

2. Recall that complex numbers can be represented as a point or vector on the complex plane, with the y-axis representing the imaginary “com- ponent” of the complex number, as shown in Fig.(11.1) below. Thus we can split the wave function into a real and imaginary part:

ψ(x, t) = ψR + ıψI (11.3)

3. Euler Formula As the Fig(11.1) also suggests, there is a way to rep- resent a complex number in terms of polar coordinates (ρ, θ). Just by

100 Figure 11.1: Complex Plane looking at the diagram, we can see that a = ρ cos(θ) b = ρ sin(θ) → z = ρ(cos(θ) + i sin(θ)) (11.4) where √ ρ = a2 + b2  a  θ = cos−1 √ (11.5) a2 + b2 What makes this interesting is the existence of Euler’s formula, which states that: eıθ = cos(θ) + ı sin(θ) (11.6) Proof: Taylor expand the exponential, cosine and sine functions in the above and see that it works term by term. So we can write any complex number in terms of a magnitude ρ and phase θ: z = ρeiθ (11.7)

101 4. Complex conjugation: To complex conjugate any expression con- taining complex numbers, simply replace ı by −ı wherever it appears.

z∗ = a − ib = ρe−iθ (11.8)

5. Norm (magnitude) of a complex number: The magnitude of a complex number is defined as: √ √ |z| = z∗z = a2 + b2 = ρ (11.9)

11.2 Probabilities and expecation values

11.2.1 Discrete Distributions • You know how to caculate the average mark on a test:

PN M hMi = i=1 i (11.10) N

where where Mi is the mark of the ith student and N is the total number of students. Note that here you are summing over all students.

• Another way to calculate the average is to ask how many time Na a particular mark Ma occurred in the set of all the marks. This provides an alternative formula for the average:

P N M hMi = a a a N X Na = M a N a X = MaPa (11.11) a Note that in the above you summing over all possible marks, not over all the students. We have defined: N P := a (11.12) a N

102 which give the probability, in the given distribution of marks, of finding the mark Ma. Actually, it only gives the probability in the limit that the number of students N is very large, but you get the idea. d

• This gives us a general formula for calculating the average value of any quantity,x whose possible values are given by the set {xa, a = 1..N}, assuming that we know the probability distribution Pa, i.e. the probability that the particular value xa shows up:

N X hxi = xaPa (11.13) a=1 We can in fact calculate the average value of any function f(x) of x as well: N X hf(x)i = f(xa)Pa (11.14) a=1 • The root mean square deviation of a given probability distribution gives an idea of the spread of the values, i.e. the width of the probability distribution. It is defined as:

p 2 2 ∆xrms := |hx i − |hxi| | (11.15)

11.2.2 Continuous probability distributions When a quantum takes its values on the real line, the sums above must be replaced by integrals, nothing else changes. In particular, we can now state that quantum mechanics makes the following predictions. If one prepares a large number of particles, each in a state described by the same wave func- tion ψ(x) at time of measurement, then measurements of position, position squared, or in fact any function f(x) of the position, then the averages of the

103 corresponding collection of measured values will be given by: Z hxi = dxxψ∗(x)ψ(x) Z hx2i = dxx2ψ∗(x)ψ(x) Z hf(x)i = dxf(x)ψ∗(x)ψ(x) (11.16)

Example: Gaussian wavefunction Suppose a large number of particles are prepared in a state with wave- function: 2 2 ψ(x) = A exp(−(x − x0) /(2b )) (11.17) where A is a complex number.

1. What is the value of the constant A? Need to normalize the wave function: Z ∞ 1 = dxψ∗(x)ψ(x) −∞ Z ∞ ∗ 2 2 2 2 = A A exp(−(x − x0) /(2b )) exp(−(x − x0) /(2b )) −∞ Z ∞ 2 2 2 = |A| dx exp(−(x − x0) /(b )) −∞ Z ∞ 2 2 = |A| bdy exp(−y ) where y := (x − x0)/b √−∞ = |A|2b π (11.18)

where we have used the basic integral:

Z ∞ y2 √ dy exp(− ) = 2π (11.19) −∞ 2 The normalization condition therefore requires: 1 |A|2 = √ (11.20) b π

104 The normalized wave function is therefore:

1 2 2 ψ(x) = √ exp(−(x − x0) /(2b )) (11.21) pb π

Note that the normalization condition does not determine the phase of the complex number A, only its magnitude.

2. What is the most probable value of the outcome of the measurement of the particle’s position? The most probable value occurs at the value of x where: dP(x) = 0 dx d(ψ∗(x)ψ(x)) = dx d(|A|2 exp(−(x − x)2/b2)) = dx (2|A|2 = − (x − x ) exp(−(x − x)2/b2) (11.22) b2 0

So the most probably value is x = x0.

105 3. Plot the probability distribution P(x) for b = 1/2 and x0 = 2.

Figure 11.2: Gaussian probability distribution

4. What will the average value of the particle’s measured position? Z ∞ hxi = dxxψ∗(x)ψ(x) −∞ Z ∞ 1 2 2 = √ dsx exp(−(x − x0) /(b )) b π −∞ Z ∞ 1 2 2 = √ dx(x − x0) exp(−(x − x0) /(b )) b π −∞ Z ∞ 1 2 2 +x0 √ dx exp(−(x − x0) /(b )) b π −∞ = 0 + x0 (11.23)

The first term is zero by symmetry of the integral, the second term gives x0 because the wave function is normalized.

106 5. What is hx2i for this state?

Z ∞ hx2i = dxx2ψ∗(x)ψ(x) −∞ Z ∞ 1 2 2 2 = √ ds(x − x0) exp(−(x − x0) /(b )) b π −∞ Z ∞ 1 2 2 2 + √ dx(2xx0 − x0) exp(−(x − x0) /(b )) b π −∞ Z ∞ 1 2 2 2 2 = √ dyy exp(−y /(b )) + x0 b π −∞ b2 = + x2 (11.24) 2 0

6. Root Mean Square Deviation One can get an idea of the “width” of a probability distribution, or equivalent the uncertainty in the resulting measurements of x by cal- culating the root mean square deviation, which is defined as

p 2 2 ∆xrms = |hx i − (hxi) | (11.25)

Clearly for the Gaussian probability distribution above:

r 2 b 2 2 b ∆xrms = + x − x = √ (11.26) 2 0 0 2

11.2.3 Dirac Delta Function If you construct a particle state with the above wave function, for example, such that b is arbitrarily small, then the width of the distribution goes to zero, the peak value goes to infinity so that it can stay normalized. In the limit that b → 0, a measurement of the position of the particle will yield x0 with probability one. Note: the limit described above yields something that is not quite a function, but useful nonetheless, namely the Dirac delta function: 1 δ(x − x ) := lim √ exp(−(x − x )2/(b2)) (11.27) 0 b→0 b π 0

107 This “function” is zero everywhere, except at x = x0, where it is infinite. The other important feature that the integral Z dxδ(x − x0) = 1 (11.28) as long as the integration region covers x0. Fig.(11.3) illustrates the limiting procedure defined above.

Figure 11.3: Dirac delta function as the infinitely thin limit of a Gaussian r

We are into a new area of mathematics here, one that was again motivated by physics and initially developed by a physicist (P.A.M. Dirac). δ(x − x0) is not a function because it is not well defined at x0. It is called a “distribution”, because it’s integral is well defined. It is extremely useful in physics and mathematicians have (mostly) also now acknowledged that the theory of distributions is a rigorous branch of mathematics. Note that (11.27) provides one of many explicit realizations of the Dirac delta function as a limit of an ordinary function. While useful, such real- izations are not required. The Dirac delta function is defined by the two properties that • It is non-vanishing only at one point:

δ(x − x0) = 0, x 6= x0 (11.29)

108 • Its integral across that point is one.

Z x0+a δ(x − x0) = 1, for all a, b (11.30) x0−b It can also be shown that the above conditions can more or less be replaced by the single condition:

Z x0+a dxδ(x − x0)f(x) = f(x0), for any function f(x) (11.31) x0−b The easiest way to prove this is to do a Taylor series expansion of f(x) around x0:

Z x0+a Z x0+a h 0 dxδ(x − x0)f(x) = dxδ(x − x0) f(x0) + f (x0)(x − x0) x0−b x0−b 1 i + f 00(x )(x − x )2 + ... 2 0 0 Z x0+a = f(x0) dxδ(x − x0) x0−b = f(x0) (11.32)

To get the second line we have used the fact that δ(x − x0) vanishes when x 6= x0 (and none of the derivatives of f(x) blow up at x0), so that all higher order terms in the Taylor expansion vanish, leaving only the first term, namely f(x0). To get the last line we used (??)

11.3 Fourier Series and Transforms

11.3.1 Fourier series Any periodic function y(x) of position that is periodic with period L can be approximated by an infinite sum of pure waves:

∞ 1 X  2πnx 2πnx y(x) = a + a cos + b sin 2 0 n L n L n=1 ∞ 1 X = a + (a cos(k x) + b sin(k x)) (11.33) 2 0 n n n n n=0

109 where we have defined: 2πn k := nk = (11.34) n L The coefficient can be obtained as follows:

2 Z x0+L a0 = dx˜ y(˜x) (11.35) L x0 2 Z x0+L 2πn an = dx˜ y(˜x) cos( x˜), n = 1, 2, 3, ... (11.36) L x0 L 2 Z x0+L 2πn bn = dx˜ y(˜x) sin( x˜), n = 1, 2, 3, ... (11.37) L x0 L Note:

• To calculate the coefficients you can integrate over any complete period.

• If the function is symmetric about x = 0, i.e if f(x) = f(−x) then the coefficients of the sin terms in the series will vanish.

• If the function is anti-symmetric about x = 0, i.e. if f(x) = −f(−x) then the coefficients of the cos terms in the series will vanish.

• If you wish to approximate a non-periodic function in a finite interval, say [−L/2, L/2], then just pretend it is periodic, with that interval corresponding to one full period, and carry on from there.

• Often you can be clever in how you choose to extend the function and make the resulting function either even or odd, therebye having to cal- culate on the cos or sin coefficients, respectively.

• Constructive and Destructive Interference Fourier series basi- cally works via constructive and destructive interference. By adding periodic functions with different wavelengths you can arrange for the peaks and valleys to cancel and add at the right places so that the desired function is produced.

110 Examples: See Maple File and PDF of Maple File for calculations of Fourier series of various functions. Fig.11.4 shows the fifth order approximation to f(x) = e−8x2 − 2 ≤ x ≤ 2 given by

f(x) = .157 + .2900768785 cos((1/2)πx) +.230 cos(πx) + .157cos((3/2)πx) +0.091 cos(2πx) + 0.0456 cos((5/2)πx) (11.38)

Note that because the function is symmetric about x → −x, the Fourier decomposition only contains cos functions.

Figure 11.4: Fifth order fourier decomposition of f(x) = e−8x2 .

Complex Form of Fourier Series Exercise: Show using Euler’s formula that the Fourier series above can be

111 written in complex form:

∞ X iknx f(x) = cne (11.39) n=−∞ 2πn where k = (11.40) n L a − ib c = n n (11.41) n 2 a + ib c = c∗ = n n (11.42) −n n 2 11.3.2 Fourier Transforms It is possible to extend the Fourier series to approximate over the entire real line functions that are not periodic, simply by taking L → ∞ (carefully). The derivation can be found in Chapter 12 of the book Mathematical Methods in Physics, by Riley et al. You can find it on the course website. The only requirement on the function f(x) is that the following integral: Z ∞ |f(x)|dx (11.43) −∞ be finite. If that is the case, then one has the following expression: 1 Z ∞ f(x) = √ dkF (k) exp(ikx) (11.44) 2π −∞ 1 Z ∞ where F (k) = √ dxf(x) exp (−ikx) (11.45) 2π −∞ • The basic difference is that instead of a discrete (but infinite) set of fundamental modes with wave numbers kn = 2πn/λ, there are an un- countable infinity of modes, so you have to integrate over them instead of doing a sum. The bottom line as far as quantum mechanics goes is that you can write any normalizable wave function as an infinite sum of pure waves.

• Note that f(x) and it’s Fourier transform F (k) can be thought of as two different (complementary) descriptions of the same function.

112 Examples: • Fourier transform of f(x) such that

f(x) = Ae−λx, x ≥ 0 = 0 x < 0 (11.46)

1 Z 0 A Z ∞ F (k) = √ dx0 · e−kx + √ dx exp(−λx) exp(−ikx) 2π −∞ 2π 0 A  exp(−(λ + ik)x∞ = 0 + √ − 2π λ + ik 0 A = √ (11.47) 2π(λ + ik)

• Gaussian wave function: Use normalized Gaussian wave function from (11.21) centered on the origin x0 = 0: 1 −x2 ψ(x) = exp( ) (11.48) p √ 2 b 2π 2b It’s Fourier transform is: 1 1 Z ∞ x2 φ(k) = √ dx exp(− exp(−ikx) p √ 2 2π b π −∞ 2b 1 1 Z ∞ −(x + ikb)2 − k2b4  = √ dx exp p √ 2 2π b π −∞ 2b 1 1 Z ∞  x2  = √ exp(−kb2/2) dx exp p √ 2 2π b π −∞ 2b 1 1 √ = √ √ exp(−kb2/2)b 2π 2π pb π s b = √ exp(−k2b2/2) (11.49) π

Note that φ(k) is exactly the same as ψ(x) providing one substitutes k for x and 1/b for b everywhere. It is therefore clearly also normalized. This is not a coincidence.

113 11.3.3 The mathematical uncertainty principle The Fourier transform, like the Fourier series works via constructive and destructive interference of pure waves to enhance or supress the parts of the resultant function where required. This suggests that if f(x) is very narrow, a lot of destructive interference is required to make the function zero on most of the real axis. Thus F (k) will be “wide”. If f(x) is wide, on the other hand , you will need fewer Fourier modes, so F (k) will be narrow. Thus results in a theorem that says the product of the widths is optimized when when f(x) and consequently F (k) are Gaussians. The width of a distribution as mentioned above is defined by the root mean square deviation. For the normalized wave function we have already found that: b ∆xrms = √ (11.50) 2 Let’s now calculate the corresponding width associated with its Fourier trans- form:

p 2 2 ∆krms := hk i − hxi 1 = √ (11.51) b 2 This follows by the fact that the integrals are identical, again with the sub- sitution x → −x and b → 1/b. The optimized situation therefore is:

b 1 1 ∆xrms∆krms = √ √ = (11.52) 2 b 2 2 Thus, the theorem states that: 1 ∆x ∆k ≥ (11.53) rms rms 2 11.3.4 Dirac Delta Function Revisited We can now show that there exists another very useful representation of the Dirac delta function: 1 Z ∞ δ(x − x0) = dk exp(−ik(x − x0)) (11.54) 2π −∞

114 Proof: Using the above representation Z 1 Z dxδ(x − x )f(x) = ∈ dx dkf(x) exp(−ikx) exp(ikx ) o 2π 0 1 Z 1 Z = √ dk exp(−ikx0)√ dxf(x) exp(−ikx) 2π 2π 1 Z = √ dk exp(−ikx0)F (k) 2π = f(x0) (11.55) since in the above k and x are arbitrary variables, it is also true that: 1 Z δ(k − k˜) = dx exp(i(k − k˜)x) (11.56) 2π

11.3.5 Parseval’s Theorem

Z Z  1 Z   1 Z  dxψ∗(x)ψ(x) = dx √ dkφ∗(k) exp(ikx) √ dkφ˜ ∗(k˜) exp(−ikx˜ ) 2π 2π Z Z 1 Z = dk dk0φ∗(k)φ(k) dx exp(i(k − k˜)x) 2π Z Z = dkφ∗(k) dkφ˜ (k˜)δ(k − k˜) Z = dkφ∗(k)φ(k) (11.57)

This shows that normalizing the wave function will automatically normalize its Fourier transform.

11.4 Waves

11.4.1 Moving pure waves Consider the time dependent pure wave:

y(x, t) = A cos(kx − ωt) (11.58)

We will assume that k and ω are both positive.

115 • k is called the wavenumber. The wavelength at fixed t is related to the wave number by: 2π λ = (11.59) k • ω is called the angular frequency. It gives the frequency of oscillation of the wave at a fixed point x in radians per second. The frequency of vibrations at fixed x in cycles per second is ω f = (11.60) 2π

• Phase Velocity The function reaches its maximum value whenever kx − ωt = nπ, n = 0, ±1, ±2, .... As t increases, if ω is positive, this peak value moves in the positive x direction, with speed ω v = (11.61) k v is called the phase velocity.

• A wave moving in the negative x direction takes the form:

y(x, t) = A cos(kx + ωt) (11.62)

• In general one can describe a wave moving at fixed speed by a function f(x ∓ vt). This function does not have to be periodic. However, since it is a function only of the combination x ∓ vt, as t increases to t + ∆t, its shape will not change, but merely get shifted to x ± Deltax, where

∆x = ±v∆t (11.63)

because with these changes, the argument of the function remains the same

(x + ∆x) ∓ v(t + ∆t) = (x ± v∆t) ∓ v(t + ∆t) = (x ∓ v∆t) (11.64)

See Maple File

116 11.4.2 Complex Waves Quantum mechanics requires us to consider waves that are complex valued. It is therefore useful to consider complex exponentials: ψ(x, t) = Aei(kx−ωt) = A (cos(kx − ωt) + i sin(kx − ωt)) (11.65) which describes a complex wave with fixed wave number k, angular frequency ω moving at fixed speed v = ω/k in the positive x direction.

11.4.3 Group velocity and phase velocity We have seen in the context of constant, or standing waves, that by taking a linear superposition of a large number of pure waves of different wavelengths and amplitudes you can generate pretty well any shape you want. The same can of course be done for moving waves to produce a moving “packet” of waves, or wave packet. The simplest example is to consider just the sum of two waves (see assignment) of equal amplitude. Example:

y1 = A cos(k1x ∓ ω1t)

y2 = A cos(k2x ∓ ω2t) (11.66) We will consider the k’s and ω’s to be positive. The direction of the wave will then be determined by the relative sign between the first and second term in the argument of the cosine function. If the sign is negative, the wave moves to the right, if it is positive, it moves to the left. The resultant wave can be put in a useful for by considering the following trig identity: a − b a + b cos(a) + cos(b) = 2 cos cos (11.67) 2 2 Applying this to the sum of the two waves given in (11.66): ytot = y1 + y2

= A cos(k1x − ω1t) + A cos(k2x − ω2t) (k − k ) (±ω ∓ ω )  (k + k ) (±ω ± ω )  = 2A cos 1 2 x − 1 2 t cos 1 2 x − 1 2 t 2 2 2 2 = 2A cos(∆kx ∓ ∆ωt) cos k ∓ ωt (11.68)

117 where |k − k | ∆k := 1 2 (11.69) 2 |ω − ω | ∆ω := 1 2 (11.70) 2 are the magnitudes of the difference in wave number and frequencies, respec- tively of the two waves, while k + k k := 1 2 (11.71) 2 ω + ω ω := 2 2 (11.72) 2 are the magnitude of the average wave number and frequency, respectively. Since we are assuming that both k’s are positive, the wavelength of the first factor 2π λ := (11.73) g ∆k is larger than that of the second factor 2π λp := (11.74) k The first prefector, with its longer wavelength, can be thought of as providing a changing or modulating the amplitude of the shorter wavelength wave. It describes the shape and motion of the envelope, as seen in the movie in the maple file. The speed of the longer wavelength component, i.e. the speed of the envelope, is called the group velocity

vg = ∆ω/∆k (11.75) The velocity of the shorter wavelength component is called the phase velocity ω (ω1 + ω2) vp = = (11.76) k (k1 + k2) where k + k k := 1 2 (11.77) 2 ω + ω ω := 1 2 (11.78) 2 are the average wave number and angular frequency, respectively.

118 11.4.4 Wave packets Fourier transform theory tells us that single wave moving wave packet is made up of many modes, each with its own phase velocity. If the wave packet is not too spread out, there is an overall central envelope moving at a specific group velocity. Consider now a wave function that describes a particle with some relatively small spread in wave number (i.e. momentum):

1 Z ∞ ψ(x, t) = √ dkφ(k) exp(ikx − ω(k)t) (11.79) 2π −∞ • Note that the angular frequency depends on the wave number. Each component of this wave is a pure wave with fixed wave number, angular frequency and velocity.

• By allowing the wave number to be negative (and assuming that the angular frequency is positive), (11.79) sums over waves moving in the positive and negative x directions.

• We assume that φ(k) is sharply peaked, so that one wave number dom- inates the wave packet. The most probably wave number is determined by: ∗ d(φ (k)φ(k)) = 0 (11.80) dk kc In this case, one can roughly think of the wave as simply a linear combination of two waves with wave number on “either side” of the most probable wavelength kc. In that case we extract a phase velocity and group velocity as follows:

• The phase velocity of the most probable pure wave component is then:

ω(kc) vp = (11.81) kc • The group velocity of the envelope is:

dω vg = (11.82) dk kc

119 • We can figure out what ω(k) must be for a free particle as follows: If this wave is a quantum wave function that obeys deBroglie’s postulates, then it must be true that:

p = mvg = ~kc 1 and E = mv2 = ω(k ) (11.83) 2 g ~ c This implies that 2 kc ω(kc) = (11.84) 2m~ so:

dω vg = dk kc k = c (11.85) m and ω

vp = k kc 1 k = c 2 m 1 = v (11.86) 2 g as we discovered before. We will now see how to define momentum in quan- tum mechanics in a more general context and calculate its expectation value.

120