14 Linear Algebra that is nonnegative when the matrices are the same N L N L † ∗ 2 (A, A)=TrA A = AijAij = |Aij| ≥ 0 (1.87) !i=1 !j=1 !i=1 !j=1 which is zero only when A = 0. So this inner product is positive definite. A vector space with a positive-definite inner product (1.73–1.76) is called an inner-product space,ametric space, or a pre-Hilbert space. A sequence of vectors fn is a Cauchy sequence if for every ϵ>0 there is an integer N(ϵ) such that ∥fn − fm∥ <ϵwhenever both n and m exceed N(ϵ). A sequence of vectors fn converges to a vector f if for every ϵ>0 there is an integer N(ϵ) such that ∥f −fn∥ <ϵwhenever n exceeds N(ϵ). An inner-product space with a norm defined as in (1.80) is complete if each of its Cauchy sequences converges to a vector in that space. A Hilbert space is a complete inner-product space. Every finite-dimensional inner-product space is complete and so is a Hilbert space. But the term Hilbert space more often is used to describe infinite-dimensional complete inner-product spaces, such as the space of all square-integrable functions (David Hilbert, 1862–1943). Example 1.17 (The Hilbert Space of Square-Integrable Functions) For the vector space of functions (1.55), a natural inner product is b (f,g)= dx f ∗(x)g(x). (1.88) "a The squared norm ∥ f ∥ of a function f(x)is b ∥ f ∥2= dx |f(x)|2. (1.89) "a A function is square integrable if its norm is finite. The space of all square-integrable functions is an inner-product space; it also is complete and so is a Hilbert space.

Example 1.18 (Minkowski Inner Product) The Minkowski or Lorentz in- ner product (p, x) of two 4-vectors p =(E/c, p1,p2,p3) and x =(ct, x1,x2,x3) is p · x − Et . It is indefinite, nondegenerate (1.79), and invariant under Lorentz transformations, and often is written as p · x or as px.Ifp is the 4-momentum of a freely moving physical particle of mass m,then p · p = p · p − E2/c2 = − c2m2 ≤ 0. (1.90) The Minkowski inner product satisfies the rules (1.73, 1.74, and 1.79), but 22 Linear Algebra and the formula (1.118) for the kth orthonormal linear combination of the vectors |Vℓ⟩ is

|uk⟩ |Uk⟩ = . (1.139) ⟨u |u ⟩ ! k k

The vectors |Uk⟩ are not unique; they vary with the order of the |Vk⟩.

Vectors and linear operators are abstract. The numbers we compute with are inner products like ⟨g|f⟩ and ⟨g|A|f⟩. In terms of N orthonormal ∗ vectors |n⟩ with fn = ⟨n|f⟩ and gn = ⟨g|n⟩, we can use the expansion (1.131) to write these inner products as

N N ⟨g|f⟩ = ⟨g|I|f⟩ = ⟨g|n⟩⟨n|f⟩ = g∗f " " n n n=1 n=1 N N (1.140) ⟨g|A|f⟩ = ⟨g|IAI|f⟩ = ⟨g|n⟩⟨n|A|ℓ⟩⟨ℓ|f⟩ = g∗ A f " " n nℓ ℓ n,ℓ=1 n,ℓ=1 in which Anℓ = ⟨n|A|ℓ⟩. We often gather the inner products fℓ = ⟨ℓ|f⟩ into a column vector f with components fℓ = ⟨ℓ|f⟩

⟨1|f⟩ f1 ⎛ ⎞ ⎛ ⎞ ⟨2|f⟩ f2 f = ⎜ . ⎟ = ⎜ . ⎟ (1.141) ⎜ . ⎟ ⎜ . ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⟨N|f⟩ ⎠ ⎝ f3 ⎠ and the ⟨n|A|ℓ⟩ into a A with matrix elements Anℓ = ⟨n|A|ℓ⟩.Ifwe also line up the inner products ⟨g|n⟩ = ⟨n|g⟩∗ in a row vector that is the of the complex conjugate of the column vector g

† ∗ ∗ ∗ ∗ ∗ ∗ g =(⟨1|g⟩ , ⟨2|g⟩ ,...,⟨N|g⟩ )=(g1,g2,...,gN ) (1.142) then we can write inner products in matrix notation as ⟨g|f⟩ = g†f and as ⟨g|A|f⟩ = g†Af. If we switch to a different basis, say from |n⟩’s to |αn⟩’s, then the com- ′ ponents of the column vectors change from fn = ⟨n|f⟩ to fn = ⟨αn|f⟩, and similarly those of the row vectors g† and of the matrix A change, but the bras, the kets, the linear operators, and the inner products ⟨g|f⟩ and ⟨g|A|f⟩ 28 Linear Algebra Remarkably, this translation operator is an exponential of the momentum operator U(a)=exp(−ipa/!)inwhich! = h/2π =1.054 × 10−34 Js is Planck’s constant divided by 2π. In two-dimensions, with basis states |x, y⟩ that are orthonormal in Dirac’s sense, ⟨x, y|x′,y′⟩ = δ(x − x′)δ(y − y′), the unitary operator

U(θ)= |x cos θ − y sin θ,xsin θ + y cos θ⟩⟨x, y| dxdy (1.171) ! rotates a system in space by the angle θ. This rotation operator is the exponential U(θ)=exp(−i θ Lz/!)inwhichthez component of the angular momentum is Lz = xpy − ypx. We may carry most of our intuition about matrices over to these unitary transformations that change from one infinite basis to another. But we must use common sense and keep in mind that infinite sums and integrals do not always converge.

1.18 Antiunitary, Antilinear Operators Certain maps on states |ψ⟩→|ψ′⟩, such as those involving time reversal, are implemented by operators K that are antilinear ∗ ∗ ∗ ∗ K (zψ + wφ)=K (z|ψ⟩ + w|φ⟩)=z K|ψ⟩ + w K|φ⟩ = z Kψ + w Kφ (1.172) and antiunitary ∗ ∗ (Kφ,Kψ)=⟨Kφ|Kψ⟩ =(φ, ψ) = ⟨φ|ψ⟩ = ⟨ψ|φ⟩ =(ψ, φ) . (1.173)

In Dirac notation, these rules are K(z|ψ⟩)=z∗⟨ψ| and K(w⟨φ|)=w∗|φ⟩.

1.19 Symmetry in Quantum Mechanics In quantum mechanics, a symmetry is a one-to-one map of states |ψ⟩↔|ψ′⟩ and |φ⟩↔|φ′⟩ that preserves probabilities ′ ′ |⟨φ |ψ ⟩|2 = |⟨φ|ψ⟩|2. (1.174)

Eugene Wigner (1902–1995) showed that every symmetry in quantum me- chanics can be represented either by an operator U that is linear and uni- tary or by an operator K that is antilinear and antiunitary. The antilinear, antiunitary case seems to occur only when the symmetry involves time re- versal. Most symmetries are represented by operators that are linear and unitary. Unitary operators are of great importance in quantum mechanics. 34 Linear Algebra shows that the operations (1.189) on columns that don’t change the value of the determinant can be written as from the rightbya matrix that has unity on its main diagonal and zeros below. Now consider the matrix product

A 0 IB A AB = (1.202) !−IB"!0 I " !−I 0 " in which A and B are N × N matrices, I is the N × N , and 0istheN × N matrix of all zeros. The second matrix on the left-hand side has unity on its main diagonal and zeros below, and so it does not change the value of the determinant of the matrix to its left, which then must equal that of the matrix on the right-hand side:

A 0 A AB det =det . (1.203) !−IB" !−I 0 "

By using Laplace’s expansion (1.183) along the first column to evaluate the determinant on the left-hand side and his expansion along the last row to compute the determinant on the right-hand side, one finds that the de- terminant of the product of two matrices is the product of the determinants det A det B =detAB. (1.204)

Example 1.27 (Two 2 × 2 Matrices) When the matrices A and B are both 2 × 2, the two sides of (1.203) are

a11 a12 00 A 0 ⎛a21 a22 00⎞ det =det (1.205) !−IB" −10b b ⎜ 11 12⎟ ⎜ 0 −1 b b ⎟ ⎝ 21 22⎠ = a11a22 det B − a21a12 det B =detA det B and

a11 a12 (ab)11 (ab)12 A AB ⎛a21 a22 (ab)21 (ab)22⎞ det =det (1.206) !−I 0 " −10 0 0 ⎜ ⎟ ⎜ 0 −10 0⎟ ⎝ ⎠ =(−1)C42 =(−1)(−1) det AB =detAB and so they give the product rule det A det B =detAB. 44 Linear Algebra remains true when the matrix A replaces the variable λ

N k P (A, A)= pk A =0. (1.258) !k=0 To see why, we use the formula (1.196)

N δkℓ det A = AikCiℓ (1.259) !i=1 to write the determinant |A − λI| = P (λ,A) as the product of the matrix A − λI and the transpose of its matrix of cofactors

(A − λI) C(λ,A)T = |A − λI| I = P (λ,A) I. (1.260)

The transpose of the matrix of cofactors of the matrix A−λI is a polynomial in λ with matrix coefficients

T N−1 C(λ,A) = C0 + C1λ + ···+ CN−1λ . (1.261) The left-hand side of equation (1.260) is then

T 2 (A − λI)C(λ,A) = AC0 +(AC1 − C0)λ +(AC2 − C1)λ + ... N−1 N +(ACN−1 − CN−2)λ − CN−1λ . (1.262) Equating equal powers of λ on both sides of (1.260), we have using (1.257) and (1.262)

AC0 = p0I

AC1 − C0 = p1I

AC2 − C1 = p2I ...= ... (1.263)

ACN−1 − CN−2 = pN−1I

−CN−1 = pN I. We now multiply from the left the first of these equations by I, the second by A,thethirdbyA2, . . . , and the last by AN and then add the resulting equations. All the terms on the left-hand sides cancel, while the sum of those on the right gives P (A, A). Thus a square matrix A obeys its characteristic equation 0 = P (A, A) or

N k N−1 N−1 N N 0= pk A = |A| I+p1A+···+(−1) (TrA) A +(−1) A (1.264) !k=0 1.27 Functions of Matrices 45 a result known as the Cayley-Hamilton theorem (Arthur Cayley, 1821– 1895, and William Hamilton, 1805–1865). This derivation is due to Israel Gelfand (1913–2009) (Gelfand, 1961, pp. 89–90). Because every N N matrix A obeys its characteristic equation, its Nth × power AN can be expressed as a linear combination of its lesser powers − − − AN =( 1)N 1 |A| I + p A + p A2 + ···+( 1)N 1(TrA) AN 1 . (1.265) − 1 2 − ! " For instance, the square A2 of every 2 2 matrix is given by × A2 = |A|I +(TrA)A. (1.266) − Example 1.35 (-one-half ) If θ is a real 3-vector and σ is the 3-vector of (1.32), then the square of the traceless 2 2 matrix A = θ · σ is × θ θ iθ (θ · σ)2 = |θ · σ| I = 3 1 − 2 I = θ2 I (1.267) − − # θ + iθ θ # # 1 2 − 3 # # # in which θ2 = θ · θ. One may use# this identity to show# (exercise (1.28)) that

exp ( iθ · σ/2) = cos(θ/2) I iθˆ · σ sin(θ/2) (1.268) − − in which θˆ is a unit 3-vector. For a spin-one-half object, this matrix repre- sents a right-handed rotation of θ radians about the axis θˆ.

1.27 Functions of Matrices What sense can we make of a function f of an N N matrix A? and × how would we compute it? One way is to use the characteristic equation (1.265) to express every power of A in terms of I, A,...,AN−1 and the N−1 coefficients p = |A|,p ,p ,...,pN− , and pN− =( 1) TrA.Thenif 0 1 2 2 1 − f(x) is a polynomial or a function with a convergent power series ∞ k f(x)= ck x (1.269) $k=0 in principle we may express f(A) in terms of N functions fk(p) of the coefficients p (p ,...,pN− ) as ≡ 0 1 N−1 k f(A)= fk(p) A . (1.270) $k=0 The identity (1.268) for exp ( iθ · σ/2) is an N = 2 example of this tech- − nique which can become challenging when N>3. 66 Linear Algebra Example 1.46 Suppose A is the 3 × 2 matrix

r1 p1 A = ⎛r2 p2⎞ (1.385) r p ⎝ 3 3⎠ and the vector |y⟩ is the cross-product |y⟩ = L = r×p. Then no solution |x⟩ exists to the equation A|x⟩ = |y⟩ (unless r and p are parallel) because A|x⟩ is a linear combination of the vectors r and p while |y⟩ = L is perpendicular to both r and p.

Even when the matrix A is square, the equation (1.381) sometimes has no solutions. For instance, if A is a square matrix that vanishes, A = 0, then (1.381) has no solutions whenever |y⟩̸= 0. And when N>M, as in for instance

x1 abc y1 ⎛x2⎞ = (1.386) %def& %y & x 2 ⎝ 3⎠ the solution (1.384) is never unique, for we may add to it any linear combi- nation of the vectors |n⟩ that A annihilates for M

min(M,N) N ⟨mn|y⟩ |x⟩ = |n⟩ + xn|n⟩. (1.387) Sn n'=1 n='M+1

These are the vectors |n⟩ for M

Example 1.47 (The CKM Matrix) In the , the mass ma- trices of the u, c, t and d, s, b quarks are 3 × 3 complex matrices Mu and Σ † Σ † Md with singular-value decompositions Mu = Uu uVu and Md = Ud dVd † whose singular-values are the quark masses. The unitary CKM matrix UuUd (Cabibbo, Kobayashi, Maskawa) describes transitions among the quarks mediated by the W ± gauge bosons. By redefining the quark fields, one may make the CKM matrix real, apart from a phase that violates charge- conjugation-parity (CP)symmetry.

The adjoint of a complex M is its complex conjugate, M † = M ∗.Soby (1.351),itsright singular vectors |n⟩ are the eigenstates of M ∗M ∗ 2 M M|n⟩ = Sn|n⟩ (1.388) 1.32 The Moore-Penrose Pseudoinverse 67

∗ and by (1.366) its left singular vectors |mn⟩ are the eigenstates of MM

∗ ∗ ∗ 2 MM |mn⟩ =(M M) |mn⟩ = Sn|mn⟩. (1.389) Thus its left singular vectors are the complex conjugates of its right singular ∗ vectors, |mn⟩ = |n⟩ . So the V is the complex conjugate of the unitary matrix U, and the SVD of M is (Autonne, 1915)

M = UΣU T. (1.390)

1.32 The Moore-Penrose Pseudoinverse Although a matrix A has an inverse A−1 if and only if it is square and has a nonzero determinant, one may use the singular-value decomposition to make a pseudoinverse A+ for an arbitrary M × N matrix A. If the singular-value decomposition of the matrix A is

A = U Σ V † (1.391) then the Moore-Penrose pseudoinverse (Eliakim H. Moore 1862–1932, Roger Penrose 1931–) is A+ = V Σ+ U † (1.392) in which Σ+ is the transpose of the matrix Σ with every nonzero entry replaced by its inverse (and the zeros left as they are). One may show that the pseudoinverse A+ satisfies the four relations

AA+ A = A and A+ AA+ = A+ † † AA+ = AA+ and A+ A = A+ A. (1.393) ( ) ( ) and that it is the only matrix that does so. Suppose that all the singular values of the M × N matrix A are positive. In this case, if A has more rows than columns, so that M>N,thenthe + product AA is the N × N identity matrix IN

+ † + † A A = V Σ ΣV = V IN V = IN (1.394)

+ and AA is an M × M matrix that is not the identity matrix IM . If instead A has more columns than rows, so that N>M,thenAA+ is the M × M identity matrix IM

+ + † † AA = UΣΣ U = UIM U = IM (1.395)

+ but A A is an N × N matrix that is not the identity matrix IN .Ifthe 58 Linear Algebra by setting its derivatives with respect to ρn, λ1, and λ2 equal to zero ∂L = −k (ln ρn + 1) − λ1 − λ2En =0 (1.339) ∂ρn ∂L = ρ − 1=0 (1.340) ∂λ n 1 !n ∂L = ρ E − E =0. (1.341) ∂λ n n 2 !n The first (1.339) of these conditions implies that

ρn =exp[−(λ1 + λ2En + k)/k] (1.342)

We satisfy the second condition (1.340) by choosing λ1 so that

exp(−λ2En/k) ρn = . (1.343) n exp(−λ2En/k)

Setting λ2 =1/T , we define the" temperature T so that ρ satisfies the third condition (1.341). Its eigenvalue ρn then is

exp(−En/kT ) ρn = . (1.344) n exp(−En/kT ) In terms of the inverse temperature" β ≡ 1/(kT), the density operator is e−βH ρ = (1.345) Tr (e−βH ) which is the Boltzmann distribution.

1.31 The Singular-Value Decomposition Every complex M × N rectangular matrix A is the product of an M × M unitary matrix U, an M × N rectangular matrix Σ that is zero except on its main diagonal which consists of A’s nonnegative singular values Sk, and an N × N unitary matrix V † A = U Σ V †. (1.346) This singular-value decomposition is a key theorem of matrix algebra. Suppose A is a linear operator that maps vectors in an N-dimensional vector space VN into vectors in an M-dimensional vector space VM .The spaces VN and VM will have infinitely many orthonormal bases {|n, a⟩∈VN } and {|m, b⟩∈VM } labeled by continuous parameters a and b. Each pair of 1.31 The Singular-Value Decomposition 61

VN and IM for the M-dimensional space VM the sums N M I = |n n| and I = |m ′ m ′ |. (1.358) N ⟩⟨ M n ⟩⟨ n n!=1 n!′=1 The singular-value decomposition of A then is

M N A = I AI = |m ′ m ′ |A |n n|. (1.359) M N n ⟩⟨ n ⟩⟨ n!′=1 n!=1

There are min(M,N) singular values Sn all nonnegative. For the positive singular values, equations (1.352 & 1.354) show that the matrix element m ′ |A|n vanishes unless n′ = n ⟨ n ⟩ 1 ′ mn′ |A|n = An |An = Sn′ δn′n. (1.360) ⟨ ⟩ Sn′ ⟨ ⟩ For the Z vanishing singular values, equation (1.353) shows that A|n =0 ⟩ and so m ′ |A|n =0. (1.361) ⟨ n ⟩ Thus only the min(M,N) Z singular values that are positive contribute to − the singular-value decomposition (1.359). If N>M, then there can be at most M nonzero eigenvalues e .IfN M, there can be at most N nonzero n ≤ en’s. The final form of the singular-value decomposition then is a sum of dyadics weighted by the positive singular values

min(M,N) min(M,N)−Z A = |m S n| = |m S n|. (1.362) n⟩ n ⟨ n⟩ n ⟨ n!=1 n!=1 The vectors |m and |n respectively are the left and right singular vectors. n⟩ ⟩ The nonnegative numbers Sn are the singular values. The linear operator A maps the min(M,N) right singular vectors |n into ⟩ the min(M,N) left singular vectors S |m scaled by their singular values n n⟩ A|n = S |m (1.363) ⟩ n n⟩ and its adjoint A† maps the min(M,N) left singular vectors |m into the n⟩ min(M,N) right singular vectors |n scaled by their singular values ⟩ A†|m = S |n . (1.364) n⟩ n ⟩ The N-dimensional vector space VN is the domain of the linear operator A.IfN>M,thenA annihilates N M + Z of the basis vectors |n .The − ⟩ null space or kernel of A is the space spanned by the basis vectors |n that ⟩ 62 Linear Algebra A annihilates. The vector space spanned by the left singular vectors |m n⟩ with nonzero singular values Sn > 0istherange or image of A. It follows from the singular value decomposition (1.362) that the dimension N of the domain is equal to the dimension of the kernel N M + Z plus that of the − range M Z, a result called the rank-nullity theorem. − Incidentally, the vectors |m are the eigenstates of the n⟩ AA† as one may see from the explicit product of the expansion (1.362) with its adjoint

min(M,N) min(M,N) † ′ AA = |m S n| |n S ′ m ′ | n⟩ n ⟨ ⟩ n ⟨ n n!=1 n!′=1 min(M,N) min(M,N) = |m S δ ′ S ′ m ′ | n⟩ n nn n ⟨ n n!=1 n!′=1 min(M,N) = |m S2 m | (1.365) n⟩ n⟨ n n!=1 which shows that |m is an eigenvector of AA† with eigenvalue e = S2 n⟩ n n AA†|m = S2|m . (1.366) n⟩ n n⟩ The SVD expansion (1.362) usually is written as a product of three explicit matrices, A = UΣV †. The middle matrix Σ is an M N matrix with × the min(M,N) singular values Sn = √en on its main diagonal and zeros elsewhere. By convention, one writes the Sn in decreasing order with the † biggest Sn as entry Σ11. The first matrix U and the third matrix V depend upon the bases one uses to represent the linear operator A. If these basis vectors are |α & |βℓ ,then k⟩ ⟩ min(M,N)

A ℓ = α |A|βℓ = α |m S n|βℓ (1.367) k ⟨ k ⟩ ⟨ k n⟩ n ⟨ ⟩ n!=1 so that the k, nth entry in the matrix U is U = α |m . The columns of kn ⟨ k n⟩ the matrix U are the left singular vectors of the matrix A:

U1n α1|mn ⎛ ⎞ ⎛ ⟨ ⟩ ⎞ U α |m 2n ⟨ 2 n⟩ ⎜ . ⎟ = ⎜ . ⎟ . (1.368) ⎜ . ⎟ ⎜ . ⎟ ⎜ ⎟ ⎜ ⎟ ⎝U ⎠ ⎝ α |m ⎠ Mn ⟨ M n⟩ † † Similarly, the n, ℓth entry of the matrix V is V = n|βℓ .ThusVℓ = ' (n,ℓ ⟨ ⟩ ,n Exercises 79

† t † u † v The eight states of the system |t, u, v⟩≡(a1) (a2) (a3) |0, 0, 0⟩.We can represent them by eight 8-vectors each of which has seven 0’s with a 1 in position 4t +2u + v +1. How big should the matrices that represent the creation and annihilation operators be? Write down the three matrices that represent the three creation operators. 1.38 Show that the Schwarz inner product (1.430) is degenerate because it can violate (1.79) for certain density operators and certain pairs of states. 1.39 Show that the Schwarz inner product (1.431) is degenerate because it can violate (1.79) for certain density operators and certain pairs of operators. 1.40 The coherent state |{αk}⟩ is an eigenstate of the annihilation operator ak with eigenvalue αk for each mode k of the electromagnetic field, (+) ak|{αk}⟩ = αk|{αk}⟩. The positive-frequency part Ei (x) of the elec- tric field is a linear combination of the annihilation operators − E(+)(x)= a E(+)(k) ei(kx ωt). (1.453) i ) k i k (+) Show that |{αk}⟩ is an eigenstate of Ei (x) as in (1.442) and find its eigenvalue Ei(x). 92 Fourier Series Moreover if f (k+1) is piecewise continuous, then π −inx −inx d (k) e (k+1) e fn = f (x) − f (x) dx −π!dx " −(in)k+1 # −(in)k+1 $ ! π (2.55) e−inx = f (k+1)(x) dx. %−π (in)k+1 Since f (k+1)(x) is piecewise continuous on the closed interval [−π, π], it is bounded there in absolute value by, let us say, M. So the Fourier coefficients of a Ck periodic function with f (k+1) piecewise continuous are bounded by π 1 (k+1) 2πM |fn| ≤ |f (x)| dx ≤ . (2.56) nk+1 %−π nk+1 We often can carry this derivation one step further. In most simple exam- ples, the piecewise continuous periodic function f (k+1)(x) actually is piece- wise continuously differentiable between its successive jumps at xj.Inthis case, the derivative f (k+2)(x) is a piecewise continuous function plus a sum of a finite number of delta functions with finite coefficients. Thus we can integrate once more by parts. If for instance the function f (k+1)(x)jumps (k+1) J times between −π and π by ∆fj , then its Fourier coefficients are π −inx (k+2) e fn = f (x) dx %−π (in)k+2 J − J − (2.57) xj+1 e inx e inxj = f (k+2)(x) dx + ∆f (k+1) % s (in)k+2 j (in)k+2 &j=1 xj &j=1 in which the subscript s means that we’ve separated out the delta functions. The Fourier coefficients then are bounded by 2πM |f | ≤ (2.58) n nk+2 (k+2) in which M is related to the maximum absolute values of fs (x) and (k+1) k of the ∆fj . The Fourier series of periodic C functions converge very rapidly if k is big. Example 2.6 (Fourier Series of a C0 Function) The function defined by 0 −π ≤ x<0 f(x)=⎧ x 0 ≤ x<π/2 (2.59) ⎨ π − x π/2 ≤ x ≤ π is continuous on the interval⎩ [−π, π] and its first derivative is piecewise 2.13 Periodic Boundary Conditions 111 with coefficients 2 L nπx f = sin f(x)dx (2.145) n L L !0 and the representation ∞ ∞ 2 nπx nπz δ(x − z − 2mL)= sin sin (2.146) " L " L L m=−∞ n=1 for the Dirac comb on SL.

2.13 Periodic Boundary Conditions Periodic boundary conditions are often convenient. For instance, rather than study an infinitely long one-dimensional system, we might study the same system, but of length L. The ends cause effects not present in the infinite system. To avoid them, we imagine that the system forms a circle and impose the periodic boundary condition ψ(x ± L, t)=ψ(x, t). (2.147) In three dimensions, the analogous conditions are ψ(x ± L,y,z,t)=ψ(x,y,z,t) ψ(x, y ± L,z,t)=ψ(x, y,z,t) (2.148) ψ(x, y, z ± L,t)=ψ(x, y, z,t). The eigenstates |p of the free hamiltonian H = p2/2m have wave func- ⟩ tions ix·p/! 3/2 ψp(x)= x|p = e /(2π!) . (2.149) ⟨ ⟩ The periodic boundary conditions (2.148) require that each component pi of momentum satisfy Lpi/! =2πni or 2π!n hn p = = (2.150) L L where n is a vector of integers, which may be positive or negative or zero. Periodic boundary conditions arise naturally in the study of solids. The atoms of a perfect crystal are at the vertices of a Bravais lattice

3 x = x + n a (2.151) i 0 " i i i=1 in which the three vectors ai are the primitive vectors of the lattice and 112 Fourier Series the ni are three integers. The hamiltonian of such an infinite crystal is invariant under translations in space by 3 n a . (2.152) " i i i=1 To keep the notation simple, let’s restrict ourselves to a cubic lattice with lattice spacing a. Then since the momentum operator p generates translations in space, the invariance of H under translations by a n exp(ian · p)H exp(−ian · p)=H (2.153) implies that exp(ian · p) and H are compatible observables [exp(ian · p),H]= 0. As explained in section 1.30, it follows that we may choose the eigenstates of H also to be eigenstates of p · ! · eiap n/ |ψ = eiak n |ψ (2.154) ⟩ ⟩ which implies that · ψ(x + an,t)=eiak n ψ(x,t). (2.155) Setting · ψ(x)=eik x u(x) (2.156) we see that condition (2.155) implies that u(x)isperiodic u(x + an)=u(x). (2.157) For a general Bravais lattice, this Born–von Karman periodic boundary condition is 3 u x + n a ,t = u(x,t). (2.158) " i i # i=1 $ Equations (2.155) and (2.157) are known as Bloch’s theorem.

Exercises

2.1 Show that sin ω1x +sinω2x is the same as (2.9). 2.2 Find the Fourier series for the function exp(ax) on the interval π < − x π. ≤ 2.3 Find the Fourier series for the function (x2 π2)2 on the same interval − ( π, π]. − 2.4 Find the Fourier series for the function (1+cos x)sinax on the interval ( π, π]. − 118 Fourier and Laplace Transforms One often needs to relate a function’s Fourier series to its Fourier trans- form. So let’s compare the Fourier series (3.1) for the function f(x) on the interval [ L/2,L/2] with its Fourier transform (3.9) in the limit of large L − ∞ ∞ i2πnx/L iknx ∞ e e ikx dk f(x)= fn = fn = f˜(k) e (3.12) ) √ ) √ "−∞ √ n=−∞ L n=−∞ L 2π in which kn =2πn/L =2πy/L. Now fn = fˆ(y), and so by the definition (3.6) of f˜(k), we have fn = fˆ(Lk/2π)= 2π/L f˜(k). Thus, to get the Fourier series from the Fourier transform, we# multiply the series by 2π/L and use the Fourier transform at kn divided by √2π ∞ ∞ 1 2π f˜(k ) f(x)= f eiknx = n eiknx. (3.13) √ n L √ L n=!−∞ n=!−∞ 2π Going the other way, we have ∞ ∞ dk L f[Lk/2π] f(x)= f˜(k) eikx = eikxdk. (3.14) "−∞ √2π 2π "−∞ √L

Example 3.1 (The Fourier Transform of a Gaussian Is a Gaussian) The Fourier transform of the gaussian f(x)=exp( m2 x2)is ∞ − dx − − 2 2 f˜(k)= e ikx e m x . (3.15) "−∞ √2π We complete the square in the exponent: ∞ − 2 2 dx − 2 2 2 f˜(k)=e k /4m e m (x+ik/2m ) . (3.16) "−∞ √2π As we shall see in Sec. 5.14 when we study analytic functions, we may shift x to x ik/2m2,sothetermik/2m2 in the exponential has no effect on the − value of the x-integral. ∞ − 2 2 dx − 2 2 1 − 2 2 f˜(k)=e k /4m e m x = e k /4m . (3.17) "−∞ √2π √2 m Thus, the Fourier transform of a gaussian is another gaussian ∞ dx − − 2 2 1 − 2 2 f˜(k)= e ikx e m x = e k /4m . (3.18) "−∞ √2π √2 m But the two gaussians are very different: if the gaussian f(x)=exp( m2x2) − decreases slowly as x because m is small (or quickly because m is big), →∞ then its gaussian Fourier transform f˜(k)=exp( k2/4m2)/m√2 decreases − quickly as k because m is small (or slowly because m is big). →∞ 3.6 Convolutions 129 If we generalize the relations (3.12–3.14) between Fourier series and trans- forms from one to n dimensions, then we find that the Fourier series corre- sponding to the Fourier transform (3.94) is

∞ ∞ π n ˜ 2 i(kj x1+···+kj xn) f(kj1 ,...,kjn ) f(x1,...,xn)= ... e 1 n ! L " (2π)n/2 j1!=−∞ jn!=−∞ (3.95) in which kjℓ =2πjℓ/L. Thus, for n =3wehave

∞ ∞ ∞ 3 (2π) · f˜(kj) f(x)= eikj x (3.96) V (2π)3/2 j1!=−∞ j2!=−∞ j3!=−∞

3 in which kj =(kj1 ,kj2 ,kj3 ) and V = L is the volume of the box.

Example 3.8 (The Feynman ) For a spinless quantum field of mass m, Feynman’s propagator is the four-dimensional Fourier transform

exp(ik · x) d4k △ (x)= (3.97) F " k2 + m2 − iϵ (2π)4 where k · x = k · x − k0x0, all physical quantities are in natural units (c = ! = 1), and x0 = ct = t. The tiny imaginary term − iϵ makes △ (x − y) proportional to the mean value in the vacuum state |0 of the F ⟩ time-ordered product of the fields φ(x) and φ(y) (section 5.34)

i △ (x y)= 0|T [φ(x)φ(y)] |0 (3.98) − F − ⟨ ⟩ θ(x0 − y0)⟨0|φ(x)φ(y)|0 + θ(y0 x0) 0|φ(y)φ(x)|0 ≡ ⟩ − ⟨ ⟩ in which θ(a)=(a + |a|)/2|a| is the Heaviside function (2.166).

3.6 Convolutions The convolution of f(x)withg(x) is the integral

∞ dy f g(x)= f(x y) g(y). (3.99) ∗ "−∞ √2π −

The convolution product is symmetric

f g(x)=g f(x) (3.100) ∗ ∗ 130 Fourier and Laplace Transforms because setting z = x − y,wehave ∞ −∞ dy dz f g(x)= f(x − y) g(y)= − f(z) g(x − z) ∗ "−∞ √2π "∞ √2π ∞ (3.101) dz = g(x − z) f(z)=g f(x). "−∞ √2π ∗ Convolutions may look strange at first, but they often occur in physics in the three-dimensional form

′ ′ F (x)= G(x − x ) S(x ) d3x (3.102) " in which G is a Green’s function and S is a source. Example 3.9 (Gauss’s Law) The divergence of the electric field E is the microscopic charge density ρ divided by the electric permittivity of the −12 vacuum ϵ0 =8.854 × 10 F/m, that is, ∇·E = ρ/ϵ0. This constraint is known as Gauss’s law. If the charges and fields are independent of time, then the electric field E is the gradient of a scalar potential E = −∇φ. These last two equations imply that φ obeys Poisson’s equation ρ −∇2φ = . (3.103) ϵ0 We may solve this equation by using Fourier transforms as described in Sec. 3.13. If φ˜(k) andρ ˜(k) respectively are the Fourier transforms of φ(x) and ρ(x), then Poisson’s differential equation (3.103) gives

· · −∇2φ(x)= −∇2 eik x φ˜(k) d3k = k2 eik x φ˜(k) d3k " " ρ(x) · ρ˜(k) = = eik x d3k (3.104) ϵ0 " ϵ0 2 which implies the algebraic equation φ˜(k)=˜ρ(k)/ϵ0k which is an instance of (3.163). Performing the inverse Fourier transformation, we find for the scalar potential

ik·x ˜ 3 ik·x ρ˜(k) 3 φ(x)= e φ(k) d k = e 2 d k (3.105) " " ϵ0 k ′ 3 ′ 3 ′ ik·x 1 −ik·x′ ρ(x ) d x d k − ′ ρ(x ) 3 ′ = e 2 e 3 = G(x x ) d x " k " ϵ0 (2π) " ϵ0 in which 3 ′ d k 1 · − ′ G(x − x )= eik (x x ). (3.106) " (2π)3 k2 3.6 Convolutions 131 This function G(x − x′) is the Green’s function for the differential operator −∇2 in the sense that 3 d k · − ′ −∇2G(x − x′)= eik (x x ) = δ(3)(x − x′). (3.107) " (2π)3 This Green’s function ensures that expression (3.105) for φ(x) satisfies Poisson’s equation (3.103). To integrate (3.106) and compute G(x − x′), we use spherical coordinates with the z-axis parallel to the vector x − x′

3 ∞ 1 − ′ d k 1 ik·(x−x′) dk ik|x−x′| cos θ G(x x )= 3 2 e = 2 d cos θ e " (2π) k "0 (2π) "−1 ∞ ′ ′ dk eik|x−x | − e−ik|x−x | = (3.108) 2 − ′ "0 (2π) ik|x x | ∞ ∞ 1 sin k|x − x′| dk 1 sin kdk = = . 2 − ′ 2 − ′ 2π |x x |"0 k 2π |x x | "0 k In example 5.35 of section 5.34 on Cauchy’s principal value, we’ll show that

∞ sin k π dk = . (3.109) "0 k 2 Using this result, we have

3 d k 1 · − ′ ′ 1 eik (x x ) = G(x − x )= . (3.110) " (2π)3 k2 4π|x − x′| Finally by substituting this formula for G(x − x′) into Eq. (3.105), we find that the Fourier transform φ(x) of the productρ ˜(k)/k2 of the functionsρ ˜(k) and 1/k2 is the convolution ′ 1 ρ(x ) 3 ′ φ(x)= ′ d x (3.111) 4πϵ0 " |x − x | of their Fourier transforms 1/|x − x′| and ρ(x′). The Fourier transform of the product of any two functions is the convolution of their Fourier trans- forms, as we’ll see in the next section. (George Green 1793–1841)

Example 3.10 (The Magnetic Vector Potential) The magnetic induction B has zero divergence (as long as there are no magnetic monopoles) and so may be written as the curl B = ∇×A of a vector potential A. For time-independent currents, Amp`ere’s law is ∇×B = µ0J in which µ0 = 2 −7 −2 1/(ϵ0c )=4π × 10 NA is the permeability of the vacuum. It follows 132 Fourier and Laplace Transforms that in the Coulomb gauge ∇·A = 0, the magnetostatic vector potential A satisfies the equation

2 2 ∇×B = ∇×(∇×A)=∇ (∇·A) −∇ A = −∇ A = µ0J. (3.112)

Applying the Fourier-transform technique (3.103–3.111), we find that the Fourier transforms of A and J satisfy the algebraic equation

J˜(k) A˜(k)=µ0 (3.113) k2 which is an instance of (3.163). Performing the inverse Fourier transform, we see that A is the convolution µ J(x′) A(x)= 0 d3x′ . (3.114) 4π " |x − x′|

If in the solution (3.111) of Poisson’s equation, ρ(x) is translated by a, then so is φ(x). That is, if ρ′(x)=ρ(x+a)thenφ′(x)=φ(x+a). Similarly, if the current J(x) in (3.114) is translated by a, then so is the potential A(x). Convolutions respect translational invariance. That’s one reason why they occur so often in the formulas of physics.

3.7 The Fourier Transform of a Convolution The Fourier transform of the convolution f g is the product of the Fourier ∗ transforms f˜ andg ˜: f"g(k)=f˜(k)˜g(k). (3.115) ∗ To see why, we form the Fourier transform f"g(k) of the convolution f g(x) ∗ ∗ ∞ dx − f"g(k)= e ikx f g(x) ∗ "−∞ √2π ∗ ∞ ∞ dx − dy = e ikx f(x − y) g(y). (3.116) "−∞ √2π "−∞ √2π

Now we write f(x − y) and g(y) in terms of their Fourier transforms f˜(p) andg ˜(q) ∞ ∞ ∞ ∞ dx − dy dp − dq f"g(k)= e ikx f˜(p) eip(x y) g˜(q) eiqy ∗ "−∞√2π "−∞√2π "−∞√2π "−∞√2π (3.117) 3.8 Fourier Transforms and Green’s Functions 133 and use the representation (3.36) of Dirac’s delta function twice to get ∞ ∞ ∞ dy − f"g(k)= dp dq δ(p − k) f˜(p)˜g(q) ei(q p)y ∗ "−∞2π "−∞ "−∞ ∞ ∞ = dp dq δ(p − k) δ(q − p) f˜(p)˜g(q) "−∞ "−∞ ∞ = dp δ(p − k) f˜(p)˜g(p)=f˜(k)˜g(k) (3.118) "−∞ which is (3.115). Examples 3.9 and 3.10 were illustrations of this result.

3.8 Fourier Transforms and Green’s Functions A Green’s function G(x) for a differential operator P turns into a delta func- tion when acted upon by P , that is, PG(x)=δ(x). If the differential oper- ator is a polynomial P (∂) P (∂ ,...,∂ ) in the derivatives ∂ ,...,∂ with ≡ 1 n 1 n constant coefficients, then a suitable Green’s function G(x) G(x ,...,x ) ≡ 1 n will satisfy P (∂)G(x)=δ(n)(x). (3.119) Expressing both G(x) and δ(n)(x) as Fourier transforms, we get dnk P (∂)G(x)= dnkP(ik) eik·x G˜(k)=δ(n)(x)= eik·x (3.120) " " (2π)n which gives us the algebraic equation 1 G˜(k)= . (3.121) (2π)n P (ik)

Thus the Green’s function GP for the differential operator P (∂)is dnk eik·x G (x)= . (3.122) P " (2π)n P (ik) Example 3.11 (Green and Yukawa) In 1935, Hideki Yukawa (1907–1981) proposed the partial differential equation P (∂)G (x) (−△ + m2)G (x)=(−∇2 + m2)G (x)=δ(x). (3.123) Y Y ≡ Y Y Our (3.122) gives as the Green’s function for PY (∂) the Yukawa potential d3k eik·x d3k eik·x e−mr GY (x)= 3 = 3 2 = (3.124) " (2π) PY (ik) " (2π) k + m2 4πr an integration done in example 5.21. 154 Infinite Series One may extend the definition (4.36) of n-factorial from positive integers to complex numbers by means of the integral formula ∞ − z! ≡ e t tz dt (4.53) !0 for Re z> 1. In particular − ∞ − 0! = e t dt = 1 (4.54) !0 which explains the definition (4.37). The factorial function (z 1)! in turn − defines the gamma function for Re z>0 as ∞ − − Γ(z)= e t tz 1 dt =(z 1)! (4.55) − !0 as may be seen from (4.53). By differentiating this formula and integrating it by parts, we see that the gamma function satisfies the key identity ∞ ∞ ∞ d − − d − − Γ(z + 1) = e t tz dt = e t tz dt = e t ztz 1 dt #−dt dt !0 ! "0 # ! "0 = z Γ(z). (4.56)

Since Γ(1) = 0! = 1, we may use this identity (4.56) to extend the definition (5.102) of the gamma function in unit steps into the left half-plane 1 1 1 1 1 1 Γ(z)= Γ(z + 1) = Γ(z + 2) = Γ(z + 3) = ... (4.57) z z z +1 z z +1 z +2 as long as we avoid the negative integers and zero. This extension leads to Euler’s definition 1 · 2 · 3 ···n Γ(z)= lim nz (4.58) n→∞ z(z + 1)(z + 2) ···(z + n) and to Weierstrass’s (exercise 4.6)

∞ −1 1 − z − Γ(z)= e γz 1+ e z/n (4.59) z n $n=1 ( % & ' (Karl Theodor Wilhelm Weierstrass, 1815–1897), and is an example of an- alytic continuation (section 5.12). One may show (exercise 4.8) that another formula for Γ(z)is ∞ − 2 − Γ(z)=2 e t t2z 1 dt (4.60) "0 4.6 Taylor Series 155 for Re z>0 and that (2n)! Γ(n + 1 )= √π (4.61) 2 n!22n which implies (exercise 4.11) that (2n 1)!! Γ n + 1 = − √π. (4.62) ) 2 * 2n

Example 4.7 (Bessel Function of nonintegral index) We can use the gamma-function formula (4.55) for n! to extend the definition (4.49) of the Bessel function of the first kind Jn(ρ) to nonintegral values ν of the index n. Replacing n by ν and (m + n)! by Γ(m + ν + 1), we get ∞ ρ ν ( 1)m ρ 2m J (ρ)= − (4.63) ν 2 m! Γ(m + ν + 1) 2 % & m$=0 % & which makes sense even for complex values of ν.

Example 4.8 (Spherical Bessel Function) The spherical Bessel function is defined as π jℓ(ρ) J (ρ). (4.64) ≡ '2ρ ℓ+1/2 For small values of its argument |ρ| 1, the first term in the series (4.63) ≪ dominates and so (exercise 4.7)

√π ρ ℓ 1 ℓ !(2ρ)ℓ ρℓ jℓ(ρ) = = (4.65) ≈ 2 %2& Γ(ℓ +3/2) (2ℓ + 1)! (2ℓ + 1)!! as one may show by repeatedly using the key identity Γ(z + 1) = z Γ(z).

4.6 Taylor Series If the function f(x) is a real-valued function of a real variable x with a continuous Nth derivative, then Taylor’s expansion for it is

2 N−1 ′ a ′′ a − f(x + a)=f(x)+af (x)+ f (x)+···+ f (N 1) + E 2 (N 1)! N − N−1 an = f (n)(x)+E (4.66) n! N n$=0 160 Infinite Series Example 4.12 (Planck’s Distribution) Max Planck (1858–1947) showed that the electromagnetic energy in a closed cavity of volume V at a temper- ature T in the frequency interval dν about ν is 8πhV ν3 dU(β, ν,V)= dν (4.94) c3 eβhν − 1 in which β =1/(kT), k =1.3806503×10−23 J/K is Boltzmann’s constant, and h =6.626068 × 10−34 Js is Planck’s constant. The total energy then is the integral ∞ 8πhV ν3 U(β,V)= dν (4.95) c3 eβhν − 1 !0 which we may do by letting x = βhν and using the geometric series (4.31) ∞ 8π(kT)4V x3 β U( ,V)= 3 x dx (hc) 0 e − 1 ! ∞ 8π(kT)4V x3e−x = 3 −x dx (hc) 0 1 − e ! ∞ 4 ∞ 8π(kT) V − − = x3e x e nx dx. (4.96) (hc)3 " !0 n=0 The geometric series is absolutely and uniformly convergent for x>0, and we may interchange the limits of summation and integration. After another change of variables, the Gamma-function formula (5.102) gives ∞ 4 ∞ 8π(kT) V − U(β,V)= x3e (n+1)x dx (hc)3 " n=0 !0 ∞ 4 ∞ 8π(kT) V 1 − = y3e y dy (hc)3 " (n + 1)4 n=0 !0 8π(kT)4V 8π5(kT)4V = 3! ζ(4) = . (4.97) (hc)3 15(hc)3 It follows that the power radiated by a “black body” is proportional to the fourth power of its temperature and to its area A

P = σ AT4 (4.98) in which 5 4 2π k − − − σ = =5.670400(40) × 10 8 Wm 2 K 4 (4.99) 15 h3c2 is Stefan’s constant. 4.11 Bernoulli Numbers and Polynomials 161 The number of photons in the black-body distribution (4.94) at inverse temperature β in the volume V is ∞ ∞ 8πV ν2 8πV x2 β ν N( ,V)= 3 βhν d = 3 x dx c 0 e − 1 (cβh) 0 e − 1 ! ! ∞ ∞ 2 −x ∞ 8πV x e 8πV − − = dx = x2e x e nx dx (cβh)3 1 − e−x (cβh)3 " !0 !0 n=0 ∞ ∞ ∞ ∞ 8πV − 8πV 1 − = x2e (n+1)x dx = y2e y dy (cβh)3 " (cβh)3 " (n + 1)3 n=0 !0 n=0 !0 8πV 8π(kT)3V = ζ(3)2! = ζ(3)2!. (4.100) (cβh)3 (ch)3 The mean energy ⟨E⟩ of a photon in the black-body distribution (4.94) is the energy U(β,V) divided by the number of photons N(β,V) 3! ζ(4) π4 ⟨E⟩ = ⟨hν⟩ = kT = kT (4.101) 2! ζ(3) 30 ζ(3) or ⟨E⟩≈2.70118 kT since Ap´ery’s constant ζ(3) is 1.2020569032 . . . (Roger Ap´ery, 1916–1994).

Example 4.13 (The Lerch Transcendent) The Lerch transcendent is the series ∞ zn Φ(z,s,α)= . (4.102) " (n + α)s n=0 It converges when |z| < 1 and Re s>0 and Re α > 0.

4.11 Bernoulli Numbers and Polynomials

The Bernoulli numbers Bn are defined by the infinite series ∞ ∞ x xn dn x xn = = B (4.103) ex − 1 " n! "dxn ex − 1## " n n! n=0 #x=0 n=0 # for the generating function x/(ex −1).# They are the successive derivatives

dn x B = . (4.104) n dxn ex − 1# #x=0 # # 162 Infinite Series

So B0 = 1 and B1 = −1/2. The remaining odd Bernoulli numbers vanish

B2n+1 = 0 for n>0 (4.105) and the remaining even ones are given by Euler’s zeta function (4.92) formula

(−1)n−12(2n)! B = ζ(2n) for n>0. (4.106) 2n (2π)2n The Bernoulli numbers occur in the power series for many transcendental functions, for instance ∞ 2k 1 2 B − coth x = + 2k x2k 1 for x2 <π2. (4.107) x ) (2k)! k=1

Bernoulli’s polynomials Bn(y) are defined by the series ∞ xexy xn = B (y) (4.108) ex − 1 ) n n! n=0 for the generating function xexy/(ex − 1). Some authors (Whittaker and Watson, 1927, p. 125–127) define Bernoulli’s numbers instead by ∞ 2(2n)! t2n−1 dt ζ Bn = 2n (2n)=4n 2πt (4.109) (2π) "0 e − 1 a result due to Carda.

4.12 Asymptotic Series A series n a s (x)= k (4.110) n xk !k=0 is an asymptotic expansion for a real function f(x)iftheremainder Rn

Rn(x)=f(x) − sn(x) (4.111) satisfies the condition n lim x Rn(x) = 0 (4.112) x→∞ for fixed n. In this case, one writes ∞ a f(x) ≈ k (4.113) xk !k=0 4.12 Asymptotic Series 163 where the wavy equal sign indicates equality in the sense of (4.112). Some authors add the condition:

n lim x Rn(x)=∞ (4.114) n→∞ for fixed x.

Example 4.14 (The Asymptotic Series for E1) Let’s develop an asymp- totic expansion for the function ∞ −y dy E1(x)= e (4.115) "x y which is related to the exponential-integral function x dy Ei(x)= ey (4.116) "−∞ y by the tricky formula E1(x)=−Ei(−x). Since

e−y d e−y e−y = − − (4.117) y dy ! y " y2 we may integrate by parts, getting

−x ∞ e − −y dy E1(x)= e 2 . (4.118) x #x y Integrating by parts again, we find

−x −x ∞ e − e −y dy E1(x)= 2 +2 e 3 . (4.119) x x #x y Eventually, we develop the series

− 0! 1! 2! 3! 4! E (x)=e x − + − + − ... (4.120) 1 ! x x2 x3 x4 x5 " with remainder ∞ − n −y dy Rn(x)=( 1) n! e n+1 . (4.121) #x y Setting y = u + x,wehave

−x ∞ n n! e −u du Rn(x)=(−1) e (4.122) xn+1 # u n+1 0 1+ x $ % 164 Infinite Series which satisfies the condition (4.112) that defines an asymptotic series −x ∞ n n n! e −u du lim x Rn(x)= lim( 1) e x→∞ x→∞ − x # u n+1 0 1+ x −x ∞ ! " n n! e −u =lim→∞( 1) e du x − x #0 n! e−x =lim( 1)n =0 (4.123) x→∞ − x for fixed n. Asymptotic series often occur in physics. In such physical problems, a small parameter λ usually plays the role of 1/x. A perturbative series n k Sn(λ)= ak λ (4.124) $k=0 is an asymptotic expansion of the physical quantity S(λ) if the remainder R (λ)=S(λ) S (λ) (4.125) n − n satisfies for fixed n −n lim λ Rn(λ)=0. (4.126) λ→0 The WKB approximation and the Dyson series for are asymptotic expansions in this sense.

4.13 Some Electrostatic Problems Gauss’s law ∇·D = ρ equates the divergence of the electric displace- ment D to the density ρ of free charges (charges that are free to move in or out of the dielectric medium—as opposed to those that are part of the medium and bound to it by molecular forces). In electrostatic problems, Maxwell’s equations reduce to Gauss’s law and the static form ∇×E =0 of Faraday’s law which implies that the electric field E is the gradient of an electrostatic potential E = ∇V . (James Maxwell 1831–1879, Michael − Faraday 1791–1867) Across an interface with normal vector nˆ between two dielectrics, the tangential electric field is continuous while the normal electric displacement jumps by the surface density of free charge σ

nˆ × (E2 − E1) = 0 and σ = nˆ · (D2 − D1) . (4.127) In a linear dielectric, the electric displacement D is proportional to the Exercises 171 of permittivity ϵ2, and the lower region z<−t/2 is a uniform linear dielectric of permittivity ϵ3. Suppose the lower infinite x-y-plane z = −t/2 has a uniform surface charge density −σ, while the upper plane z = t/2 has a uniform surface charge density σ. What is the energy per unit area of this system? What is the pressure on the second dielectric? What is the capacitance per unit area of the stack? 178 Complex-Variable Theory of analyticity. Since IM = −I, the integral of f(z) along this closed contour vanishes:

f(z) dz = I + IM = I − I = 0 (5.25) $ and we have again derived Cauchy’s integral theorem. n Since every polynomial P (z)=c0 + c1z + ···+ cnz is entire (everywhere analytic), it follows that its integral along any closed contour must vanish

P (z) dz =0. (5.26) $

Example 5.3 (A pole) The derivative of the function f(z)=1/(z − z0) 1 1 1 1 f ′(z)= lim − = − (5.27) dz→0 #z + dz − z z − z dz (z − z )2 0 0 ! 0 exists everywhere except at z = z0, a region that is not simply connected.

5.3 Cauchy’s Integral Formula

Let f(z) be analytic in a simply connected region R and z0 a point inside this region. We first will integrate the function f(z)/(z − z0) along a tiny closed counterclockwise contour around the point z0. The contour is a circle iθ of radius ϵ with center at z0 with points z = z0 + ϵ e for 0 ≤ θ ≤ 2π, and dz = iϵ eiθdθ.Sincez − z = ϵ eiθ, the contour integral in the limit ϵ 0is 0 → 2π ′ f(z) [f(z )+f (z )(z z )] θ dz = 0 0 − 0 iϵ ei dθ $ϵ z z0 "0 z z0 − 2π −′ iθ f(z0)+f (z0) ϵ e θ = iϵ ei dθ ϵ iθ "0 % e & 2π ′ iθ = f(z0)+f (z0) ϵ e idθ. (5.28) "0 ' ( ′ The θ-integral involving f (z0) vanishes, and so we have 1 f(z) f(z0)= dz (5.29) 2πi $ϵ z z − 0 which is a miniature version of Cauchy’s integral formula. Now consider the counterclockwise contour C′ in Fig. 5.3 which is a big counterclockwise circle, a small clockwise circle, and two parallel straight lines, all within a simply connected region R in which f(z) is analytic. The 5.14 Ghost Contours 199

Third-Harmonic Microscopy

−L unseenvisible unseen L

Figure 5.6 In the limit in which the distance L is much larger than the wavelength λ, the integral (5.128) is non-zero when an edge (solid line) lies where the beam is focused but not when a feature (dots) lies where the beam is not focused. Only features within the focused region are visible.

which is in the UHP since the length b>0, but no singularity in the LHP y<0. So the integral of f(z) along the closed contour from z = R to z = R − and then along the ghost contour vanishes. But since the integral along the ghost contour vanishes, so does the integral from R to R.Thuswhenthe − dispersion is normal, the third-harmonic signal vanishes, E3 = 0, as long as the medium with constant χ(3)(z) effectively extends from −∞ to ∞ so that its edges are in the unfocused region like the dotted lines of Fig. 5.6. But an edge with varying χ(3)(z) in the focused region like the solid line of the figure does make a third-harmonic signal E3. Third-harmonic microscopy lets us see features instead of background. 208 Complex-Variable Theory We let ϵ 0 and find → ∞ ∞ 1 I = 2i dx dy 2 . (5.181) − #−∞ #0 (1 + x2 + y2) Changing variables to ρ2 = x2 + y2,wehave ∞ ∞ π ρ π d 1 π I = 4 i dρ 2 2 =2 i dρ 2 = 2 i. (5.182) − #0 (1 + ρ ) #0 dρ 1+ρ − which is simpler than evaluating the integral (5.178) directly.

5.15 Logarithms and Cuts By definition, a function f is single valued; it maps every number z in its domain into a unique image f(z). A function that maps only one number z in its domain into each f(z) in its range is said to be one to one.A one-to-one function f(z) has a well-defined inverse function f −1(z). The exponential function is one to one when restricted to the real numbers. It maps every real number x into a positive number exp(x). It has an inverse function ln(x) that maps every positive number exp(x) back into x.Butthe exponential function is not one to one on the complex numbers because exp(z +2πni)=exp(z) for every integer n. The exponential function is many to one. Thus on the complex numbers, the exponential function has no inverse function. Its would-be inverse function ln maps it to ln(exp(z)) or z +2πni which is not unique. It has in it an arbitrary integer n. In other words, when exponentiated, the logarithm of a complex number z returns exp(ln z)=z.Soifz = r exp(iθ), then a suitable logarithm is ln z = ln r+iθ. But what is θ? In the polar representation of z, the argument θ can just as well be θ +2πn because both give z = r exp(iθ)=r exp(iθ + i2πn). So ln r + iθ + i2πn is a correct value for ln[r exp(iθ)] for every integer n. People usually want one of the correct values of a logarithm, rather than all of them. Two conventions are common. In the first convention, the angle θ is zero along the positive real axis and increases continuously as the point z moves counterclockwise around the origin, until at points just belowthe positive real axis, θ =2π ϵ is slightly less than 2π. In this convention, − the value of θ drops by 2π as one crosses the positive real axis moving counterclockwise. This discontinuity on the positive real axis is called a cut. The second common convention puts the cut on the negative real axis. Here the value of θ is the same as in the first convention when the point z is in the upper half-plane. But in the lower half-plane, θ decreases from 5.16 Powers and Roots 209 The second common convention puts the cut on the negative real axis. Here the value of θ is the same as in the first convention when the point z is in the upper half-plane. But in the lower half-plane, θ decreases from 0to π as the point z moves clockwise from the positive real axis to just − below the negative real axis, where θ = π + ϵ. As one crosses the negative − real axis moving clockwise or up, θ jumps by 2π while crossing the cut. The two conventions agree in the upper half-plane but differ by 2π in the lower half-plane. Sometimes it is convenient to place the cut on the positive or negative imaginary axis — or along a line that makes an arbitrary angle with the real axis. In any particular calculation, we are at liberty to define the polar angle θ by placing the cut anywhere we like, but we must not change from one convention to another in the same computation.

5.16 Powers and Roots The logarithm is the key to many other functions to which it passes its arbitrariness. For instance, any power a of z = r exp(iθ) is defined as θ za =exp(a ln z)=exp[a (ln r + iθ + i2πn)] = ra eia ei2πna. (5.183) So za is not unique unless a is an integer. The square-root, for example, θ θ √z =exp 1 (ln r + iθ + i2πn) = √rei /2 einπ =( 1)n √rei /2 (5.184) 2 − ! " changes sign when we change θ by 2π as we cross a cut. The m-th root ln z m√z = z1/m =exp (5.185) # m $ changes by exp( 2πi/m) when we cross a cut and change θ by 2π. And ± when a = u + iv is a complex number, za is θ − θ za = ea ln z = e(u+iv)(ln r+i +i2πn) = ru+iv e( v+iu)( +2πn) (5.186) which changes by exp[2π( v + iu)] as we cross a cut. − Example 5.27 (ii) The number i =exp(iπ/2+i2πn) for any integer n. So the general value of ii is ii =exp[i(iπ/2+i2πn)] = exp( π/2 2πn). − − One can define a sequence of mth-root functions ln r + i(θ +2πn) z1/m =exp (5.187) % &n # m $ one for each integer n. These functions are the branches of the mth-root 5.16 Powers and Roots 211 interval [−1, 1]. Let’s promote x to a complex variable z and write the square root as √1 x2 = i√x2 1= i (z 1)(z + 1). As in the last example − − − − ! − (5.29), if in both of the square roots we put the cut on the negative (or the positive) real axis, then the function f(z)=1/[(z k)( i) (z 1)(z + 1)] − − ! − will be analytic everywhere except along a cut on the interval [ 1, 1] and at θ − z = k.Thecirclez = Rei for 0 ≤ θ ≤ 2π is a ghost contour as R →∞. If we shrink-wrap this ccw contour around the pole at z = k and the interval [−1, 1], then we get 0 = −2I +2πi/[(−i)√k 1√k + 1] or − π I = . (5.193) − √k 1√k +1 − So if k = 2, then I = π/√3, while if k = 2, then I = π/√3. − − Example 5.31 (Contour Integral with a Cut) Let’s compute the integral

∞ xa I = 2 dx (5.194) #0 (x + 1) for 1

Example 5.36 (The Feynman Propagator) Adding ±iϵ to the denomina- tor of a pole term of an integral formula for a function f(x) can slightly shift the pole into the upper or lower half plane, causing the pole to contribute if a ghost contour goes around the upper half-plane or the lower half-plane. Such an iϵ can impose a boundary condition on a Green’s function. The Feynman propagator ∆F (x) is a Green’s function for the Klein- Gordon differential operator (Weinberg, 1995, pp. 274–280)

2 4 (m − ✷)∆F (x)=δ (x) (5.230) in which x =(x0, x) and ∂2 ∂2 ✷ = △− = △− (5.231) ∂t2 ∂(x0)2 is the four-dimensional version of the laplacian △≡∇· ∇. Here δ4(x)isthe four-dimensional Dirac delta function (3.36)

d4q d4q δ4(x)= exp[i(q · x − q0x0)] = eiqx (5.232) # (2π)4 # (2π)4 in which qx = q · x − q0x0 is the Lorentz-invariant inner product of the 4- vectors q and x. There are many Green’s functions that satisfy Eq.(5.230). Feynman’s propagator ∆F (x) is the one that satisfies boundary conditions that will become evident when we analyze the effect of its iϵ

∞ 0 0 d4q exp(iqx) d3q dq0 eiq·x−iq x ∆F (x)= = . (5.233) # (2π)4 q2 + m2 − iϵ # (2π)3 #−∞ 2π q2 + m2 − iϵ

2 2 The quantity Eq = q + m is the energy of a particle of mass m and momentum q in natural# units with the speed of light c = 1. Using this ′ abbreviation and setting ϵ = ϵ/2Eq, we may write the denominator as

2 2 0 2 2 ′ 0 ′ 0 ′2 q + m − iϵ = q · q − q + m − iϵ = Eq − iϵ − q Eq − iϵ + q + ϵ ! " ! "! (5.234)" 5.20 Kramers-Kronig Relations 223 5.20 Kramers-Kronig Relations If we use σE for the current density J and E(t)=e−iωtE for the electric field, then Maxwell’s equation ∇×B = µJ + ϵµE˙ becomes

σ 2 ∇×B = −iωϵµ 1+i E ≡−iωn ϵ0µ0E (5.268) # ϵω $ and reveals the squared index of refraction as ϵµ σ n2(ω)= 1+i . (5.269) ϵ0µ0 # ϵω $ The imaginary part of n2 represents the scattering of light mainly by elec- trons. At high frequencies in nonmagnetic materials n2(ω) → 1, and so Kramers and Kronig applied the Hilbert-transform relations (5.267) to the function n2(ω) − 1 in order to satisfy condition (5.255). Their relations are ∞ 2 ω Im(n2(ω)) 2 ω ω Re(n ( 0)) = 1 + P 2 2 d (5.270) π %0 ω − ω0 and ∞ 2ω Re(n2(ω)) − 1 2 ω 0 ω Im(n ( 0)) = − P 2 2 d . (5.271) π %0 ω − ω0 What Kramers and Kronig actually wrote was slightly different from these dispersion relations (5.270 & 5.271). H. A. Lorentz had shown that the index of refraction n(ω) is related to the forward scattering amplitude f(ω) for the scattering of light by a density N of scatterers (Sakurai, 1982) 2πc2 n(ω)=1+ Nf(ω). (5.272) ω2 They used this formula to infer that the real part of the index of refraction approached unity in the limit of infinite frequency and applied the Hilbert transform (5.267) ∞ 2 ω′ Im[n(ω′)] ω ω′ Re[n( )] = 1 + P ′2 2 d . (5.273) π %0 ω − ω The Lorentz relation (5.272) expresses the imaginary part Im[n(ω)] of the index of refraction in terms of the imaginary part of the forward scattering amplitude f(ω) Im[n(ω)] = 2π(c/ω)2NIm[f(ω)]. (5.274) And the optical theorem relates Im[f(ω)] to the total cross-section 4π 4πc σ = Im[f(ω)] = Im[f(ω)]. (5.275) tot |k| ω 226 Complex-Variable Theory and that the real part of the forward scattering amplitude is given by the Kramers-Kronig integral (5.276) of the total cross-section ∞ ω2 σ (ω′) dω′ Re(f(ω)) = P tot . (5.286) π2 ω′2 ω2 2 c #0 − So the real part of the index of refraction is ∞ cN σ (ω′) dω′ n (ω)=1+ P tot . (5.287) r π ω′2 ω2 #0 − If the amplitude for forward scattering is of the Breit-Wigner form Γ/2 f(ω)=f0 (5.288) ω0 − ω − iΓ/2 then by (5.285) the real part of the index of refraction is π 2 ω − ω ω c Nf0Γ( 0 ) nr( )=1+ 2 2 2 (5.289) ω [(ω − ω0) + Γ /4] and by (5.283) the group velocity is −1 2 2 (ω − ω ) − Γ2/4 πc Nf0Γ ω0 0 vg = c ⎡1+ # $ ⎤ . (5.290) ω2 2 2 2 [(ω − ω0) + Γ /4] ⎣ ⎦ 2 2 This group velocity vg is less than c whenever (ω − ω0) > Γ /4. But we get 2 2 fast light vg >c,if(ω − ω0) < Γ /4, and even backwards light, vg < 0, if 2 ω ≈ ω0 with 4πc Nf0/Γω0 ≫ 1. Robert W. Boyd’s papers explain how to make slow and fast light (Bigelow et al., 2003) and backwards light (Gehring et al., 2006). We can use the principal-part identity (5.224) to subtract ∞ cN σ (ω) 0= P tot dω′ (5.291) π ω′2 ω2 '0 − from the Kramers-Kronig integral (5.287) so as to write the index of refrac- tion in the regularized form ∞ cN σ (ω′) − σ (ω) n (ω)=1+ P tot tot dω′ (5.292) r π ω′2 ω2 '0 − which we can differentiate and use in the group-velocity formula (5.283) ∞ − cN [σ (ω′) − σ (ω)] (ω′2 + ω2) 1 v (ω)=c 1+ P tot tot dω′ . (5.293) g π ω′2 ω2 2 ( '0 ( − ) ) Exercises 235 T (z) are defined by its Laurent series ∞ L T (z)= n (5.336) ) zn+2 n=−∞ and the inverse relation

1 n+1 Ln = z T (z) dz. (5.337) 2πi ! Thus the commutator of two modes involves two loop integrals

1 m+1 1 n+1 [Lm,Ln]= z T (z) dz, w T (w) dw (5.338) "2πi ! 2πi ! # which we may deform as long as we cross no poles. Let’s hold w fixed and deform the z loop so as to keep the T ’s radially ordered when z is near w as in Fig. 5.10. The operator-product expansion of the radially ordered product R{T (z)T (w)} is c/2 2 1 R{T (z)T (w)} = + T (w)+ T ′(w)+... (5.339) (z − w)4 (z − w)2 z − w in which the prime means derivative, c is a constant, and the dots denote terms that are analytic in z and w. The commutator introduces a minus sign that cancels most of the two contour integrals and converts what remains into an integral along a tiny circle Cw about the point w as in Fig. 5.10 dw dz c/2 2T (w) T ′(w) [L ,L ]= wn+1 zm+1 + + . m n − 4 − 2 − ! 2πi !Cw2πi "(z w) (z w) z w # (5.340) After doing the z-integral, which is left as a homework exercise (5.43), one may use the Laurent series (5.336) for T (w)todothew-integral, which one may choose to be along a tiny circle about w = 0, and so find the commutator c [L ,L ]=(m − n) L + m(m2 − 1) δ (5.341) m n m+n 12 m+n,0 of the Virasoro algebra.

Exercises 5.1 Compute the two limits (5.6) and (5.7) of example 5.2 but for the function f(x, y)=x2 − y2 +2ixy. Do the limits now agree? Explain. 5.2 Show that if f(z) is analytic in a disk, then the integral of f(z) around atiny(isosceles) triangle of side ϵ ≪ 1 inside the disk is zero to order ϵ2. Exercises 237 5.12 Evaluate the contour integral of the function f(z)=sinwz/(z − 5)3 along the curve z = 6 + 4(cos t + i sin t) for 0 t 2π. ≤ ≤ 5.13 Evaluate the contour integral of the function f(z)=sinwz/(z 5)3 − along the curve z = 6 + 4(cos t + i sin t) for 0 t 2π. − ≤ ≤ 5.14 Is the function f(x, y)=x2 + iy2 analytic? 5.15 Is the function f(x, y)=x3 3xy2 +3ix2y iy3 analytic? Is the − − function x3 3xy2 harmonic? Does it have a minimum or a maximum? − If so, what are they? 5.16 Is the function f(x, y)=x2 + y2 + i(x2 + y2) analytic? Is x2 + y2 a harmonic function? What is its minimum, if it has one? 5.17 Derive the first three nonzero terms of the Laurent series for f(z)= 1/(ez 1) about z = 0. − 5.18 Assume that a function g(z) is meromorphic in R and has a Laurent series (5.97) about a point w R. Show that as z w, the ratio ∈ → g′(z)/g(z) becomes (5.95). 5.19 Find the poles and residues of the functions 1/ sin z and 1/ cos z. 5.20 Derive the integral formula (5.122) from (5.121). 5.21 Show that if Re w<0, then for arbitrary complex z ∞ 2 π ew(x+z) dx = . (5.347) #−∞ ! w − 5.22 Use a ghost contour to evaluate the integral ∞ x sin x dx. "−∞ x2 + a2 Show your work; do not just quote the result of a commercial math program. 5.23 For a>0 and b2 4ac < 0, use a ghost contour to do the integral − ∞ dx . (5.348) "−∞ ax2 + bx + c 5.24 Show that ∞ − 2 1 − 2 cos ax e x dx = √π e a /4. (5.349) "0 2 5.25 Show that ∞ dx π = . (5.350) "−∞ 1+x4 √2 5.26 Evaluate the integral ∞ cos x 4 dx. (5.351) "0 1+x 238 Complex-Variable Theory 5.27 Show that the Yukawa Green’s function (5.151) reproduces the Yukawa potential (5.141) when n = 3. Use K (x)= π/2xe−x (9.99). 1/2 ! 5.28 Derive the two explicit formulas (5.188) and (5.189) for the square-root of a complex number. 5.29 What is ( i)i? What is the most general value of this expression? − 5.30 Use the indefinite integral (5.223) to derive the principal-part formula (5.224). 5.31 The Bessel function Jn(x) is given by the integral

1 (x/2)(z −1/z) dz Jn(x)= e n+1 (5.352) 2πi !C z along a counter-clockwise contour about the origin. Find the generating function for these Bessel functions, that is, the function G(x, z) whose Laurent series has the Jn(x)’s as coefficients ∞ n G(x, z)= Jn(x) z . (5.353) n=#−∞ 5.32 Show that the Heaviside function θ(y)=(y + y )/(2 y ) is given by the | | | | integral ∞ 1 dx θ(y)= eiyx (5.354) 2πi "−∞ x iϵ − in which ϵ is an infinitesimal positive number. 5.33 Show that the integral of exp(ik)/k along the contour from k = L to k = L + iH and then to k = L + iH and then down to k = L − − vanishes in the double limit L and H . →∞ →∞ 5.34 Use a ghost contour and a cut to evaluate the integral 1 dx I = (5.355) 2 2 "−1 (x + 1)√1 x − by imitating example 5.30. Be careful when picking up the poles at z = i. If necessary, use the explicit square-root formulas (5.188) and ± (5.189). 5.35 Redo the previous exercise (5.34) by defining the square roots so that the cuts run from to 1 and from 1 to . Take advantage of the −∞ − ∞ evenness of the integrand and integrate on a contour that is slightly above the whole real axis. Then add a ghost contour around the upper half plane. 5.36 Show that if u is even and v is odd, then the Hilbert transforms (5.265) imply (5.267). 6.5 Separable Partial Differential Equations 249 Example 6.6 (The Helmholtz Equation in Three Dimensions) In three dimensions and in rectangular coordinates r =(x, y, z), the function f(x, y, z)=X(x)Y (y)Z(z) is a solution of the ODE −△f = k2f as long − ′′ 2 − ′′ 2 − ′′ 2 as X, Y , and Z satisfy Xa = a Xa, Yb = b Yb, and Zc = c Zc 2 2 2 2 with a + b + c = k .WesetXa(x)=α sin ax + β cos ax and so forth. Arbitrary linear combinations of the products Xa Yb Zc also are solutions of Helmholtz’s equation −△f = k2f as long as a2 + b2 + c2 = k2. In cylindrical coordinates (ρ, φ,z), the laplacian (6.34) is 1 1 f = △f = (ρ f ) + f + ρ f (6.49) ∇ · ∇ ρ " ,ρ ,ρ ρ ,φφ ,zz! and so if we substitute f(ρ, φ,z)=P(ρ) Φ(φ) Z(z) into Helmholtz’s equation −△f = α2f and multiply both sides by −ρ2/P Φ Z, then we get ρ2 ρ2P′′ + ρP′ Φ′′ Z′′ △f = + + ρ2 = − α2ρ2. (6.50) f P Φ Z kz 2 If we set Zk(z)=e , then this equation becomes (6.46) with k replaced by α2 + k2. Its solution then is

2 2 inφ kz f(ρ, φ,z)=Jn( α + k ρ) e e (6.51) " in which n must be an integer if the solution is to apply to the full range of φ from 0 to 2π. The case in which α = 0 corresponds to Laplace’s equation inφ kz with solution f(ρ, φ,z)=Jn(kρ)e e . We could have required Z to satisfy Z′′ = −k2Z. The solution (6.51) then would be

2 2 inφ ikz f(ρ, φ,z)=Jn( α − k ρ) e e . (6.52) " But if α2 − k2 < 0, we write this solution in terms of the modified Bessel −n function In(x)=i Jn(ix) (section 9.3) as

2 2 inφ ikz f(ρ, φ,z)=In( k − α ρ) e e . (6.53) "

In spherical coordinates, the laplacian (6.35) is 1 ∂ ∂f 1 ∂ ∂f 1 ∂2f △f = r2 + sin θ + . (6.54) r2 ∂r # ∂r $ r2 sin θ ∂θ # ∂θ $ r2 sin2 θ ∂φ2 imφ If we set f(r, θ, φ)=R(r) Θ(θ) Φm(φ)whereΦm = e and multiply both sides of the Helmholtz equation −△f = k2f by −r2/RΘΦ, then we get ′ r2R′ (sin θ Θ′)′ m2 + − = −k2 r2. (6.55) % R & sin θ Θ sin2 θ 250 Differential Equations The first term is a function of r, the next two terms are functions of θ, and the last term is a constant. So we set the r-dependent terms equal to a constant ℓ(ℓ + 1) k2 and the θ-dependent terms equal to ℓ(ℓ + 1), and − − we require the associated Legendre function Θℓ,m(θ) to satisfy (8.91) sin θ Θ′ ′ / sin θ + ℓ(ℓ + 1) m2/ sin2 θ Θ =0. (6.56) ℓ,m − ℓ,m % & ' ( If Φ(φ)=eimφ is to be single valued for 0 ≤ φ ≤ 2π, then the parameter m must be an integer. The constant ℓ also must be an integer with −ℓ ≤ m ≤ ℓ (example 6.29, section 8.12) if Θℓ,m(θ) is to be single valued and finite for 0 ≤ θ ≤ π.Theproductf = R ΘΦ then will obey Helmholtz’s equation 2 −△f = k f if the radial function Rk,ℓ(r)=jℓ(kr) satisfies 2 ′ ′ 2 2 − r Rk,ℓ + k r ℓ(ℓ + 1) Rk,ℓ = 0 (6.57) % & ' ( which it does because the spherical Bessel function jℓ(x) obeys Bessel’s equation (9.63) 2 ′ ′ 2 − x jℓ +[x ℓ(ℓ + 1)] jℓ =0. (6.58) % & In three dimensions, Helmholtz’s equation separates in 11 standard coor- dinate systems (Morse and Feshbach, 1953, pp. 655–664).

6.6 Wave Equations You can easily solve some of the linear homogeneous partial differential equations of electrodynamics (exercise 6.6) and quantum field theory. Example 6.7 (The Klein-Gordon Equation) In , the ana- log of the laplacian in natural units (! = c = 1) is (summing over a from 0 to 3) ∂2 ∂2 ✷ = ∂ ∂a = △ − = △ − (6.59) a ∂x02 ∂t2 and the Klein-Gordon wave equation is ∂2 ✷ − m2 A(x)= △ − − m2 A(x)=0. (6.60) # ∂t2 $ % & a 0 0 If we set A(x)=B(px)wherepx = pax = p · x−p x ,thenthekth partial derivative of A is pk times the first derivative of B ∂ ∂ A(x)= B(px)=p B′(px) (6.61) ∂xk ∂xk k 6.6 Wave Equations 251 and so the Klein-Gordon equation (6.60) becomes

✷ m2 A =(p2 (p0)2)B′′ m2B = p2B′′ m2B = 0 (6.62) − − − − % & in which p2 = p2 (p0)2.ThusifB(p x)=exp(ip x) so that B′′ = B, and − · · − if the energy-momentum 4-vector (p0, p) satisfies p2 + m2 = 0, then A(x) will satisfy the Klein-Gordon equation. The condition p2 + m2 = 0 relates the energy p0 = p2 + m2 to the momentum p for a particle of mass m. " Example 6.8 (Field of a Spinless Boson) The quantum field d3p φ(x)= a(p)eipx + a†(p)e−ipx (6.63) ) 2p0(2π)3 * + " describes spinless bosons of mass m. It satisfies the Klein-Gordon equation ✷ m2 φ(x) = 0 because p0 = p2 + m2. The operators a(p) and a†(p) − %respectively& represent the annihilation" and creation of the bosons and obey the commutation relations

[a(p),a†(p′)] = δ3(p p′) and [a(p),a(p′)] = [a†(p),a†(p′)] = 0 (6.64) − in units with ! = c = 1. These relations make the field φ(x) and its time derivative φ˙(y) satisfy the canonical equal-time commutation rela- tions

[φ(x,t), φ˙(y,t)] = iδ3(x y) and [φ(x,t),φ(y,t)] = [φ˙(x,t), φ˙(y,t)] = 0 − (6.65) in which the dot means time derivative.

Example 6.9 (Field of the Photon) The electromagnetic field has four components, but in the Coulomb or radiation gauge A(x) = 0, the ∇ · component A0 is a function of the charge density, and the vector potential A in the absence of charges and currents satisfies the wave equation ✷A(x)=0 for a spin-one massless particle. We write it as

2 3 d p ∗ A(x)= e(p,s) a(p, s) eipx + e (p,s) a†(p, s) e−ipx ) 0 3 ,s=1 2p (2π) * + " (6.66) in which the sum is over the two possible polarizations s. The energy p0 is equal to the modulus p of the momentum because the photon is mass- | | less, p2 = 0. The dot-product of the polarization vectors e(p,s)withthe momentum vanishes p e(p,s) = 0 so as to respect the gauge condition · 6.16 Linear First-Order Ordinary Differential Equations 265 (6.150) was linear in y. So we could set P = α(ry s) and Q = α.When − P and Q are more complicated, integrating factors are harder to find or nonexistent.

Example 6.23 (Bodies Falling in Air) The downward speed v of a mass m in a gravitational field of constant acceleration g is described by the inhomogeneous first-order ODE mv = mg bv in which b represents air t − resistance. This equation is like (6.150) but with t instead of x as the independent variable, r = b/m, and s = g. Thus by (6.157), its solution is

mg mg − v(t)= + v(0) e bt/m. (6.158) b - − b $ The terminal speed mg/b is nearly 200 km/h for a falling man. A diving Peregrine falcon can exceed 320 km/h; so can a falling bullet. But mice can fall down mine shafts and run off unhurt, and insects and birds can fly. If the falling bodies are microscopic, a statistical model is appropriate. The potential energy of a mass m at height h is V = mgh. The heights of particles at temperature T K follow Boltzmann’s distribution (1.345)

− P (h)=P (0)e mgh/kT (6.159) in which k =1.380 6504×10−23 J/K = 8.617 343×10−5eV/K is his constant. The probability depends exponentially upon the mass m and drops by a factor of e with the scale height S = kT/mg, which can be a few kilometers for a small molecule.

Example 6.24 (R-C Circuit) The capacitance C of a capacitor is the charge Q it holds (on each plate) divided by the applied voltage V , that is, C = Q/V .ThecurrentI through the capacitor is the time derivative of the charge I = Q˙ = CV˙ . The voltage across a resistor of R Ω (Ohms) through which a current I flows is V = IR by Ohm’s law. So if a time- dependent voltage V (t) is applied to a capacitor in series with a resistor, then V (t)=Q/C + IR.ThecurrentI therefore obeys the first-order differential equation I˙ + I/RC = V˙ /R (6.160) or (6.150) with x → t, y → I, r → 1/RC, and s → V˙ /R.Sincer is a constant, the integrating factor α(x) → α(t)is

(t−t0)/RC α(t)=α(t0) e . (6.161)

Our general solution (6.157) of linear first-order ODEs gives us the expres- 270 Differential Equations in which a =0 is the coefficient of the lowest power of x in y(x). Differen- 0 ̸ tiating, we have ∞ − y′(x)= (r + n) a xr+n 1 (6.183) ) n n=0 and ∞ − y′′(x)= (r + n)(r + n 1) a xr+n 2. (6.184) ) − n n=0 When we substitute the three series (6.182–6.184) into our differential equa- tion x2y′′ + xp(x)y′ + q(x)y = 0, we find ∞ [(n + r)(n + r 1) + (n + r)p(x)+q(x)] a xn+r =0. (6.185) ) − n n=0 If this equation is to be satisfied for all x, then the coefficient of every power of x must vanish. The lowest power of x is xr, and it occurs when n =0 with coefficient [r(r 1+p(0)) + q(0)] a .Thussincea = 0, we have − 0 0 ̸ r(r 1+p(0)) + q(0) = 0. (6.186) − This quadratic indicial equation has two roots r1 and r2. To analyze higher powers of x, we introduce the notation ∞ ∞ p(x)= p xj and q(x)= q xj (6.187) ) j ) j j=0 j=0 in which p0 = p(0) and q0 = q(0). The requirement (exercise 6.16) that the coefficient of xr+k vanish gives us a recurrence relation k−1 1 a = [(j + r)p − + q − ] a (6.188) k − !(r + k)(r + k 1+p )+q " k j k j j − 0 0 #j=0 that expresses ak in terms of a0, a1,...ak−1.Whenp(x) and q(x) are polynomials of low degree, these equations become much simpler. Example 6.28 (Sines and Cosines) To apply Frobenius’s method the ODE y′′ +ω2y = 0, we first write it in the form x2y′′ +xp(x)y′ +q(x)y =0inwhich 2 2 p(x) = 0 and q(x)=ω x . So both p(0) = p0 = 0 and q(0) = q0 = 0, and the indicial equation (6.186) is r(r 1) = 0 with roots r1 = 0 and r2 = 1. − 2 We first set r = r1 = 0. Since the p’s and q’s vanish except for q2 = ω ,the recurrence relation (6.188) is a = q a − /k(k 1) = ω2a − /k(k 1). k − 2 k 2 − − k 2 − Thus a = ω2a /2, and a =( 1)nω2na /(2n)!. The recurrence relation 2 − 0 2n − 0 (6.188) gives no information about a1, so to find the simplest solution, we 6.19 Frobenius’s Series Solutions 271 set a = 0. The recurrence relation a = ω2a − /k(k 1) then makes all 1 k − k 2 − the terms a2n+1 of odd index vanish. Our solution for the first root r1 =0 then is ∞ ∞ (ωx)2n y(x)= a xn = a ( 1)n = a cos ωx. (6.189) ) n 0 ) − (2n)! 0 n=0 n=0

Similarly, the recurrence relation (6.188) for the second root r2 =1is a = ω2a − /k(k + 1), so that a =( 1)nω2na /(2n + 1)!, and we again k − k 2 2n − 0 set all the terms of odd index equal to zero. Thus we have ∞ ∞ a (ωx)2n+1 a y(x)=x a xn = 0 ( 1)n = 0 sin ωx (6.190) ) n ω ) − (2n + 1)! ω n=0 n=0 as our solution for the second root r2 = 1. Frobenius’s method sometimes shows that solutions exist only when a parameter in the ODE assumes a special value called an eigenvalue.

Example 6.29 (Legendre’s Equation) If one rewrites Legendre’s equa- tion (1 x2)y′′ 2xy′ + λy = 0 as x2y′′ + xpy′ + qy = 0, then one finds − − p(x)= 2x2/(1 x2) and q(x)=x2λ/(1 x2), which are analytic but not − − − polynomials. In this case, it is simpler to substitute the expansions (6.182– 6.184) directly into Legendre’s equation (1 x2)y′′ 2xy′ +λy = 0. We then − − find ∞ − (n + r)(n + r 1)(1 x2)xn+r 2 2(n + r)xn+r + λxn+r a =0. ) − − − n n=0 $ % The coefficient of the lowest power of x is r(r 1)a , and so the indicial − 0 equation is r(r 1) = 0. For r = 0, we shift the index n on the term − − n(n 1)xn 2a to n = j + 2 and replace n by j in the other terms: − n ∞ (j + 2)(j + 1) a [j(j 1) + 2j λ] a xj =0. (6.191) { j+2 − − − j} #j=0

Since the coefficient of xj must vanish, we get the recursion relation j(j + 1) λ a = − a (6.192) j+2 (j + 2)(j + 1) j which for big j says that a a . Thus the series (6.182) does not converge j+2 ≈ j for x 1 unless λ = j(j + 1) for some integer j in which case the series | |≥ (6.182) is a Legendre polynomial (chapter 8). 6.27 Self-Adjoint Differential Operators 279 (Carl Neumann, 1832–1925).

6.27 Self-Adjoint Differential Operators If p(x) and q(x) are real, then the differential operator d d L = p(x) + q(x) (6.233) − dx ! dx" is formally self adjoint. Such operators are interesting because if we take any two functions u and v that are twice differentiable on an interval [a, b] and integrate vLu twice by parts over the interval, we get b b (v, L u)= vLudx= v pu′ ′ + qu dx % % − a a # ! " % b = pu′v′ + uqv dx vpu′ b % a a & ' − & ' b = (pv′)′ + qv udx+ puv′ vpu′ b % a a &− ' & − ' b = (Lv) udx+ p(uv′ vu′) b (6.234) % a a & − ' which is Green’s formula b (vL u uLv) dx = p(uv′ vu′) b =[pW (u, v)]b (6.235) % a a a − & − ' (George Green, 1793–1841). Its differential form is Lagrange’s identity vL u uLv = pW(u, v) ′ (6.236) − & ' (Joseph-Louis Lagrange, 1736–1813). Thus if the twice-differentiable func- tions u and v satisfy boundary conditions at x = a and x = b that make the boundary term (6.235) vanish

′ ′ b b p(uv vu ) a =[pW (u, v)]a = 0 (6.237) & − ' then the real differential operator L is symmetric b b (v, L u)= vLudx= uLvdx=(u, L v). (6.238) %a %a A real linear operator A that acts in a real vector space and satisfies the analogous relation (1.161) (g, Af)=(f,Ag) (6.239) 294 Differential Equations equations, we see that the real part of ψi satisfies them, and by subtracting them, we see that the imaginary part of ψi also satisfies them. So it might seem that ψi = ui + ivi is made of two real eigenfunctions with the same eigenvalue. But each eigenfunction ui in the domain D satisfies two homogeneous boundary conditions as well as its second-order differential equation (pu′ )′ + qu = λ ρ u (6.325) − i i i i and so ui is the unique solution in D to this equation. There can be no other eigenfunction in D with the same eigenvalue. In a regular Sturm-Liouville system, there is no degeneracy. All the eigenfunctions ui are orthogonal and can be normalized on the interval [a, b] with weight function ρ(x) b ∗ uj ρ ui dx = δij. (6.326) #a They may be taken to be real. It is true that the eigenfunctions of a second-order differential equation come in pairs because one can use Wronski’s formula (6.268) x dx′ y2(x)=y1(x) ′ 2 ′ (6.327) # p(x ) y1(x ) to find a linearly independent second solution with the same eigenvalue. But the second solutions don’t obey the boundary conditions of the domain. Bessel functions of the second kind, for example, are infinite at the origin. A set of eigenfunctions ui is complete in the mean in a space S of functions if every function f ∈ S can be represented as a series ∞ f(x)= a u (x) (6.328) " i i i=1 (called a Fourier series) that converges in the mean, that is 2 b N lim f(x) − a u (x) ρ(x) dx =0. (6.329) →∞ # i i # N #a # " # # i=1 # # # The natural space S is the# space (a, b) of all# functions f that are square- L2 integrable on the interval [a, b] b f(x) 2 ρ(x) dx < ∞. (6.330) #a | | The orthonormal eigenfunctions of every regular Sturm-Liouville system on an interval [a, b] are complete in the mean in (a, b). The completeness of L2 6.39 Green’s Functions in One Dimension 309 we write G(x y) in terms of the complete set of eigenfunctions u as − k ∞ u (x)u (y) G(x y)= k k (6.410) − ) λk k=1 so that the action Luk = λkρuk turns G into

∞ ∞ Luk(x)uk(y) LG(x y)= = ρ(x) uk(x) uk(y)=δ(x y) (6.411) − ) λk ) − k=1 k=1 our α = 1 series expansion (6.374) of the delta function.

6.39 Green’s Functions in One Dimension In one dimension, we can explicitly solve the inhomogeneous ordinary dif- ferential equation Lf(x)=g(x)inwhich

d d L = p(x) + q(x) (6.412) − dx dx # ! is formally self adjoint. We’ll build a Green’s function from two solutions u and v of the homogeneous equation Lu(x)=Lv(x) = 0 as

1 G(x, y)= [θ(x y)u(y)v(x)+θ(y x)u(x)v(y)] (6.413) A − − in which θ(x)=(x + x )/(2 x )isthe Heaviside step function (Oliver | | | | Heaviside 1850–1925), and A is a constant which we’ll presently identify. We’ll show that the expression

b v(x) x u(x) b f(x)= G(x, y) g(y) dy = u(y) g(y) dy + v(y) g(y) dy "a A "a A "x solves our inhomogeneous equation. Differentiating, we find after a cancel- lation

v′(x) x u′(x) b f ′(x)= u(y) g(y) dy + v(y) g(y) dy. (6.414) A "a A "x 310 Differential Equations Differentiating again, we have

v′′(x) x u′′(x) b f ′′(x)= u(y) g(y) dy + v(y) g(y) dy A "a A "x v′(x)u(x)g(x) u′(x)v(x)g(x) + A − A v′′(x) x u′′(x) b = u(y) g(y) dy + v(y) g(y) dy A "a A "x W (x) + g(x) (6.415) A in which W (x) is the W (x)=u(x)v′(x) u′(x)v(x). The result − (6.266) for the wronskian of two linearly independent solutions of a self- adjoint homogeneous ODE gives us W (x)=W (x0) p(x0)/p(x). We set the constant A = W (x )p(x ) so that the last term in (6.415) is g(x)/p(x). − 0 0 − It follows that [Lv(x)] x [Lu(x)] b Lf(x)= u(y) g(y) dy + v(y) g(y) dy + g(x)=g(x). A "a A "x (6.416) But Lu(x)=Lv(x) = 0, so we see that f satisfies our inhomogeneous equation Lf(x)=g(x).

6.40 Nonlinear Differential Equations

The field of nonlinear differential equations is too vast to cover here, but we may hint at some of its features by considering some examples from cosmology and particle physics. The Friedmann equations of general relativity (11.410 & 11.412) for the scale factor a(t) of a homogeneous, isotropic universe are

a¨ 4πG a˙ 2 8πG k = (ρ +3p) and = ρ (6.417) a − 3 a 3 − a2 # ! in which k respectively is 1, 0, and 1 for closed, flat, and open geometries. − (The scale factor a(t) tells how much space has expanded or contracted by the time t.) These equations become more tractable when the energy density ρ is due to a single constituent whose pressure p is related to it by an equation of state p = wρ. Conservation of energyρ ˙ = 3(ρ + p)/a − (11.426–11.431) then ensures (exercise 6.30) that the product ρ a3(1+w) is 312 Differential Equations

Soliton 1.5

1

0.5 x

0 )=tanh x ( φ −0.5

−1

−1.5 −10 −8 −6 −4 −2 0 2 4 6 8 10 x

Figure 6.4 The field φ(x) of the soliton (6.435) at rest (v = 0) at position x0 = 0 for λ =1=φ0. The energy density of the field vanishes when φ = φ0 = 1. The energy of this soliton is concentrated at x = 0. ± ± in which C is a constant of integration.

The equations of particle physics are nonlinear. Physicists usually use perturbation theory to cope with the nonlinearities. But occasionallythey focus on the nonlinearities and treat the fields classically or semi-classically. To keep things relatively simple, we’ll work in a space-time of only two dimensions and consider a model field theory described by the action density

1 = φ˙2 φ′2 V (φ) (6.427) L 2 - − $ − in which V is a simple function of the field φ. Lagrange’s equation for this theory is dV φ¨ φ′′ = . (6.428) − − dφ 6.40 Nonlinear Differential Equations 313 We can convert this partial differential equation to an ordinary one by mak- ing the field φ depend only upon the combination u = x vt rather than − upon both x and t.Wethenhaveφ˙ = v φ . With this restriction to − u traveling-wave solutions, Lagrange’s equation reduces to dV (1 v2) φ = . (6.429) − uu dφ

We multiply both sides of this equation by φu dV (1 v2) φ φ = φ (6.430) − u uu dφ u and integrate both sides to get (1 v2) 1 φ2 = V +E in which E is a constant − 2 u of integration E = 1 (1 v2) φ2 V (φ). (6.431) 2 − u− We can convert (exercise 6.37) this equation into a problem of integration

√1 v2 u u0 = − dφ. (6.432) − % 2(E + V (φ)) ! By inverting the resulting equation relating u to φ,wemayfindthesoliton solution φ(u u ), which is a lump of energy traveling with speed v. − 0 Example 6.48 (Soliton of the φ4 Theory) To simplify the integration (6.432), we take as the action density

2 1 λ 2 = φ˙2 φ′2 φ2 φ2 E . (6.433) L 2 − − $ 2 − 0 − ' " # % & Our formal solution (6.432) gives

√1 v2 √1 v2 u u = dφ = ∓ tanh−1(φ/φ ) (6.434) 0 2− 2 − 0 − ± ( λ φ φ λφ0 − 0 or % &

x x0 v(t t0) φ(x vt)=∓φ0 tanh λφ0 − − − (6.435) − $ √1 v2 ' − which is a soliton (or an antisoliton) at x + v(t t ). A unit soliton at rest 0 − 0 is plotted in Fig. 6.4. Its energy is concentrated at x =0where φ2 φ2 is | − 0| maximal. 314 Differential Equations Exercises 6.1 In rectangular coordinates, the curl of a curl is by definition (6.40)

3 3 ∇× ∇× ∇× ( ( E))i = ϵijk∂j( E)k = ϵijk∂jϵkℓm∂ℓEm. j,k'=1 j,k,'ℓ,m=1 (6.436) Use Levi-Civita’s identity (1.449) to show that

∇×(∇×E) = ∇(∇·E) −△E. (6.437)

This formula defines △E in any system of orthogonal coordinates.

6.2 Show that since the Bessel function Jn(x) satisfies Bessel’s equation (6.48), the function Pn(ρ)=Jn(kρ) satisfies (6.47). 6.3 Show that (6.58) implies that Rk,ℓ(r)=jℓ(kr) satisfies (6.57). Φ′′ 2Φ 6.4 Use (6.56, 6.57), and m = m m to show in detail that the product − 2 f(r, θ, φ)=Rk,ℓ(r) Θℓ,m(θ) Φm(φ) satisfies −△f = k f. 6.5 Replacing Helmholtz’s k2 by 2m(E − V (r))/!2, we get Schr¨odinger’s equation

−(!2/2m)△ψ(r, θ, φ)+V (r)ψ(r, θ, φ)=Eψ(r, θ, φ). (6.438)

imφ Let ψ(r, θ, φ)=Rn,ℓ(r)Θℓ,m(θ)e in which Θℓ,m satisfies (6.56) and show that the radial function Rn,ℓ must obey − 2 ′ ′ 2 2 !2 !2 r Rn,ℓ /r + ℓ(ℓ + 1)/r +2mV/ Rn,ℓ =2mEn,ℓ Rn,ℓ/ . % & ! " (6.439) 6.6 Use the empty-space Maxwell’s equations ∇·B = 0, ∇×E + B˙ = 0, ∇·E = 0, and ∇×B − E/c˙ 2 = 0 and the formula (6.437) to show that in vacuum △E = E¨/c2 and △B = B¨/c2. a b 6.7 Argue from symmetry and anti-symmetry that [γ , γ ]∂a∂b =0inwhich the sums over a and b run from 0 to 3. 6.8 Suppose a voltage V (t)=V sin(ωt) is applied to a resistor of R (Ω)in series with a capacitor of capacitance C (F ). If the current through the circuit at time t = 0 is zero, what is the current at time t? 6.9 (a) Is 1+x2 + y2 −3/2 (1 + y2)ydx+(1+x2)xdy = 0 exact? (b) Find its% general integral& ! and solution y(x). Use section" 6.11. 6.10 (a) Separate the variables of the ODE (1 + y2)ydx+(1+x2)xdy = 0. (b) Find its general integral and solution y(x). 6.11 Find the general solution to the differential equation y′ + y/x = c/x. 2 6.12 Find the general solution to the differential equation y′ +xy = ce−x /2. Exercises 315 6.13 James Bernoulli studied ODEs of the form y′ + py = qyn in which p and q are functions of x. Division by yn and the substitution v = y1−n gives us the equation v′ +(1 n)pv =(1 n) q which is soluble as shown − − in section (6.16). Use this method to solve the ODE y′ y/2x =5x2y5. − 6.14 Integrate the ODE (xy + 1) dx +2x2(2xy 1) dy = 0. Hint: Use the − variable v(x)=xy(x) instead of y(x). 6.15 Show that the points x = 1 and are regular singular points of ± ∞ Legendre’s equation (6.181). 6.16 Use the vanishing of the coefficient of every power of x in (6.185) and the notation (6.187) to derive the recurrence relation (6.188). 6.17 In example 6.29, derive the recursion relation for r = 1 and discuss the resulting eigenvalue equation. 6.18 In example 6.29, show that the solutions associated with the roots r = 0 and r = 1 are the same. 6.19 For a hydrogen atom, we set V (r)= e2/4πϵ r q2/r in (6.439) − 0 ≡− and get (r2R′ )′ + (2m/!2) E + Zq2/r r2 ℓ(ℓ + 1) R = 0. So n,ℓ n,ℓ − + n,ℓ at big r, R′′ !2mE R% /!2 and R & exp( 2mE r/!). n,ℓ ≈− n,ℓ n,ℓ n,ℓ ∼ − − n,ℓ At tiny r,(r2R′ )′ ℓ(ℓ + 1)R and R (r) rℓ!.SetR (r)= n,ℓ ≈ n,ℓ n,ℓ ∼ n,ℓ rℓ exp( 2mE r/!)P (r) and apply the method of Frobenius to − − n,ℓ n,ℓ find the! values of En,ℓ for which Rn,ℓ is suitably normalizable. 6.20 Show that as long as the matrix = y(ℓj )(x ) is nonsingular, the n Ykj k j boundary conditions n (ℓj ) (ℓj ) bj = y (xj)= ck yk (xj) (6.440) )k=1

determine the n coefficients ck of the expansion (6.222) to be n CT = BT −1 or C = b −1. (6.441) Y k jYjk )j=1

6.21 Show that if the real and imaginary parts u1, u2, v1, and v2 of ψ and χ satisfy boundary conditions at x = a and x = b that make the boundary term (6.235) vanish, then its complex analog (6.242) also vanishes. 6.22 Show that if the real and imaginary parts u1, u2, v1, and v2 of ψ and χ satisfy boundary conditions at x = a and x = b that make the boundary term (6.235) vanish, and if the differential operator L is real and self adjoint, then (6.238) implies (6.243). 6.23 Show that if D is the set of all twice-differentiable functions u(x) on [a, b] that satisfy Dirichlet’s boundary conditions (6.245) and if the function p(x) is continuous and positive on [a, b], then the adjoint set 8.7 Schlaefli’s Integral 335 The generating function g(t, x) is even under the reflection of both inde- pendent variables, so ∞ ∞ g(t, x)= tn P (x)= ( t)n P ( x)=g( t, x) (8.51) ) n ) − n − − − n=0 n=0 which implies that P ( x)=( 1)n P (x)whenceP (0) = 0. (8.52) n − − n 2n+1 With more effort, one can show that (2n 1)!! P (0) = ( 1)n − and that P (x) 1. (8.53) 2n − (2n)!! | n | ≤

8.7 Schlaefli’s Integral Schlaefli used Cauchy’s integral formula (5.36) and Rodrigues’s formula 1 d n P (x)= (x2 1)n (8.54) n 2n n! dx − # ! to express Pn(z) as a counterclockwise contour integral around the point z 1 (z′2 1)n P (z)= − dz′. (8.55) n 2n 2πi (z′ z)n+1 ! −

8.8 Orthogonal Polynomials Rodrigues’s formula (8.8) generates other families of orthogonal polynomials. The n-th order polynomials Rn in which the en are constants n 1 d n Rn(x)= n [w(x) Q (x)] (8.56) enw(x) dx are orthogonal on the interval from a to b with weight function w(x)

b Rn(x) Rk(x) w(x) dx = Nn δnk (8.57) "a as long as Q(x) vanishes at a and b (exercise 8.8) Q(a)=Q(b) =0. (8.58) 8.12 The Associated Legendre Functions/Polynomials 341

(m) Instead, we define Pℓ,m in terms of the mth derivative Pℓ as

2 m/2 (m) Pℓ (x) (1 x ) P (x) (8.97) ,m ≡ − ℓ and compute the derivatives mxP P (m+1) = P ′ + ℓ,m (1 x2)−m/2 (8.98) ℓ ! ℓ,m 1 x2 " − − 2mxP ′ mP m(m + 2)x2P P (m+2) = P ′′ + ℓ,m + ℓ,m + ℓ,m (1 x2)−m/2. ℓ ℓ,m 1 x2 1 x2 (1 x2)2 − # − − − ! When we put these three expressions in equation (8.94),weget the desired ODE (8.92). Thus the associated Legendre functions are m 2 m/2 (m) 2 m/2 d Pℓ (x)=(1 x ) P (x)=(1 x ) Pℓ(x) (8.99) ,m − ℓ − dxm They are simple polynomials in x = cos θ and √1 x2 =sinθ − dm P (cos θ)=sinm θ P (cos θ). (8.100) ℓ,m d cosm θ ℓ It follows from Rodrigues’s formula (8.8) for the Legendre polynomial Pℓ(x) that Pℓ,m(x) is given by the similar formula

2 m/2 ℓ+m (1 x ) d 2 ℓ Pℓ (x)= − (x 1) (8.101) ,m 2ℓℓ! dxℓ+m − which tells us that under parity P m(x) changes by ( 1)ℓ+m ℓ − ℓ+m Pℓ ( x)=( 1) Pℓ (x). (8.102) ,m − − ,m

Rodrigues’s formula (8.101) for the associated Legendre function makes sense as long as ℓ + m 0. This last condition is the requirement in ≥ quantum mechanics that m not be less than ℓ. And if m exceeds ℓ,then − Pℓ,m(x) is given by more than 2ℓ derivatives of a polynomial of degree 2ℓ; so Pℓ,m(x)=0ifm>ℓ. This last condition is the requirement in quantum mechanics that m not be greater than ℓ.Sowehave

ℓ m ℓ. (8.103) − ≤ ≤ One may show that

m (ℓ m)! Pℓ (x)=( 1) − Pℓ (x). (8.104) ,−m − (ℓ + m)! ,m 344 Legendre Functions

Figure 8.2 CMB temperature fluctuations over the celestial sphere as mea- sured by the Planck satellite. The average temperature is 2.7255 K. White regions are warmer and black ones colder by about 0.0005 degrees. ⃝c ESA and the Planck Collaboration.

′ ′ Pℓ(nˆ · nˆ ) of the cosine nˆ · nˆ in which the polar angles of the unit vec- tors respectively are θ, φ and θ′, φ′ is the addition theorem

ℓ ′ 4π ∗ ′ ′ P (nˆ · nˆ )= Y (θ, φ)Y (θ , φ ) ℓ 2ℓ +1 ) ℓ,m ℓ,m m=−ℓ (8.121) ℓ 4π ∗ ′ ′ = Y (θ, φ)Y (θ , φ ). 2ℓ +1 ) ℓ,m ℓ,m m=−ℓ

Example 8.8 (CMB Radiation) Instruments on the Wilkinson Microwave Anisotropy Probe (WMAP) and Planck satellites in orbit at the Lagrange 6 point L2 (in the Earth’s shadow, 1.5×10 km farther from the Sun) have measured the temperature T (θ, φ) of the cosmic microwave background (CMB) radiation as a function of the polar angles θ and φ in the sky as shown in Fig. 8.2. This radiation is photons last scattered when the visible universe became transparent at an age of 380,000 years and a temperature (3,000 K) cool enough for hydrogen atoms to be stable. This initial tran- parency is usually (and inexplicably) called recombination. Since the spherical harmonics Yℓ,m(θ, φ) are complete on the sphere, we 8.13 Spherical Harmonics 345

Figure 8.3 The power spectrum ℓ = ℓ(ℓ + 1)Cℓ/2π of the CMB tem- 2 D perature fluctuations in µK as measured by the Planck Collaboration (arXiv:1303.5062) is plotted against the angular size and the multipole moment ℓ. The solid curve is the ΛCDM prediction. can expand the temperature as

∞ ℓ T (θ, φ)= a Y (θ, φ) (8.122) ) ) ℓ,m ℓ,m ℓ=0 m=−ℓ in which the coefficients are by (8.117) 2π π ∗ aℓ,m = dφ sin θ dθ Yℓ,m(θ, φ) T (θ, φ). (8.123) #0 #0

The average temperature T contributes only to a0,0 = T = 2.7255 K. The other coefficients describe the difference ∆T (θ, φ)=T (θ, φ) − T .The angular power spectrum is ℓ 1 2 Cℓ = aℓ . (8.124) 2ℓ +1 ) | ,m| m=−ℓ If we let the unit vector nˆ point in the direction θ, φ and use the addition theorem (8.121), then we can write the angular power spectrum as

1 ′ ′ ′ C = d2nˆ d2nˆ P (nˆ · nˆ ) T (nˆ) T (nˆ ). (8.125) ℓ 4π # # ℓ 346 Legendre Functions In Fig. 8.3, the measured values (arXiv:1303.5062) of the power spectrum ℓ = ℓ(ℓ + 1) Cℓ/2π are plotted against ℓ for 1 < ℓ < 1300 with the angles D and distances decreasing with ℓ. The power spectrum is a snapshot at the moment of initial transparency of the temperature distribution of the plasma of photons, electrons, and nuclei undergoing acoustic oscillations. In these oscillations, gravity opposes radiation pressure, and ∆T (θ, φ) is maximal | | both when the oscillations are most compressed and when they are most rarefied. Regions that gravity has squeezed to maximum compression at transparency form the first and highest peak. Regions that have bounced off their first maximal compression and that radiation pressure has expanded to minimum density at transparency form the second peak. Those at their second maximum compression at transparency form the third peak, and so forth. The solid curve is the prediction of a model with inflation, cold dark matter, and a cosmological constant Λ.Inthismodel, the age of the visible universe is 13.817 Gyr; the Hubble constant is H0 = 67.3 km/sMpc; the energy density of the universe is enough to make the universe flat; and the fractions of the energy density due to baryons, dark matter, and dark energy are 4.9%, 26.6%, and 68.5% (Edwin Hubble 1889–1953).

Much is known about Legendre functions. The books A Course of Modern Analysis (Whittaker and Watson, 1927, chap. XV) and Methods of Mathe- matical Physics (Courant and Hilbert, 1955) are outstanding.

Exercises

8.1 Use conditions (8.6) and (8.7) to find P0(x) and P1(x). 8.2 Using the Gram-Schmidt method (section 1.10) to turn the functions n x into a set of functions Ln(x) that are orthonormal on the interval [−1, 1] with inner product (8.2), find Ln(x) for n = 0, 1, 2, and 3. Isn’t Rodrigues’s formula (8.8) easier to use?

8.3 Derive the conditions (8.6–8.7) on the coefficients ak of the Legendre polynomial P (x)=a + a x + + a xn. n 0 1 ··· n 8.4 Use equations (8.6–8.7) to find P3(x) and P4(x). 8.5 In superscript notation (6.19), Leibniz’s rule (4.46) for derivatives of products uv of functions is n n (uv)(n) = u(n−k) v(k). (8.126) ) #k$ k=0 352 Bessel Functions These integrals (exercise 9.8) give J (0) = 0 for n = 0, and J (0) = 1. n ̸ 0 By differentiating the generating function (9.5) with respect to u and identifying the coefficients of powers of u, one finds the recursion relation 2n J − (z)+J (z)= J (z). (9.8) n 1 n+1 z n Similar reasoning after taking the z derivative gives (exercise 9.10) ′ J − (z) J (z)=2J (z). (9.9) n 1 − n+1 n By using the gamma function (section 5.12), one may extend Bessel’s equation (9.4) and its solutions Jn(z) to non-integral values of n ∞ z ν ( 1)m z 2m Jν(z)= − . (9.10) 2 m! Γ(m + ν + 1) 2 - # m#=0 - # Letting z = ax in (9.4), we arrive (exercise 9.11) at the self-adjoint form (6.307) of Bessel’s equation d d n2 x J (ax) + J (ax)=a2xJ (ax). (9.11) − dx # dx n $ x n n In the notation of equation (6.287), p(x)=x, a2 is an eigenvalue, and ρ(x)=x is a weight function. To have a self-adjoint system (section 6.28) on an interval [0,b], we need the boundary condition (6.247)

0= p(J v′ J ′ v) b = x(J v′ J ′ v) b (9.12) n − n 0 n − n 0 $ % $ % for all functions v(x) in the domain D of the system. Since p(x)=x, J0(0) = 1, and Jn(0) = 0 for integers n>0, the terms in this boundary condition vanish at x = 0 as long as the domain consists of functions v(x) that are twice differentiable on the interval [0,b]. To make these terms vanish at x = b, we require that Jn(ab) = 0 and that v(b) = 0. So ab must be a zero zn,m of Jn(z), that is Jn(ab)=Jn(zn,m) = 0. With a = zn,m/b,Bessel’s equation (9.11) is d d n2 z2 x J (z x/b) + J (z x/b)= n,m xJ (z x/b) . (9.13) − dx # dx n n,m $ x n n,m b2 n n,m 2 2 2 For fixed n, the eigenvalue a = zn,m/b is different for each positive integer m. Moreover as m , the zeros z of J (x) rise as mπ as one might →∞ n,m n expect since the leading term of the asymptotic form (9.3) of Jn(x) is pro- portional to cos(x nπ/2 π/4) which has zeros at mπ+(n+1)π/2+π/4. It − − follows that the eigenvalues a2 ≈ (mπ)2/b2 increase without limit as m →∞ in accordance with the general result of section 6.34. It follows then from 9.1 Bessel Functions of the First Kind 357 When α = 0, the Helmholtz equation reduces to the Laplace equation △V = 0 of electrostatics which the simpler functions

±inφ ±kz ±inφ ±ikz Vk,n(ρ, φ,z)=Jn(kρ)e e and Vk,n(ρ, φ,z)=Jn(ikρ)e e (9.34) satisfy. −ν The product i Jν(ikρ) is real and is known as the modified Bessel function −ν Iν(kρ) ≡ i Jν(ikρ). (9.35)

It occurs in various solutions of the diffusion equation D△φ = φ˙.The function V (ρ, φ,z)=B(ρ)Φ(φ)Z(z) satisfies

1 1 2 △V = (ρ V ρ) + V + ρ V = α V (9.36) ρ " , ,ρ ρ ,φφ ,zz! if B(ρ) obeys Bessel’s equation d dB ρ ρ − (α2 − k2)ρ2 + n2 B = 0 (9.37) dρ # dρ $ ! " and Φ and Z respectively satisfy d2Φ d2Z − = n2Φ(φ) and = k2Z(z) (9.38) dφ2 dz2 or if B(ρ) obeys the Bessel equation d dB ρ ρ − (α2 + k2)ρ2 + n2 B = 0 (9.39) dρ # dρ $ ! " and Φ and Z satisfy d2Φ d2Z − = n2Φ(φ) and = −k2Z(z). (9.40) dφ2 dz2 In the first case (9.37 & 9.38), the solution V is

2 2 ±inφ ±kz Vk,n(ρ, φ,z)=In( α − k ρ)e e (9.41) ( while in the second case (9.39 & 9.40), it is

2 2 ±inφ ±ikz Vk,n(ρ, φ,z)=In( α + k ρ)e e . (9.42) ( In both cases, n must be an integer if the solution is to be single valued on the full range of φ from 0 to 2π. 360 Bessel Functions Since p =(ϵ ϵ )/(ϵ + ϵ ) > 0, the principal image charge pq at w − ℓ w ℓ (0, 0, h) has the same sign as the charge q and so contributes a positive − term proportional to pq2 to the energy. So a lipid slab repels a nearby charge in water no matter what the sign of the charge. A cell membrane is a phospholipid bilayer. The lipids avoid water and form a 4-nm-thick layer that lies between two 0.5-nm layers of phosphate groups which are electric dipoles. These electric dipoles cause thecellmem- brane to weakly attract ions that are within 0.5 nm of the membrane.

Example 9.3 (Cylindrical Wave Guides) An electromagnetic wave trav- eling in the z-direction down a cylindrical wave guide looks like − − E einφ ei(kz ωt) and B einφ ei(kz ωt) (9.52) in which E and B depend upon ρ

E = Eρρˆ + Eφφˆ + Ezzˆ and B = Bρρˆ + Bφφˆ + Bzzˆ (9.53) in cylindrical coordinates (section 6.4). If the wave guide is an evacuated, perfectly conducting cylinder of radius r, then on the surface of the wave guide the parallel components of E and the normal component of B must vanish which leads to the boundary conditions

Ez(r)=0,Eφ(r)=0, and Bρ(r)=0. (9.54) Since the E and B fields have subscripts, we will use commas to denote derivatives as in ∂(ρEφ)/∂ρ ≡ (ρEφ),ρ and so forth. In this notation, the vacuum forms ∇ × E = − B˙ and ∇ × B = E˙ /c2 of the Faraday and Maxwell-Amp`ere laws give us (exercise 9.14) the field equations 2 Ez,φ/ρ − ikEφ = iωBρ inBz/ρ − ikBφ = −iωEρ/c 2 ikEρ − Ez,ρ = iωBφ ikBρ − Bz,ρ = −iωEφ/c (9.55) 2 [(ρEφ),ρ − inEρ] /ρ = iωBz [(ρBφ),ρ − inBρ] /ρ = −iωEz/c . Solving them for the ρ and φ components of E and B in terms of their z components (exercise 9.15), we find −ikE + nωB /ρ nkE /ρ + iωB E = z,ρ z E = z z,ρ ρ k2 − ω2/c2 φ k2 − ω2/c2 (9.56) −ikB − nωE /c2ρ nkB /ρ − iωE /c2 B = z,ρ z B = z z,ρ . ρ k2 − ω2/c2 φ k2 − ω2/c2

The fields Ez and Bz obey the wave equations (11.91, exercise 6.6) 2 2 2 2 2 2 −△Ez = −E¨z/c = ω Ez/c and −△Bz = −B¨z/c = ω Bz/c . (9.57) 9.2 Spherical Bessel functions of the first kind 359 at ρ = r as well as the separable wave equations (9.57). The frequencies of ′2 2 2 2 2 the resonant TE modes then are ωn,m,ℓ = c zn,m/r + π ℓ /h . The TM modes are Bz = 0 and " inφ −iωt Ez = Jn(zn,m ρ/r) e sin(πℓz/h) e (9.62)

2 2 2 2 2 with resonant frequencies ωn,m,ℓ = c zn,m/r + π ℓ /h . "

9.2 Spherical Bessel functions of the first kind

If in Bessel’s equation (9.4), one sets n = ℓ +1/2 and jℓ = π/2xJℓ+1/2, then one may show (exercise 9.21) that # 2 ′′ ′ 2 − x jℓ (x)+2xjℓ(x)+[x ℓ(ℓ + 1)] jℓ(x) = 0 (9.63) which is the equation for the spherical Bessel function jℓ. We saw in example 6.6 that by setting V (r, θ, φ)=Rk,ℓ(r) Θℓ,m(θ) Φm(φ) we could separate the variables of Helmholtz’s equation V = k2V in −△ spherical coordinates 2 2 ′ ′ θ ′ ′ ′′ r V (r Rk,ℓ) (sin Θℓ,m) Φ 2 2 △ = + + 2 = k r . (9.64) V Rk,ℓ sin θ Θℓ,m sin θ Φ − Thus if Φ (φ)=eimφ so that Φ′′ = m2Φ , and if Θ satisfies the m m − m ℓ,m associated Legendre equation (8.91) sin θ sin θ Θ′ ′ +[ℓ(ℓ + 1) sin2 θ m2] Θ = 0 (9.65) ℓ,m − ℓ,m ! then the product V$(r, θ, φ)=Rk,ℓ(r) Θℓ,m(θ) Φm(φ) will obey (9.64) because in view of (9.63) the radial function Rk,ℓ(r)=jℓ(kr) satisfies (r2R′ )′ +[k2r2 ℓ(ℓ + 1)]R =0. (9.66) k,ℓ − k,ℓ In terms of the spherical harmonic Yℓ,m(θ, φ)=Θℓ,m(θ) Φm(φ), the solution is V (r, θ, φ)=jℓ(kr) Yℓ,m(θ, φ). Rayleigh’s formula gives the spherical Bessel function π j (x) J (x) (9.67) ℓ ≡ "2x ℓ+1/2 as the ℓth derivative of sin x/x

1 d ℓ sin x j (x)=( 1)ℓ xℓ (9.68) ℓ − #x dx$ # x $ 9.2 Spherical Bessel functions of the first kind 363 sin x/x and j (x)=sinx/x2 cos x/x. Rayleigh’s formula leads to the 1 − recursion relation (exercise 9.22) ℓ j (x)= j (x) j′ (x) (9.69) ℓ+1 x ℓ − ℓ with which one can show (exercise 9.23) that the spherical Bessel functions as defined by Rayleigh’s formula do satisfy their differential equation (9.66) with x = kr. The spherical Bessel functions jℓ(kr) satisfy the self-adjoint Sturm-Liouville (6.333) equation (9.66) r2j′′ 2rj′ + ℓ(ℓ + 1)j = k2r2j (9.70) − ℓ − ℓ ℓ ℓ 2 2 with eigenvalue k and weight function ρ = r .Ifjℓ(zℓ,n) = 0, then the functions jℓ(kr)=jℓ(zℓ,nr/a) vanish at r = a and form an orthogonal basis

a 3 2 a 2 jℓ(zℓ,nr/a) jℓ(zℓ,mr/a) r dr = jℓ+1(zℓ,n) δn,m (9.71) #0 2 for a self-adjoint system on the interval [0,a]. Moreover, since as n →∞ 2 2 2 ≈ 2 2 →∞ the eigenvalues kℓ,n = zℓ,n/a [(n + ℓ/2)π] /a , the eigenfunctions jℓ(zℓ,nr/a) also are complete in the mean (section 6.35). On an infinite interval, the analogous relation is ∞ π ′ 2 δ − ′ jℓ(kr) jℓ(k r) r dr = 2 (k k ). (9.72) #0 2k

If we write the spherical Bessel function j0(x) as the integral 1 sin z 1 izx j0(z)= = e dx (9.73) z 2 #−1 and use Rayleigh’s formula (9.68), we may find an integral for jℓ(z) ℓ ℓ 1 ℓ ℓ 1 d sin z ℓ ℓ 1 d 1 izx jℓ(z)=(−1) z =(−1) z e dx #z dz $ # z $ #z dz $ 2 %−1 ℓ 1 − 2 ℓ − ℓ 1 − 2 ℓ ℓ z (1 x ) izx ( i) (1 x ) d izx = ℓ e dx = ℓ ℓ e dx 2 %−1 2 ℓ! 2 %−1 2 ℓ! dx − ℓ 1 ℓ 2 − ℓ − ℓ 1 ( i) izx d (x 1) ( i) izx = e ℓ ℓ dx = Pℓ(x) e dx (9.74) 2 %−1 dx 2 ℓ! 2 %−1 (exercise 9.24) that contains Rodrigues’s formula (8.8) for the Legendre poly- nomial Pℓ(x). With z = kr and x = cos θ, this formula 1 ℓ 1 ikr cos θ i jℓ(kr)= Pℓ(cos θ)e d cos θ (9.75) 2 %−1 Exercises 369

The boundary condition uℓ(kr0) = 0 fixes the ratio vℓ = dℓ/cℓ of the constants cℓ and dℓ. Thus for ℓ = 0, Rayleigh’s formulas (9.68 & 9.104) and the boundary condition say that kr u (r )=c sin(kr ) d cos(kr ) = 0 or 0 0 0 0 0 − 0 0 d /c = tan kr .Thes-wave then is u (kr)=c sin(kr kr )/(kr cos kr ), 0 0 0 0 0 − 0 0 which tells us that the tangent of the phase shift is tan δ0(k)= kr0.By ≈ 2 − (9.90), the cross-section at low energy is σ 4πr0 or four times the classical value. Similarly, one finds (exercise 9.28) that the tangent of the p-wave phase shift is kr0 cos kr0 − sin kr0 tan δ1(k)= . (9.108) cos kr0 − kr0 sin kr0 3 For kr0 ≪ 1, we have δ1(k) ≈−(kr0) /3; more generally, the ℓth phase shift 2ℓ+1 2 is δℓ(k) ≈ −(kr0) / (2ℓ + 1)[(2ℓ − 1)!!] for a potential of range r0 at low energy k ≪ 1/r . 0 $ %

Further Reading A great deal is known about Bessel functions. Students may find Mathe- matical Methods for Physics and Engineering (Riley et al., 2006) as well as the classics A Treatise on the Theory of Bessel Functions (Watson, 1995), A Course of Modern Analysis (Whittaker and Watson, 1927, chap. XVII), and Methods of (Courant and Hilbert, 1955) of special interest.

Exercises

9.1 Show that the series (9.1) for Jn(ρ) satisfies Bessel’s equation (9.4). 9.2 Show that the generating function exp(z(u − 1/u)/2) for Bessel func- tions is invariant under the substitution u →−1/u. 9.3 Use the invariance of exp(z(u − 1/u)/2) under u →−1/u to show that n J−n(z)=(−1) Jn(z). 9.4 By writing the generating function (9.5) as the product of the expo- nentials exp(zu/2) and exp(−z/2u), derive the expansion ∞ ∞ m+n −n z − z m+n u z n u exp u − u 1 = − . !2 " #$ % % &2' (m + n)! & 2' n! n=0 m=−n (9.109) 9.5 From this expansion (9.109) of the generating function (9.5), derive the power-series expansion (9.1) for Jn(z). 370 Bessel Functions 9.6 In the formula (9.5) for the generating function exp(z(u 1/u)/2), − replace u by exp iθ and then derive the integral representation (9.6) for J (z). Start with the interval [ π, π]. n − 9.7 From the general integral representation (9.6) for Jn(z), derive the two integral formulas (9.7) for J0(z). 9.8 Show that the integral representations (9.6 & 9.7) imply that for any integer n = 0, J (0) = 0, while J (0) = 1. ̸ n 0 9.9 By differentiating the generating function (9.5) with respect to u and identifying the coefficients of powers of u, derive the recursion relation

2n J − (z)+J (z)= J (z). (9.110) n 1 n+1 z n 9.10 By differentiating the generating function (9.5) with respect to z and identifying the coefficients of powers of u, derive the recursion relation

′ J − (z) J (z)=2J (z). (9.111) n 1 − n+1 n 9.11 Change variables to z = ax and turn Bessel’s equation (9.4) into the self-adjoint form (9.11). 9.12 If y = J (ax), then equation (9.11) is (xy′)′ +(xa2 n2/x)y = 0. n − Multiply this equation by xy′, integrate from 0 to b, and so show that if ab = zn,m and Jn(zn,m) = 0, then

b 2 2 ′2 2 xJn(ax) dx = b Jn (zn,m) (9.112) )0 which is the normalization condition (9.14). 9.13 Show that with λ z2/r2, the change of variables ρ = zr/r and ≡ d d u(r)=J (ρ)turns (ru′)′ + n2 u/r = λ ru into (9.25). n − 9.14 Use the formula (6.42) for the curl in cylindrical coordinates and the vacuum forms ∇ E = B˙ and ∇ B = E˙ /c2 of the laws of Faraday × − × and Maxwell-Amp`ere to derive the field equations (9.55). 9.15 Derive equations (9.56) from (9.55). 9.16 Show that J ( ω2/c2 k2 ρ)einφei(kz−ωt) is a traveling-wave solution n − (9.52) of the wave& equations (9.57). 9.17 Find expressions for the nonzero TM fields in terms of the formula (9.59) for Ez. 9.18 Show that the TE field E = 0 and B = J ( ω2/c2 k2 ρ)einφei(kz−ωt) z z n − will satisfy the boundary conditions (9.54)& if ω2/c2 k2 r is a zero ′ ′ − zn,m of Jn. & Exercises 371

2 2 2 2 2 ′ 9.19 Show that if ℓ is an integer and if ω /c π ℓ /h r is a zero zn,m of ′ ′− inφ −iωt Jn,thenthefieldsEz = 0 and Bz&= Jn(zn,mρ/r) e sin(ℓπz/h) e satisfy both the boundary conditions (9.54) at ρ = r and those (9.60) at z = 0 and h as well as the wave equations (9.57). Hint: Use Maxwell’s equations ∇ E = B˙ and ∇ B = E˙ /c2 as in (9.55). × − × 9.20 Show that the resonant frequencies of the TM modes of the cavity of 2 2 2 2 2 example 9.4 are ωn,m,ℓ = c zn,m/r + π ℓ /h . ' 9.21 By setting n = ℓ +1/2 and jℓ = π/2xJℓ+1/2, show that Bessel’s equation (9.4) implies that the spherical& Bessel function jℓ satisfies (9.63). 9.22 Show that Rayleigh’s formula (9.68) implies the recursion relation (9.69). 9.23 Use the recursion relation (9.69) to show by induction that the spherical Bessel functions jℓ(x) as given by Rayleigh’s formula (9.68) satisfy their differential equation (9.66) which with x = kr is x2j′′ 2xj′ + ℓ(ℓ + 1)j = x2j . (9.113) − ℓ − ℓ ℓ ℓ Hint: start by showing that j0(x)=sin(x)/x satisfies this equation. This problem involves some tedium. 9.24 Iterate the trick d 1 i 1 i 1 eizx dx = xeizx dx = eizx d(x2 1) zdz #−1 z #−1 2z #−1 − i 1 1 1 = (x2 1)deizx = (x2 1)eizxdx −2z #−1 − 2 #−1 − (9.114) to show that (Schwinger et al., 1998, p. 227)

ℓ 1 1 2 ℓ d izx (x 1) izx e dx = −ℓ e dx. (9.115) (zdz ) #−1 #−1 2 ℓ! 9.25 Use the expansions (9.76 & 9.77) to show that the inner product of the ket |r that represents a particle at r with polar angles θ and φ and the ⟩ one |k that represents a particle with momentum p = !k with polar ⟩ angles θ′ and φ′ is with k · r = kr cos θ ∞ 1 θ 1 r|k = eikr cos = (2ℓ + 1) P (cos θ) iℓ j (kr) ⟨ ⟩ (2π)3/2 (2π)3/2 ℓ ℓ 'ℓ=0 ∞ 1 · 2 ∗ = eik r = iℓ j (kr) Y (θ, φ) Y (θ′, φ′). (2π)3/2 *π ℓ ℓ,m ℓ,m 'ℓ=0 (9.116) 378 Group Theory Thus U(g) cannot change E or j, and so

′ ′ ′ ′ (j) ⟨E ,j ,m U(g) E,j,m⟩ = δ ′ δ ′ ⟨ j, m U(g) j, m⟩ = δ ′ δ ′ D ′ (g). | | E E j j | | E E j j m m (10.13) The matrix element (10.11) is a single sum over E and j in which the (j) irreducible representations Dm′m(g) of the rotation group SU(2) appear ′ (j) ⟨φ U(g) ψ⟩ = ⟨φ E,j,m ⟩D ′ (g)⟨E,j,m ψ⟩. (10.14) | | % | m m | E,j,m′,m This is how the block-diagonal form (10.7) usually appears in calculations. (j) The matrices Dm′m(g) inherit the unitarity of the operator U(g).

10.4 Subgroups If all the elements of a group S also are elements of a group G,thenS is a subgroup of G. Every group G has two trivial subgroups—the identity element e and the whole group G itself. Many groups have more interesting subgroups. For example, the rotations about a fixed axis is an abelian subgroup of the group of all rotations in 3-dimensional space. A subgroup S ⊂ G is an invariant subgroup if every element s of the subgroup S is left inside the subgroup under the action of every element g of the whole group G, that is, if g−1sg = s′ ∈ S for all g ∈ G. (10.15) This condition often is written as g−1Sg = S for all g ∈ G or as Sg= gS for all g ∈ G. (10.16) Invariant subgroups also are called normal subgroups. A set C ⊂ G is called a conjugacy class if it’s invariant under the action of the whole group G, that is, if Cg = gC or g−1Cg= C for all g ∈ G. (10.17) A subgroup that is the union of a set of conjugacy classes is invariant. The center C of a group G is the set of all elements c ∈ G that commute with every element g of the group, that is, their commutators [c, g] ≡ cg − gc = 0 (10.18) vanish for all g ∈ G. 10.5 Cosets 379 Example 10.8 (Centers Are Abelian Subgroups) Does the center C always form an abelian subgroup of its group G?Theproductc1c2 of any two elements c1 and c2 of the center commutes with every element g of G since c1c2g = c1gc2 = gc1c2. So the center is closed under multiplication. The identity element e commutes with every g ∈ G,soe ∈ C.Ifc′ ∈ C,then c′g = gc′ for all g ∈ G, and so multiplication of this equation from the left and the right by c′−1 gives gc′−1 = c′−1g, which shows that c′−1 ∈ C. The subgroup C is abelian because each of its elements commutes with all the elements of G including those of C itself.

So the center of any group always is one of its abelian invariant subgroups. The center may be trivial, however, consisting either of the identity or of the whole group. But a group with a nontrivial center can not be simple or semisimple (section 10.23).

10.5 Cosets If H is a subgroup of a group G, then for every element g ∈ G the set of elements Hg ≡ hg h ∈ H, g ∈ G is a right coset of the subgroup { | } H ⊂ G. (Here ⊂ means is a subset of or equivalently is contained in.) If H is a subgroup of a group G, then for every element g ∈ G the set of elements gH is a left coset of the subgroup H ⊂ G. The number of elements in a coset is the same as the number of elements of H, which is the order of H. An element g of a group G is in one and only one right coset (and in one and only one left coset) of the subgroup H ⊂ G. For suppose instead that g were in two right cosets g ∈ Hg1 and g ∈ Hg2, so that g = h1g1 = h2g2 for suitable h1,h2 ∈ H and g1,g2 ∈ G.ThensinceH is a (sub)group, we have −1 g2 = h2 h1g1 = h3g1, which says that g2 ∈ Hg1. But this means that every element hg2 ∈ Hg2 is of the form hg2 = hh3g1 = h4g1 ∈ Hg1.Soevery element hg2 ∈ Hg2 is in Hg1: the two right cosets are identical, Hg1 = Hg2. The right (or left) cosets are the points of the quotient coset space G/H. If H is an invariant subgroup of G, then by definition (10.16) Hg = gH for all g ∈ G, and so the left cosets are the same sets as the right cosets. In this case, the coset space G/H is itself a group with multiplication defined 398 Group Theory and generate the elements of the group SU(2) σ θ θ exp i θ · = I cos + i θˆ · σ sin (10.118) - 2 # 2 2 in which I is the 2 2 identity matrix, θ = √θ2 and θˆ = θ/θ. × It follows from (10.117) that the spin operators satisfy

[Sa,Sb]=i ! ϵabc Sc. (10.119) The raising and lowering operators

S = S iS (10.120) ± 1 ± 2 have simple commutators with S3 [S ,S ]= ! S . (10.121) 3 ± ± ± This relation implies that if the state j, m is an eigenstate of S with | ⟩ 3 eigenvalue !m, then the states S j, m either vanish or are eigenstates of ±| ⟩ S with eigenvalues !(m 1) 3 ± S S j, m = S S j, m !S j, m = !(m 1)S j, m . (10.122) 3 ±| ⟩ ± 3| ⟩ ± ±| ⟩ ± ±| ⟩ Thus the raising and lowering operators raise and lower the eigenvalues of S .Whenj =1/2, the possible values of m are m = 1/2, and so with the 3 ± usual sign and normalization conventions

S = ! + and S− + = ! (10.123) +|−⟩ | ⟩ | ⟩ |−⟩ while

S + = 0 and S− =0. (10.124) +| ⟩ |−⟩ The square of the total spin operator is simply related to the raising and lowering operators and to S3

2 2 2 2 1 1 2 S = S + S + S = S S− + S−S + S . (10.125) 1 2 3 2 + 2 + 3 2 ! 2 But the squares of the Pauli matrices are unity, and so Sa =( /2) for all three values of a.Thus 3 S2 = !2 (10.126) 4 is a Casimir operator (10.105) for a spin one-half system.

Example 10.20 (Two Spin 1/2’s) Consider two spin operators S(1) and 408 Group Theory Example 10.24 (Lack of Analyticity) One may define a function f(q) of a quaternionic variable and then ask what functions are analytic in the sense that the (one-sided) derivative

′ ′− f ′(q)= lim f(q + q ) f(q) q 1 (10.184) q′→0 − ! " exists and is independent of the direction through which q′ → 0. This space of functions is extremely limited and does not even include the function f(q)=q2 (exercise 10.22).

10.28 The Symplectic Group Sp( 2n) The symplectic group Sp(2n) consists of 2n × 2n matrices W that map n- tuples q of into n-tuples q′ = Wq of quaternions with the same value of the quadratic quaternionic form

′ 2 ′ 2 ′ 2 ′ 2 2 2 2 2 ∥q ∥ = ∥q1∥ + ∥q2∥ + ···+ ∥qn∥ = ∥q1∥ + ∥q2∥ + ···+ ∥qn∥ = ∥q∥ . (10.185) By (10.177), the quadratic form ∥q′∥2 times the 2 × 2 identity matrix I is equal to the hermitian form q′†q′

′ 2 ′† ′ ′† ′ ′† ′ † † ∥q ∥ I = q q = q1 q1 + ···+ qn qn = q W Wq (10.186) and so any matrix W that is both a 2n × 2n unitary matrix and an n × n matrix of quaternions keeps ∥q′∥2 = ∥q∥2

′ ∥q ∥2 I = q†W †Wq = q†q = ∥q∥2 I. (10.187)

The group Sp(2n) thus consists of all 2n × 2n unitary matrices that also are n × n matrices of quaternions. (This last requirement is needed so that q′ = Wq is an n-tuple of quaternions.) The generators ta of the symplectic group Sp(2n) are 2n × 2n direct- product matrices of the form

I ⊗ A, σ1 ⊗ S1, σ2 ⊗ S2, and σ3 ⊗ S3 (10.188) in which I is the 2 × 2 identity matrix, the three σi’s are the Pauli matrices, A is an imaginary n × n anti-symmetric matrix, and the Si are n × n real symmetric matrices. These generators ta close under commutation

[ta,tb]=ifabctc. (10.189)

Any imaginary linear combination iαata of these generators is not only a 10.33 The Dirac Representation of the 421 is called a right-handed Weyl . One may show (exercise 10.38) that the action density

† Lr(x)=i ζ (x)(∂0I + ∇·σ) ζ(x) (10.274) is Lorentz covariant

−1 U(L) Lr(x) U (L)=Lr(Lx). (10.275) Example 10.33 (Why ζ Is Right Handed) An argument like that of ex- ample (10.31) shows that the field ζ(x) satisfies the wave equation

(∂0I + ∇·σ) ζ(x) = 0 (10.276) or in momentum space (E p · σ) ζ(p)=0. (10.277) − Thus, E = |p|, and ζ(p) is an eigenvector of pˆ · J 1 pˆ · J ζ(p)= ζ(p) (10.278) 2 with eigenvalue 1/2. A particle whose spin is parallel to its momentum is said to have positive helicity or to be right handed. Nearly massless antineutrinos are nearly right handed. The Majorana mass term

1 T T † LM (x)= 2 im ζ (x) σ2 ζ(x)+ im ζ (x) σ2 ζ(x) (10.279) ! " # $ like (10.266) is Lorentz covariant.

10.33 The Dirac Representation of the Lorentz Group Dirac’s representation of SO(3, 1) is the direct sum D(1/2,0) ⊕ D(0,1/2) of D(1/2,0) and D(0,1/2). Its generators are the 4 × 4 matrices 1 σ 0 i −σ 0 J = and K = . (10.280) 2 %0 σ& 2 % 0 σ& Dirac’s representation uses the Clifford algebra of the gamma matrices γa which satisfy the anticommutation relation {γa, γb} ≡ γa γb + γb γa =2ηab (10.281) in which η is the 4 × 4 (10.222) with η00 = −1 and ηjj =1 for j = 1, 2, and 3. 424 Group Theory Suppose T (y) is a translation that takes a 4-vector x to x + y and T (z)is a translation that takes a 4-vector x to x + z.ThenT (z)T (y) and T (y)T (z) both take x to x + y + z. So if a translation T (y)=T (t, y)isrepresented by a unitary operator U(t, y)=exp(iHt iP · y), then the hamiltonian H − and the momentum operator P commute with each other [H, P j] = 0 and [P i,Pj]=0. (10.299) We can figure out the commutation relations of H and P with the angular- momentum J and boost K operators by realizing that P a =(H, P )isa 4-vector. Let − · − · U(θ, λ)=e iθ J iλ K (10.300) be the (infinite-dimensional) unitary operator that represents (in Hilbert space) the infinitesimal L = I + θ · R + λ · B (10.301) where R and B are the six 4 × 4 matrices (10.231 & 10.232). Then because P is a 4-vector under Lorentz transformations, we have − · · − · − · U 1(θ, λ)PU(θ, λ)=e+iθ J+iλ K Pe iθ J iλ K =(I + θ · R + λ · B) P (10.302) or using (10.272) (I + iθ · J + iλ · K) H (I − iθ · J − iλ · K)=H + λ · P (10.303) (I + iθ · J + iλ · K) P (I − iθ · J − iλ · K)=P + Hλ + θ ∧ P . Thus, one finds (exercise 10.42) that H is invariant under rotations, while P transforms as a 3-vector

[Ji,H] = 0 and [Ji,Pj]=iϵijkPk (10.304) and that

[Ki,H]=−iPi and [Ki,Pj]=−iδijH. (10.305) By combining these equations with (10.285), one may write (exercise 10.44) the of the Poincar´egroup as i[J ab,Jcd]=ηbc J ad − ηac J bd − ηda J cb + ηdb J ca i[P a,Jbc]=ηabP c − ηacP b [P a,Pb]=0. (10.306)

Further Reading Exercises 425 The classic Lie Algebras in Particle Physics (Georgi, 1999), which inspired much of this chapter, is outstanding.

Exercises 10.1 Show that all n × n (real) orthogonal matrices O leave invariant the 2 2 2 ′ ′2 2 quadratic form x1 +x2 +···+xn, that is, that if x = Ox,thenx = x . 10.2 Show that the set of all n × n orthogonal matrices forms a group. 10.3 Show that all n × n unitary matrices U leave invariant the quadratic 2 2 2 ′ ′ 2 2 form |x1| +|x2| +···+|xn| , that is, that if x = Ux,then|x | = |x| . 10.4 Show that the set of all n × n unitary matrices forms a group. 10.5 Show that the set of all n × n unitary matrices with unit determinant forms a group. (j) ′ 10.6 Show that the matrix Dm′m(g)=⟨j, m |U(g)|j, m⟩ is unitary because ′ † the rotation operator U(g) is unitary ⟨j, m |U (g)U(g)|j, m⟩ = δm′m. 10.7 Invent a group of order 3 and compute its multiplication table. For extra credit, prove that the group is unique. 10.8 Show that the relation (10.20) between two equivalent representations is an isomorphism. 10.9 Suppose that D1 and D2 are equivalent, finite-dimensional, irreducible −1 representations of a group G so that D2(g)=SD1(g)S for all g ∈ G. What can you say about a matrix A that satisfies D2(g) A = AD1(g) for all g ∈ G? 10.10 Find all components of the matrix exp(iαA)inwhich 00−i A = ⎛00 0⎞ . (10.307) i 00 ⎝ ⎠ 10.11 If [A, B]=B,findeiαABe−iαA. Hint: what are the α-derivatives of this expression? 10.12 Show that the tensor-product matrix (10.31) of two representations D1 and D2 is a representation. 10.13 Find a 4 × 4 matrix S that relates the tensor-product representation D 1 1 to the direct sum D ⊕ D . ⊗ 1 0 2 2 10.14 Find the generators in the adjoint representation of the group with structure constants fabc = ϵabc where a, b, c run from 1 to 3. Hint: The answer is three 3 × 3 matrices ta, often written as La. 10.15 Show that the generators (10.90) satisfy the commutation relations (10.93). 426 Group Theory 10.16 Show that the demonstrated equation (10.98) implies the commuta- tion relation (10.99). 10.17 Use the Cayley-Hamilton theorem (1.264) to show that the 3 × 3 matrix (10.96) that represents a right-handed rotation of θ radians about the axis θ is given by (10.97). 10.18 Verify the mixed Jacobi identity (10.142). 10.19 For the group SU(3), find the structure constants f123 and f231. 10.20 Show that every 2 × 2 unitary matrix of unit determinant is a quater- nion of unit norm. 10.21 Show that the quaternions as defined by (10.175) are closed under addition and multiplication and that the product xq is a if x is real and q is a quaternion. 10.22 Show that the one-sided derivative f ′(q) (10.184) of the quaternionic function f(q)=q2 depends upon the direction along which q′ → 0. 10.23 Show that the generators (10.188) of Sp(2n) obey commutation rela- tions of the form (10.189) for some real structure constants fabc and a ′ ′ suitably extended set of matrices A, A , . . . and Sk, Sk,.... 10.24 Show that for 0 < ϵ ≪ 1, the real 2n × 2n matrix T =exp(ϵJS) in which S is symmetric satisfies T TJT = J (at least up to terms of order ϵ2) and so is in Sp(2n, R). 10.25 Show that the matrix T of (10.197) is in Sp(2,R). 10.26 Use the parametrization (10.217) of the group SU(2), show that the parameters a(c, b) that describe the product g(a(c, b)) = g(c) g(b) are those of (10.219). 10.27 Use formulas (10.219) and (10.212) to show that the left-invariant measure for SU(2) is given by (10.220). 10.28 In tensor notation, which is explained in chapter 11, the condition (10.229) that I + ω be an infinitesimal Lorentz transformation reads ωT a ωa c da b = b = − ηbc ω d η in which sums over c and d from 0 to 3 gh !are understood." In this notation, the matrix ηef lowers indices and η a da gh raises them, so that ωb = − ωbd η . (Both ηef and η are numeri- cally equal to the matrix η displayed in equation (10.222).) Multiply both sides of the condition (10.229) by ηae = ηea and use the relation da d d η ηae = η e ≡ δ e to show that the matrix ωab with both indices lowered (or raised) is antisymmetric, that is,

ba ab ωba = − ωab and ω = − ω . (10.308)

10.29 Show that the six matrices (10.231) and (10.232) satisfy the SO(3, 1) condition (10.229). Exercises 427 10.30 Show that the six generators J and K obey the commutations rela- tions (10.234–10.236). 10.31 Show that if J and K satisfy the commutation relations (10.234– 10.236) of the Lie algebra of the Lorentz group, then so do J and K. − 10.32 Show that if the six generators J and K obey the commutation re- lations (10.234–10.236), then the six generators J + and J − obey the commutation relations (10.243). 10.33 Relate the parameter α in the definition (10.253) of the standard boost B(p) to the 4-vector p and the mass m. 10.34 Derive the formulas for D(1/2,0)(0, α pˆ) given in equation (10.254). 10.35 Derive the formulas for D(0,1/2)(0, α pˆ) given in equation (10.271). 10.36 For infinitesimal complex z, derive the 4-vector properties (10.255 & 10.272) of ( I,σ)underD(1/2,0) and of (I,σ)underD(0,1/2). − 10.37 Show that under the unitary Lorentz transformation (10.257),the action density (10.258) is Lorentz covariant (10.259). 10.38 Show that under the unitary Lorentz transformation (10.273), the action density (10.274) is Lorentz covariant (10.275). 10.39 Show that under the unitary Lorentz transformations (10.257 & 10.273), the Majorana mass terms (10.266 & 10.279) are Lorentz covariant. 10.40 Show that the definitions of the gamma matrices (10.281) and of the generators (10.283) imply that the gamma matrices transform as a 4- vector under Lorentz transformations (10.284). 10.41 Show that (10.283) and (10.284) imply that the generators J ab satisfy the commutation relations (10.285) of the Lorentz group. ∗ 10.42 Show that the spinor ζ = σ2ξ defined by (10.295) is right handed (10.273) if ξ is left handed (10.257). 10.43 Use (10.303) to get (10.304 & 10.305). 10.44 Derive (10.306) from (10.285, 10.299, (10.304), & 10.305). 432 Tensors and Local Symmetries

′ The coefficients ei · ej form an , and the linear operator

n n e e′ T = |e ⟩⟨e′ | (11.21) % i i % i i i=1 i=1 is an orthogonal (real, unitary) transformation. The change x → x′ is a rotation plus a possible reflection (exercise 11.2).

Example 11.2 (A of Two Dimensions) In two-dimensional euclidean space, one can describe the same point by euclidean (x, y) and polar (r, θ) coordinates. The derivatives

∂r x ∂x ∂r y ∂y = = and = = (11.22) ∂x r ∂r ∂y r ∂r respect the symmetry (11.18), but (exercise 11.1) these derivatives

∂θ y ∂x ∂θ x ∂y = − ̸= = −y and = ̸= = x (11.23) ∂x r2 ∂θ ∂y r2 ∂θ do not.

11.6 Summation Conventions When a given index is repeated in a product, that index usually is being summed over. So to avoid distracting summation symbols, one writes

n A B ≡ A B . (11.24) i i % i i i=1 The sum is understood to be over the relevant range of indices, usually from 0 or 1 to 3 or n. Where the distinction between covariant and contravariant indices matters, an index that appears twice in the same monomial, once as a subscript and once as a superscript, is a dummy index that is summed over as in n A Bi ≡ A Bi. (11.25) i % i i=1 These summation conventions make tensor notation almost as compact as matrix notation. They make equations easier to read and write. 480 Tensors and Local Symmetries How weak are the static gravitational fields we know about? The dimen- sionless ratio φ/c2 is 10−39 on the surface of a proton, 10−9 on the Earth, 10−6 on the surface of the sun, and 10−4 on the surface of a white dwarf.

11.41 Gravitational Time Dilation i Suppose we have a system of coordinates x with a metric gik and a clock at rest in this system. Then the proper time dτ between ticks of the clock is

dτ =(1/c) g dxi dxj = √ g dt (11.340) − ij − 00 " where dt is the time between ticks in the xi coordinates, which is the labo- ratory frame in the gravitational field g00. By the principle of equivalence (section 11.39), the proper time dτ between ticks is the same as the time between ticks when the same clock is at rest deep in empty space. If the clock is in a weak static gravitational field due to a mass M at a distance r,then g =1+2φ/c2 =1 2GM/c2r (11.341) − 00 − is a little less than unity, and the interval of proper time between ticks dτ = √ g dt = 1 2GM/c2rdt (11.342) − 00 − is slightly less than the interval dt between# ticks in the coordinate system of an observer at x in the rest frame of the clock and the mass, and in its gravitational field. Since dt > dτ, the laboratory time dt between ticks is greater than the proper or intrinsic time dτ between ticks of the clock unaffected by any gravitational field. Clocks near big masses run slow. Now suppose we have two identical clocks in at different heights above sea level. The time Tℓ for the lower clock to make N ticks will be longer than the time Tu for the upper clock to make N ticks. The ratio of the clock times will be 2 Tℓ 1 2GM/c (r + h) gh = − 1+ 2 . (11.343) Tu 1 2GM/c2r ≈ c # − Now imagine that a photon# going down passes the upper clock which mea- sures its frequency as νu and then passes the lower clock which measures its frequency as νℓ. The slower clock will measure a higher frequency. The ratio of the two frequencies will be the same as the ratio of the clock times

νℓ gh =1+ 2 . (11.344) νu c As measured by the lower, slower clock, the photon is blue shifted. 488 Tensors and Local Symmetries Black holes are not really black. Stephen Hawking (1942–) has shown that the intense gravitational field of a black hole of mass M radiates at temperature ! c3 T = (11.389) 8π kGM in which k =8.617343 10−5 eV K−1 is Boltzmann’s constant, and ! is × Planck’s constant h =6.6260693 10−34 Js dividedby2π, ! = h/(2π). × The black hole is entirely converted into radiation after a time 5120 π G2 t = M 3 (11.390) ! c4 proportional to the cube of its mass.

11.48 Cosmology Astrophysical observations tell us that on the largest observable scales, space is flat or very nearly flat; that the visible universe contains at least 1090 particles; and that the cosmic microwave background radiation is isotropic to one part in 105 apart from a Doppler shift due the motion of the Earth. These and other observations suggest that potential energy expanded our universe by exp(60) = 1026 during an era of inflation that could have been as brief as 10−35 s. The potential energy that powered inflation became the radiation of the Big Bang. During the first three minutes, some of that radiation became hydrogen, helium, neutrinos, and dark matter. But for 65,000 years after the Big Bang, most of the energy of the visible universe was radiation. Because the momentum of a particle but not its mass falls with the expansion of the universe, this era of radiation gradually gave way to an era of matter. This transition happened when the temperature kT of the universe fell to 1.28 eV. The era of matter lasted for 8.8 billion years. After 380,000 years, the universe had cooled to kT =0.26 eV, and less than 1% of the atoms were ionized. Photons no longer scattered off a plasma of electrons and ions. The universe became transparent. The photons that last scattered just before this initial transparency became the cosmic microwave back- ground radiation or CMBR that now surrounds us, red-shifted to 2.7255 0.0006 K. ± The era of matter has been followed by the current era of dark energy during which the energy of the visible universe is mostly a potential energy 11.48 Cosmology 489 called dark energy (something like a cosmological constant). Dark energy has been accelerating the expansion of the universe for the past 5 billion years and may continue to do so forever. It is now 13.817 0.048 billion years after the Big Bang, and the dark- ± −30 2 −3 energy density is ρde = 5.827 10 c g cm or 68.5 percent ( 1.8%) of × 2 2 −29± 2 −3 the critical energy density ρc =3H0 /8πG =1.87837 h 10 c g cm ×−1 −1 needed to make the universe flat. Here H0 = 100 h km s Mpc is the Hubble constant, one parsec is 3.262 light-years, the Hubble time is 1/H =9.778 h−1 109 years, and h = 0.673 0.012 is not to be confused 0 × ± with Planck’s constant. Matter makes up 31.5 1.8 % of the critical density, and baryons only ± 4.9 0.06 % of it. Baryons are 15 % of the total matter in the visible uni- ± verse. The other 85 % does not interact with light and is called dark mat- ter. Einstein’s equations (11.373) are second-order, non-linear partial differ- ential equations for 10 unknown functions gij(x) in terms of the energy- momentum tensor Tij(x) throughout the universe, which of course we don’t know. The problem is not quite hopeless, however. The ability to choose ar- bitrary coordinates, the appeal to symmetry, and the choice of a reasonable form for Tij all help. Hubble showed us that the universe is expanding. The cosmic microwave background radiation looks the same in all spatial directions (apart from a Doppler shift due to the motion of the Earth relative to the local super- cluster of galaxies). Observations of clusters of galaxies reveal a universe that is homogeneous on suitably large scales of distance. So it is plausible that the universe is homogeneous and isotropic in space, but not in time. One may show (Carroll, 2003) that for a universe of such symmtery, the line element in comoving coordinates is dr2 ds2 = dt2 + a2 + r2 dθ2 +sin2 θ dφ2 . (11.391) − "1 kr2 # − ! " Whitney’s embedding theorem tells us that any smooth four-dimensional manifold can be embedded in a flat space of eight dimensions with a suitable signature. We need only four or five dimensions to embed the space-time described by the line element (11.391). If the universe is closed, then the signature is ( 1, 1, 1, 1, 1), and our three-dimensional space is the 3-sphere − which is the surface of a four-dimensional sphere in four space dimensions. The points of the universe then are p =(t, a sin χ sin θ cos φ, a sin χ sin θ sin φ, a sin χ cos θ,acos χ) (11.392) 11.48 Cosmology 491 The other nonzero Γ’s are Γ1 = r (1 kr2) Γ1 = r (1 kr2)sin2 θ (11.399) 22 − − 33 − − 1 Γ2 = Γ3 = = Γ2 = Γ3 (11.400) 12 13 r 21 31 Γ2 = sin θ cos θ Γ3 = cot θ = Γ3 . (11.401) 33 − 23 32 Our formulas (11.350 & 11.348) for the Ricci and curvature tensors give

n n R00 = R0n0 =[∂0 + Γ0, ∂n + Γn] 0. (11.402)

Clearly the commutator of Γ0 with itself vanishes, and one may use the formulas (11.397–11.401) for the other connections to check that a˙ 2 [Γ , Γ ]n = Γn Γk Γn Γk =3 (11.403) 0 n 0 0 k n 0 − nk 00 !a" and that a˙ a¨ a˙ 2 ∂ Γn =3∂ =3 3 (11.404) 0 n 0 0 !a" a − !a" n while ∂nΓ00 = 0. So the 00-component of the Ricci tensor is a¨ R =3 . (11.405) 00 a Similarly, one may show that the other non-zero components of Ricci’s tensor are A R = R = r2A and R = r2A sin2 θ (11.406) 11 −1 kr2 22 − 33 − − in which A = aa¨ +2˙a2 +2k. The scalar curvature (11.351) is 6 R = gabR = aa¨ +˙a2 + k . (11.407) ba −a2 ! In co-moving coordinates such as those$ of the Robertson-Walker metric (11.395) ui =(1, 0, 0, 0), and so the energy-momentum tensor (11.370) is ρ 000 ⎛0 pg11 00⎞ T = . (11.408) ij 00pg 0 ⎜ 22 ⎟ ⎜00 0pg ⎟ ⎝ 33⎠ Its is T = gij T = ρ +3p. (11.409) ij − Thus using our formula (11.395) for g = 1, (11.405) for R , (11.408) 00 − 00 11.48 Cosmology 493 density 3H2 ρ = . (11.415) c 8πG The ratio of the energy density ρ to the critical energy density is called Ω

Ω ρ 8πG = = 2 ρ. (11.416) ρc 3H From (11.414), we see that Ω is k k Ω =1+ =1+ . (11.417) (aH)2 a˙ 2 Thus Ω = 1 both in a flat universe (k = 0) and as aH →∞. One use of inflation is to expand a by 1026 so as to force Ω almost exactly to unity. Something like inflation is needed because in a universe in which the energy density is due to matter and/or radiation, the present value of Ω Ω = 1.000 0.036 (11.418) 0 ± is unlikely. To see why, we note that conservation of energy ensures that a3 times the matter density ρm is constant. Radiation red-shifts by a, so energy 4 conservation implies that a times the radiation density ρr is constant. So with n = 3 for matter and 4 for radiation, ρan ≡ 3 F 2/8πG is a constant. In terms of F and n, Friedmann’s first-order equation (11.412) is 8πG F 2 a˙ 2 = ρa2 − k = − k (11.419) 3 an−2 In small-a limit of the early-universe, we have a˙ = F/a(n−2)/2 or a(n−2)/2da = Fdt (11.420) which we integrate to a ∼ t2/n so thata ˙ ∼ t2/n−1. Now (11.417) says that 1 t radiation Ω − 1 = ∝ t2−4/n = . (11.421) | | a˙ 2 ! t2/3 matter Thus, Ω deviated from unity faster than t2/3 during the eras of radiation and matter. At this rate, the inequality Ω − 1 < 0.036 could last 13.8 | 0 | billion years only if Ω at t = 1 second had been unity to within six parts in 1014. The only known explanation for such early flatness is inflation. Manipulating our relation (11.417) between Ω and aH, we see that k (aH)2 = . (11.422) Ω − 1 So Ω > 1 implies k = 1, and Ω < 1 implies k = −1, and as Ω → 1the 494 Tensors and Local Symmetries product aH →∞, which is the essence of flatness since curvature vanishes as the scale factor a →∞. Imagine blowing up a balloon. Staying for the moment with a universe without inflation and with an energy density composed of radiation and/or matter, we note that the first- order equation (11.419) in the forma ˙ 2 = F 2/an−2 − k tells us that for a closed (k = 1) universe, in the limit a →∞we’d havea ˙ 2 →−1 which is im- possible. Thus a closed universe eventually collapses, which is incompatible with the flatness (11.422) implied by the present value Ω = 1.000 0.036. 0 ± The first-order equation Friedmann (11.412) tells us that ρ a2 ≥ 3k/8πG. So in a closed universe (k = 1), the energy density ρ is positive and increases without limit as a → 0 as in a collapse. In open (k<0) and flat (k = 0) uni- verses, the same Friedmann equation (11.412) in the forma ˙ 2 =8πGρa2/3−k tells us that if ρ is positive, thena ˙ 2 > 0, which means thata ˙ never vanishes. Hubble told us thata> ˙ 0 now. So if our universe is open or flat, then it always expands. Due to the expansion of the universe, the wave-length of radiation grows with the scale factor a(t). A photon emitted at time t and scale factor a(t) with wave-length λ(t) will be seen now at time t0 and scale factor a(t0)to have a longer wave-length λ(t0) λ(t ) a(t ) 0 = 0 = z + 1 (11.423) λ(t) a(t) in which the redshift z is the ratio λ(t ) − λ(t) ∆λ z = 0 = . (11.424) λ(t) λ

Now H =˙a/a = da/(adt)impliesdt = da/(aH), and z = a0/a − 1 implies 2 dz = −a0da/a ,sowefind dz dt = − (11.425) (1 + z)H(z) which relates time intervals to redshift intervals. An on-line calculator is available for macroscopic intervals (Wright, 2006).

11.49 Model Cosmologies The 0-component of the energy-momentum conservation law (11.372) is a a Γa c a Γc 0=(T 0);a = ∂aT 0 + acT 0 − T c 0a Γa cc Γc = −∂0T00 − a0T00 − g Tcc 0c a˙ a˙ a˙ = − ρ˙ − 3 ρ − 3p = − ρ˙ − 3 (ρ + p) . (11.426) a a a 11.49 Model Cosmologies 495 or dρ 3 = (ρ + p) . (11.427) da −a

The energy density ρ is composed of fractions ρk each contributing its own partial pressure pk according to its own equation of state

pk = wkρk (11.428) in which wk is a constant. In terms of these components, the energy- momentum conservation law (11.427) is dρ 3 k = (1 + w ) ρ (11.429) da −a k k "k "k with solution a 3(1+wk) a 3(1+pk/ρk) ρ = ρ = ρ . (11.430) k #a$ k #a$ "k "k Simple cosmological models take the energy density and pressure each to have a single component with p = wρ, and in this case

a 3(1+w) a 3(1+p/ρ) ρ = ρ = ρ . (11.431) #a$ #a$

Example 11.25 (w = 1/3, No Acceleration) If w = 1/3, then p = − − w ρ = ρ/3 and ρ +3p = 0. The second-order Friedmann equation (11.410) − then tells us thata ¨ = 0. The scale factor does not accelerate. To find its constant speed, we use its equation of state (11.431)

a 3(1+w) a 2 ρ = ρ = ρ . (11.432) #a$ #a$ Now all the terms in Friedmann’s first-order equation (11.412) have a com- mon factor of 1/a2 which cancels leaving us with the square of the constant speed 8πG a˙ 2 = ρ a2 k. (11.433) 3 − Incidentally, ρ a2 must exceed 3k/8πG. The scale factor grows linearly with time as 8πG 1/2 a(t)= ρ a2 k (t t )+a(t ). (11.434) # 3 − $ − 0 0 496 Tensors and Local Symmetries

Setting t0 = 0 and a(0) = 0, we use the definition of the Hubble parameter H =˙a/a to write the constant linear growtha ˙ as aH and the time as a a t = da′/a′H =(1/aH) da′ =1/H. (11.435) %0 %0 So in a universe without acceleration, the age of the universe is the inverse of the Hubble rate. For our universe, the present Hubble time is 1/H0 = 14.5 billion years, which isn’t far from the actual age of 13.817 0.048 ± billion years. Presumably, a slower Hubble rate during the era of matter compensates for the higher rate during the era of dark energy.

Example 11.26 (w = 1, Inflation) Inflation occurs when the ground − state of the theory has a positive and constant energy density ρ > 0 that dwarfs the energy densities of the matter and radiation. The internal en- ergy of the universe then is proportional to its volume U = ρ V , and the pressure p as given by the thermodynamic relation ∂U p = = ρ (11.436) −∂V − is negative. The equation of state (11.428) tells us that in this case w = 1. − The second-order Friedmann equation (11.410) becomes a¨ 4πG 8πGρ = (ρ +3p)= ≡ g2 (11.437) a − 3 3 By it and the first-order Friedmann equation (11.412) and by choosing t =0 as the time at which the scale factor a is minimal, one may show (exercise 11.37) that in a closed (k = 1) universe cosh gt a(t)= . (11.438) g Similarly in an open (k = −1) universe with a(0) = 0, we have sinh gt a(t)= . (11.439) g Finally in a flat (k = 0) expanding universe, the scale factor is a(t)=a(0) exp(gt). (11.440) Studies of the cosmic microwave background radiation suggest that infla- tion did occur in the very early universe—possibly on a time scale as short as 10−35 s. What is the origin of the vacuum energy density ρ that drove 11.49 Model Cosmologies 497 inflation? Current theories attribute it to the assumption by at least one scalar field φ of a mean value φ different from the one 0 φ 0 that min- ⟨ ⟩ ⟨ | | ⟩ imizes the energy density of the vacuum. When φ settled to 0 φ 0 ,the ⟨ ⟩ ⟨ | | ⟩ vacuum energy was released as radiation and matter in a Big Bang.

Example 11.27 (w =1/3, The Era of Radiation) Until a redshift of z = 3400 or 50,000 years after inflation, our universe was dominated by ra- diation (Frieman et al., 2008). During The First Three Minutes (Weinberg, 1988) of the era of radiation, the quarks and gluons formed hadrons, which decayed into protons and neutrons. As the neutrons decayed (τ = 885.7 s), they and the protons formed the light elements—principally hydrogen, deuterium, and helium in a process called big-bang nucleosynthesis. We can guess the value of w for radiation by noticing that the energy- momentum tensor of the electromagnetic field (in suitable units) 1 T ab = F a F bc gabF F cd (11.441) c − 4 cd is traceless 1 T = T a = F a F c δaF F cd =0. (11.442) a c a − 4 a cd But by (11.409) its trace must be T =3p ρ. So for radiation p = ρ/3 and − w =1/3. The relation (11.431) between the energy density and the scale factor then is a 4 ρ = ρ . (11.443) #a$

The energy drops both with the volume a3 and with the scale factor a due to a redshift; so it drops as 1/a4. Thus the quantity

8πGρa4 f 2 ≡ (11.444) 3 is a constant. The Friedmann equations (11.410 & 11.411) now are

a¨ 4πG 8πGρ f 2 = − (ρ +3p)=− ora ¨ = − (11.445) a 3 3 a3 and f 2 a˙ 2 + k = . (11.446) a2 498 Tensors and Local Symmetries With calendars chosen so that a(0) = 0, this last equation (11.446) tells us that for a flat universe (k = 0) a(t)=(2ft)1/2 (11.447) while for a closed universe (k = 1)

a(t)= f 2 (t f)2 (11.448) & − − and for an open universe (k = 1) − a(t)= (t + f)2 f 2 (11.449) & − as we saw in (6.422). The scale factor (11.448) of a closed universe of radiation has a maximum a = f at t = f and falls back to zero at t =2f.

Example 11.28 (w = 0, The Era of Matter) A universe composed only of dust or non-relativistic collisionless matter has no pressure. Thus p = wρ =0withρ = 0, and so w = 0. Conservation of energy (11.430), or ̸ equivalently (11.431), implies that the energy density falls with the volume a 3 ρ = ρ . (11.450) #a$ As the scale factor a(t) increases, the matter energy density, which falls as 1/a3, eventually dominates the radiation energy density, which falls as 1/a4. This happened in our universe about 50,000 years after inflation at a temperature of T =9, 400 K or kT =0.81 eV. Were baryons most of the matter, the era of radiation dominance would have lasted for a few hundred thousand years. But the kind of matter that we know about, which interacts with photons, is only about 17% of the total; the rest—an unknown substance called dark matter—shortened the era of radiation dominance by nearly 2 million years. Since ρ 1/a3, the quantity ∝ 4πGρa3 m2 = (11.451) 3 is a constant. For a matter-dominated universe, the Friedmann equations (11.410 & 11.411) then are a¨ 4πG 4πGρ m2 = (ρ +3p)= ora ¨ = (11.452) a − 3 − 3 − a2 11.49 Model Cosmologies 499 and a˙ 2 + k =2m2/a. (11.453)

For a flat universe, k = 0, we get

3m 2/3 a(t)= t . (11.454) ' √2 ( For a closed universe, k = 1, we use example 6.46 to integrate

a˙ = 2m2/a 1 (11.455) ) − to 2 2 2 t t0 = a(2m a) m arcsin(1 a/m ). (11.456) − −) − − − With a suitable calendar and choice of t0, one may parametrize this solution in terms of the development angle φ(t) as

a(t)=m2 [1 cos φ(t)] − t = m2 [φ(t) sin φ(t)] . (11.457) − For an open universe, k = 1, we use example 6.47 to integrate − a˙ = 2m2/a + 1 (11.458) ) to 1/2 1/2 t t = a(2m2 + a) m2 ln 2 a(2m2 + a) +2a +2m2 . (11.459) − 0 − * + , * + - The conventional parametrization is

a(t)=m2 [cosh φ(t) 1] − t = m2 [sinh φ(t) φ(t)] . (11.460) −

Transparency: Some 380,000 years after inflation at a redshift of z = 1090, the universe had cooled to about T = 3000 K or kT =0.26 eV—a tem- perature at which less than 1% of the hydrogen is ionized. Ordinary matter became a gas of neutral atoms rather than a plasma of ions and electrons, and the universe suddenly became transparent to light. Some scientists call this moment of last scattering or first transparency recombination. 500 Tensors and Local Symmetries Example 11.29 (w = 1, The Era of Dark Energy) About 10.3 billion − years after inflation at a redshift of z =0.30, the matter density falling as 1/a3 dropped below the very small but positive value of the energy den- 4 sity ρv =(2.23 meV) of the vacuum. The present time is 13.817 billion years after inflation. So for the past 3 billion years, this constant energy density, called dark energy, has accelerated the expansion of the universe approximately as (11.439)

a(t)=a(tm)exp (t tm) 8πGρv/3 (11.461) - − ) . in which t = 10.3 109 years. m ×

Observations and measurements on the largest scales indicate that the universe is flat: k = 0. So the evolution of the scale factor a(t) is given by the k = 0 equations (11.440, 11.447, 11.454, & 11.461) for a flat universe. During the brief era of inflation, the scale factor a(t) grew as (11.440)

a(t)=a(0) exp t 8πGρi/3 (11.462) / ) . in which ρi is the positive energy density that drove inflation. During the 50,000-year era of radiation, a(t) grew as √t as in (11.447) 1/2 ′ 4 ′ a(t)= 2(t ti) 8πGρ(tr)a (tr)/3 + a(ti) (11.463) / − ) . ′ where ti is the time at the end of inflation, and tr is any time during the era of radiation. During this era, the energy of highly relativistic particles dominated the energy density, and ρa4 T 4a4 was approximately con- ∝ stant, so that T (t) 1/a(t) 1/√t. When the temperature was in the 12 ∝10 ∝ 2 2 range 10 >T >10 K or mµc >kT>mec ,wheremµ is the mass of the muon and me that of the electron, the radiation was mostly electrons, positrons, photons, and neutrinos, and the relation between the time t and the temperature T was (Weinberg, 2010, ch. 3) 1010 K 2 t =0.994 sec + constant. (11.464) × ' T ( By 109 K, the positrons had annihilated with electrons, and the neutrinos fallen out of equilibrium. Between 109 K and 106 K, when the energy density of nonrelativistic particles became relevant, the time-temperature relation was (Weinberg, 2010, ch. 3) 1010 K 2 t =1.78 sec + constant′. (11.465) × ' T ( 11.50 Yang-Mills Theory 501 During the 10.3 billion years of the matter era, a(t) grew as (11.454)

2/3 ′ ′ 3/2 a(t)= (t tr) 3πGρ(tm)a(tm)+a (tr) + a(tr) (11.466) * − ) 0 ′ where tr is the time at the end of the radiation era, and tm is any time in the matter era. By 380,000 years, the temperature had dropped to 3000 K, the universe had become transparent, and the CMBR had begun to travel freely. Over the past 3 billion years of the era of vacuum dominance, a(t) has been growing exponentially (11.461)

a(t)=a(tm)exp (t tm) 8πGρv/3 (11.467) / − ) . in which tm is the time at the end of the matter era, and ρv is the density of dark energy, which while vastly less than the energy density ρi that drove inflation, currently amounts to 68.5% of the total energy density.

11.50 Yang-Mills Theory The gauge transformation of an abelian gauge theory like electrodynam- ics multiplies a single charged field by a space-time-dependent phase factor φ′(x)=exp(iqθ(x)) φ(x). Yang and Mills generalized this gauge transfor- mation to one that multiplies a vector φ of matter fields by a space-time dependent unitary matrix U(x)

n ′ ′ φa(x)= Uab(x) φb(x) or φ (x)=U(x) φ(x) (11.468) "b=1 and showed how to make the action of the theory invariant under such non- abelian gauge transformations. (The fields φ are scalars for simplicity.) Since the matrix U is unitary, inner products like φ†(x) φ(x) are automat- ically invariant ′ φ†(x) φ(x) = φ†(x)U †(x)U(x)φ(x)=φ†(x)φ(x). (11.469) / . i † But inner products of derivatives ∂ φ ∂iφ are not invariant because the derivative acts on the matrix U(x) as well as on the field φ(x). Yang and Mills made derivatives Diφ that transform like the fields φ ′ (Diφ) = UDiφ. (11.470)

To do so, they introduced gauge-field matrices Ai that play the role of 11.51 Gauge Theory and Vectors 503

† 2 with φ0φ0 = m /λ so as to minimize their potential energy density V (φ). i † i i † Their kinetic action (D φ) Diφ =(∂ φ + A φ) (∂iφ + Aiφ)thenisineffect † i i α i φ0 A Aiφ0. The gauge-field matrix Aab = i tabAα is a linear combination of the generators tα of the gauge group. So the action of the scalar fields † i 2 i contains the term φ A A φ = M A A β in which the mass-squared 0 i 0 − αβ α i 2 ∗a α β c matrix for the gauge fields is Mαβ = φ0 tab tbc φ0.ThisHiggs mechanism gives masses to those linear combinations bβi Aβ of the gauge fields for which 2 2 Mαβ bβi = mi bαi ̸=0. The also gives masses to the fermions. The mass term m in the Yang-Mills-Dirac action is replaced by something like c φ in which c is a constant, different for each fermion. In the vacuum and at low temperatures, each fermion acquires as its mass c φ0 . On 4 July 2012, physicists at CERN’s Large Hadron Collider announced the discovery of a Higgs-like particle with a mass near 126 GeV/c2 (Peter Higgs 1929–).

11.51 Gauge Theory and Vectors This section is optional on a first reading. We can formulate Yang-Mills theory in terms of vectors as we did rela- tivity. To accomodate noncompact groups, we will generalize the unitary matrices U(x) of the Yang-Mills gauge group to nonsingular matrices V (x) that act on n matter fields ψa(x) as n ψ′a(x)= V a (x) ψb(x). (11.480) % b a=1 The field n Ψ(x)= e (x) ψa(x) (11.481) % a a=1 ′ will be gauge invariant Ψ (x)=Ψ(x) if the vectors ea(x) transform as n e′ (x)= e (x) V −1b (x). (11.482) a % b a b=1 In what follows, we will sum over repeated indices from 1 to n and often will suppress explicit mention of the space-time coordinates. In this compressed notation, the field Ψ is gauge invariant because

′ ′ ′a −1b a c b c b Ψ = ea ψ = eb V a V c ψ = eb δ c ψ = eb ψ = Ψ (11.483) which is e′Tψ′ = eTV −1V ψ = eTψ in matrix notation. 508 Tensors and Local Symmetries dρ, dφ, and dz, and so derive the expressions (11.169) for the orthonor- mal basis vectors ρˆ, φˆ, and zˆ. 11.14 Similarly, derive (11.175) from (11.174). 11.15 Use the definition (11.191) to show that in flat 3-space, the dual of the Hodge dual is the identity: dxi = dxi and (dxi dxk)=dxi dxk. ∗∗ ∗∗ ∧ ∧ 11.16 Use the definition of the Hodge star (11.202) to derive (a) two of the four identities (11.203) and (b) the other two. 11.17 Show that Levi-Civita’s 4-symbol obeys the identity (11.207). pmn p 11.18 Show that ϵℓmn ϵ =2δℓ . pℓmn p 11.19 Show that ϵkℓmn ϵ = 3! δk. 11.20 Using the formulas (11.175) for the basis vectors of spherical coordi- nates in terms of those of rectangular coordinates, compute the deriva- tives of the unit vectors rˆ, θˆ, and φˆ with respect to the variables r, θ, and φ. Your formulas should express these derivatives in terms of the basis vectors rˆ, θˆ, and φˆ. (b) Using the formulas of (a) and our expression (6.28) for the gradient in spherical coordinates,derivethe formula (11.297) for the laplacian ∇ ∇. · 11.21 Consider the torus with coordinates θ, φ labeling the arbitrary point p = (cos φ(R + r sin θ), sin φ(R + r sin θ),rcos θ) (11.505) in which R>r. Both θ and φ run from 0 to 2π. (a) Find the basis vectors eθ and eφ. (b) Find the metric tensor and its inverse. 11.22 For the same torus, (a) find the dual vectors eθ and eφ and (b) find i the nonzero connections Γjk where i, j, & k take the values θ & φ. 11.23 For the same torus, (a) find the two Christoffel matrices Γθ and Γφ, θ φ (b) find their commutator [Γθ, Γφ], and (c) find the elements Rθθθ, Rθφθ, θ φ Rφθφ, and Rφφφ of the curvature tensor. 11.24 Find the curvature scalar R of the torus with points (11.505). Hint: In these four problems, you may imitate the corresponding calculation for the sphere in Sec. 11.42. ik i ik 11.25 By differentiating the identity g g ℓ = δ , show that δg = k ℓ − gisgktδg or equivalently that dgik = gisgktdg . st − st 11.26 Just to get an idea of the sizes involved in black holes, imagine an isolated sphere of matter of uniform density ρ that as an initial con- dition is all at rest within a radius rb. Its radius will be less than its Schwarzschild radius if 2MG 4 G r < =2 πr3ρ . (11.506) b c2 !3 b " c2 If the density ρ is that of water under standard conditions (1 gram per Exercises 509

cc), for what range of radii rb might the sphere be or become a black hole? Same question if ρ is the density of dark energy. 11.27 For the points (11.392), derive the metric (11.395) with k = 1. Don’t forget to relate dχ to dr. 11.28 For the points (11.393), derive the metric (11.395) with k = 0. 11.29 For the points (11.394), derive the metric (11.395) with k = 1. Don’t − forget to relate dχ to dr. 11.30 Suppose the constant k in the Roberson-Walker metric (11.391 or 11.395) is some number other than 0 or 1. Find a coordinate transfor- ± mation such that in the new coordinates, the Roberson-Walker metric has k = k/ k = 1. Hint: You also can change the scale factor a. | | ± 11.31 Derive the affine connections in Eq.(11.399). 11.32 Derive the affine connections in Eq.(11.400). 11.33 Derive the affine connections in Eq.(11.401). 11.34 Derive the spatial Einstein equation (11.411) from (11.375, 11.395, 11.406, 11.408, & 11.409). 11.35 Assume there had been no inflation, no era of radiation, and no dark energy. In this case, the magnitude of the difference Ω 1 would have | − | increased as t2/3 over the past 13.8 billion years. Show explicitly how close to unity Ω would have had to have been at t = 1 s so as to satisfy the observational constraint Ω 1 < 0.036 on the present value of Ω. | 0 − | 11.36 Derive the relation (11.431) between the energy density ρ and the Robertson-Walker scale factor a(t) from the conservation law (11.427) and the equation of state p = wρ. 11.37 Use the Friedmann equations (11.410 & 11.412) for constant ρ = p − and k = 1 to derive (11.438) subject to the boundary condition that a(t) has its minimum at t = 0. 11.38 Use the Friedmann equations (11.410 & 11.412) with w = 1, ρ con- − stant, and k = 1 to derive (11.439) subject to the boundary condition − that a(0) = 0. 11.39 Use the Friedmann equations (11.410 & 11.412) with w = 1, ρ − constant, and k = 0 to derive (11.440). Show why a linear combination of the two solutions (11.440) does not work. 11.40 Use the conservation equation (11.444) and the Friedmann equations (11.410 & 11.412) with w =1/3, k = 0, and a(0) = 0 to derive (11.447). 11.41 Show that if the matrix U(x) is nonsingular, then

(∂ U) U −1 = U ∂ U −1. (11.507) i − i 11.42 The gauge-field matrix is a linear combination A = ig tb Ab of the k − k 514 Forms There are two ways of thinking about differential forms. The Russian literature views a manifold as embedded in Rn and so is somewhat more straightforward. We will discuss it first. The Russian Way: Suppose x(t)isacurvewithx(0) = x on some manifold M, and f(x(t)) is a smooth function f : Rn → R that maps points x(t)intonumbers.Thenthedifferential df (˙x(t)) mapsx ˙(t) at x into n d d ∂f(x(t)) df x(t) ≡ f(x(t)) = x˙(t) =˙x(t) · ∇f(x(t)) (12.18) !dt " dt j ∂x #j=1 j all at t = 0. As physicists, we think of df as a number—the change in the function f(x) when its argument x is changed by dx. Russian mathemati- cians think of df as a linear map of tangent vectorsx ˙ at x into numbers. Since this map is linear, we may multiply the definition (12.18) by dt and arrive at the more familiar formula d d dt df x(t) = df dt x(t) = df (dx(t)) = dx(t) · ∇f(x(t)) (12.19) !dt " ! dt " all at t = 0. So df (dx)=dx · ∇f. (12.20) is the physicist’s df . Since the differential df is a linear map of vectorsx ˙(0) into numbers, it is a 1-form; since it is defined on vectors likex ˙(0), it is a differential 1-form. The term differential 1-form underscores the fact that the actual value of the differential df depends upon the vectorx ˙(0) and the point x = x(0). Mathematicians call the space of vectorsx ˙(0) at the point x = x(0) the tangent space TMx.Theysaydf is a smooth map of the tangent bundle TM, which is the union of the tangent spaces for all points x in the manifold M, to the real line, so df : TM → R. In the special case in which f(x)=xi(x)=xi,thedifferential dxi(˙x(t)) by (12.18) is n n n ∂x (x) ∂x dx (˙x(t)) = x˙ (t) i = x˙ (t) i = x˙(t) δ =˙x (t). (12.21) i j ∂x j ∂x j ij i #j=1 j #j=1 j #j=1

These dxi’s are the basic differentials. Using A for the vectorx ˙(t), we find from our definition (12.18) that n n ∂x dx (A)= A i = A δ = A (12.22) i j ∂x j ij i #j=1 j #j=1 516 Forms sets of basic differentials. Then by applying the formula (12.24) to the function yk(x), we get n ∂y (x) dy = k dx (12.32) k % ∂x j j=1 j which is the familiar rule for changing variables. The most general differential 1-form ω on the space Rn with coor- dinates x1 ...xn is a linear combination of the basic differentials dxi with coefficients ai(x) that are smooth functions of x =(x1,...,xn)

ω = a1(x) dx1 + ...an(x) dxn. (12.33)

The basic differential 2-forms are dxi ∧ dxk defined as dx (A) dx (A) A A dx ∧ dx (A, B)=( i k ( = ( i k ( = A B − A B . i k ( dx (B) dx (B) ( ( B B ( i k k i ( i k ( ( i k ( ( ( ( ( (12.34) So in particular

dxi ∧ dxi =0. (12.35)

The basic differential k-forms dx1 ∧ ···∧ dxk are defined as dx (A ) ... dx (A ) A ... A ( 1 1 k 1 ( ( 11 1k ( ∧ ( . . . ( ( . . . ( dx1 ...dxk(A1,...Ak)=( . .. . ( = ( . .. . ( . ( ( ( ( ( dx (A ) ... dx (A ) ( ( A ... A ( ( 1 k k k ( ( k1 kk ( ( ( ( (12.36)( ∧ 2 2 2 2 2 2 Example 12.4 (dx3 dr ) If r = x1 + x2 + x3,thendr is 2 dr = 2(x1dx1 + x2 dx2 + x3 dx3) (12.37)

2 and the differential 2-form ω = dx3 ∧ dr is

ω = dx3 ∧2(x1dx1 +x2 dx2 +x3 dx3)=2x1dx3 ∧dx1 +2x2dx3 ∧dx2 (12.38) since in view of (12.35) dx3 ∧ dx3 = 0. So the value of the 2-form ω on the vectors A =(1, 2, 3) and B =(2, 1, 1) at the point x =(3, 0, 3) is

dx (A) dx (A) 31 ω(A, B)=2x dx ∧ dx (A, B)=6( 3 1 ( =6( ( = 30. 1 3 1 ( dx (B) dx (B) ( ( 12( ( 3 1 ( ( ( ( ( ( ((12.39) On the vectors, C =(1, 0, 0) and D =(0, 0, 1) at x =(2, 3, 4), this 2-form has the value ω(C, D)=−4. 522 Forms

∆qi, one may show (exercise 12.10) that this sum of areas remains constant

d dω1(δp, δq; ∆p, ∆q) = 0 (12.77) dt along the trajectories in phase space (Gutzwiller, 1990, chap. 7).

Example 12.12 (The Curl) We saw in example 12.7 that the 1-form (12.50) of a vector field A is ωA = A1 h1 dx1 +A2 h2 dx2 +A3 h3 dx3 in which 2 2 2 the hk’s are those that determine (12.44) the squared length ds = hk dxk of the triply orthogonal coordinate system with unit vectors eˆ1, eˆ2, eˆ3.So the exterior derivative of the 1-form ωA is

3 dω = ∂ (A h ) dx ∧ dx A % k i i k i i,k=1

∂(A3 h3) ∂(A2 h2) = − dx2 ∧ dx3 (12.78) ! ∂x2 ∂x3 " ∂(A2 h2) ∂(A1 h1) + − dx1 ∧ dx2 ! ∂x1 ∂x2 " ∂A1 h1 ∂(A3 h3) + − dx3 ∧ dx1 ≡ ω∇×A. (12.79) ! ∂x3 ∂x1 "

Comparison with Eq. (12.52) shows that the curl of A is

1 ∂A3 h3 ∂A2 h2 ∇×A = − dx2 ∧ dx3 eˆ1 + ... h2 h3 # ∂x2 ∂3 $ h eˆ h eˆ h eˆ 1 1 1 2 2 3 3 = " ∂ ∂ ∂ " " 1 2 3 " h1 h2 h3 " " " A1h1 A2h2 A3h3 " " 3 " 1 " ∂(Akhk) " = ϵijkhi eˆi (12.80) h1h2h3 % ∂xj i,j,k=1 as we saw in (11.240). This formula gives our earlier expressions for the curl in cylindrical and spherical coordinates (11.241 & 11.242).

Example 12.13 (The Divergence) We have seen in equations (12.48, 12.49, & 12.52) that the 2-form ωA(U, V )=A · (U × V ) of the vector field A = A1eˆ1 + A2eˆ2 + A3eˆ3 is

ω2 ∧ ∧ ∧ A = A1 h2 h3 dx2 dx3 + A2 h3 h1 dx3 dx1 + A3 h1 h2 dx1 dx2. (12.81) 544 Probability and Statistics

One way of computing PB(n, p, N) for large N is to use Srinivasa Ramanu- jan’s correction (4.39) to Stirling’s formula N! √2πN(N/e)N ≈ N N 1 1 1/6 N! √2πN 1+ + . (13.51) ≈ ! e " ! 2N 8N 2 " When N and N n, but not n, are big, one may use (13.51) for N! and (N − − n)! in the formula (13.43) for PB(n, p, N) and so may show (exercise 13.11) that (pN)n P (n, p, N) qN−n R (n, N) (13.52) B ≈ n! 2 in which n n−1/2 1 1 1/6 R (n, N)= 1 1+ + 2 − N 2N 8N 2 ! " # $ 1 1 −1/6 1+ + (13.53) × 2(N n) 8(N n)2 % − − & tends to unity as N for any fixed n. →∞ When all three factorials in PB(n, p, N) are huge, one may use Ramanu- jan’s approximation (13.51) to show (exercise 13.12) that

N pN n qN N−n P (n, p, N) R (n, N) (13.54) B ≈ 2πn(N n) n N n 3 ' − # $ # − $ where 1 1 −1/6 1 1 1/6 R (n, N)= 1+ + 1+ + 3 2n 8n2 2N 8N 2 # $ # $ 1 1 −1/6 1+ + (13.55) × 2(N n) 8(N n)2 % − − & tends to unity as N , N n , and n . →∞ − →∞ →∞ Another way of coping with the unwieldy factorials in the binomial for- mula PB(n, p, N) is to use limiting forms of (13.43) due to Poisson and to Gauss.

13.4 The Poisson Distribution Poisson took the two limits N and p = n /N 0. So we let N →∞ ⟨ ⟩ → and N n, but not n, tend to infinity, and use (13.52) for the binomial − distribution (13.43). Since R (n, N) 1 as N , we get 2 → →∞ (pN)n n n P (n, p, N) qN−n = ⟨ ⟩ qN−n. (13.56) B ≈ n! n! 558 Probability and Statistics 13.10 The Einstein-Nernst relation If a particle of mass m carries an electric charge q and is exposed to an electric field E, then in addition to viscosity v/B and random buffeting − f, the constant force qE acts on it dv v m = + qE + f. (13.138) dt −B The mean value of its velocity will then satisfy the differential equation dv ⟨v⟩ qE = + (13.139) ! dt " − τ m where τ = mB. A particular solution of this inhomogeneous equation is qτE ⟨v(t)⟩ = = qBE. (13.140) pi m The general solution of its homogeneous version is ⟨v(t)⟩ = A exp( t/τ) gh − in which the constant A is chosen to give ⟨v(0)⟩ at t = 0. So by (6.13), the general solution ⟨v(t)⟩ to equation (13.139) is (exercise 13.19) the sum of ⟨v(t)⟩pi and ⟨v(t)⟩gh τ ⟨v(t)⟩ = qBE +[⟨v(0)⟩−qBE] e−t/ . (13.141)

By applying the tricks of the previous section (13.9), one may show (ex- ercise 13.20) that the variance of the position r about its mean ⟨r(t)⟩ is

2 6kTτ t τ (r − ⟨r(t)⟩)2 = − 1+e−t/ (13.142) # $ m %τ & where ⟨r(t)⟩ =(qτ 2E/m) t/τ − 1+e−t/τ if ⟨r(0)⟩ = ⟨v(0)⟩ = 0. So for times t ≫ τ, this variance is' ( 6kTτt (r − ⟨r(t)⟩)2 = . (13.143) # $ m Since the diffusion constant D is defined by (13.134) as

(r − ⟨r(t)⟩)2 =6Dt (13.144) # $ we arrive at the Einstein-Nernst relation qB µ D = BkT = kT = kT (13.145) q q in which the electric mobility is µ = qB. 562 Probability and Statistics If we substitute our formula (13.169) for ⟨v2(t)⟩ into the expression (13.123) for the acceleration of ⟨r2⟩, then we get d2⟨r2(t)⟩ 1 d⟨r2(t)⟩ 6kT −2t/τ v2 −2t/τ 2 = +2e ⟨ (0)⟩ + 1 e . (13.171) dt −τ dt m - − # The solution with both ⟨r2(0)⟩ = 0 and d⟨r2(0)⟩/dt = 0 is (exercise 13.21) 2 3kT 6kTτ ⟨r2(t)⟩ = ⟨v2(0)⟩ τ 2 1 e−t/τ τ 2 1 e−t/τ 3 e−t/τ + t. " − # − m " − #" − # m (13.172)

13.12 Characteristic and Moment-Generating Functions The Fourier transform (3.9) of a probability distribution P (x)isitschar- acteristic function P˜(k) sometimes written as χ(k)

P˜(k) χ(k) E[eikx]= eikx P (x) dx. (13.173) ≡ ≡ ( The probability distribution P (x) is the inverse Fourier transform (3.9) dk P (x)= e−ikx P˜(k) . (13.174) ( 2π Example 13.10 (Gauss) The characteristic function of the gaussian 1 (x µ)2 PG(x, µ, σ)= exp − (13.175) σ√2π #− 2σ2 $ is by (3.18) 1 (x µ)2 P˜G(k, µ, σ)= exp ikx − dx (13.176) σ√2π ( # − 2σ2 $ eikµ x2 1 = exp ikx dx =exp iµk σ2k2 . σ√2π ( # − 2σ2 $ # − 2 $

For a discrete probability distribution Pn the characteristic function is χ(k) E[eikx]= eikn P . (13.177) ≡ n n % The normalization of both continuous and discrete probability distributions implies that their characteristic functions satisfy P˜(0) = χ(0) = 1. 13.12 Characteristic and Moment-Generating Functions 563 Example 13.11 (Poisson) The Poisson distribution (13.58) ⟨n⟩n P (n, ⟨n⟩)= e−⟨n⟩ (13.178) P n! has the characteristic function ∞ ∞ ⟨n⟩n (⟨n⟩eik)n χ(k)= eikn e−⟨n⟩ = e−⟨n⟩ =exp ⟨n⟩ eik 1 . n! n! - − !" n=0 n=0 & % % (13.179)

The moment-generating function is the characteristic function evalu- ated at an imaginary argument

M(k) E[ekx]=P˜( ik)=χ( ik). (13.180) ≡ − − For a continuous probability distribution P (x), it is

M(k)=E[ekx]= ekx P (x) dx (13.181) # and for a discrete probability distribution Pn,itis

kx kxn M(k)=E[e ]= e Pn. (13.182) $n In both cases, the normalization of the probability distribution implies that M(0) = 1. Derivatives of the moment-generating function and of the characteristic function give the moments

n n ˜ n d M(k) n d P (k) E[x ]=µn = n =( i) n % . (13.183) dk %k=0 − dk % % %k=0 % % Example 13.12 (Gauss and Poisson)% The moment-generating% functions for the distributions of Gauss (13.175) and Poisson (13.178) are

1 2 2 k MG(k, µ, σ)=exp µk + σ k and MP (k, ⟨n⟩)=exp ⟨n⟩ e 1 . & 2 ' ( ) − !" (13.184) They give as the first three moments of these distributions

2 2 µG0 =1,µG1 = µ, µG2 = µ + σ (13.185) 2 µP 0 =1,µP 1 = ⟨n⟩,µP 2 = ⟨n⟩ + ⟨n⟩ (13.186) (exercise 13.22). 564 Probability and Statistics Since the characteristic and moment-generating functions have derivatives (13.183) proportional to the moments µn, their Taylor series are ∞ ∞ (ik)n (ik)n P˜(k)=E[eikx]= E[xn]= µ (13.187) n! n! n n$=0 n$=0 and ∞ ∞ kn kn M(k)=E[ekx]= E[xn]= µ . (13.188) n! n! n n$=0 n$=0

The cumulants cn of a probability distribution are the derivatives of the logarithm of its moment-generating function

n n ˜ d ln M(k) n d ln P (k) cn = n =( i) n % . (13.189) dk %k=0 − dk % % %k=0 % % One may show (exercise 13.24)% that the first five cumulants% of an arbitrary probability distribution are

c =0,c= µ, c = σ2,c= ν , and c = ν 3σ4 (13.190) 0 1 2 3 3 4 4 − where the ν’s are its central moments (13.27). The 3d and 4th normalized 3 3 cumulants are the skewness ζ = c3/σ = ν3/σ and the kurtosis κ = c /σ4 = ν /σ4 3. 4 4 − Example 13.13 (Gaussian Cumulants) The logarithm of the moment- generating function (13.184) of Gauss’s distribution is µk + σ2k2/2. Thus by (13.189), PG(x, µ, σ) has no skewness or kurtosis, its cumulants vanish 4 cGn = 0 for n>2, and its fourth central moment is ν4 =3σ .

13.13 Fat Tails The gaussian probability distribution P (x, µ, σ) falls off for x µ σ G | − | ≫ very fast—as exp (x µ)2/2σ2 . Many other probability distributions − − fall off more slowly;* they have fat+ tails. Rare “black-swan” events—wild fluctuations, market bubbles, and crashes—lurk in their fat tails. Gosset’s distribution, which is known as Student’s t-distribution with ν degrees of freedom 1 Γ((1 + ν)/2) aν PS(x, ν,a)= (13.191) √π Γ(ν/2) (a2 + x2)(1+ν)/2 13.13 Fat Tails 565 has power-law tails. Its even moments are Γ(ν/2 n) a2 n µ =(2n 1)!! − (13.192) 2n − Γ(ν/2) # 2 $ for 2n<ν and infinite otherwise. For ν = 1, it coincides with the Breit- Wigner or Cauchy distribution 1 a P (x, 1,a)= (13.193) S π a2 + x2 in which x = E E and a = Γ/2 is the half-width at half-maximum. − 0 Two representative cumulative probabilities are (Bouchaud and Potters, 2003, p.15–16) ∞ ′ ′ 1 1 x Pr(x, )= PS(x , 3, 1) dx = arctan x + 2 (13.194) ∞ (x 2−π $ 1+x ' ∞ ′ ′ 1 3 1 3 Pr(x, )= PS(x , 4, √2) dx = u + u (13.195) ∞ (x 2 − 4 4 where u = x/√2+x2 and a is picked so σ2 = 1. William Gosset (1876– 1937), who worked for Guinness, wrote as Student because Guinness didn’t let its employees publish. The log-normal probability distribution on (0, ) ∞ 2 1 ln (x/x0) Pln(x)= exp (13.196) σx√2π $ − 2σ2 ' describes distributions of rates of return (Bouchaud and Potters, 2003, p. 9). Its moments are (exercise 13.27)

n n2σ2/2 µn = x0 e . (13.197)

The exponential distribution on [0, ) ∞ −αx Pe(x)=αe (13.198) has (exercise 13.28) mean µ =1/α and variance σ2 =1/α2. The sum of n independent exponentially and identically distributed random variables x = x + + x is distributed on [0, ) as (Feller, 1966, p.10) 1 ··· n ∞ (αx)n−1 P (x)=α e−αx. (13.199) n,e (n 1)! − The sum of the squares x2 = x2 + + x2 of n independent normally and 1 ··· n 566 Probability and Statistics identically distributed random variables of zero mean and variance σ2 gives rise to Pearson’s chi-squared distribution on (0, ) ∞ n−1 √2 1 x −x2/(2σ2) Pn,G(x, σ)dx = e dx (13.200) σ Γ(n/2) #σ√2$ which for x = v, n = 3, and σ2 = kT/m is (exercise 13.29) the Maxwell- Boltzmann distribution (13.100). In terms of χ = x/σ,itis

2 n/2−1 1 χ 2 P (χ2/2) dχ2 = e−χ /2d χ2/2 . (13.201) n Γ(n/2) # 2 $ % & It has mean and variance µ = n and σ2 =2n (13.202) and is used in the chi-squared test (Pearson, 1900). Personal income, the amplitudes of catastrophes, the price changes of fi- nancial assets, and many other phenomena occur on both small and large scales. L´evy distributions describe such multi-scale phenomena. The char- acteristic function for a symmetric L´evy distribution is for ν 2 ≤ ν L˜ν(k, aν)=exp( aν k ) . (13.203) − | | Its inverse Fourier transform (13.174) is for ν = 1 (exercise 13.30) the Cauchy or Lorentz distribution

a1 L1(x, a1)= 2 2 (13.204) π(x + a1) and for ν = 2 the gaussian 1 x2 L2(x, a2)=PG(x, 0, √2a2)= exp (13.205) 2√πa2 # − 4a2 $ but for other values of ν no simple expression for Lν(x, aν) is available. For 0 < ν < 2 and as x , it falls off as x −(1+ν), and for ν > 2it → ±∞ | | assumes negative values, ceasing to be a probability distribution (Bouchaud and Potters, 2003, pp. 10–13).

13.14 The Central Limit Theorem and Jarl Lindeberg We have seen in sections (13.7 & 13.8) that unbiased fluctuations tend to distribute the position and velocity of molecules according to Gauss’s distri- bution (13.75). Gaussian distributions occur very frequently. The central limit theorem suggests why they occur so often. 13.21 Kolmogorov’s Test 595 and the number of points 10m rises further, the probability distributions Pr(m) ( ,u) converge to the universal cumulative probability distribu- e,G,G −∞ tion K(u) and provide a numerical verification of Kolmogorov’s theorem. Such curves make poor figures, however, because they hide beneath K(u). (m) The curves labeled Pre,S,G( ,u) for m = 2 and 3 are made from 100 sets m −∞ of N = 10 points taken from PS(x, 3, 1) and tested as to whether they m instead come from PG(x, 0, 1). Note that as N = 10 increases from 100 to 1000, the cumulative probability distribution Pr(m) ( ,u) moves farther e,S,G −∞ from Kolmogorov’s universal cumulative probability distribution K(u). In fact, the curve Pr(4) ( ,u) made from 100 sets of 104 points lies beyond e,S,G −∞ u>8, too far to the right to fit in the figure. Kolmogorov’s test gets more conclusive as the number of points N . →∞ Warning, mathematical hazard: While binned data are ideal for chi- squared fits, they ruin Kolmogorov tests. The reason is that if the data are in bins of width w, then the empirical cumulative probability distribution (N) Pre ( ,x) is a staircase function with steps as wide as the bin-width w −∞ even in the limit N .Thuseven if the data come from the the- →∞ oretical distribution, the limiting value D∞ of the Kolmogorov distance will be positive. In fact, one may show (exercise 13.21) that when the data do come from the theoretical probability distribution Pt(x) assumed to be continuous, then the value of D∞ is

wPt(x) D∞ sup . (13.324) ≈ −∞

Thus in this case, the quantity √NDN could diverge as √ND∞ and lead one to believe that the data had not come from Pt(x). Suppose we have made some changes in our experimental apparatus and ′ ′ ′ our software, and we want to see whether the new data x1,x2,...,xN ′ we took after the changes are consistent with the old data x1,x2,...,xN we took before the changes. Then following equations (13.310–13.312), we can (N) make two empirical cumulative probability distributions—one Pre ( ,x) (N ′) −∞ made from the N old points xj and the other Pre ( ,x) made from the ′ ′ −∞ N new points xj. Next, we compute the distances

+ (N) (N ′) DN,N′ =sup Pre ( ,x) Pre ( ,x) −∞

// probability of no events p[0] = exp(-AN); prob = p[0];

// p(k) is the probability of fewer than k+1 events per day for (k=1; k<=LOOP_ITR; k++) { fact = factorial (k); tmpProb = k * exp(-AN) / fact; prob += pow(AN, tmpProb); p[k] = prob; }

// Random seed srand ( time(NULL) );

// Goes through all the histories for (k=0; k(rand()) / RAND_MAX;

// Finds an M with p(M) < X for (m=100; m>=0; m--) { if (x < p[m]) { num = m; } }

histories[k][day] = num; } 608 Monte Carlo Methods // Calculates max and sum numMuons = histories[k].max(); totMuons = histories[k].sum();

// Updates our records maxEvents[numMuons]++; totEvents[totMuons]++; }

// Opens a data file ofstream fhMaxEvents, fhSumEvents; fhMaxEvents.open ("maxEvents.txt"); fhSumEvents.open ("totEvents.txt");

// Sets fhMaxEvents.setf(ios::fixed,ios::floatfield); fhMaxEvents.precision(7); fhSumEvents.setf(ios::fixed,ios::floatfield); fhSumEvents.precision(7);

// Writes the data to a file for (k=0; k

14.4 Statistical Mechanics The Metropolis algorithm can generate a sequence of states or configurations of a system distributed according to the Boltzmann probability distribution (1.345). Suppose the state of the system is described by a vector x of many components. For instance, if the system is a protein, the vector x might be the 3N spatial coordinates of the N atoms of the protein. A protein composed of 200 amino acids has about 4000 atoms, and so the vector x would have some 12,000 components. Suppose E(x) is the energy 600 Monte Carlo Methods fall within the region R N V = R Ln. (14.2) R N The integral formula (14.1) then becomes

n NR n L f(x) d x f(xk). (14.3) # ≈ N R "k=1 The utility of the Monte Carlo method of numerical integration rises sharply with the dimension n of the hypervolume. Example 14.1 (Numerical Integration) Suppose one wants to integrate the function e−2x−3y f(x, y)= (14.4) x2 + y2 +1 over the quarter of the unit disk in which" x and y are positive. In this case, VR is the area π/4 of the quarter disk. To generate fresh random numbers, one must set the seed for the code that computes them. The following program sets the seed by using the subroutine init random seed defined in a fortran95 program in section 13.16. With some compilers, one can just write “call random seed().” program integrate implicit none ! catches typos integer :: k, N integer, parameter :: dp = kind(1.0d0) real(dp) :: x, y, sum = 0.0d0, f real(dp), dimension(2) :: rdn real(dp), parameter :: area = atan(1.0d0) ! pi/4 f(x,y) = exp(-2*x - 3*y)/sqrt(x**2 + y**2 + 1.0d0) write(6,*) ’How many points?’ read(5,*) N call init_random_seed() ! set new seed do k = 1, N 10 call random_number(rdn); x= rdn(1); y = rdn(2) if (x**2+y**2 > 1.0d0) then go to 10 end if sum = sum + f(x,y) end do ! integral = area times mean value < f > of f 610 Monte Carlo Methods Once the system is thermalized, one can start measuring properties of the system. One computes a physical quantity every hundred or every thousand sweeps and takes the average of these measurements. That average is the mean value of the physical quantity at temperature T . Why does this work? Consider two configurations x and x′ which re- spectively have energies E = E(x) and E′ = E(x′) and are occupied with ′ ′ probabilities Pt(x) and Pt(x ) as the system is thermalizing. If E >E, then the rate R(x′ → x) of going from x′ to x is the rate v of choosing to ′ ′ ′ test x when one is at x times the probability Pt(x ) of being at x , that is, ′ ′ ′ −(E′−E)/kT R(x → x)=vPt(x ). The reverse rate is R(x → x )=vPt(x) e with the same v since the random walk is symmetric. The net rate from x′ → x then is

′ ′ ′ −(E′−E)/kT R(x → x) − R(x → x )=v Pt(x ) − Pt(x) e . (14.8) - # This net flow of probability from x′ → x is positive if and only if

′ −(E′−E)/kT Pt(x )/Pt(x) >e . (14.9)

The probability distribution Pt(x) therefore flows with each sweep toward the Boltzmann distribution exp(−E(x)/kT ). The flow slows and stops when the two rates are equal R(x′ → x)=R(x → x′) a condition called detailed balance. At this equilibrium, the distribution Pt(x) satisfies ′ −(E−E′)/kT ′ E′/kT Pt(x)=Pt(x ) e in which Pt(x ) e is independent of x.Sothe −E/kT thermalizing distribution Pt(x) approaches the distribution P (x)=ce in which c is independent of x. Since the sum of these probabilities must be unity, we have P (x)=c e−E/kT = 1 (14.10) #x #x which means that the constant c is the inverse of the partition function

Z(T )= e−E(x)/kT . (14.11) #x The thermalizing distribution approaches Boltzmann’s distribution (1.345)

−E(x)/kT Pt(x) → PB(x)=e /Z(T ). (14.12)

Example 14.2 (Z2 ) First, one replaces space-time with a lattice of points in d dimensions. Two nearest neighbor points are separated by the lattice spacing a and joined by a link. Next, one puts an 14.4 Statistical Mechanics 611 element U of the gauge group on each link. For the Z2 gauge group (exam- ple 10.4), one assigns an action S✷ to each elementary square or plaquette of the lattice with vertices 1, 2, 3, and 4

S✷ = 1 − U1,2U2,3U3,4U4,1. (14.13) Then, one replaces E(x)/kT with βS in which the action S is a sum of all the plaquette actions Sp. More details are available at Michael Creutz’s website (latticeguy.net/lattice.html). Although the generation of configurations distributed according to the Boltzmann probability distribution (1.345) is one of its most useful appli- cations, the Monte Carlo method is much more general. It can generate configurations x distributed according to any probability distribution P (x). To generate configurations distributed according to P (x), we accept any new configuration x′ if P (x′) ≥ P (x) and also accept x′ with probability P (x → x′)=P (x′)/P (x) (14.14) if P (x) >P(x′). This works for the same reason that the Boltzmann version works. Con- sider two configurations x and x′. If the system is thermalized, then the ′ probabilities Pt(x) and Pt(x ) have reached equilibrium, and so the rate R(x → x′) from x → x′ must equal that R(x′ → x) from x′ → x.If P (x′) < P (x), then R(x′ → x)is

′ ′ R(x → x)=vPt(x ) (14.15) in which v is the rate of choosing δx = x′ − x, while the rate R(x → x′)is

′ ′ R(x → x )=vPt(x) P (x )/P (x) (14.16) with the same v since the random walk is symmetric. Equating the two rates R(x′ → x)=R(x → x′) (14.17) we find that the flow of probability stops when

′ ′ Pt(x)=P (x) Pt(x )/P (x )=cP(x) (14.18)

′ where c is independent of x .ThusPt(x) → P (x). So far we have assumed that the rate of choosing x → x′ is the same as the rate of choosing x′ → x.InSmart Monte Carlo schemes, physicists arrange the rates vx→x′ and vx′→x so as to steer the flow and speed-up thermalization. To compensate for this asymmetry, they change the second 612 Monte Carlo Methods part of the Metropolis step from x → x′ when E′ = E(x′) >E= E(x)to accept conditionally with probability

′ ′ P (x → x )=P (x ) vx′→x/ [P (x) vx→x′ ] . (14.19)

Now if P (x′) < P (x), then R(x′ → x)is

′ ′ R(x → x)=vx′→x Pt(x ) (14.20) while the rate R(x → x′)is

′ ′ R(x → x )=vx→x′ Pt(x) P (x ) vx′→x/ [P (x) vx→x′ ] . (14.21)

Equating the two rates R(x′ → x)=R(x → x′), we find

′ ′ Pt(x )=Pt(x) P (x )/P (x). (14.22)

′ ′ That is Pt(x)=P (x) Pt(x )/P (x ) which gives

Pt(x)=NP(x) (14.23) where N is a constant of normalization.

14.5 Solving Arbitrary Problems If you know how to generate a suitably large space of trial solutions to a problem, and you also know how to compare the quality of any two of your solutions, then you can use a Monte Carlo method to solve it. The hard parts of this seemingly magical method are characterizing a big enough space of solutions s and constructing a quality function or functional that assigns a number Q(s) to every solution in such a way that if s is a better solution than s′,then Q(s) >Q(s′). (14.24)

But once one has characterized the space of possible solutions s and has constructed the quality function Q(s), then one simply generates zillions of random solutions and selects the one that maximizes the function Q(s)over the space of all solutions. If one can characterize the solutions as vectors of a certain dimension, s =(x1,...,xn), then one may use the Monte Carlo method of the previous section (14.4) by replacing −E(s)withQ(s) and kT with a parameter of the same dimension as Q(s), nominally dimensionless. 614 Monte Carlo Methods evolution more ergodic, which is why most complex modern organisms use sexual reproduction. Other genomic changes occur when a virus inserts its dna into that of a cell and when transposable elements (transposons) of dna move to different sites in a genome. In evolution, the rest of the Metropolis step is done by the new individual: if he or she survives and multiplies, then the change is accepted; if he or she dies without progeny, then the change is rejected. Evolution is slow,but it has succeeded in turning a soup of simple molecules into humans with brains of 100 billion neurons, each with 1000 connections to other neurons. John Holland and others have incorporated analogs of these Metropolis steps into Monte Carlo techniques called genetic algorithms for solving wide classes of problems (Holland, 1975; Vose, 1999; Schmitt, 2001). Evolution also occurs at the cellular level when a cell mutates enough to escape the control imposed on its proliferation by its neighbors and trans- forms into a cancer cell. Further Reading The classic Quarks, Gluons, and Lattices (Creutz, 1983) is a marvelous introduction to the subject; his website (latticeguy.net/lattice.html) is an extraordinary resource, as is Rubinstein’s Simulation and the Monte Carlo Method (Rubinstein and Kroese, 2007).

Exercises 14.1 Go to Michael Creutz’s website (latticeguy.net/lattice.html) and get his C-code for Z2 lattice gauge theory. Compile and run it, and make a graph that exhibits strong hysteresis as you raise and lower β =1/kT . 14.2 Modify his code and produce a graph showing the coexistence of two phases at the critical coupling βt =0.5ln(1+√2). Hint: Do a cold start and then 100 updates at βt, then do a random start and do 100 updates at βt. Plot the values of the action against the update number 1, 2, 3, . . . 100. 14.3 Modify Creutz’s C code for Z2 lattice gauge theory so as to be able to vary the dimension d of space-time. Show that for d = 2, there’s no hysteresis loop (there’s no phase transition). For d = 3, show that any hysteresis loop is minimal (there’s a second-order phase transition). 14.4 What happens when d = 5? 628 Path Integrals Linking three of these matrix elements together and using subscripts instead of primes, we have

∞ −3ϵH −ϵH −ϵH −ϵH ⟨q3|e |q0⟩ = ⟨q3|e |q2⟩⟨q2|e |q1⟩⟨q1|e |q0⟩ dq1dq2 (16.23) ##−∞ ∞ 2 3/2 m 1 2 = exp ⎧ − ϵ 2 m q˙j + V (qj) ⎫ dq1dq2. 2πϵ ##−∞ ! " ⎨ 'j=0 ( )⎬ ⎩ ⎭ Boldly passing from 3 to n and suppressing some integral signs, we get

∞ −nϵH −ϵH −ϵH ⟨qn|e |q0⟩ = ⟨qn|e |qn−1⟩ ...⟨q1|e |q0⟩ dq1 ...dqn−1 (16.24) ### −∞ ∞ n−1 n/2 m 1 2 − = exp ⎧ − ϵ 2 m q˙j + V (qj) ⎫ dq1 ...dqn 1. 2πϵ ### −∞ ! " ⎨ 'j=0 ( )⎬ ⎩ ⎭ Writing dt for ϵ and taking the limits ϵ → 0 and n ≡ β/ϵ →∞,wefind −βH that the matrix element ⟨qβ|e |q0⟩ is a path integral of the exponential of the average energy multiplied by −β

β −βH 1 2 ⟨qβ|e |q0⟩ = exp − 2 mq˙ (t)+V (q(t)) dt Dq (16.25) # - #0 .

n/2 in which Dq ≡ (n m/2πβ) dq1dq2 ···dqn−1 as n →∞. We sum over all paths q(t) that go from q(0) = q0 at inverse temperature β =0toq(β)=qβ at inverse temperature β. In the limit β →∞, the operator exp(−βH) becomes proportional to a projection operator (16.1) on the ground state of the theory. In three-dimensional space, q(t) replaces q(t) in equation (16.25)

β −βH 1 2 ⟨qβ|e |q0⟩ = exp − 2 mq˙ + V (q) dt Dq. (16.26) # - #0 .

Path integrals in imaginary time are called euclidean mainly to distinguish them from Minkowski path integrals, which represent matrix elements of the time-evolution operator exp(−itH) in real time. 16.6 Free Particle in Imaginary Time 633 a complete set of momentum dyadics |p⟩⟨p| and doing the resulting Fourier transform.

Example 16.1 (The Bohm-Aharonov Effect) From our formula (11.311) for the action of a relativistic particle of mass m and charge q,weinfer (exercise 16.7) that the action a nonrelativistic particle in an electromagnetic field with no scalar potential is

x2 S = 1 mv + q A ·dx . (16.55) # 2 x1 ! " Now imagine that we shoot a beam of such particles past but not through a narrow cylinder in which a magnetic field is confined. The particles can go either way around the cylinder of area S but cannot enter the region of the magnetic field. The difference in the phases of the amplitudes is the loop integral from the source to the detector and back to the source ∆ · · S mv · dx mv dx q · mv dx qΦ ! = + q A ! = ! + ! B dS = ! + ! # $ 2 % # 2 &S # 2 (16.56) in which Φ is the magnetic flux through the cylinder.

16.6 Free Particle in Imaginary Time If we mimic the steps of the preceding section (16.5) in which the hamiltonian is H = p2/2m,setβ = it/! =1/kT , and use Dirac’s delta function

m 3/2 2 ! δ3(q)=lim e−mq /2 t (16.57) → ! t 0 '2π t( then we get

3/2 2 3/2 m mq mkT 2 !2 ⟨q|e−βH |0⟩ = exp = e−mkT q /2 . )2π!2β * +−2!2β , ) 2π!2 * (16.58) To study the ground state of the system, we set β = t/! and let t in →∞ 3/2 2 −tH/! m m q q|e |0⟩ = ! exp ! (16.59) ⟨ '2π t( + − 2 t , which for D = !/(2m) is the solution (3.200 & 13.107) of the diffusion equation. 16.13 Application to Quantum Electrodynamics 649 and the functional delta function

δ[∇ · A]= δ(∇ · A(x)) (16.163) !x enforces the Coulomb-gauge condition. The term Lm is the action density of the matter field ψ. Tricks are available. We introduce a new field A0(x) and consider the factor

1 2 F = exp i ∇A0+∇△−1j0 d4x DA0 (16.164) " # " 2 & $ % which is just a number independent of the charge density j0 since we can cancel the j0 term by shifting A0. By △−1, we mean − 1/4π|x − y|. By integrating by parts, we can write the number F as (exercise 16.21)

2 F = exp i 1 ∇A0 − A0j0 − 1 j0△−1j0 d4x DA0 " # " 2 2 & $ % (16.165) 1 0 2 0 0 4 0 = exp i ∇A − A j d x + i VC dt DA . " # " 2 " & $ % So when we multiply the numerator and denominator of the amplitude (16.161) by F , the awkward Coulomb term cancels, and we get

iS′ O1 ...On e δ[∇ · A] DA Dψ " ⟨Ω|T [O ...O ] |Ω⟩ = (16.166) 1 n ′ eiS δ[∇ · A] DA Dψ " where now DA includes all four components Aµ and

′ 1 ˙ 2 1 2 1 0 2 0 0 4 S = A − (∇×A) + ∇A + A · j − A j + Lm d x. (16.167) " 2 2 2 $ % Since the delta-function δ[∇ · A] enforces the Coulomb-gauge condition, we can add to the action S′ the term (∇ · A˙) A0 which is − A˙ · ∇A0 after we integrate by parts and drop the surface term. This extra term makes the action gauge invariant

1 ˙ 0 2 1 2 0 0 4 S = (A −∇A ) − (∇×A) + A · j − A j + Lm d x " 2 2 (16.168) 1 ab b 4 = − Fab F + A jb + Lm d x. " 4 Exercises 663 which contributes the terms ∂ ω∗∂µω and − µ a a ∂ ω∗ gf Aµω = ∂ ω∗ gf Aµω (16.254) − µ a abc b c µ a abc c b to the action density. Thus we can do perturbation theory by using the modified action density

′ 1 µν 1 µ 2 ∗ µ ∗ µ L = F νF (∂ A ) ∂ ω ∂ ω + ∂ ω gf A ω ψ (D̸ + m) ψ − 4 aµ a − 2 µ a − µ a a µ a abc c b − (16.255) in which D̸ γµD = γµ(∂ igtaA ). The ghost field ω is a mathematical ≡ µ µ− aµ device, not a physical field describing real particles, which wouldbespinless fermions violating the spin-statistics theorem (example 10.19). Further Reading (Srednicki, 2007), The Quantum Theory of Fields I, II, & III (Weinberg, 1995, 1996, 2005), and Quantum Field Theory in a Nutshell (Zee, 2010) all provide excellent treatments of path integrals.

Exercises 16.1 Derive the multiple gaussian integral (16.8) from (5.167). 16.2 Derive the multiple gaussian integral (16.12) from (5.166). 16.3 Show that the vector Y that makes the argument of the multiple gaus- sian integral (16.12) stationary is given by (16.13), and that the mul- tiple gaussian integral (16.12) is equal to its exponential evaluated at its stationary point Y apart from a prefactor involving det iS. 16.4 Repeat the previous exercise for the multiple gaussian integral (16.11). 16.5 Compute the double integral (16.23) for the case V (qj) = 0. 16.6 Insert into the LHS of (16.54) a complete set of momentum dyadics |p p|,usetheinnerproduct q|p =exp(iqp!)/√2π!,dotheresulting ⟩⟨ ⟨ ⟩ Fourier transform, and so verify the free-particle path integral (16.54). 16.7 By taking the nonrelativistic limit of the formula (11.311) for the action of a relativistic particle of mass m and charge q, derive the expression (16.55) for the action a nonrelativistic particle in an electromagnetic field with no scalar potential. 16.8 Show that for the hamiltonian (16.60) of the simple harmonic oscillator the action S[qc] of the classical path is (16.67). 16.9 Show that the harmonic-oscillator action of the loop (16.68) is (16.69). 16.10 Show that the harmonic-oscillator amplitude (16.72) for q′ = 0 and q′′ = q reduces as t 0 to the one-dimensional version of the free- → particle amplitude (16.54). 664 Path Integrals 16.11 Work out the path-integral formula for the amplitude for a mass m initially at rest to fall to the ground from height h in a gravitational field of local acceleration g to lowest order and then including loops up to an overall constant. Hint: use the technique of section 16.7. 16.12 Show that the action (16.74 ) of the stationary solution (16.77) is (16.79). 16.13 Derive formula (16.132) for the action S0[φ] from (16.130 & 16.131). 16.14 Derive identity (16.136). Split the time integral at t =0intotwo halves, use d ϵ e±ϵt = ± e±ϵt (16.256) dt and then integrate each half by parts. 16.15 Derive the third term in equation (16.138) from the second term. 16.16 Use (16.143) and the Fourier transform (16.144) of the external cur- rent j to derive the formula (16.145) for the modified action S0[φ, ϵ,j]. 16.17 Derive equation (16.147) from equations (16.145) and (16.146). 16.18 Derive the formula (16.148) for Z0[j] from the expression (16.147) for S0[φ, ϵ,j]. 16.19 Derive equations (16.149 & 16.150) from formula (16.148). 16.20 Derive equation (16.154) from the formula (16.149) for Z0[j]. 16.21 Show that the time integral of the Coulomb term (16.159) is the term that is quadratic in j0 in the number F defined by (16.164). 16.22 By following steps analogous to those the led to (16.150), derive the formula (16.177) for the photon propagator in Feynman’s gauge. 16.23 Derive expression (16.192) for the inner product ⟨ζ|θ . ⟩ 16.24 Derive the representation (16.195) of the identity operator I for a single fermionic degree of freedom from the rules (16.182 & 16.185) for Grassmann integration and the anticommutation relations (16.178 & 16.184). 16.25 Derive the eigenvalue equation (16.200) from the definition (16.198 & 16.199) of the eigenstate |θ and the anticommutation relations (16.196 ⟩ & 16.197). 16.26 Derive the eigenvalue relation (16.213) for the Fermi field ψm(x,t) from the anticommutation relations (16.209 & 16.210) and the defini- tions (16.211 & 16.212). 16.27 Derive the formula (16.214 ) for the inner product χ′|χ from the ⟨ ⟩ definition (16.212) of the ket |χ . ⟩ 17 The Group

17.1 The Renormalization Group in Quantum Field Theory Most quantum field theories are non-linear with infinitely many degrees of freedom, and because they describe point particles, they are rife with infini- ties. But short-distance effects, probably the finite sizes of the fundamental constituents of matter, mitigate these infinities so that we can cope with them consistently without knowing what happens at very short distances and very high energies. This procedure is called renormalization. For instance, in the theory described by the Lagrange density

1 ν 1 2 2 g 4 L = ∂νφ∂ φ m φ φ (17.1) −2 − 2 − 24 we can cut off divergent integrals at some high energy Λ. The amplitude for the elastic scattering of two bosons of initial four-momenta p1 and p2 into ′ ′ two of final momenta p1 and p2 to one-loop order (Weinberg, 1996, chap. 18) then is proportional to (Zee, 2010, chaps. III & VI) g2 Λ6 A = g ln iπ +3 (17.2) − 32π2 " #stu$ − # as long as the absolute values of the Mandelstam variables s = (p + p )2, − 1 2 t = (p p′ )2, and u = (p p′ )2, which satisfy stu > 0 and s + t + u = − 1 − 1 − 1 − 2 4m2, are all much larger than m2 (Stanley Mandelstam, 1928–). We define the physical coupling constant gµ, as opposed to the bare one g that comes with L, to be the real part of the amplitude A at s = t = u = µ2 − − 3g2 Λ2 g = g ln +1 . (17.3) µ − 32π2 " # µ2 $ # 2 2 2 Thus the bare coupling constant is g = gµ +3g ln(Λ /µ )+1 , and using $ $ 666 The Renormalization Group this formula, we can write our expression (17.2) for the amplitude A in a form in which the cutoffΛno longer appears g2 µ6 A = g ln iπ . (17.4) µ − 32π2 " #stu$ − # This is the magic of renormalization. The physical coupling “constant” gµ is the right coupling at energy µ because when all the Mandelstam variables are near the renormalization 6 point stu = µ , the one-loop correction is tiny, and A ≈ gµ. How does the physical coupling gµ depend upon the energy µ? The am- plitude A must be independent of the renormalization energy µ, and so dA dg g2 6 = µ − = 0 (17.5) dµ dµ 32π2 µ which is a version of the Callan-Symanzik equation. We assume that when the cutoffΛis big but finite, the bare and running 2 coupling constants g and gµ are so tiny that they differ by terms of order g 2 2 2 or gµ. Then to lowest order in g and gµ, we can replace g by gµ in (17.5) and arrive at the simple differential equation

dg 3 g2 µ µ β(g )= µ (17.6) dµ ≡ µ 16π2 which we can integrate

E gE 2 gE 2 E dµ dgµ 16π dgµ 16π 1 1 ln = = = 2 = M "M µ "gM β(gµ) 3 "gM gµ 3 #gM − gE $ (17.7) to find the running physical coupling constant gµ at energy µ = E g g = M . (17.8) E 1 3 g ln(E/M)/16π2 − M As the energy E = √s rises above M, while staying below the singular value 2 E = M exp(16π /3gM ), the running coupling gE slowly increases. And so does the scattering amplitude, A ≈ gE. Example 17.1 (Quantum Electrodynamics) Vacuum polarization makes the amplitude for the scattering of two electrons proportional to (Weinberg, 1995, chap. 11) A(q2)=e2 1+π(q2) (17.9) $ $ 2 ′ − rather than to e . Here e is the renormalized charge, q = p1 p1 is the 17.1 The Renormalization Group in Quantum Field Theory 667 four-momentum transferred to the first electron, and e2 1 q2x(1 x) π 2 (q )= 2 x(1 x)ln 1+ 2− dx (17.10) 2π "0 − " m # represents the polarization of the vacuum. We define the square of the 2 2 2 running coupling constant eµ to be the amplitude (17.9) at q = µ 2 2 2 2 eµ = A(µ )=e 1+π(µ ) . (17.11) % $ For µ2 ≫ m2, the vacuum polarization term π(µ2) is (exercise 17.1) e2 µ 5 π(µ2) ≈ ln − . (17.12) 6π2 " m 6# The amplitude (17.9) then is 1+π(q2) A(q2)=e2 (17.13) µ 1+π(µ2) and since it must be independent of µ,wehave d A(q2) d e2 d 0= = µ ≈ e2 1 − π(µ2) . (17.14) dµ 1+π(q2) dµ 1+π(µ2) dµ µ & % $' So we find de dπ(µ2) de e2 0=2e µ 1 − π(µ2) −e2 =2e µ 1 − π(µ2) −e2 . µ µ µ µ π2 # dµ $ $ dµ # dµ $ $ 6 µ $ $ (17.15) 2 2 2 2 4 Thus since by (17.10 & 17.11) π(µ )=O(e ) and eµ = e + O(e ), we find to lowest order in eµ de e3 µ µ β(e )= µ . (17.16) dµ ≡ µ 12π2 We can integrate this differential equation E E dµ eE de eE de 1 1 ln = = µ = 12π2 µ =6π2 β 3 2 2 M (M µ (eM (eµ) (eM eµ #eM − eE $ (17.17) and so get for the running coupling constant the formula e2 e2 = M (17.18) E 1 e2 ln(E/M)/6π2 − M which shows that it slowly increases with the energy E.Thus,thefine- 2 structure constant eµ/4π rises from α =1/137.036 at me to e2(45.5GeV) α 1 = = (17.19) 4π 1 2α ln(45.5/0.00051)/3π 134.6 − 17.3 The Renormalization Group in Condensed-Matter Physics 671 where for nf flavors of light quarks 1 11 2 β = N n 0 (4π)2 3 − 3 f # $ 1 34 10 N 2 1 β = N 2 Nn − n . (17.33) 1 (4π)4 3 − 3 f − N f # $ In , N = 3. Combining the definition (17.31) of the β-function with its expansion (17.32) for small g, one arrives at the differential equation dg = β g3 + β g5 (17.34) d ln a 0 1 which one may integrate dg 1 β β + β g2 d ln a =lna ln c = = + 1 ln 0 1 ( − ( β g3 + β g5 −2β g2 2β2 g2 0 1 0 0 # $ (17.35) to find 2 2 β1/2β0 β + β g 2 a(g)=c 0 1 e−1/2β0g (17.36) g2 # $ 2 in which c is a constant of integration. The term β1g is of higher order 2 in g, and if one drops it and absorbs a factor of β0 into a new constant of integration Λ, then one gets

2 2 1 2 −β1/2β0 −1/2β0g a(g)=Λ β0g e . (17.37) $ % As g → 0, the lattice spacing a(g) goes to zero very fast (as long as nf < 17 for N = 3). The inverse of this relation (17.37) is

−1/2 g(a) β ln(a−2Λ−2)+(β /β )ln ln(a−2Λ−2) . (17.38) ≈ 0 1 0 $ $ $% It shows that the coupling constant slowly goes to zero with a,whichisa lattice version of asymptotic freedom.

17.3 The Renormalization Group in Condensed-Matter Physics The study of condensed matter is concerned mainly with properties that emerge in the bulk, such as the melting point, the boiling point, or the conductivity. So we want to see what happens to the physics when we increase the distance scale many orders of magnitude beyond the size a of an individual molecule or the distance between nearest neighbors. 17.3 The Renormalization Group in Condensed-Matter Physics 673 The full action of a stretched field is 1 S(φ )= ddx (∂φ)2 + g (L)φn (17.48) L 2 d,n # ! n # " in which d n d−n(d−2)/2 gd,n(L)=L A (L) gn = L gd,n. (17.49) The beta-function

L dgd,n(L) β(gd,n) ≡ = d − n(d − 2)/2 (17.50) gd,n(L) dL is just the exponent of the coupling “constant” gd,n(L). If it is positive, then the coupling constant gd,n(L) gets stronger as L →∞; such couplings are called relevant. Couplings with vanishing exponents are insensitive to changes in L and are marginal. Those with negative exponents shrink with increasing L; they are irrelevant. The coupling constant gd,n,p of a term with p derivatives and n powers of the field φ in a space of d dimensions varies as

d n −p d−n(d−2)/2−p gd,n,p(L)=L A (L) L gn,p = L gd,n,p. (17.51) Example 17.3 (QCD) In quantum chromodynamics, there is a cubic term a b c ff ˙ gfabc A0Ai ∂0Ai which in e ect looks like gfabc φaφbφc. Is it relevant? Well, if we stretch space but not time, then the time derivative has no effect, and d = 3. So the cubic, n = 3, grows as L3/2

d−n(d−2)/2 3/2 g3,3,0(L)=L g3,3,0 = L g3,3,0. (17.52) Since this cubic term drives asymptotic freedom, its strengthening as space is stretched by the dimensionless factor L may point to a qualitative ex- 3/2 planation of confinement. For if g3,3,0(L) grows with distance as L ,then 2 3 2 αs(L)=g3,3,0(L)/4π grows as L , and so the strength αs(Lr)/(Lr) of the force between two quarks separated by a distance Lr grows linearly with L α (Lr) L3 α (r) α (r) F (Lr)= s = s = L s (17.53) (Lr)2 (Lr)2 r2 which may be enough for quark confinement. On the other hand, if we stretch both space and time, then the cubic g4,3,1(L) and quartic g4,4,0(L) couplings are marginal.