<<

Contents

1 Things I forget to show the students 1 1.1 What orthogonal matrices are not ...... 1 1.2 Why you should be careful with your manipu- lations ...... 1 1.3 Why the difference between conditional- and absolute-convergence is important ...... 1 1.4 Why you should be careful when changing the order of integration2 1.5 Second need not commute (trivial example) ...... 3 1.6 Second derivatives need not commute (second sort of example) .4 1.7 Second derivatives need not commute (totally different sort of example) ...... 5 1.8 Why you should not extrapolate based on the first few terms of a sequence ...... 7 1.9 Why you should be careful with Lagrange Multipliers ...... 7 1.10 Cauchy Riemann Equations and Wirtinger Derivatives ...... 8 1.10.1 Wirtinger derivatives ...... 8 1.10.2 Extension to fields ...... 8 1.10.3 Cauchy Riemann ...... 10 1.11 L’Hopital’s rule ...... 10 1.12 SU(3) multiplet multiplicites ...... 10 1.13 Playing the Cello (in equal temperament) ...... 11 1.14 Another way of doing one of the sums from the Fourier part of the course ...... 11

1 Things I forget to show the students

Dr C. G. Lester, Peterhouse 2016

1.1 What orthogonal matrices are not The matrix  cosh 7 i sinh 7  M = (1) −i sinh 7 cosh 7 is not orthogonal, even though it satisfies MM T = M T M = 1. It is, how- ever, a Lester Matrix. Lester Matrices are those matrices M which satisfy MM T = M T M = 1. Orthogonal Matrices are required to be real, whereas Lester Matrices are not. Orthogonal Matrices are therefore the real sub-set of the Lester Matrices.

1.2 Why you should be careful with your complex number manipulations Complex numbers are well founded, and can always be manipulated safely. How- ever, it is easy to be sloppy when manipulating them. For example, there is an

1 error in the following argument but it is hard to spot: √ 1 = 1 hopefully this is obvious p = √ (−√1)(−1) since 1 = (−1)(−1) = −1 −1 since (ab)c = acbc (2) = ii by definition of i = i2 since aa = a2 = −1 since i is the square root of −1

Here is a variation on the same bad argument which avoids use of any explicit “i”s: √ 1 = 1 hopefully this is obvious p = √ (−√1)(−1) since 1 = (−1)(−1) c c c = √−1 −1 since (ab) = a b = ( −1)2 since aa = a2 (3) 1 2 √ 1 = ((−1) 2 ) since a = a 2 = (−1)1 since (ab)c = abc = −1 since a1 = a for all a.

1.3 Why the difference between conditional- and absolute- convergence is important Add up all the numbers in the table below and what do you get? It appears to be “0” or “-2” depending on whether you add up the column totals or the row totals! Why?

1 1 1 1 Row and column totals −1 − 2 − 4 − 8 − 16 ... column totals sum to “-2” 1 1 1 1 0 −1 2 4 8 16 ... 1 1 1 0 0 −1 2 4 8 ... 1 1 0 0 0 −1 2 4 ... 1 0 0 0 0 −1 2 ...... row totals sum to “0” (4)

1.4 Why you should be careful when changing the order of integration Define f(x, y) by 2(x − y) f(x, y) = (5) ((x − y)2 + 1)2 and then compare Z ∞ Z ∞ π f(x, y)dxdy = + (6) y=0 x=0 2 with Z ∞ Z ∞ π f(x, y)dydx = − (7) x=0 y=0 2

2 noting that the integrand is well behaved for all values of x and y. For a more interesting example that proves that it’s not just the sign that might change when you reverse integration limits, try ( e−(y−x) if y > x h(x, y) = (8) −2e2(y−x) otherwise, and compare Z ∞ Z ∞ Z ∞ Z y Z ∞  h(x, y)dxdy = h(x, y)dx + h(x, y)dx dy (9) y=0 x=0 y=0 x=0 x=y Z ∞ Z y Z ∞  = e−(y−x)dx + −2e2(y−x)dx dy y=0 x=0 x=y Z ∞ h iy h i∞  = e−(y−x) + e2(y−x) dy (10) y=0 x=0 x=y Z ∞ = 1 − e−y + (0 − 1) dy (11) y=0 Z ∞ = −e−ydy (12) y=0  −y∞ = e y=0 (13) = 0 − 1 (14) = −1 (15) with Z ∞ Z ∞ Z ∞ Z x Z ∞  h(x, y)dydx = h(x, y)dy + h(x, y)dy dx (16) x=0 y=0 x=0 y=0 y=x Z ∞ Z x Z ∞  = −2e2(y−x)dy + e−(y−x)dy dx x=0 y=0 y=x Z ∞ h ix h i∞  = −e2(y−x) + −e−(y−x) dx (17) x=0 y=0 y=x Z ∞ = −1 − (−e−2x) + (0 − (−1)) dx (18) x=0 Z ∞ = e−2xdx (19) x=0  1 ∞ = − e−2x (20) 2 x=0 1 = 0 − (− ) (21) 2 1 = . (22) 2 Determining the reason for the discrepancy is left as an excercise for the reader. [Hint: consider convergence versus absolute convergence for series.]

3 1.5 Second derivatives need not commute (trivial exam- ple) If x and y are expressed in terms of r and θ as follows

x = r cos θ (23) y = r sin θ (24) then note that ∂2y ∂2y 6= (25) ∂x|y∂θ|r ∂θ|r∂x|y

(i.e. note that second derivatives need not commute) since

∂2y ∂  ∂  ≡ (y) (26) ∂x|y∂θ|r ∂x|y ∂θ|r ∂  ∂  = (r sin θ) (27) ∂x|y ∂θ|r ∂ = (r cos θ) (28) ∂x|y ∂ = (x) (29) ∂x|y = 1 (30) while ∂2y ∂  ∂  ≡ (y) (31) ∂θ|r∂x|y ∂θ|r ∂x|y ∂ = (0) (32) ∂θ|r = 0. (33)

1.6 Second derivatives need not commute (second sort of example) In this example, we illustrate non-commuting where the the non-commutation comes only from ‘the thing being held constant’. This time we show that

∂2x ∂2x 6= . ∂r|θ∂r|y ∂r|y∂r|θ

4 We prove the above by noting that:

∂2x ∂  ∂  = (x) (34) ∂r|θ∂r|y ∂r|θ ∂r|y ∂  ∂ p  = ( r2 − y2) (35) ∂r|θ ∂r|y ! ∂ r = p (36) ∂r|θ r2 − y2 ∂  r  = (37) ∂r|θ x ∂  r  = (38) ∂r|θ r cos θ ∂  1  = (39) ∂r|θ cos θ = 0 (40) while ∂2x ∂  ∂  = (x) (41) ∂r|y∂r|θ ∂r|y ∂r|θ ∂  ∂  = (r cos θ) (42) ∂r|y ∂r|θ ∂ = (cos θ) (43) ∂r|y ∂ r cos θ  = (44) ∂r|y r ∂ x = (45) ∂r|y r ! ∂ pr2 − y2 = (46) ∂r|y r   r √ r − pr2 − y2 r2−y2 = (47) r2 2 2 2 √ r − √r −y r2−y2 r2−y2 = (48) r2 y2 = (49) r2pr2 − y2 y2 = (50) r2x sin2 θ = . (51) r cos θ

5 1.7 Second derivatives need not commute (totally differ- ent sort of example) The everywhere-smooth and everywhere-continuous f(x, y) defined by

( 2 2 xy(x −y ) if (x, y) 6= (0, 0) f(x, y) = x2+y2 0 otherwise, has the following everywhere-continuous first derivatives:

( 4 2 2 4 ∂f y(x +4x y −y ) if (x, y) 6= (0, 0) = (x2+y2)2 (52) ∂x|y 0 otherwise, ( 4 2 2 4 ∂f x(x −4x y −y ) if (x, y) 6= (0, 0) = (x2+y2)2 (53) ∂y|x 0 otherwise, and has the following second derivatives valid away from the origin:

∂2f ∂2f (x2 − y2)(x4 + 10x2y2 + y4) = = A(x, y) ≡ 2 2 3 . ∂x|y∂y|x ∂y|x∂x|y (x + y )

But A(x, y) is itself not smooth at the origin. Indeed limx→0 A(x, y) = −1 and limy→0 A(x, y) = +1. This means that when attempting to take the second derivative from first principles you could find that

2 ∂ f = +1 ∂x∂y (0,0) and 2 ∂ f = −1. ∂y∂x (0,0)

6 1.8 Why you should not extrapolate based on the first few terms of a sequence

Z ∞ sin x dx =π −∞ x Z ∞ x sin x sin 3 · x dx =π −∞ x 3 Z ∞ x x sin x sin 3 sin 5 · x · x dx =π −∞ x 3 5 Z ∞ x x x sin x sin 3 sin 5 sin 7 · x · x · x dx =π −∞ x 3 5 7 Z ∞ x x x x sin x sin 3 sin 5 sin 7 sin 9 · x · x · x · x dx =π −∞ x 3 5 7 9 Z ∞ x x x x x sin x sin 3 sin 5 sin 7 sin 9 sin 11 · x · x · x · x · x dx =π −∞ x 3 5 7 9 11 Z ∞ x x x x x x sin x sin 3 sin 5 sin 7 sin 9 sin 11 sin 13 · x · x · x · x · x · x dx =π −∞ x 3 5 7 9 11 13 Z ∞ x x x x x x x sin x sin 3 sin 5 sin 7 sin 9 sin 11 sin 13 sin 15 467807924713440738696537864469 · x · x · x · x · x · x · x dx = π −∞ x 3 5 7 9 11 13 15 467807924720320453655260875000 All equalities above are exact. Note that the last fraction is approximately

0.9999999999852937186π.

7 1.9 Why you should be careful with Lagrange Multipliers By inspection, we can see that for any finite value of the fixed parameter m, the extremum of the function

f(x, y, z) = x2 + y2 + z2 + mx subject to the constraint x2 + y2 = 0 is just zero, since the constraint simply implies that x = y = 0 and so f reduces 2 to fc = z which has a global minimum of zero at x = y = z = 0. Note that an “unthinking” application of the method of Lagrange multipliers fails: If you define L = x2 + y2 + z2 + mx − λ(x2 + y2) then the Euler-Lagrange equation corresponding to x is:

2x + m − 2λx = 0 and it is clear that this equation is NOT satisfied by the desired solution x = y = z = 0 (unless we happen to be in the special case where m = 0).1 This is due to the gradient of the constraint function being a null vector at the place where the constraint is satisfied. It is not clear to me how in general one avoids getting caught in this trap when the maths is more obscure (as it frequently is in real problems). Is it sufficient to simply take grad of any constraint and check for inequality with the null vector at all places satisfying the constraint? Is it acceptable to perturb the constraint from null - eg by replacing the constraint with x2 + y2 = 2 ?

1.10 Cauchy Riemann Equations and Wirtinger Deriva- tives 1.10.1 Wirtinger derivatives ∂ ∂ Physicists use ∂z and ∂z¯ quite a lot, but often don’t say how these seeminly innocent derivatives are defined. Often people seem to assume that they can be interpreted as oridinary partial derivatives in which “the other” is kept ∂ ∂ constant: vis ∂z z¯ and ∂z¯ z. However, after some thought one notices that it is not possible to vary z while keepingz ¯ constant, or to varyz ¯ while keeping z constant. If one changes, then so does the other. Is this a problem? It need not be ... after all, if f = x + y is considered to define f as a function of two variables x and y, and if we later discover that y itself is a function of x, such as y = x2, this later knowledge (which tells us that x can’t really be varied without

∂f varying y) doesn’t actually prevent us still considering things like ∂x . Such y derivatives intentionally blind themselves to any functional depenence that y might have – asking us to look at f explicitely as a function of just x and y, regardless of the internal workings of either x or y.

1Note, however, that this argument presupposes λ is well defined. If the condition is perturbed my a small amount  (see later) then λ may be seen to grow as an inverse power of  – in effect saying that λ needs to go to infinity as the constraint becomes un-perturbed.

8 ∂ Nonetheless, even though one can view ∂z as being a sloppy notation for ∂ ∂z z¯, many people still find such an approach disturbing. Persons in that camp can sometimes find it helpful to see an alternative approach (Wirtinger) which is to define these derivatives as follows: ! ∂ 1 ∂ ∂ ≡ − i , (54) ∂z 2 ∂x y ∂y x ! ∂ 1 ∂ ∂ ≡ + i (55) ∂z¯ 2 ∂x y ∂y x acting on the space of functions on the {z | z = x + iy}. Among the useful consequences of this definition are: • These operators really are allowed to be called “derivatives” (or “deriva- ∂(fg) tions” by mathematicians) as they satisfy the : ∂z = ∂g ∂f ∂(fg) ∂g ∂f f ∂z + ∂z g and ∂z¯ = f ∂z¯ + ∂z¯ g,

∂ n ∂ n 1 n−1 2 ∂f(z) • ∂z¯ (z ) = ∂z¯ ((x + iy) ) = 2 n(x + iy) (1 + i ) = 0 and thus ∂z¯ = 0 for funtions that have a Taylor or Laurent Series in powers of z,

∂ n ∂ n 1 n−1 2 n−1 • ∂z (z ) = ∂z ((x + iy) ) = 2 n(x + iy) (1 − i ) = nz Taking the above properties together, one sees that one really can ‘treat’ the ∂ ∂ ∂ ∂ derivatives “ ∂z and ∂z¯” almost ‘as if’ they really did mean “ ∂z z¯ and ∂z¯ z”, even though strictly this notation is not self consistent!

1.10.2 Extension to fields In text books on Quantum Field Theory one often sees similar methods being applied to allow a complex scalar field φ, its complex conjugate φ¯, its time derivative φ˙ and the time derivative of its complex conjugate φ¯˙, to appear to be treated as independent quantities in lagrangians of the form L(φ, φ,¯ φ,˙ φ¯˙). As before, that view is not wrong, but if you wish to do so, you can alternatively take the view that the “actual” independent degrees of freedom are the fields u(x, t),u ˙(x, t), v(x, t) andv ˙(x, t) where φ = u + iv, and where the Wirtinger derivatives are being defined as: ! ∂ 1 ∂ ∂ ≡ − i , (56) ∂φ 2 ∂u v,u,˙ v˙ ∂v u,u,˙ v˙ ! ∂ 1 ∂ ∂ ¯ ≡ + i , (57) ∂φ 2 ∂u v,u,˙ v˙ ∂v u,u,˙ v˙ ! ∂ 1 ∂ ∂ ≡ − i , (58) ˙ ∂φ 2 ∂u˙ v,u,v˙ ∂v˙ u,u,v˙ ! ∂ 1 ∂ ∂ ≡ + i . (59) ˙ ∂φ¯ 2 ∂u˙ v,u,v˙ ∂v˙ u,u,v˙ As before, these definitions may safely be be used “as if” they were equivalent to derivatives at “constant” values of the other “independent” field components,

9

∂ ∂ vis ∂φ ≡ ∂φ , etc. Note that none of these definitions would be much φ,¯ φ,˙ φ¯˙ use if things like the E-L equations did not retain a simple form in the new deriviatives. Fortunately they do. Inverting the relations above, we see that ! ∂L d ∂L = ∂u v,u,˙ v˙ dt ∂u˙ v,u,v˙ becomes ! ∂L ∂L d ∂L ∂L + ¯ = + (60) ∂φ ∂φ dt ∂φ˙ ∂φ¯˙ and ! ∂L d ∂L = ∂v u,u,˙ v˙ dt ∂v˙ u,u,v˙ becomes ! ∂L ∂L d ∂L ∂L i − ¯ = i − (61) ∂φ ∂φ dt ∂φ˙ ∂φ¯˙ and so taking (60) and (61) together we recover: ∂L d ∂L = and (62) ∂φ dt ∂φ˙ ! ∂L d ∂L ¯ = . (63) ∂φ dt ∂φ¯˙ Note that if L is real, then these two equations are not independent (one being the complex conjugate of the other) and so in this case it is sufficient to use just one of them.

1.10.3 Cauchy Riemann We are now able to derive the Cauchy Riemann equations for functions meeting the conditions above: ∂f(z) 0 = (provided f(z) is of the appropriate type!) (64) ∂z¯ ∂(u(x, y) + iv(x, y)) = (65) ∂z¯ ! 1 ∂ ∂ = + i (u(x, y) + iv(x, y)) (66) 2 ∂x y ∂y x ! 1 ∂u ∂u ∂v 2 ∂v = + i + i + i (67) 2 ∂x y ∂y x ∂x y ∂y x ! ! 1 ∂u ∂v i ∂u ∂v = − + + (68) 2 ∂x y ∂y x 2 ∂y x ∂x y and therefore, since u, v, x and y are all real, we must have

∂u ∂v = + ∂x y ∂y x

10 and

∂u ∂v = − . ∂y x ∂x y

1.11 L’Hopital’s rule

f 0(x) If limx→c f(x) = limx→c g(x) = a and [ a = 0 or a = ±∞ ] and limx→c g0(x) f(x) f 0(x) exists, then limx→c g(x) = limx→c g0(x) . Your attention is drawn to the last requirement above. It is important, as the example f(x) = x + sin x and g(x) = x with x → ∞ illustrates.

1.12 SU(3) multiplet multiplicites If p and q are the number of gaps across two adjacent edges of the multiplet 1 weight diagram, then it is called the (p, q)-multiplet and contains 2 (p + 1)(q + 1)(p + q + 2) states within it. N.B., this number of states can also be written 1 as 2 nanb(na + nb) where na and nb are the number of states on each of two adjacent edges. Multiplicity is not a good way of labelling multiplets, as different multiplets can have the same multiplicity. For example. There are 15 states in both the (2, 1) and the (4, 0) multiplets. There are 105 states in both the (13, 0) and the (6, 2) multiplets. There are 120 states in each of the (9, 1), (14, 0) and (5, 3) multiplets. 840 and 960 can also be realised in three different ways (find them!) and there are many other ways of realising multiplicities in two different ways. I know of no multiplicity that can be realised in four or more ways.

1.13 Playing the Cello (in equal temperament)

8x 1 N = + − 3 (3 for the key of C major) (69) a 2 n = N mod 7 (0=tonic, 6=sub-tonic) (70) σ = bN/7c (which octave we are in) (71) σ 2 4 5 7 9 11 f = (130.8 Hz × 2 ) {1, 2 12 , 2 12 , 2 12 , 2 12 , 2 12 , 2 12 }[n] (note frequency) (72) s 1 T L = (T =tension, ρ=mass per unit length) (73) 2f ρ

1.14 Another way of doing one of the sums from the Fourier part of the course To perform the sum ∞ X 1 S = 1 + n2 n=1 one can try to create a function of a complex number with poles at the integers whose residues are equal to the terms being summed, so that the sum can be

11 converted to a contour . [ Thanks to Herschel Chawdhry for suggesting this approach. ] We could start by noting that the function π cot πz f(z) = 1 + z2 has simple poles at z = ±i and z = n for integer n. We will need to know the residues rz of f(z) at each of those poles. From the periodicity of cot, the residies of cot πz at each of the positions z = n will be identical. From the Taylor series: 1 + O((πz)2) 1 π cot πz = π = + O(1) πz + O((πz)3) z we see that the residue of π cot πz at the origin, and thus also at each of the positions z = n, is 1. The residues of f(z) at these integer positions are therefore: 1 1 r = 1 · = n 1 + n2 1 + n2 which sum over all n to 2S + 1. The residues at z = ±i are best seen by factorising f(z) 1 1 f(z) = (π cot πz) z − i z + i and thereby noting that π cot πi π coth π r = = − . ±i 2i 2 We are now in a position to state by Cauchy’s Theorem that: Z f(z)dz = 2πi {(2S + 1) − (π coth π)} Γ where first quantity in round brackets is the sum of the poles on the real axis, the second is the sum of the residues of the other two poles, and Γ is an infinitely big version of the curve shown in the plot below.2

2Technically, we ought to use a sequence of ever larger but nonetheless finite curves, and therefore should work with finite sums associated with finite of the poles on the real line. We will leave those details as an exercise for the reader.

12 =(z)

+i

−4 −3 −2 −1 0 1 2 3 4 <(z)

−i

The curve Γ has two vertical sections and two horizontal sections. The on the two horizontal parts of Γ go to zero as H goes to infinity since sin 2x − i sinh 2y cot(x + iy) = cosh 2y − cos 2x and so lim cot(x + iy) = ∓i. y→±∞ 1 As this is constant and bounded, the 1+z2 part of f(x) will ensure that the integral along the top and bottom of Γ will go to zero as the box size increases if the aspect ratio of the box remains approximately unity. On the vertical parts of Γ the complex number z takes values where its real part is half-integer. 1 Taking z = n + 2 + iy yields cot πz = −i tanh πy which is bounded between −1 1 and +1. As this also is bounded, the remaining 1+z2 part of f(x) will ensure that the integral along the vertical part of Γ will go to zero as the box size increases if the aspect ratio of the box remains approximately one. Accordingly, we can see that as the curve Γ grows in size, the integral around it will tend to zero. In other words, we may state:

0 = 2πi {(2S + 1) − (π coth π)} which if solved for S gives

∞ X 1 1 S = = (π coth π − 1) . 1 + n2 2 n=1

13