<<

Quantum Mechanics

Li-Chung Wang

February 8, 2014

Abstract

The Dirac delta generalized function should not be treated as a function; we provide an easy way to bridge the gap between a function and a generalized function. Because the proofs given in Reif [93, §15 14] fail to get to the heart of the matter, we shall prove the equality between the ensemble average and· the time average using Chung [15, p.133, Theorem 5.4.2(8)] instead. Without the guide of the theory of Lie groups, the discussion about angular momentum given in Cohen-Tannoudji–Diu– Laloe¨ [17, vol. 1, chap. VI, §B, §D.1.a, Complement BVI] looks disorganized. In contrast, with the theory of Lie groups as a guide, Vvedensky [119, §7.4.1 & §7.4.2] clarifies the inner structure of angular momentum. Comparing Vvedensky [119, §7.3.1] with Marion–Thornton [77, p.35, l. 22–p.36, l. 1], the latter discussion is marginal. It is unnecessary to represent δθ by a vector [Marion–Thornton− − [77, p.262, l. 2–l. 1; p.263, Fig. 7-8]]; see Fomin–Gelfand [35, p.87, l. 18–l. 5]. In order to be natural, simple, and− easy− to understand, it is an improper approach to represent− a matrix− by a vector.

Keywords. Calculus of variation, variation of a functional, Taylor’s polynomial formula, admissible functions, variational derivative, isoperimetric problems, holonomic problems, non-holonomic problems, inhomogeneous wave equations, extremal, natural frame field, Legendre transformation, duality, Hadamard global inverse function theorem, convex function, proper map, principle of least action, conservation of momentum and that of angular momentum, Lagrangian method, symmetry, canonical Euler equations, characteristic system associated with the Hamilton–Jacobi equation, Dirac delta function, solid angle, au- tonomous system, initial condition, Lagrangian, total derivative, infinitesimal displacement, infinitesimal transformations, finite transformations, Lie groups, infinitesimal rotation, complete integral, cyclic coor- dinates, Hamilton’s characteristic function, canonical transformations, generating functions, Hamiltonian, equipartition theorem, Lissajous curves, conic sections, Galilean transformations, Lorentz transformations, Michelson–Morley experiment, Maxwell’s equations, muon decay, special relativity, proper time, relativis- tic kinetic energy, relativistic Lagrangian, refraction index, phase velocity, group velocity, signal veloc- ity, inertial frame, timelike interval, spacelike interval, scattering angle, final velocity, total cross section, center-of-mass coordinate system, out of phase, coupled harmonic oscillators, normal modes, generalized coordinates, complete orthonormal set [orthonormal basis], node, closure relation, state space, Bessel func- tions, Leibniz rule for higher derivatives, indicial equation, generators of a one-sheet hyperboloid, ruled surface, developable surface, directrix, director sphere, orthogonal, argumented matrix, conjugate diame- ters, polar lines, polar planes, diametral planes of a paraboloid, umbilics of an ellipsoid, principal planes, conicoids of revolution, double tangent plane, osculating plane, osculating circle, osculating sphere, recip- rocal cone, consecutive points, contact of second or third order, envelope, edge of regression, inflectional tangent, ensemble, measurable, Laplace’s equation, confocal conicoids, equipotential surfaces, strain ten- sor, bounded variation, random variable, Markov or stopping time, expectation, variance, moment of inertia,

1 center of mass, uniformly integrable, Slutsky’s theorem, Laplace transform, characteristic functions, distribu- tion, density, indicator, inversion formula, distribution determining class, pushforward measure, convergence in probability, almost sure convergence, strong law of large numbers, central limit theorem, weak conver- gence, pointwise convergence, spin, Stern–Gerlach experiment, ensemble, ergodicity, Liouville’s theorem, chain rule, reversible processes, Stirling’s formula, dielectrics, postulate of equal a priori probabilities, mi- crostates, equation of state, Boltzmann’s constant, specific heats, entropy, dipole layer, Green’s theorem, point dipole, Jacobian, Wronskian, normal derivative, Lommel’s formula, Ritz method, Sturm–Liouville problems, direct methods, method of finite differences, Euler expression, Gaussian curvature, positive def- inite, extremal, equation of the vibrating membrane, vector triple product, method of Lagrange multipliers, uncernty principle, De Broglie wavelength, bound state, specific heat, zero-point energy, ground state, mag- netic moment, normal operators, observables, conjugacy, wave function, probability amplitude, probabil- ity distribution, testing functions, surface harmonics, principal normal, Mobius¨ strip, vector bundle, trivial bundle, n-plane bundle, Jordan-measurable, simply-connected regions, Stokes theorem, multiply-connected regions, flow, Kelvin’s inversion theorem, Whipple’s formula, contour integrals, dummy variable, branch points, Riemann’s differential equation, Euler transforms, Fuchsian type, Borel measures, hyper- geometric, confluent, Chu–Vandermonde identity, Kummer’s first formula, regular singularity, Riemann’s P-equation, Lebesgue dominated convergence theorem, methodical solutions, Weber–Schafheitlin integral, Hankel inversion theorem, Riemann–Lebesgue theorem, Riemann surfaces, Lommel’s theorem, Sturm’s the- orem, meromorphic functions, essential singularities, Cauchy–Kowalewski theorem, microstates, flux den- sity, rate of flow, Plancherel’s theorem, transient, steady state, inductance, dimensions, polarization, solid angle, inertia tensor, harmonic function, polarization, magnetization, potential, vector potential, divergence, curl, solenoid, toroid, polarizability, Clausius–Mossotti relation, Larmor frequency, diamagnetic, Debye equation, Curie law for paramagnetic material, ferroelectric, ferromagnetic, Curie temperature, Maxwell velocity distribution function, Lorenz–Lorentz law, cophasal surface, continuity equation

1 Prerequisites

Rome wasn’t built in a day. To lay a solid foundation, there are no shortcuts except by studying classics. Because of its fragmentariness, a cyclopidea is useful only in the following cases: After reading a classical textbook, it may help review and organize the material; after a long while, it may help recover a clear image. It is unnecessary to rewrite an entire classsical textbook. We need only pointout its shortcomings, and propose the methods of improvement. For example, one may update its reference or quote more direct or detailed proofs. Sometimes one rewrites an entire classical book, only to lose its originality or damage its structural delicacy.

1. Geometry (2-dimensional analytic geometry: Fine–Thompson [33]; 3-dimensional analytic geometry: Bell [2]; differential geometry: Carmo [12]; Courant–John [22, vol. 2, chap. 5]; Deo [25]; Hicks [55]; Kreyszig [66]; Lee [73] [For its solutions to exercises and problems, see https://wj32.org/wp/ wp-content/uploads/2012/12/Introduction-to-Smooth-Manifolds.pdf]; Munkres [84]; Munkres [85]; O’neill [86]; Spivak [108]; Weatherburn [124]; Willmore [126])

Remark 1. (Point dividing a chord) Fine–Thompson [33, p.46, Example 2]

Proof. We may assume the center of the circle is (0,0). Rotate the coordinate axes so that P =(x ,y ), Q =(x ,y1), and R =(x , y1). ′ ′ ′ ′ −

2 Remark 2. Parallel lines project into parallel lines [Fine–Thompson [33, p.90, l. 14]]. − Proof. Let l1,l2 be parallel lines that cut plane γ at points P1,P2. Draw the normal Ni of γ at Pi. Plane Nili cuts γ in line li′. By the definition of cross product, plane N1l1 and plane N2l2 are parallel. Consequently, l1′ ,l2′ are parallel.

Remark 3. By Fine–Thompson [33, p.34, (3)], (ae1,0),(a/e1,b/e2),(0,be2) are on a straight line [Fine– Thompson [33, p.105, l. 2–l. 1]]. − − Remark 4. (The equaton of hyperbola referred to its asymptotes) If the lines ax + by + c = 0 and a′x + b′y + c′ = 0 are neither identical nor parallel, then ,by Fine–Thompson [33, p.131, l. 15–l. 12], (ax + by + c)(a′x + b′y + c′)= d (where d is a constant) is a hyperbola [Fine–Thompson− − [33, p.114, l.13–l.15]]. Remark 5. Fine–Thompson [33, p.121, (3)]

Proof. For (3)(i), consider the perpendiculars to the y-axis of x,x1,y1. For (3)(ii), consider the perpendiculars to the x-axis of y,x1,y1.

x 2y 14 Remark 6. − − + 8√5 = 0 [Fine–Thompson [33, p.133, l. 13]]. √5 −

Proof. Let (x,y) be an arbitrary point on the directrix. (x,y) and O are on the same side of Cx1. The distance of O from Cx1 is negative, so is the distance of (x,y) from Cx1 [Fine–Thompson x 2y 14 [33, p.32, (3)]]. Consequently, − − = CD = 8√5. √5 −| | − Remark 7. Let U = 0 be a conic and l = 0 a line. If U = 0 and l = 0 intersect at two points P and Q. Then U + λl2 = 0 (where λ is a constant) touches U = 0 at P and Q [Fine–Thompson [33, p.142, l. 10–l. 1]]. − − Proof. Suppose U = 0 and (l = 0) (l +ε = 0) intersect at P,P ,Q,Q , where P P and Q Q ∪ ′ ′ ′ → ′ → as ε 0. Then PP approaches a tangent line to U = 0 at P as ε 0. The same discussion applies → ′ → to U + λl(l + ε)= 0 [Fine–Thompson [33, p.142, l.1–l.4]].

Remark 8. The equation of tangent to y2 = 4ax in terms of its slope is y = mx + a/m [Fine–Thompson [33, p.181, l. 10–l. 9]]. − − 2 Proof. Let P(x′,y′) be a point on y = 4ax. Then the tangent at P is yy ′ = 2a(x + x ′) [Fine– Thompson [33, p.61, (6)]]. Namely, y = 2a x + 2ax ′ . y ′ y ′ 2 2a 2ax ′ y ′ /2 m = = = y ′/2 = a/m. y ′ ⇒ y ′ y ′ Remark 9. Every plane section of a conicoid is a conic [Fine–Thompson [33, p.236, l. 9]]. − Proof. Bell [2, p.74, Ex. 2] gives one proof. Another proof is as follows: Take a point (x0,y0,z0) on the plane as the new origin and then take a basis A,B of the plane. { } By substituting (x,y,z)=(x0,y0,z0)+ uA + vB into the equation F(x,y,z)= 0 of the conicoid, we obtain an equation of the second degree in u,v.

Remark 10. The proof of the statement given in Fine–Thompson [33, p.250, l.3–l.6] is similar to the proof of the statement given in Fine–Thompson [33, p.150, l.2–l.4].

3 Remark 11. QP cannot meet the one-sheet hyperboloid in more than two points [Fine–Thompson [33, p.252, l. 5–l. 4]]. − − Proof. If QP meets the one-sheet hyperboloid in more than two points, then QP is a generating line [Bell [2, p.152, l.7–l.8]]. Then the plane determined by the two lines from different systems contains three generating lines of the one-sheet hyperboloid, which contradicts the fact that every plane section of a conicoid is a conic [Bell [2, p.74, l.1–l.2]]. Remark 12. (A line of centers at infinity) Fine–Thompson [33, p.268, Example 3] The line of centers of x2 = 4ay is (∞,∞,z), where ∞ < z < ∞. − x2 = 4ay means that the square of the distance between (x,y,z) and yz-plane equals 4a times the distance between (x,y,z) and zx-plane. The equation given in Fine–Thompson [33, p.268, l. 7] means that the square of the dis- − tance between (x,y,z) and x z + 1/4 = 0 equals 4a times the distance between (x,y,z) and − x + 8y + z 1/8 = 0. − The relationship between (x,y,z) and two perpendicular planes shows that the conicoid repre- sented by x2 = 4ay is congruent to that represented by the equation given in Fine–Thompson [33, p.268, l. 7]. We can use a translation and a rotation to make the two conicoids coincide. See Fine–Thompson− [33, p.134, (5) & (6)]. Remark 13. The necessary and sufficient condition that φ(x,y,z) shall split up into linear factors is Fine– Thompson [33, p.124, (3’)] [Fine–Thompson [33, p.278, l.10–l.11]]. Proof. For the necessary condition, see Fine–Thompson [33, p.123, l. 6–p.124, l.12]. For the sufficient condition, see Bell [2, pp.209–210, §145]. −

1 0 0 x1 4 0 1 0 y1 Remark 14. The coefficient of k in ∆4 is [Fine–Thompson [33, p.280, l. 19– 0 0 1 z1 −

x y z x2 + y2 + z2 1 1 1 1 1 1 1 l. 17]]. − −

k 0 0 kx1 − − 0 k 0 ky1 4 Proof. ∆4 and − − have the same coefficient of k 0 0 k kz1 − 2 − 2 2 kx1 ky1 kz1 k(x + y + z 1) − − − − 1 1 1 − k 0 0 kx1 1 0 0 x1 − − 0 k 0 ky1 0 1 0 y1 and − − = k4 . 0 0 k kz1 0 0 1 z1 − − kx ky kz k(x2 + y2 + z2 1) x y z x2 + y2 + z2 1 1 1 1 1 1 1 1 1 1 1 1 1 − − − − − − 2 2 2 Remark 15. k1x + k2y + k3z + d′ = 0 [Fine–Thompson [33, p.282, (10)]]. Proof. Fine–Thompson [33, p.282, (11)] [i.e., Fine–Thompson [33, p.271, (6)]] gives the dire- cion cosines of the axes of the conicoid [Fine–Thompson [33, p.282, l.14–l.16]]. Using the axes of the conicoid as the coordinate axes, we obtain the standard form of the equation of conicoid. Since a rotation [Fine–Thompson [33, p.261, l.14–l.18]] will not affect the absolute term of the 2 2 2 equation of the conicoid, this standard form is k1x + k2y + k3z + d′ = 0 [Fine–Thompson [33, p.282, l.6; p.277, l.4–l.11; p.280, l. 1]]. − 4 Remark 16. (Cartesian coordinates lack ability to distinguish infinities of all directions) The center is (∞,∞,∞) [Fine–Thompson [33, p.285, l. 7]]. − Proof. The algebraic proof follows from Fine–Thompson [33, p.266, (8)]. A geometric proof proceeds as follows: Let the center of conicoid be (x0,y0,z0). Since (x0,y0,z0) is the common midpoint of chords of the parabola on the plane y = 0 [Fine–Thompson [33, p.243, Figure]], x ∞ y 0 z ∞ y = 0,x0 = ∞ = z0. The line through (∞,0,∞) with direction cosines (λ,0,ν) is −λ = −0 = −ν , which can also be interpreted as x ∞ y ∞ z ∞ −λ = −0 = −ν , the line through (∞,∞,∞) with direction cosines (λ,0,ν). This is why we find from the algebraic proof that the distance between (∞,∞,∞) and any point on the paraboloid is the same. Consequently, from the algebraic viewpoint, (∞,0,∞)=(∞,∞,∞). Since (x0,y0,z0) is the common midpoint of chords of the parabola on the plane x = 0 [Fine–Thompson [33, p.243, Figure]], x = 0,y0 = ∞ = z0.

Remark. The paraboloid has the center (∞,∞,∞) because we use an improper tool. Cartesian coordinates lack ability to distinguish infinities of all directions. If we use spherical coordinates instead, then (θ,φ,r = ∞)′s represent different points if (θ,φ)′s point to different directions. In this case, there is no common midpoint for the chords through origin for the paraboloid. Remark 17. (Projected area) Bell [2, p.18, l.6–l.15] The signed area of a triangle can be expressed in terms of cross product and unit normal. For example, −→AB −→AC =(∆ABC)N. If N N = cosθ < 0, then ∆ABC and ∆A′B C′ have opposite × · ′ ′ signs. Remark 18. (Direction-cosines vs. direction-ratios) Bell [2, §20 & §28] Remark. The angle between any two coordinate axes may be oblique. The direction-ratios of a ray are the coordinates of L, where −→OL is the direction of the ray. The x-coordinate of L is OA, where the plane through L and parallel to plane YOZ cuts the x-axis at A. Remark 19. (The angle between two lines) 2 ∑(aa′ sin λ) ∑(bc′+b′c)(cosλ cos µ cosν) cosθ = 2 2 ±{ − 1/2 2 2 − } 1/2 [Bell [2, p.27, l. 6– ∑a sin λ 2∑bc(cosλ cos µ cosν) ∑a′ sin λ 2∑b′c′(cosλ cos µ cosν) − l. 5]]. { − − } { − − } − Proof. cosα ∑a2 sin2 λ 2∑bc(cosλ cos µ cosν) 1/2 a { − − } = 1 cos2 λ cos2 µ cosν + 2cosλ cos µ cosν 1/2 ±{cosα′ − 2 −2 − 1/}2 = ∑a′ sin λ 2∑b′c′(cosλ cos µ cosν) [Bell [2, p.27, §26]]. a′ { − − } The desired result follows from the equality given in Bell [2, p.27, l. 12–l. 10]. − − Remark 20. (Intersection of three planes) Bell [2, §45] Consider the system of equations given in Bell [2, p.49, (1), (2), & (3)]. Let a1 b1 c1 r = rank of the coefficient matrix a2 b2 c2 and   a3 b3 c3  a1 b1 c1 d1 r = rank of the argumented matrix a2 b2 c2 d2 . ′   a3 b3 c3 d3  

5 Table 1: The intersection of three planes Systems Case number Algebraic Classification Geometric classification 1 r = 3 Three planes intersect at one point. Consistent 2 r = r′ = 2; no two rows of the Three planes intersect in one argumented matrix are propor- line. tional. 3 r = r′ = 2; two rows of the argu- Two planes are coincident, and mented matrix are proportional. the third cuts the others. 4 r = r′ = 1 All three planes are coincident.

5 r = 2,r′ = 3; no two rows of Normals are coplanar, planes in- the coefficient matrix are propor- tersect in pairs, and the intersect- Inconsistent tional. ing lines form a triangular prism. 6 r = 2,r′ = 3; two rows of the co- Two parallel planes intersect a efficient matrix are proportional, third plane. but the same two rows of the ar- gumented matrix are not propor- tional. 7 r = 1,r′ = 2; no two rows of All planes are parallel and dis- the argumented matrix are pro- tinct. portional. 8 r = 1,r′ = 2; two rows of the Two planes are coincident, and argumented matrix are propor- the third is parallel. tional.

Proof. Based on geometric considerations, there are no cases other than the above eight cases. In order to prove that the two corresponding classifications are equivalent, all we have to do is find the coordinate system to put a case in simplest equation form and then determine the ranks. Because ranks are invariant under translations and rotations and the general case can be obtained from a simple case by a finite number of translations and rotations, it is unnecessary to consider the general case. For example, for case 5, all we have to do is consider y = a,y = αx,y = βx, where a = 0 and α = β. 6 6 Remark. Bell [2, §45] attempts to prove the same thing, but it chooses the hard way. In this context, the emphasis should be on geometry rather than matrix theory. Bell might know some matrix theory, but he failed to master it or make good use of it. Remark 21. The equalities given in Bell [2, p.72, l. 1] follow from the equalities given in Bell [2, p.70, l.2–l.3]. − Remark 22. (Angle between lines in which a plane cuts a cone) Bell [2, p.91, (4)]

Proof. Let N =(cu2 + aw2 2gwu,hw2 + cuv f uw gvw,cv2 + bw2 2 f vw). 2 2 − 2 2 − − − Then (l1 ,2l1m1,m1) N = 0 =(l2 ,2l2m2,m2) N. Consequently, 2 2 2 · 2 · l1m2+l2m1 (l ,2l1m1,m ) (l ,2l2m2,m )= 2(l1m2 l2m1)(m1m2, ,l1l2) 1 1 × 2 2 − − 2 should be proportional to N.

6 The first two equalities follow. The third equality follows from the identity 2 2 (l1m2 + l2m1) 4(l1l2)(m1m2)=(l1m2 l2m1) . − − Remark 23. All parallel plane sections of a conicoid are similar and similarly situated conics [Bell [2, p.74, Ex. 3]].

Proof. By Fine–Thompson [33, p.137, l. 3–l. 1], the centers of resulting conics are collinear − − and vary with a,h,g,b, f . By Fine–Thompson [33, p.137, l.11], the axes of a resulting conic is determined by λ, which is, in turn, determined by a,b,h. Thus, the axes of every resulting conic make the same angles with plane coordinate axes, so the conics are similarly situated. By Fine–Thompson [33, p.138, l. 17], the conics are similar. − Remark 24. (Director sphere) [Bell [2, p.103, l.16, Ex.1]]

l1 m1 n1 Solution. Let Q = l2 m2 n2 . Then   l3 m3 n3 QQT = I. Hence Qis orthogonal. 3 2 1 3 2 1 3 2 1 3 2 ∑i=1(lix + miy + niz) = a (∑i=1 li )+ b (∑i=1 mi )+ c (∑i=1 ni ) [Bell [2, p.103, l. 12]]. 2 2 2 1 1 1 − Since Q is orthogonal, x + y + z = a + b + c .

Remark 25. ρ = 2r1r2 [Bell [2, p.105, l.6]]. r1+r2 Proof. R is the harmonic conjugate of A with respect to P and Q [Bell [2, p.104, l. 7–l. 6]]. AP RP − − AQ = RQ [http://users.math.uoc.gr/˜pamfilos/eGallery/problems/Harmonic. html−] AP AR = AQ−AR . Namely, r − r−1 ρ 1 = − . r2 r2 ρ − − Remark. Since 1 1 = 1 1 , AP,AR,AQ are in harmonic progression [Bell [2, p.105, l.5– ρ r1 r2 ρ l.6]]. − − Remark 26. Bell [2, p.116, (E’)]

Proof. Bell [2, p.116, (E’)] may follow from Cramer’s rule as said in Bell [2, p.115, l. 4]. − However, since the book title of Bell [2] is Coordinate Geometry, it should also include the following geometric proof. x1 y1 z1 a b c x2 y2 z2 x1 y1 z1 x2 y2 z2 x3 y3 z3 a b c = 1 ( a , b , c ) [( a , b , c ) ( a , b , c )] = 1. x y z ± ⇒ · × ± 3 3 3 a b c By Bell [2, p.115, (A’)], ( x1 , y1 , z1 )= ( x2 , y2 , z2 ) ( x3 , y3 , z3 ). a b c a b c a b c ± × Remark 27. Bell [2, p.118, Ex. 16]

Proof. OP2 = a2 + b2 + c2 α2 β 2 follows from the equality given in Bell [2, p.116, l. 15]. − − − x1 y1 z1 x2 y2 z2 = abc [Bell [2, p.116, l.8]] ± x y z 3 3 3

7 −→OP (−→OQ −→OR)= abc [Kreyszig [66, p.17, (5.14)]] ⇒ · × ± −→OP (−→OQ −→OR)= αβ(OP)cosφ, where φ is the angle between −→OP and the vector ( x1 , y1 , z1 ) · × a2 b2 c2 [Bell [2, p.115, l.5]]. OP cosφ = p [Bell [2, p.118, l. 10]]. ·| | − x1 y1 z1 Remark. If we replace (l,m,n) with ( a2 , b2 , c2 ), then Bell [2, p.118, Ex. 17] follows from Bell [2, p.118, Ex. 16]. Remark 28. The locus of the tangents drawn from P,(α,β,γ) is the pair of tangent planes whose line of intersection is OP.

Proof. Suppose the plane z = γ cuts the cone in an ellipse. Draw tangents from P to the ellipse. Let Q,R be the two tangent points. By Bell [2, p.120, l.2], OPQ is a tangent plane passing through P.

Remark 29. (Construction of three conjugate diameters of a cone) [Bell [2, p.120, l. 8–p.121, l.7]] The conjugacy of three diameters OP, OQ, and OR of a cone is defined− by the properties given in Bell [2, p.120, l. 4–l. 2]. We may construct three conjugate diameters as follows: Let a given cone be− cut by− a plane in a conic, say an ellipse. Let Q be a point outside of the ellipse. Draw two tangents from Q to the ellipse and they touch it at points A and B. Then ←→AB is the polar line of Q and OAB is the polar plane of Q. Take P = tA+(1 t)B, where t (0,1). Let − ∈ ←→RQ be P’s polar line which intersects ←→AB at R. Then ←→PQ is the polar line of R. Remark 30. Bell [2, p.123, Fig. 39] is incorrect. The x-axis in the figure should have been the y-axis and vice versa. See Fine–Thompson [33, p.243, Figure]. Remark 31. By Fine–Thompson [33, p.119, (1)], Bell [2, p.125, (1)] is the condition that the lines alx+bmy = 0, alx + bmy = 0, in the plane z = k, should be conjugate diameters of the conic ax2 + by2 = 2k [Bell [2, p.126, l.3–l.5]]. Remark 32. (Diametral planes of a paraboloid) Any plane meets a pair of conjugate diametral planes of a paraboloid in lines which are parallel to conjugate diameters of the conic in which the plane meets the surface [Bell [2, p.126, l.5– l.8]].

Proof. Assume the paraboloid is a hyperbolic paraboloid. If the plane is perpendicular to the z-axis, the situation can be described as in Bell [2, p.123, l.5– l.9]. Otherwise, we should rotate about an asymptote of the hyperbola on the cutting plane according to the method given in Fine– Thompson [33, p.112, l.9–l.12] so that the projected intersecting lines with conjugate diametral planes becomes perpendicular. Then the situation becomes the previous special case.

Remark 33. (The delicate part of a theory should not be described roughly and carelessly) If r is the length of either semi-axis of the conic in which the plane lx + my + nz = 0 cuts the conicoid, the plane touches the cone [Bell [2, p.132, l.10–l.13]].

Proof. Suppose the conicoid ax2 + by2 + cz2 = 1 and the plane lx + my + nz = 0 intersect in an ellipse. If a semi-diameter is not a semi-axis, then the semi-diameter with length r can be directed along two intersecting lines. Consequently, the plane cuts the cone in two intersecting lines. However, if the semi-diameter is a semi-axis (the longest or shortest semi-diameter), then the semi-diameter with length r can be directed along only one line. Consequently, the plane touches the cone.

8 Remark 34. Bell [2, p.132, (2)] follows from the equality given in Bell [2, p.120, l.4]. The equality given in Bell [2, p.132, l. 10] follows from the equality given in Bell [2, p.120, − l.2]. The equation given in Bell [2, p.134, l.17] is the same as that given in Bell [2, p.107, l.7]. The last equality given in Bell [2, p.134, (1)] follows from lα + mβ + nγ = p. 2 aα2 + bβ 2 + cγ2 = p [Bell [2, p.134, l. 13]] follows from l2/a+m2/b+n2/c − aα2+bβ 2+cγ2 aα2 bβ 2 cγ2 a2α2 b2β 2 c2γ2 (aα2+bβ 2+cγ2)2 l2/a+m2/b+n2/c = l2/a = m2/b = n2/c = l2 = m2 = n2 = p2 . Bell [2, p.135, (2)] follows from the equality given in Bell [2, p.120, l.4]. The proof of Bell [2, p.135, (3)] is similar to that of Bell [2, p.132, (3)]. The expression given in Bell [2, p.135, l. 1] follows from the equality given in Bell [2, p.118, l. 6]. − − The equalities given in Bell [2, p.137, l.15] follow from Bell [2, p.124, (1)]. 2 2 aα2 + bβ 2 2γ = l /a+m /b+2np [Bell [2, p.137, l.17]] follows from − n2 aα2 bβ 2+2γ aα2 b−β 2 1 − −np = l2/a = m2/b = n2 . Remark 35. f (x,y,z)= ax2 +by2 +cz2 +2 f yz+2gzx+2hxy = 1 represents a central conicoid [Bell [2, p.140, l. 1–p.141, l.1]]. − Proof. By Fine–Thompson [33, p.266, (5)], (l,m,n)=(0,0,0) (the center is (0,0,0)). ⇒ Remark 36. (Umbilics) The centres of a series of parallel plane sections of a conicoid lie upon a diameter of the conicoid and the tangent plane at an extremity of the diameter is parallel to the plane sections [Bell [2, p.143, l.1–l.4]].

Proof. If we replace (α,β,γ) with (tα,tβ,tγ), then the equation given in Bell [2, p.107, l.8] becomes (x tα)atα +(y tβ)btβ +(z tγ)ctγ = 0. − − − If (α,β,γ) is on the conicoid, then (x α)aα +(y β)bβ +(z γ)cγ = 0 [Bell [2, p.107, l.8]] − − − is the tangent plane at (x,y,z)=(α,β,γ).

Remark 37. (Umbilics of an ellipsoid) If P,(ξ,η,ζ) is an umbilic, the diametral plane of OP is a central circular section. Therefore the equations xξ + yη + zζ = 0, x √a2 b2 z √b2 c2 = 0 represent a2 b2 c2 a − ± c − the same plane [Bell [2, p.143, l. 9–l. 6]]. − − xξ yη zζ Proof. By Fine–Thompson [33, p.271, (4′′)], the diametral plane of OP is a2 + b2 + c2 = 0, which is parallel to the tangent plane at (x,y,z)=(ξ,η,ζ). Since (ξ,η,ζ) is an umbilic, the plane section of the diametral plane of OP is a circle [Bell [2, p.143, l.4–l.10]]. x2 y2 z2 1 2 2 2 a2 + b2 + c2 b2 (x + y + z ) 1 1 −2 1 1 2 =( a2 b2 )x +( c2 b2 )z a2−b2 2 b2 c2 −2 = 2− 2 x + 2−2 z . − a b b c Remark 38. (Generators of a one-sheet hyperboloid) It is impossible to assign values to λ and µ so that the equations [Bell [2, p.148, (1)]] become identical with the equations [Bell [2, p.148, (2)]]. See Bell [2, p.148, l. 11–l. 9]. − − Proof 1. See Bell [2, p.155, l.13–l.18]. Since the intersection of the two lines consists of a single point, the two lines cannot be the same.

9 Proof 2. The direction of the line given in Bell [2, p.148, (1)] is proportional to ( 1 , λ , 1 ) ( 1 , 1 , 1 )=( 1 (λ 1 ), 2 , 1 ( 1 + λ)). a − b c × a λb − c bc − λ ac ab λ The direction of the line given in Bell [2, p.148, (2)] is proportional to ( 1 , µ , 1 ) ( 1 , 1 , 1 )=( 1 ( 1 µ), 2 , 1 ( 1 + µ)). a − b − c × a µb c bc µ − − ac ab µ These two directions can be the same only when λ 1 = µ 1 and λ + 1 = 1 µ. − λ − µ λ − µ − Namely, λ = 1 . − µ Then x + z = 0 and x z = 0. a c a − c Consequently, x = z = 0,y = b. ± If the intersection of two lines consists of no more than two points, the two lines cannot be the same.

Remark 39. Since A1B1 does not intersect A2B2, α = 0 [Bell [2, p.165, l.3]]. Since A3B3 does not intersect 6 A1B1 A2B2, 0 = β = α [Bell [2, p.165, l.3]]. ∪ 6 6 If no two lines given in Bell [2, p.164, l. 3] are parallel and we can follow the procure given in Bell [2, p.164, l. 3–p.165, l.12] to produce− a conicoid, then the equation given in Bell [2, p.165, − l.12] represents a hyperbolic paraboloid [Bell [2, p.165, l.14]].

Proof 1. (Choose the method that can eliminate most impossibilities) Since no two lines of the three lines given in Bell [2, p.164, l. 3] are parallel, l = 0,m = 0. − 6 6 ∆ = α2β 2m2l2(α β)2 = 0. By the ∆-column of Fine–Thompson [33, p.232, Table], we have − 6 either case (a) or case (c). Furthermore, D = 0, so we have case (c) [Fine–Thompson [33, p.232, Table]].

Proof 2. (Find out the most distinguishable difference between two options) [Bell [2, p.163, l. 3–p.164, l. 4]] proves that if three non-intersecting lines are not parallel to the same plane, − − then any conicoid passing through these three lines is a hyperboloid of one sheet. In fact, the converse is also true. Namely, Lemma. Any three generators of the same system of a one-sheet hyperboloid cannot be parallel to the same plane.

Proof of the Lemma. The direction-cosines of a λ-generator are proportional to 1 λ 1 1 1 1 1 1 2 1 1 ( a , −b , c ) ( a , bλ , c )=( bc (λ λ ), −ac , ab (λ + λ )). 1 1× 2 −1 1 − (λ1 ) − (λ1 + ) bc λ1 ac ab λ1 1 − 1 2 1 1 4 (λ2 ) − (λ2 + ) = (λ λ )(λ λ )(λ λ ) = 0. bc λ2 ac ab λ2 a2−b2c2 2 1 3 2 1 3 1 − 1 2 1 1 − − − 6 bc (λ3 ) −ac ab (λ3 + ) − λ3 λ3

Since the conicoid passes through three non-intersecting lines which are not parallel to the same

plane [Bell [2, p.163, l. 3–l. 2]], by the Lemma, the conicoid cannot be a one-sheet hyper- boloid. Since no two of− the three− given lines are parallel, the cases (d) and (e) given in Fine– Thompson [33, p.232, Table] are impossible.

Proof 3. (Use the main property of the target of proof) A hyperbolic paraboloid is our target of proof and the statement given in Bell [2, p.150, l.2–l.3] is its main property. The direction cosines βmy of the λ-generator z = − ,lx(α β) βmy = λ are proportional to λ − − (0, βm ,1) (l(α β), βm,0)=(βm,l(α β), βm l(α β)). Thus, all the λ-generators are λ × − − − − λ − parallel to the plane lx(α β) βmy = 0. By the Lemma in Proof 2, the conicoid cannot be a − − one-sheet hyperboloid.

10 x2 y2 x2 y2 Remark 40. The sections of the paraboloids a2 λ + b2 λ = 2z λ, a2 + b2 = 2z by the plane YOZ consist of confocal parabolas [Bell [2, p.176,− l. 8–l.− 6]]. − − − 2 λ b2 b2 λ b2 b2 Proof. Since y = 4 − (z − ), the common focus of these parabolas is (0,0, ). − 2 − 2 − 2 2 2 2 2 2 (a λ1)(a λ2)(a λ3) Remark 41. α = −(b2 a2−)(c2 a2)− [Bell [2, p.178, l. 14]]. − − − Proof.

(a2 λ)(b2 λ)(c2 λ) α2(b2 λ)(c2 λ) β 2(c2 λ)(a2 λ) γ2(a2 λ)(b2 λ) − − − − − − − − − − − − = (λ λ1)(λ λ2)(λ λ1). − − − − Let λ = a.

(βn γm) l2 m2 n2 Remark 42. ∑ (b2 λ−)(c2 λ) = a2 λ + b2 λ + c2 λ [Bell [2, p.180, l.16]]. − − − − − Proof. By the equation given in Bell [2, p.180, l.14–l.15], 2 2 2 2 2 2 2 2 2 ( l + m + n )( α + β + γ ) ( l + m + n )=( αl + βm + γn )2. a2 λ b2 λ c2 λ a2 λ b2 λ c2 λ − a2 λ b2 λ c2 λ a2 λ b2 λ c2 λ If we− subtract− the− term on− the right-hand− − side of− the above− equation− from− the first− term− on the left-hand side, we obtain the term on the left-hand side of the desired equation.

α2 β 2 γ2 Remark 43. 2 2 + 2 2 + 2 2 = 0 [Bell [2, p.183, l.9]]. a (a λ1) b (b λ1) c (c λ1) − − − 6 αx βy γx Proof. 2 + 2 + 2 = 1 is the tangent plane at P of the confocal a λ1 b λ1 c λ1 x2 −y2 −z2 − 2 + 2 + 2 = 1. a λ1 b λ1 c λ1 − − − x2 y2 z2 Let PT touch the conicoid a2 + b2 + c2 = 1 at T =(α1,β1,γ1). Then 2 2 2 α1x β1y γ1x x y z 2 + 2 + 2 = 1 is the tangent plane at T of the conicoid 2 + 2 + 2 = 1. a λ1 b λ1 c λ1 a b c − − − α1α β1β γ1γ By Bell [2, p.180, l.8–l.10], 2 2 + 2 2 + 2 2 = 0. Consequently, a (a λ1) b (b λ1) c (c λ1) − − − α2 β 2 γ2 2 2 + 2 2 + 2 2 = 0. a (a λ1) b (b λ1) c (c λ1) − − − 6 This is because (a1,a2,a3) [(b1,b2,b3)+t(c1,c2,c3)] = 0 is a linear equation in t. · Remark 44. The centre of the section of the conicoid by the plane QRS lies on PC [Bell [2, p.184, l.12–l.13]].

Proof. Suppose the plane OPW cuts the ellipsoid in an ellipse E and cuts the ellipse on the plane αx βy QRS in points A,B. Then make appropriate coordinate transformations so that a2 + b2 = 1 is the x2 y2 polar line AB of P =(α,β) with respect to the ellipse E, a2 + b2 = 1. If A =(x1,y1),B =(x2,y2), 2 2 2 2 x1+x2 y1+y2 αa b βa b then C1 =( 2 , 2 )=( a2β 2+b2α2 , a2β 2+b2α2 ). Hence PC1 passes through the origin, i.e., the center of the ellipse E. We may make W(ϕ) circle around the line OP, but the middle point of A(ϕ)B(ϕ) is the constant C1. Therefore, C1 is the center of the ellipse on the plane QRS. Remark 45. By Das [24, p.334, l. 2–l. 1], the equation to the conicoid will be of the form 2 2 2 − − x + y + z = k( p1x + p2y + p3z )2 [Bell [2, p.184, l. 4]]. λ1 λ2 λ3 λ1 λ2 λ3 − 1 1 Remark 46. By Table 1, the three planes pass through one line if (i) λ = a2 ,α = 0; or (ii) λ = b2 ,β = 0; or (iii) λ = 1 ,γ = 0 [Bell [2, p.189, l. 12–l. 11]]. c2 − − Remark 47. The equations to the directrix corresponding to (α,0,γ) are x = ξ,z = ζ [Bell [2, p.191, l.6–l.7]].

11 2 a2 b2 2 b2 c2 Proof. Let A = a2−b2 ,B = b2−c2 . A2(x ξ)2 + B2(z ζ)2 =[B(z ζ)+ A(x ξ)][B(z ζ) A(x ξ)]. − − − − − − − − B(z ζ)+ A(x ξ)= 0 2B(z ζ)= 0 − − − (B(z ζ) A(x ξ)= 0 ⇔ (2A(x ξ)= 0. − − − −

Remark 48. The equation to the conicoid that contains the x-axis as a generator is by2 + cz2 + 2 f yz + 2gzx + 2hxy + 2vy + 2wz = 0 [Bell [2, p.200, l.7–l.9]].

Proof. Substituting (x,y,z)=(α,0,0) into the equation ax2 + by2 + cz2 + 2 f yz + 2gzx + 2hxy + 2ux + 2vy + 2wz + d = 0, we have aα2 + 2uα + d = 0.

Remark 49. (Centers) Bell [2, §152–§153] Consider the system of equations given in Bell [2, p.216, (1), (2), & (3)]. Let a h g r = rank of the coefficient matrix h b f and   g f c  ah g u r = rank of the argumented matrix hb f v . Let ′   g f cw ah g u   hb f v S = . gf cw

u v wd

12 Table 2: The classification of conicoids The set of centers Ranks The type of conicoids r = 3 ,b S = 0 ellipsoid or hyperboloid a point a 6 r = 3,S = 0 cone a line r = r′ = 2 elliptic or hyperbolic, cylinder, pair of intersecting planes c d a plane r = r′ = 1 pair of parallel planes e r = 2,r′ = 3 paraboloid empty set ∅ f r = 1,r′ = 2 parabolic cylinder a See Bell [2, §155]. b See Table 1. c See Bell [2, §157]. d See Bell [2, §159]. e See Bell [2, §156]. f See Bell [2, §158].

Remark 1. The classification of cases for centers should be determined by the final solution of a system of equations rather than the solution process. Consider the parabola x2 = 4ay with polar coordinates. Let A =(cosπ/4,sinπ/4), B =(cosπ/3,sinπ/3). The middle point for the chord along −→OA =(∞,π/4) =(∞,π/3)= the middle point for the chord along −→OB. We 6 may find the algebraic solution (∞,∞) for the center [Remark 16] because the Cartesian coordinates are inadquate to tell the above difference. In this example, the equation for the second central plane is 4a = 0, which is the empty set instead of a plane. Thus, this standard type becomes an exception for the classification given in Bell [2, p.216, l. 2–p.217, l.16]. − If the set of centers is empty and we allow a point involving ∞ to be its element due to a tool abuse, then all the theorems to which the false existence of elements leads will be meaningless. Remark 2. The classification by the set of centers is not as good as the classification by Fine–Thompson [33, p.283, Table] because the latter is finer when the set of centers consists of one point. Remark 3. The classification by Fine–Thompson [33, p.283, Table] is not as good as the classification by the above classification by ranks becase the latter is finer in Fine–Thompson [33, p.283, Table, case (e)]. Remark 4. The classification by the set of centers is not as good as the classification by the above classification by ranks becase the latter is finer when the set of centers is empty. Remark 50. (Using the standard form of conicoids to simplify the proofs of theorems about symmetric ma- trices) Consider the system of equations given in Bell [2, p.216, (1), (2), & (3)]. Let a h g r = rank of the coefficient matrix h b f and   g f c  ah g u r = rank of the argumented matrix hb f v . Let ′   g f cw  

13 ah g u a h g hb f v D = h b f and ∆ = . gf cw g f c u v wd

Bell [2, p.220, l.21–l.22] says that [(r = 2,r = 3) (the conicoid is a paraboloid)]. Fine– ′ ⇒ Thompson [33, p.283, Table] says that [(D = 0,∆ = 0) (the conicoid is a paraboloid)]. Hence 6 ⇒ (r = 2,r = 3) (D = 0,∆ = 0). ′ ⇔ 6 Proof. Since the rank and the determinant of a matrix are invariant under nonsingular linear transformations, we may use standard forms of conicoids to prove this theorem. Using a com- plete list [Fine–Thompson [33, §237]] of standard forms of conicoids to check if there is any 2 2 form satisfies the property (r = 2,r = 3), we find that only the standard form x y = 2z of a ′ a2 ± b2 c paraboloid satisfies this property. Consequently, paraboloids can be characterized by the property (r = 2,r′ = 3).

Remark 51. The principal planes corresponding to λ1 and λ2 pass through the vertex [Bell [2, p.222, l.4–l.5]]. ∂F ∂F ∂F Proof. 0 = l1 + m1 + n1 [Bell [2, p.221, l.3–l.4; p.212, l.10–l.12]] ∂x′ ∂y′ ∂z′ = 2[x′(al1 + hm1 + gn1)+ y′(hl1 + bm1 + f n1)+ z′(gl1 + f m1 + cn1)+ ul1 + vm1 + wn1] = 2[λ1(l1x′ + m1y′ + n1z′)+ ul1 + vm1 + wn1] [Bell [2, p.212, l.3]]. By Bell [2, p.221, l.3–l.4; p.212, l.10–l.12], (x′,y′,z′) is on the principal plane corresponding to λ1. Remark 52. The equations to the line of centers of circular sections are f (x + u )= g(y + v )= h(z + w ) λ1 λ1 λ1 [Bell [2, p.229, l.9–l.10]]. Proof. Consider a conicoid of revolution. Let L be the line of centers of circular sections. By Bell [2, p.228, (i), (ii), (iii)], L is the axis of the conicoid and is perpendicular to the planes that cut the conicoid in circles.

a λ h g a1 λ h1 g1 − − Remark 53. By Fine–Thompson [33, p.124, (3 )], h b λ f = 0, h1 b1 λ f1 = 0 ′ − − g f c λ g1 f1 c1 λ − − have the same roots [Bell [2, p.231, l. 9–l. 8]]. − − Remark 54. By Fine–Thompson [33, p.276, l. 7–l. 6], − − a λ h g u a1 λ h1 g1 u1 − − h b λ f v h1 b1 λ f1 v1 − = 0, − = 0 have the same roots g f c λ w g1 f1 c1 λ w1 − − u v wd λ u v w d λ 1 1 1 1 [Bell [2, p.233, l.3–l.4]]. − −

Remark 55. Bell [2, p.231, l. 3–p.233, l.10] and Fine–Thompson [33, §370] give two proofs of the fact that − S is an invariant. The former proof is simpler than the latter one. Remark 56. Their curve of intersection is a cubic curve [Bell [2, p.239, l.1]].

Proof. Suppose a plane P cuts the first cylinder D1 in conic C1 and the second cylinder D2 in conic C2. By Bell [2, p.238, l. 3], the third cylinder D3 D1 D2. Consequently, D1 D2 is − ⊃ ∩ ∩ the curve C given in [Bell [2, p.239, l.3]]. P C = P (D1 D2)=(P D1) (P D2)= C1 C2 consists of three points [Bell [2, p.239, ∩ ∩ ∩ ∩ ∩ ∩ ∩ l.10]].

14 Remark 57. As P tends to R, Q tends to P [Bell [2, p.240, l.4]].

Proof. The S1-generators in the system to which PQ belongs are disjoint. Consequently, Q moves toward R rather than away from it as P tends to R.

Remark 58. If these planes are distinct, they meet the conicoids in two other common generators of the opposite system to OZ [Bell [2, p.242, l. 4– 3]]. − − Proof. Let A1 and A2 denote the above two planes. S1 S2 = OZ C, where C is a conic. ∪ ∪ S1 touches S2 at each point of OZ, so OZ should be counted as a conic. Since C is a conic, Ai C = ∅(i 1,2 ). Consequently, Ai C is a line contained in S1 S2. ∩ 6 ∈ { } ∩ ∪ Remark 59. Let the conics in which the xy-plane cuts the conicoids be f ax2 +2hxy+by2 +2gx+2 f y+c = ≡ 0, and f + λx2 0 [Bell [2, p.247, l.1–l.4]]. ≡ Proof. Let 2ax0x+2by0y+2cz0z+h(y0z+z0y)+g(z0x+x0z)+ f (x0y+y0x)+u(x+x0)+v(y+ y0)+w(z+z0)+d = 0 be the common tangent plane at (x,y,z)=(x0,y0,z0). One of A,B has the coordinates (0,y0,0), where y0 = 0. 6 2by0y+hy0z+ f y0x+ux+v(y+y0)+wz+d = 0 and 2b′y0y+h′y0z+ f ′y0x+u′x+v′(y+y0)+ w′z + d′ = 0 represent the same plane. Consequently, b = b′,h = h′, f = f ′,u = u′,v = v′,w = w′,d = d′.

Remark 60. m = m′ and p = p′ [Bell [2, p.247, l.15]].

Proof. The two conics given in Bell [2, p.247, l.13–l.14] have the same tangent line at A,B. One of A,B has the coordinates (y0,0), where y0 = 0. 6 2by0y + f (y + y0)+ c + my0z + pz = 0 and 2by0y + f (y + y0)+ c + m′y0z + p′z = 0 represent the same tangent line, so m = m′ and p = p′. Remark. In this proof, we use the hypothesis that the line joining the points of contact is not a common generator. The tangent line here is meaningful only when the conic is nondegenerate. Remark 61. The curve of intersection consists of two conies which cross at A and B [Bell [2, p.247, l. 16– − l. 15]]. − Proof. By Bell [2, p.247, l. 17–l. 16], S1 S2 is contained in two planes P1,P2 passing through − − ∩ AB.S1 P1 is a conic. ∩ Remark 62. By Bell [2, p.110, Ex. 6], the equation to the other conicoid is of the form S + λuv = 0 [Bell [2, p.248, l.16–l.18]]. Remark 63. Three conditions must be satisfied if a conicoid is to touch a given plane at a given point and therefore the general equation should contain three disposable constants [Bell [2, p.248, l. 1– p.249, l.2]]. −

Proof. By Bell [2, p.103, l.10], we see that three conditions must be satisfied if a conicoid is to touch a given plane at a given point. A conicoid is determined by nine conditions [Bell [2, p.196, l. 9]]. At point A, the tangent plane of the second conicoid must be the tangent plane − of the first conicoid. Thus, we have used three conditions for the point A. Similarly, we have used another three conditions for the point B. Consequently, there are three conditions left for the undetermined coefficients λ, µ,ν.

15 Remark 64. The three conicoids S = 0,S = 0,S = 0 intersect in eight points [Bell [2, p.252, l. 14– l. 13]]. ′ ′′ − − Proof. S = 0,S′ = 0 intersect in a quartic curve (x(t),y(t),z(t)). Substituting (x,y,z)=(x(t),y(t),z(t)) into S′′, we obtain a polynomial of degree eight in t. Remark 65. An inflectional tangent of a one-sheet hyperboloid meets the surface in infinite coincident points [Bell [2, p.262, l.8]]. a2l2 b2m2 c2n2 Remark 66. By Bell [2, p.132, (2)], a2 r2 + b2 r2 + c2 r2 = 0 [Bell [2, p.267, l. 14]]. − − − − Remark 67. The common tangents are given by y = 0, √a2 b2x √b2 c2z = b√a2 c2 [Bell [2, p.269, l.15– l.16]]. ± − ± − −

2 2 2 Proof. Suppose a common tangent touches the circle x + z = b at (x,y,z)=(x1,0,z1) and the 2 2 z x 2 z2 x2 ellipse a2 + c2 = 1 at (x,y,z)=(x2,0,z2). Then x1x + z1z = b is the same as a2 z + c2 x = 1. Remark 68. axξ + czζ = abc is at least a double tangent plane [Bell [2, p.269, l.22–l.23]] because two tan- gential points can be provided by the common tangent to the circle x2 + z2 = b2 and the ellipse z2 x2 a2 + c2 = 1. Remark 69. This conicoid is a paraboloid if rt = s2, and a parabolic cylinder if rt = s2 [Bell [2, p.270, l.16– l.17]]. 6 r s 0 1 if rt = s2 Proof. The rank of s t 0 = 2   (2 if rt = s . 0 0 0 6 r s 0 0   s t 0 0 r s = . The results follow from Fine–Thompson [33, p.283, Table]. 0 0 0 1 − s t − 0 0 1 0

Remark 70. (Trinity of consecutive points, approach to the same point, and contact of higher order) The definition given in Bell [2, p.279, l. 12–l. 10] is a rigorous statement of the definition given − − in Weatherburn [124, vol.1, p.12, l.6–l.8]. The definitions of osculating plane, osculating circle, osculating sphere given in Bell [2, p.279, l. 12–l. 10; p.292, l.1–l.3; l. 3–l. 1] are based − − − − on the same idea of osculation. The second approach is heuristic, systematic and unified. In contrast, the definitions given in Kreyszig [66, p.33, Table 10.1; p.51, l.4 & l.17] look artificial. The common idea of these three definitions becomes vague in the third approach. The important step for the construction process (the circle PQR [Bell [2, p.292, l.2]] or the sphere PQRS [Bell [2, p.292, l. 2]]) in the second approach is lost in the third approach and cannot be restored − by using the results in the third approach alone. From hindsight, consecutive points can be considered a brief expression for contact of higher order [Kreyszig [66, p.50, l. 6; p.51, l.6]]). − The contact of second or third order can easily be generalized to the nth order. Remark 71. Bell [2, p.270, l. 10–l. 5] follows from Fine–Thompson [33, p.137, l. 12–l. 9]. − − − − Remark 72. Because the definition given in Bell [2, p.279, l. 12–l. 10] is euqivalent to the conditions given in Bell [2, p.281, (1) & (2)], the equation to the− osculating− plane is (ξ x) fx+(η y) fy+(ζ z) fz (ξ x)φx+(η y)φy+(ζ z)φz − 2 − − = − 2 − − [Bell [2, p.281, l.18–l.19]]. x fxx +2y z fyz+ x φxx +2y z φyz+ ′ ··· ′ ′ ··· ′ ··· ′ ′ ··· Remark 73. By Bell [2, p.92, l.14–l.15], Ot1 is the normal to the tangent plane to the reciprocal cone [Bell [2, p.286, l. 1–p.287, l.2]]. − 16 Remark 74. C2S1 is the locus of points equidistant from A2,A3,A4 [Bell [2, p.299, l.15–l.16]].

Proof. C1S1 = M1C1S1 M2C2S2. Similarly, C2S2 = M2C2S2 M3C3S3. ∩ ∩ Since S1 = M1C1S1 M2C2S2 M3C3S3, C2S2 passes through S1. { } ∩ ∩ Remark 75. C1K differs from C1M1 C2M2 by an infinitesimal of higher order [Bell [2, p.299, l. 7–l. 6]]. − − − Proof. C1M1 C2M2 = C1M2 C2M2 C1C2 C1K as C2 C1. − − ≈ ≈ → Remark 76. Since M2C2S2 passes through C1 and S1, C1S1C2 = M2C2S2 [Bell [2, p.300, l. 2–l. 1]]. − − Remark 77. t is introduced to make the equations f = 0, fα = 0 homogeneous [Bell [2, p.308, l. 13–l. 12]]. Remark. Read Bell [2, p.94, l. 6–l. 1]. − − − − f (x,y,z,t,α) = a(α)x2 +b(α)y2 +c(α)z2 +2 f (α)yz+2g(α)zx+2h(α)xy+2u(α)xt +2v(α)yt +2w(α)zt + d(α)t2. f (x,y,z,1,α)= f (x,y,z,α). α can be expressed in terms of x,y,z,t by solving the equation fα = 0. ∂ ∂ Remark 78. δα ∂α f (α + θ1δα,β + δβ)+ δβ ∂β f (α,β + θ2δβ)= 0 [Bell [2, p.311, l.10]]. Proof. f (α + δα,β + δβ) f (α,β) − = f (α + δα,β + δβ) f (α,β + δβ)+ f (α,β + δβ) f (α,β) ∂ − ∂ − = δα ∂α f (α + θ1δα,β + δβ)+ δβ ∂β f (α,β + θ2δβ) [by the mean value theorem]. Remark 79. d = 0 if and only if consecutive two generators are coplanar [Bell [2, p.314, l.14–l.15]].

x α y β z Proof. The generator −a = −b = 1 is parallel to u =(a,b,1). x (α+δα) y (β+δβ) z The consecutive generator −a+δa = −b+δb = 1 is parallel to v =(a + δa,b + δb,1). Let A =(α,β,0) and B =(α + δα,β + δβ,0). Then [−→AB,u,v]= δaδβ δαδb. − d = 0 δaδβ δαδb = 0 [Bell [2, p.314, l.5]] ⇔ − [−→AB,u,v]= 0 ⇔ −→AB,u,v are coplanar [Kreyszig [66, p.18, l.6–l.8]]. ⇔ Remark 80. If there are drawbacks in a classical method, all we have to do is provide ideas to improve them. If there is a gap in its proof, we simply fill the gap. In other words, a remedy rather than a thorough revamp is all we need. This introduction mode based on needs may make the key to improvement most outstanding. Example 1.1. (Precision improvement of a classical method) Let S be a ruled surface [Bell [2, p.313, l. 1]]. If α′b′ β ′a′ = 0 [Bell [2, p.314, l.15; p.313, l. 5]], then S is developable [Bell [2, p.314,− l.15–l.16]].− − Proof. We have d = O(δt3) in Bell [2, p.314, l.12–l.13], but Bell jumps to the conclusion that d = 0. Thus, there is a gap needed to be filled. Let the directrix of S be y(s)=(α(s),β(s),0), z(s)=(a(s),b(s),1), and x(s,t)= y(s)+tz(s).

17 S is developable α′ β ′ 0 0 = y zz = a b 1 = β a α b [Kreyszig [66, p.169, Theorem 59.1]]. ⇔ | ′ ′| ′ ′ − ′ ′ a b 0 ′ ′

Remark. In the above proof, we use the method of differential geometry to fill the gap of a classical proof. Thus, we see the advantage of modern geometry. At the same time, we also see the concept of “consecutive” generators is useful to the intuitive understanding of a ruled surface although it is difficult to make its definition rigorous. Consequently, classical geometry and differential geometry are complementary to each other.

x2 y2 z2 x acosα Remark 81. The generators of a2 + b2 c2 = 1 through the point (acosα,bsinα,0) are the lines −asinα = y bsinα z − −bcosα = c [Bell [2, Appendix, p.iv, l. 15–l. 6]]. − ± − − x2 y2 z2 x y 2 x y 2 z2 Proof. a2 + b2 c2 1 = [( a cosα + b sinα) 1]+[( a sinα b cosα) c2 ]. − − x y − x− y − z x Consequently, A = (x,y,z) a cosα + b sinα = 1 ( (x,y,z) a sinα b cosα = c (x,y,z) a sinα y z { | }∩ { | − }∪{ | − b cosα = c ) −x }acosα y bsinα z = (x,y,z) − = − = [Bell [2, Appendix, p.iv, l. 11–l. 6]] is on the hyper- { | asinα bcosα c } − − boloid. − ± Since all plane sections of a conicoid are conics [Bell [2, p.74, Ex. 2]], A must be the intersection x y of the tangent plane a cosα + b sinα = 1 and the hyperboloid. Remark 82. By the formulas given in Bell [2, p.170, l.11], the shortest distance between consecutive gener- ators of the same system of a hyperboloid or paraboloid does not vanish [Bell [2, p.314, l. 8– − l. 6]]. − Remark 83. A developable surface is the locus of the tangents to, or the envelope of the osculating planes of, its edge of regression [Bell [2, p.317, l.14–l.16]].

Proof. The generators of the developable surface are tangents to a curve [Bell [2, p.316, l.1–l.2]]. By Bell [2, p.310, Ex. 6], the ruled surface generated by the tangents to the curve is the envelope of the osculating plane of the curve, and has the curve for its edge of regression.

Remark 84. Sneddon [105, p.20, l.1–p.21, l.7] provides a rigorous proof of Bell [2, p.318, l. 9–l. 1]. Remark. When reading classical books, one should be familar with their frequent− mistakes− and should know how to correct them. Otherwise, one cannot appreciate these books. In most cases, the statement of a theorem is correct, but the author fails to provide a rigorous proof. If we read Bell [2, p.318, l. 9–l. 1] alone, we do not know the proof can be improved. Likewise, if we − − read Sneddon [105, p.20, l.1–p.21, l.7] alone, we do not know it is an improvement of a classical theorem. Remark 85. By Kreyszig [66, p.76, l. 2–l. 1], for some “polygonizations” of the surface, the area may be infinite and for other ones− finite− [Blaga [6, p.144, l.3–l.4]]. a a Remark 86. A given in Blaga [6, p.144, l.10] means 11 12 in Carmo [12, p.154, l.14]; Blaga [6, p.144, a21 a22 (4.11.1)] means Carmo [12, p.154, (1)].  Remark 87. By Jacobson [61, vol. 2, p.151, l.13], the scalar product on R3 is nondegenerate and the same is true for its restriction to any subspace [Blaga [6, p.145, l.1–l.2]]. By Jacobson [61, vol. 2,

18 p.149, l.2], the matrix G is invertible [Blaga [6, p.145, l.2]]. However, it is easier to prove the last statement using O’neill [86, p.47, Lemma 1.8]. Remark 88. (Reduction from a large number of cases to a smaller one) ei ek e j ek I. (ei e j) (ek el)= · · [Carmo [12, p.13, l.14]]. ∧ · ∧ ei el e j el · ·

4 Proof. There are 3 = 81 cases to consider. If i = j or k = l, the proof is trivial. Thus, it suffices to consider the cases when i = j and k = l. 6 6 If the equality holds for (i, j,k,l), then its holds for ( j,i,k,l), (i, j,l,k), and ( j,i,l,k) too. Thus, it suffices to consider the cases when (i, j),(k,l) (1,2),(1,3),(2,3) . ∈ { } If (i, j)=(k,l), the proof is trivial; if the equality holds for (i, j,k,l), then its holds for (k,l,i, j). Thus, it suffices to consider the following three cases: (i, j)=(1,2),(k,l)=(1,3); (i, j)= (1,2),(k,l)=(2,3); (i, j)=(1,3),(k,l)=(2,3).

II. (ei e j) ek =(ei ek)e j (e j ek)ei [Carmo [12, p.14, l.3]]. ∧ ∧ · − · Proof. There are 33 = 27 cases to consider. If i = j, the proof is trivial. Thus, it suffices to consider the cases when i = j. 6 If the equality holds for (i, j,k), then its holds for ( j,i,k). If k / i, j , the proof is trivial. Thus, it suffices to cosider the cases when k i, j . Namely, ∈ { } ∈ { } (i, j,k) (1,2,1),(1,2,2),(1,3,1),(1,3,3),(2,3,2),(2,3,3) . ∈ { } Remark 89. The torsion remains invariant under a change of orientation [Carmo [12, p.18, l. 2–l. 1]]. − − Proof. I. Let β(s)= α( s). Then − β (s)= α ( s). ′ − ′ − β (s)= α ( s). Thus, ′′ ′′ − n (s)= nα ( s) [since k (s)= β (s) = α ( s) = kα ( s)]. β − β | ′′ | | ′′ − | − II. bα′ (s)= τα (s)nα (s) [Carmo [12, p.18, l.16]]. Thus, b ( s)= τα ( s)nα ( s). α′ − − − III. b (s)= bα ( s) [Carmo [12, p.18, l. 3]]. β − − − b (s)= b ( s). β′ α′ − τ (s)= τα ( s) [by I and II]. β − Remark 90. Carmo [12, p.72, l.1–l.2] says that there exists a neighborhood N of r such that y(N) M. ⊂ Actually, there exists a neighborhood N of r such that y(N) M x(U) [O’neill [86, p.151, ⊂ ∩ Exercise 13]]. Remark 91. Since F(U 0 )= x(U) [Carmo [12, p.71, l.22]], F 1 y(N)=(x 1 y(N)) 0 ; (x 1 × { } − ◦ − ◦ × { } − ◦ y(N)) 0 can be regarded as x 1 y(N)= h(N) [Carmo [12, p.72, l.3]]. × { } − ◦ Remark 92. There are two orthogonal normal lines in Ri [Carmo [12, p.116, l.15-l.16]].

∂(x¯,y¯) Proof. = 0 at pi [Carmo [12, p.117, l.5]]. Thus, in the (x¯,y¯,z¯) coordinate system, the | ∂(u,v) | 6 ∂(x¯,y¯) normal vector at pi is (0,0, ∂(u,v) ). By Carmo [12, p.116, l.14-l.15], there is a normal vector in Ri with the form ( , ,0). These two normal vectors are orthogonal. × ×

Remark 93. It suffices to verify that < dNp(w1),w2 >=< w1,dNp(w2) > for a basis w1,w2 of Tp(S) [Carmo { } [12, p.140, l. 8–l. 7]]. − − 19 Proof. For the definition of a self-adjoint linear map, see Carmo [12, p.214, l. 18–l. 17]. − − Let v = aw1 + bw2,w = cw1 + dw2. Then < dNp(v),w >= ac < dNp(w1),w1 > +ad < dNp(w1),w2 > +bc < dNp(w2),w1)+bd < dNp(w2),w2 >; < v,dNp(w) >= ac < w1,dNp(w1) > +ad < w1,dNp(w2) > +bc < w2,dNp(w1) > +bd < w2,dNp(w2) >. < dNp(w1),w2 >= dNp(w1) w2 =< w2,dNp(w1) >. · Remark 94. By Carmo [12, p.155, l. 15–l. 14], this matrix is not necessarily symmetric, unless xu,xv is an orthonormal basis [Carmo− [12,− p.154, l.16–l.17]]. { } Remark 95. Carmo [12, p.158, Proposition 1].

Proof. A. Suppose the coordinate curves are lines of curvature so that F = 0 and f = 0 [Kreyszig [66, p.93, Theorem 27.4]]. w 2 = Eu2 + Gv2 G(u2 + v2) [We may assume E G]. | | ≥ 2 ≥ By Carmo [12, p.158, l. 6], limw 0(R/ w )= 0. − → | | B. For all x(u,v) sufficiently near an elliptic point p, d has the same sign as IIp(w) [Carmo [12, p.159, l.1–l.3]].

1 2 2 Proof. d = 2 (eu + gv )+ R [Carmo [12, p.158, l. 2]] 1 2 2 2 − = 2 (k1Eu + k2Gv )+ o( w ) [by A and Kreyszig [66, p.93, (27.11)]] 1 2 2 2 | 2| k2G(u + v )+ o(u + v ) > 0. ≥ 2 C. x(u,v),x(u¯,v¯) belong to distinct sides of Tp(S) [Carmo [12, p.159, l.6]].

Proof. a. Let x(u,v) lie in the part of surface between the normal sections kn = k1 > 0 and k1 kn = 2 . 1 2 2 2 d = 2 (k1Eu + k2Gv )+ o( w ) 1 2 2 | | k1 = 2 w kn + o( w ) [Carmo [12, p.149, l.7]] ( 2 kn k1). | | | | ≤ ≤ k2 b. Let x(u¯,v¯) lie in the part of surface between the normal sections kn = k2 < 0 and kn = 2 . 1 2 2 k2 d = w¯ kn + o( w¯ ) [Carmo [12, p.149, l.7]] (k2 kn ). 2 | | | | ≤ ≤ 2

Remark 96. For the proofs of Carmo [12, p.176, Theorem 1 & Theorem 2], Carmo [12, p.177, l.7–l.9] gives an 11-page reference: W. Hurewicz, Lectures on Ordinary Differential Equations, M.I.T. Press, Cambridge, Mass., 1958, Chap. 2. Everyone knows the proof of Carmo [12, p.176, Theorem 1]. For the proof of Carmo [12, p.176, Theorem 2], all one needs to read is actually less than two pages of material: Coddington–Levinson [16, p.26, l.4–p.27, l. 13]. − Remark. (The theorem of mean [Coddington–Levinson [16, p.26, l.11]]) Let g(t)= f ((1 t)x+ − ty). Then f (y) f (x)= g(1) g(0)= g (c)= ∇ f ((1 c)x + cy) (y x). − − ′ − · − Remark 97. By Carmo [12, p.177, l. 12; p.176, 1, the formula on the right], dα˜ p maps the unit vector of − − t-axis into w(0,0) [Carmo [12, p.177, l. 5–l. 4]]; by Coddington–Levinson [16, p.27, (7.12)], − − dα˜ p maps the unit vector of y-axis into itself [Carmo [12, p.177, l. 4]]. − Remark 98. Because V 0 (x,y,t) R3;x = 0 is 1-dim instead of 0-dim and f (x,y)= Proj α˜ 1, W × { } ∩ { ∈ } y ◦ p− may be taken sufficiently small so that d fq = 0 for all q W [Carmo [12, p.178, l.2]]. 6 ∈ Remark 99. A first integral f : R2 (0,0) R is f (x,y)= x2 + y2 [Carmo [12, p.178, l.6]]. \ { } →

20 Proof. By Carmo [12, p.175, l. 3], − α : V I U × → (x,y,t) ((x2 + y2)1/2 cost,(x2 + y2)1/2 sint). Then 7→ α˜ : (0,y,t) (ycost,ysint). Thus, 7→ f (ycost,ysint)= Projy(0,y,t)= y. Remark 100. The differentiability of w [Carmo [12, p.180, l. 3]] follows from O’neill [86, p.151, Exercise − 12].

Remark 101. In a neighborhood of p, each trajectory (point-set) of intergral curves of r or r′ is a trajectory of coordinate curves of x [Carmo [12, p.183, l.10–l.12]]. ϕ Proof. V U¯ ←→x q = x(u,v) ! ( f1(q), f2(q))=(u,v) is a diffeomorphism. By Carmo [12, p.177, Lemma], q S : f1(q)= c = x(u,v) S : u = c and q S : f2(q)= d = x(u,v) S : v = d . { ∈ } { ∈ } { ∈ } { ∈ } 2 2 Remark. In the proof Carmo [12, p.184, Corollary 3], e(u′) + 2 f u′v′ + g(v′) = 0 [Carmo [12, p.160, (7); p.183, l. 7]] comes naturally with a parameter t. If we require that the coordinate − curves be parametrized with the same parameter t [Carmo [12, p.183, l.5–l.7]], then the neces- sary and sufficient conditions would be stronger than the condition given in Carmo [12, p.182, Theorem]. See Kreyszig [66, p.90, Theorem 26.1]. Similarly, for Carmo [12, p.185, Corol- lary 4], if we require that the coordinate curves be parametrized with the same parameter t as (u )2 uv (v )2 ′ − ′ that in E F G = 0 [Kreyszig [66, p.91, (27.4)]], then the necessary and sufficient

e f g

conditions could be found in Kreyszig [66, p.93, Theorem 27.4].

Remark 102. By comparing Carmo [12, pp. 223–224, Figure 4-4] with Kreyszig [66, p.163, Fig. 51.2], we see that hand draw is much better than computer graphics. Remark 103. For a proof of Carmo [12, p. 229, Exercise 14], see https://math.stackexchange. com/questions/150448/proof-that-angle-preserving-map-is-conformal. Remark 104. (Take one thing at a time) ∑ α(ti 1 ti) +∑(ti 1 ti) α′(ti) | | − − | − − | || ∑(ti 1 ti)supsi α′(si) ∑(ti 1 ti) α′(ti) ≤| − − | |− − − ε | || ∑(ti 1 ti)sups α′(si) α′(ti) [Carmo [12, p.475, l. 4–l. 2]]. ≤| − − i | − |≤ 2 − − Proof. The first inequality follows from Rudin [97, p.99, Theorem 5.20]; the second inequality follows from the following inequalties: 0 sup α (si) α (ti) ≤ si | ′ |−| ′ | = sup ( α (si) α (ti) ) sup α (si) α (ti) . ≤ si | ′ |−| ′ | ≤ si | ′ − ′ | Remark. Rudin [97, p.125, Theorem 6.35] provides a step-by-step proof of Carmo [12, p.475, Solution to §1-3, Exercise 8]. The former proof is clearer because Rudin considers the subinterval [xi 1,xi] first [Rudin [97, p.125, l. 6–l. 1]]. In contrast, the latter proof is confusing because − − − Carmo attempts to omit this first step and jump directly to sum over the partition of the entire interval [a,b]. Furthermore, to prove A B , we should let C = B and divide the proof into | |≤| | | | two parts: Part I. A C; Part II. A C. If we know B > 0, we should not write B as B . For ≤ − ≤ | | the expression given in Carmo [12, p.475, l. 3], we should write it as ∑(ti ti 1)sups α′(si) − − − i | |− ∑(ti ti 1) α′(ti) rather than ∑(ti 1 ti)sups α′(si) ∑(ti 1 ti) α′(ti) . − − | | | − − i | |− − − | || 21 Remark 105. [A1, ,Am;A1′ , ,Am′ ]=[A1, ,Am;E1′ , ,Em′ ]det(A1′ , ,Am′ ) [Courant–John [22, vol. 2, p.199,··· l.1–l.2]] follows··· from the··· definition of··· determinant. ··· Remark 106. By Courant–John [22, vol. 2, p.518, l.12–l.13], T(S) is Jordan-measurable [Courant–John [22, vol. 2, p.538, l.15]]. Remark 107. Since the Jacobian is positive, the orientation is preserved [Courant–John [22, vol. 2, p.559, l.10–l.11]].

Proof. A vector can be viewed as the tangent vector of a curve. From n to t, the direction rotates counterclockwise. Hence, from n′ to t′, the direction rotates counterclockwise [Courant–John [22, vol. 2, p.559, l.10–l.11]].

Remark 108. (How we deal with a problem that may easily cause us to commit errors) To prove the equality given in Courant–John [22, vol. 2, p.568, l. 12–l. 11] may easily − − cause us to commit errors. Even worse, the situation is too confusing to allow us to locate errors. Is it because reality often goes against mathematical conventions? If so, how should we prevent an error? If we commit an error, how should we find it and then correct it? The advantage of the method given in Courant–John [22, vol. 2, p.567, l.9–p.568, l. 8] − over the direct calculation is that we need not carry out the somewhat complicated calculation of the second of u [Courant–John [22, vol. 2, p.567, l.20–l.22]]. However, proving the equality given in Courant–John [22, vol. 2, p.568, l. 12–l. 11] may easily cause one to commit errors unless one is familiar with the consequences− of choosing− an orientation for a curve. Define Rn as in Courant–John [22, vol. 2, p.567, l. 6–l. 4]. − − Let the polar coordinates of A,B,C, and D be (r + h,θ),(r + h,θ + k),(r,θ + k), and (r,θ). I. The first parameterization γ1 of Cn: Let s0 < s4. Define γ1 : [s0,s4] Cn such that γ1(s0)= γ1(s4)= A,γ1(s1)= B,γ1(s2)= C, and → γ1(s3)= D. II. The second parameterization γ2 of>Cn: > Define γ2 such that γ2 : [θ,θ + k] AB,γ2 : [r + h,r] BC,γ2 : [θ + k,θ] CD,γ2 : [r,r + h] → → → → DA. The four segments of Cn are parameterized respectively as four different functions; it does not matter even if their domains intersect. According to convention, the domain [a,b] of a curve must satisfy a < b. γ2 : [r + h,r] BC → does not comply with this convention. In fact, it reverses the sense of line segment γ1[s1,s2] [Courant–John [22, vol. 1, p.334, l. 19]]. Since the principal normal is defined as the turning − direction of the tangent vector, the principal normal of a point on γ2[r + h,r] is the opposite of the principal normal of the corresponding point on γ1[s1,s2]. III. The third parameterization γ3 of Cn: In such a case, the key to preventing errors is to preseve the sense of the parameterization γ1 when parameterizing Cn. In order to fulfill this goal, all we have to do is reverse the orientation of each of the domains of the two segments of the parameterization γ2: γ2 : [r + h,r] BC,γ2 : > → [θ + k,θ] CD. The rest of the segments of the parameterization γ2 remain the same. The → parameterization so formed is called γ3. Based on the paramaterization γ3, we may easily prove the equality given in Courant–John [22, vol. 2, p.568, l. 12–l. 11]. This is because a line integral is invariant under parameter changes if the orientation− of− the domain of a curve is preserved and also because the only seg- ments of γ3 whose principal> normals do not point away from the origin or away from the polar du axis are γ3 : [θ,θ + k] CD and γ3 : [r,r + h] DA. Since = ∇u n, the principal normals → → dn ·

22 poining toward the origin and those pointing toward the polar axis contribute the two minus signs in the formula. If we unfortunately choose the parameterization γ2, we can still make it right as long as we pay attention to the abve-mentioned remark about γ2. Remark 109. The principal normal n is the direction that will become the direction of the tanngent vector

after a counterclockwise rotation by π/2, i.e., n =(dy, dx)/ds, so C v nds = C v1dy v2dx [Courant–John [22, vol. 2, p.574, l.1]]. − · − R R Remark 110. (Generalized orientations) When studying a generalized definition, we should understand its primitive version, its entire process of revolution, and the reason for the necessity of generalization. If we proceed directly toward the most general version in axiomatic approaches, its setting usually requires a more strange language and less familiar structures which may blur the essential idea, and the algorithm to check the definition usually becomes less effective. Thus, an improper approach to generalized definitions may easily lead to an empty formality and make it difficult for us to see the advantages of generalized definitions over the primitive version. Providing several non-trivial examples alone is not enough. I. The approach given in Courant–John [22, vol. 2] aims at the origin, the insight, and the essen- tial idea. (1). Choosing an advantageous setting makes us easily see the entire process of revolution [Courant–John [22, vol. 2, p.575, l. 12–l. 8]]. −n − (2). Two ordered sets of vectors in R , (A1, ,An) and (B1, ,Bn), have the same orientation ··· ··· if and only if [A1, ,An;B1, ,Bn] > 0 [Courant–John [22, vol. 2, p.196, l.3–l.11]]. ··· ··· (3). Two ordered pairs of independent vectors on the tangent plane of a surface, (ξ,η) and (ξ ′,η′), have the same orientation if and only if [ξ,η;ξ ′,η′] > 0 [Courant–John [22, vol. 2, p.577, (40b)]]. (4). The orientations Ω(π∗(P)) determined by Courant–John [22, vol. 2, p.577, (40a)] from pairs of tangential vectors ξ(P),η(P) vary continuously with P if the unit normal vector ζ given by Courant–John [22, vol. 2, p.578, (40d)] depends continuously on P [Courant–John [22, vol. 2, p.578, l. 19–l. 16]]. The positive unit normal of S is defined by Ω(ζ,ξ,η)= Ω(x,y,z) − − ∗ [Courant–John [22, vol. 2, p.579, (40i)]]. (5). S∗ has the same orientation with respect to two ordered pairs of parameters (u,v) and (u′,v′) d(u′,v′) provided d(u,v) > 0 [Courant–John [22, vol. 2, p.581, (40s)]]. (6). We use (5) [Courant–John [22, vol. 2, p.586, (41e)]] instead of the positive unit normal to generalize the concept of orientation for a surface because on a manifold in higher dimensions there is no unique normal vector or “side” of S we can associate with S [Courant–John [22, vol. 2, p.583, l. 7–p.585, l.1]]. II. In contrast,− although the orientation preserving or reversing for a vector space automorphism given in Spivak [108, vol. 1, p.114, l. 5] is defined the same way as I (2), the setting for defining − the concept of orientation is a non-trivial n-plane bundle [Spivak [108, vol. 1, p.116, l. 3]], a generalization of tangent bundle. The unfamiliar setting and the direct axiomatic approach− to the most general version may blur the essential idea. Therefore, the approach given in Spivak [108, vol. 1, p.114, l. 8–p.118, l. 7] is definitely not suitable for beginners even though providing − − several non-trivial examples in the end is still not good enough. Remark 111. What the paragraph given in Courant–John [22, vol. 2, p.623, l.9–l.15] says is simply Courant– John [22, vol. 2, p.104, The Fundamental Theorem].

23 x x Remark 112. Since u v = C = 0, u = φ(x,y),v = ψ(x,y) are continuous and have continous derivatives yu yv 6

for (x, y) satisfying (4) [Courant–John [22, vol. 2, p.626, l.1–l.4]].

Remark 113. It obviously applies to the whole expression [Courant–John [22, vol. 2, p.640, l. 14]]. − Proof. Applying Courant–John [22, vol. 2, p.637, (26)] to the individual term aχi, we have R(axχi + a(χi)x)dxdydz = τ aχiξdA. Remark 114.RRR The surface integral vanishes [Courant–JohnR R [22, vol. 2, p.640, l. 8]]. − Proof. Let R = P PQ < 2εQ ,τ = ∂R. { | } a,b,c vanish outside P PQ εQ . { | ≤ } Remark 115. Since Karamcheti [64, p.64, (4.11)] is independent of coordinate systems, a rigid motion of space does not change the formula to be proved [Courant–John [22, vol. 2, p.641, l.8–l.9]]. Remark 116. By Courant–John [22, vol. 2, p.630, l. 12–l. 11], cζdA = cdxdy [Courant–John [22, − − τ vol. 2, p.641, l. 10, (29)]]. − R R R R Remark 117. For the same reason that (1,0,Fu) and (0,1,Fv) are tangential vectors to the surface of Courant– 2 m John [22, vol. 2, p.633, (21)][Courant–John [22, vol. 2, p.634, l.7]], A =(gx ,1,0, ,0,0), ,A = 2 ··· ··· (gx ,0, ,0,1) are tangential vectors to ∂s [Courant–John [22, vol. 2, p.650, l.8–l.9]]. m ··· Remark 118. Ω(∂s )= εΩ(x2, ,xm) [Courant–John [22, vol. 2, p.650, l. 6, (40c)]]. ∗ ··· − Proof. Ω(∂s )= εΩ(A2, ,Am) [Courant–John [22, vol. 2, p.649, (39)]] ∗ ··· = εΩ(x2, ,xm) [Courant–John [22, vol. 2, p.197, (82f)]]. ··· Remark 119. The proof of Deo [25, p.152, Theorem 4.5.11] is not as effective and complete as that given in http://www.savory.de/maths17.htm. However, the consideration of the inequality m < 6 [Deo [25, p.152, l.12]] may improve the latter proof somewhat. Remark 120. The definition of t in Eisenhart [27, p.29, l.11] comes from the θ in Weatherburn [124, vol. 1, p.11, l. 7]. − Remark 121. The claim in Hicks [55, p.3, Example 6] follows from Rudin [97, p.196, Theorem 9.18] and Example 4 of http://www.math.ubc.ca/˜feldman/m428/manifolds.pdf. Remark 122. The claim in Hicks [55, p.3, Example 7] follows from the C∞-version [Hicks [55, p.10, Theorem] or O’neill [86, p.161, Theorem 5.4]] of the inverse function theorem [Rudin [97, p.193, Theorem 9.17]]. Remark 123. By O’neill [86, p.186, Exercise 5], M need not be Hausdorff [Hicks [55, p.3, l. 11]]. The concept of Housdorffness becomes relevant to smooth manifolds in passing from a− Riemannian metric to a distance function [Hicks [55, p.3, l. 10–l. 9]], see Lee [73, p.339, l. 10–l. 9]. − − − − Remark 124. [ fX,gY] = f (Xg)Y g(Y f )X + f g[X,Y] prevents the bracket mapping from being a tensor [Hicks [55, p.8, l. 1–p.9,− l.1]]. − Proof. If the Lie bracket were to be a tensor, then [ fX,gY]= f g[X,Y] would hold [Lee [73, p.311, Proposition 12.10; p.318, Lemma 12.24]]. j ∂ ∂ ∂ ∂ ∂ However, [x ∂xi , ∂x j ]= ∂xi [Lee [73, p.188, Proposition 8.28(d)]] and [ ∂xi , ∂x j ]= 0 [Lee [73, p.187, (8.10)]].

24 Remark 125. f [X,Y]=[ f X, f Y] [Hicks [55, p.10, l. 15]]. ∗ ∗ ∗ − Proof. f [X,Y](g)=[X,Y](g f ) [Spivak [108, vol. 1, p.109, l. 5]]. ∗ ◦ − [ f X, f Y](g)=( f X) f (p)[( f Y)g] ( f Y) f (p)[( f X)g] ∗ ∗ ∗ ∗ − ∗∞ ∗ =( f X) f (p)g1 ( f Y) f (p)g2 [g1,g2 C ( f (m)); g1 f = Y(g f ),g2 f = X(g f )] ∗ − ∗ ∈ ◦ ◦ ◦ ◦ = X(g1 f ) Y(g2 f ) ◦ − ◦ = XY(g f ) YX(g f ). ◦ − ◦ Remark 126. Hicks [55, p.13, l.5–l.9] says the definition of submanifold is based on Spivak [108, vol. 1, p.59, Theorem 10 (2)]; see Spivak [108, vol. 1, p.63, l.3]. Hicks [55, p.13, l.11–l.13] is based on Spivak [108, vol. 1, p.63, the upper figure]. Hicks [55, p.13, l. 8–l. 5] says the production of submanifolds is also based on Spivak [108, vol. 1, p.59, Theorem− 10− (2)]; we may identify f as i because f is univalent. The equivalence of the immersion given in Hicks [55, p.13, l. 13] and that given in Spivak [108, vol. 1, p.60, l. 2] follows from Jacobson [61, vol.2, p.45,− Theorem − 4]. Hicks [55, p.14, l.3–l.5] follows from Dugundji [26, p.343, Theorem 3.3]. Hicks [55, p.16, l.2–l.11] is based on Spivak [108, vol. 1, p.62, the lower figure]. The writing style given in Hicks [55, Sec. 1.6] is not reader-friendly. Remark 127. Hicks [55, p.15, Problem 13].

σ ′′(0) 2 2 Proof. σ(t)= σ(0)+ 2! [t + o(t )] [Spivak [108, vol. 1, p.222, l.1 & l.3]]. 2 2 = σ(0)+[X,Y]m[t + o(t )] [Spivak [108, vol. 1, p.224, Theorem 16]]. Hence, γ(t)= γ(0)+[X,Y]m[t + o(t)].

Remark 128. The definition of D¯ XY [Hicks [55, p.18, l. 2]] originates from O’neill [86, p.78, Lemma 5.2]. Hicks [55, p.20, (5)] follows from the formula− given in Spivak [108, vol. 1, p.212, l.8]. By O’neill [86, p.127, Theorem 1,4; p.148, Lemma 3.8] and §Hypersurfaces in n-dimensional space in https://en.wikipedia.org/wiki/Normal_(geometry), such an N always exists locally [Hicks [55, p.21, l. 9]]. The definition of conjugacy between two nonvectors in Mp [Hicks [55, p.24, l. 14–l. −13]], where M is a hypersurface in Rn, originates from the definition − − of conjugacy between two nonvectors in Sp [Carmo [12, p.150, l.1–l.2]], where S is a surface in R3. For the details of Hicks [55, p.25, l. 9-l. 7], see Kreyszig [66, p.97, l. 10–p.98, l. 1]. − − − − Remark 129. Since L is self-adjoint there are always (n 1) independent directions of curvature [Hicks [55, − p.24, l.20]].

Proof. η = L [Hicks [55, p.22, l.7–l.8]]. By Halmos∗ [50, p.156, Theorem 1], L is diagonalizable.

Remark 130. Hicks [55, p.26, (8)] should have been corrected as DXY = D¯ XY+ < LX,Y > N.

Remark 131. D¯ T T = 0 [Hicks [55, p.27, l. 3]]. 6 − Proof. By Hicks [55, p.27, l. 7], g is not a geodesic in Rn. − By Hicks [55, p.19, l. 5], D¯ T T = 0. − 6 Remark 132. Hicks [55, p.27, Corollary].

Proof. By Hicks [55, p.27, l. 4], D¯ T T = 0. − 6 By Hicks [55, p.27, l.5], DT T = 0. D¯ T T = < LT,T > N [Hicks [55, p.26, (8)]]. − 25 For every t belonging to the domain of g, D¯ T T(t) is normal to both M1 and M2. Namely, both M1 and M2 have the same tangent hyperplane.

Remark 133. Hicks [55, p.28, (9)] follows from Hicks [55, p.21, l.12]. The dependence of L(X) on N follows from Hicks [55, p.21, (7)]. Remark 134. The fact that a closed connected surface is developable iff K 0 [Hicks [55, p.32, l. 8–l. 7]] follows from Kreyszig [66, p.174, Theorem 5.2] and Lee [73,≡ p.99, Proposition 5.1]. − − c2 Remark 135. K(t,s)= (1 2as+a2s2+c2s2)2 [Hicks [55, p.34, l. 5]]. − − − Proof. I. Let y(t,s)= f (t)+ sX(t). yt = A =(1 a(t)s)T(t)+ scN(t) [Hicks [55, p.33, l.12]]. − ys = X(t); yss = 00. yt ys =(as 1)N(t) scT(t). × 2 − 2 − EG F = yt ys [O’neill [86, p.47, Lemma 1.8]] − k × k =(as 1)2 +(sc)2. − II. In view of the proof of Struik [109, p.190, (5-5)], the formula still holds even if i i = 1. ′ × ′ 6 Let x,i be f (t),X(t) given in Hicks [55, p.32, Fig. 2.5] respectively. Then (xxiiiii′)=( f (t)X(t)D¯ T X) = c(TXN) [Hicks [55, p.33, l.8]] = c [Hicks [55, p.33, l.2]]. − ∞ Remark 136. By Lee [73, p.176, Lemma 8.6], we may imbed Xp in a C field X about p [Hicks [55, p.36, l. 2–l. 1]]. By Hicks [55, p.24, l. 14], we may let L = fI [Hicks [55, p.37, l.1]]. By Hicks − − − [55, p.37, l.11] we may assume h > 0, so h2 hk [Hicks [55, p.41, l.1–l.2]]. ≥ Remark 137. In order to prove that Pa t is a linear isomorphism [Hicks [55, p.57, l. 2]], by Jacobson [61, vol. , − 2, p.45, Theorem 4], it suffices to prove that Pa,t is injective. Remark 138. For a detailed proof of Hicks [55, p.70, Theorem], read Lee [73, p.339, Theorem 13.29].

Remark 139. a1(p) = 0 [Hicks [55, p.124, l.14]]. 6 Proof. If a1(p) were zero, then we would have ∂ (p)= Xp [Hicks [55, p.124, l.13]] ∂x1 n ∂ = ∑ ai(p) (p) [Hicks [55, p.124, l.13–l.14]]. 2 ∂xi Then we would have a contradiction because ∂ (p), , ∂ (p) are linearly independent. ∂x1 ··· ∂xn ∂ d d Remark 140. F ( )= X [Hicks [55, p.124, l. 1]] follows from c ( t)= t [Spivak [108, vol. 1, p.110, ∗ ∂u1 − ∗ dt | dt | l.4]]. 1 By Hicks [55, p.124, l. 5], F(0,a2, ,an)= φ (0,a2, ,an) [Hicks [55, p.125, l.1–l.2]]. − ··· − ··· ∂ ∂ Remark 141. F ( )origin = (p) for i = 2, ,n [Hicks [55, p.125, l.2]]. ∗ ∂ui ∂xi ··· Proof. x = φ. ∂ ∂ F [( )origin]( f )=(F [( )origin]) F(0)=p( f ) ∗ ∂ui ∗ ∂ui ∂ | = ( f F) 0 [Spivak [108, vol. 1, p.109, l. 5]] ∂ui ◦ | − = ∂ ( f φ 1) = ∂ ( f x 1) x(p) ∂ui − 0 ∂ui − ∂ f ◦ | ◦ ◦ = p [Spivak [108, vol. 1, p.46, l.9]]. ∂xi | 26 Remark 142. F 0 = I [Hicks [55, p.125, l.1–l.2]]. ∗ ∂ Proof. F ( )b = XF(b) [Hicks [55, p.124, l. 1]] ∗ ∂u1 ∂ − F ( )0 = Xp ∗ ∂u1 ⇒ ∂ = ∂u (p) [Hicks [55, p.124, l. 16]]. 1 − n The matrix of F with respect to the basis ∂u1, ,∂un of TR and the basis ∂x1, ,∂xn of { ··· } { ··· } Mp is the identity matrix [Hicks [55, p.125, l.2]]. Remark 143. // [Hicks [55, p.125, l.7]].

∂ Proof. (a). b B(0,r),F (( )b)= XF(b) [Hicks [55, p.125, l.3]]. ∀ ∈ ∗ ∂u1 (b). yi = ui F [Hicks [55, p.125, l.7]]. 1 ∂ f 1 ◦∂( f y− ) y− = ◦ [Spivak [108, vol. 1, p.46, l.9]] ∂y1 ◦ ∂u1 ∂( f F) 1 = ◦ [F 0 = I, so we may define a coordinate system y = F− in a neighborhood of p] ∂u1 ∗ = F ( ∂ )( f ) [Spivak [108, vol. 1, p.109, l. 5]] ∗ ∂u1 − =(X f )(F(b)) [by (a)]. Remark 144. dim P¯ = k 1 [Hicks [55, p.125, l. 17]]. − − Proof. Y2, ,Yk are linearly independent and belong to P¯, so dim P¯ k 1. ··· ≥ − X1 does not belong to P¯, so dim P¯ < k.

Remark 145. P¯p (V0)p [Hicks [55, p.125, l. 12]] follows from Spivak [108, vol. 1, p.107, l. 2]. ⊂ − − Remark 146. Hicks [55, p.125, l. 6–l. 4] says the following: − − “First project U onto π(U) so that we may borrow the coordinate system z2, ,zn of π(U). ··· Then construct integral curves with parameter y1 and initial points in V0 as in Spivak [108, vol. 1, p.205, Figure].” This is the case that words explain more clearly than mathematical formulas. ∂ ∂ Remark 147. In order to prove [ , , ]= P, it suffices to prove that Yix j = 0 (i = 1, ,k; j = k + ∂x1 ··· ∂xn ··· 1, ,n) [Hicks [55, p.125, l. 3–l. 1]]. ··· − − ∂ ∂ Proof. By Spivak [108, vol. 1, p.107, Theorem 3], Yi [ , , ] (i = 1, ,k). ∈ ∂x1 ··· ∂xn ··· Y2, ,Yk are linearly independent and Y1 cannot be expressed as their linear linear combination. ···

Remark 148. φ Yr = er [Hicks [55, p.127, l. 15]]. ∗ − 1 ... 0 0 ... 0 . . . . Proof. φ = . .. . 0 .. 0 . ∗   0 ... 1 0 ... 0     Remark 149. By O’neill [86, p.161, Theorem 5.4], there is a neighborhood V of a and a map F which is a diffeo of V onto F(V) U [Hicks [55, p.127, l. 14–l. 13]]. ⊂ − − Remark 150. By O’neill [86, p.160, Definition 5.3] and F (er)= Yr for r k, (3) follows [Hicks [55, p.127, l. 10–l. 9]]. ∗ ≤ − − Remark 151. Kreyszig [66, p.82, Theorem 24.1] follows from https://math.stackexchange.com/ questions/784972/show-that-two-intersecting-curves-on-a-regular- surface-with-the-same-osculating.

27 Remark 152. By O’neill [86, p.218, l. 7], any point of an elliposid is an elliptic point [Kreyszig [66, p.84, l. 4–l. 1]]. Any point of− a hyperboloid of one sheet [Bell [2, p.99, (2); p.100, Fig. 30]] is a − − a K 1 a i hyperbolic point. Proof: Replace 3 in = a2a2a2 Z 4 [O’neill [86, p.218, l. 7]] with 3 . By 1 2 3 − O’neill [86, p.215, l. 2], any point of a hyperbolic|| || paraboloid is a hyperbolic point [Kreyszig [66, p.85, l.10–l.11]].− Remark 153. The argument given in Kreyszig [66, p.138, l. 14–l. 13] is not as clear as that given in Struik − − [109, p.129, l.9–l.22]. i j l i j l Remark 154. Γl jiu ′ u ′ u ′ = Γil ju ′ u ′ u ′ [Kreyszig [66, p.141, l.16]].

i j l j i l Proof. Γl jiu ′ u ′ u ′ = Γli ju ′ u ′ u ′ [rename i, j] i j l = Γli ju ′ u ′ u ′ .

Remark 155. By Lee [73, p.331, Proposition 13.9; p.332, Theorem 13.14 (a) (b)], we may introduce on S 1 2 ⇒ coordinatesu ˜ ,u˜ with origin P so thatg ˜11 = g˜22 = 1 andg ˜12 = 0 at P [Kreyszig [66, p.148, l.9–l.10]].

Remark 156. By Rudin [99, p.292, Theorem 13.11(a) (b)], the part F∗ of S∗ which is bounded by C∗ and C0∗ can be mapped conformally onto a plane⇔ annulus bounded by two concentric circles [Kreyszig [66, p.209, l. 11–l. 10]]. − − Remark 157. For Lee [73, p.13, Proposition 1.17(a)], see Example 6.116 of http://www.lcwangpress. com/papers/methods.pdf. Remark 158. By solution 2 of https://math.stackexchange.com/questions/1165705/a-vector- space-over-an-infinite-field-is-not-a-finite-union-of-proper-subspaces and solution 1 of https://math.stackexchange.com/questions/381731/for- linear-subspaces-of-equal-dimension-there-exists-a-common-complement, it is possible to find a subspace Q of dimension n k whose intersections with both P and P′ are trivial [Lee [73, p.24, l. 6–l. 5]]. − − − Remark 159. For the solution of Lee [73, p.45, Exercise 2.27], see https://math.stackexchange. com/questions/2612420/giving-a-counterexample-for-the-extension- lemma-of-smooth-functions. Remark 160. For the proof of Lee [73, p.59, Proposition 3.14], see Tu [117, p.371, 8.7].

Remark 161. The fundamental group is a covariant functor from Top* to Grp [Lee [73, p.75, l.9–l.10]]: Proof. Read Munkres [83, p.334, Theorem 52.4]. Remark 162. For the solution of the last problem in Lee [73, p.79, Exercise 4.4], see https://math. stackexchange.com/questions/2632426/showing-that-the-composition- of-maps-of-constant-rank-does-not-have-to-be-of-con. Remark 163. For the solution of Lee [73, p.87, Exercise 4.24], see https://math.stackexchange. com/questions/2169483/smooth-embedding-that-isnt-an-open-or-closed- map. Remark 164. For the solution of Lee [73, p.89, Exercise 4.27], see https://math.stackexchange. com/questions/2646999/smooth-map-that-is-a-topological-submersion- but-not-a-smooth-submersion. Remark 165. For each n 1, the map q : Sn RPn defined in Example 2.13(f) is a two-sheeted smooth covering map≥ [Lee [73, p.92, l.2–l.4]].→

28 Proof. RPn is a topological manifold [Lee [73, p.6, Example 1.5]]. RPn is a smooth manifold [Lee [73, p.21, Example 1.33]]. q : Sn RPn is a smooth map [Lee [73, p.38, Example 2.13(f)]]. → RPn is connected because Rn+1 0 is connected and π : Rn+1 0 RPn is continuous. \ { } \ { } → ιˆ and πˆ in Lee [73, p.38] are local diffeomorphisms, so q = π ι is a local diffeomorphism. q is a smooth covering map [Lee [73, p.91, Proposition 4.33(c)]].◦

Remark 166. For a proof of Lee [73, p.96, Problem 4-9], see https://math.stackexchange.com/ questions/3556355/showing-smooth-structure-of-a-covering-space-of- a-smooth-manifold-is-unique. Remark 167. These are the only topology and smooth structure on S with this property [Lee [73, p.109, l. 2– − l. 1]]. − Proof. The only way to make F : N F(N) a homeomorphism is to establish a one-to-one cor- → respondence between the open sets in N and those in F(N) via the bijection F. The only way to make F : N F(N) a diffeomorphism is to establish a one-to-one correspon- → dence between the coordinate charts in the maximal atlas on N and those on F(N) via the bijec- tion F.

Remark 168. For a proof of Lee [73, p.110, Exercise 5.20], see https://math.stackexchange.com/ questions/3468476/exercise-5-20-from-john-lees-ism-every-open-subset- of-a-immersed-submanifold. Remark 169. Lee [73, p.116, Proposition 5.35] follows from Lee [73, p.70, Proposition 3.23; p.81, l. 14– l. 13; p.116, l.1–l.6]. − − Remark 170. Lee [73, p.117, Corollary 5.39].

Proof. v TpS = Ker dΦ [ Lee [73, p.117, Proposition 5.38]] ∂Φ1 ∈ ∂Φ1 1 ∂x1 ∂xm v . ···...... = 0 ⇔  . .  .  ∂Φk ∂Φk vm  ∂x1 ··· ∂xm      v1 ∂Φi ∂Φi . ( i1 i k) 1 m . = 0 ⇔ ∀ ≤ ≤ ∂x ··· ∂x     vm m j ∂ i   ( i1 i k)(∑ j=1 v j )[Φ ]=0.  ⇔ ∀ ≤ ≤ ∂x Remark 171. Lee [73, p.118, Exercise 5.40].

Proof. 1. Φ ι is constant on S. ◦ dΦp dιp = 0. Namely, ◦ Im dιp Ker dΦp. ⊆ 2. dimKer dΦp = dimTpM rank Φp [Lee [73, p.627, Corollary B.21]] − = dimTpS [Lee [73, p.105, Theorem 5.12]]. Remark 172. For a proof of Lee [73, p.118, Proposition 5.41], see https://math.stackexchange. com/questions/892306/inward-and-outward-pointing-tangent-vectors. Remark 173. Lee [73, p.119, Exercise 5.44].

29 Proof. (a). f 1(0)= ∂M [Lee [73, p.118, l. 10]]. − − (b). For a vector v tangent to ∂M, v f = 0 [by (a)]. (c). For an inward-pointing tangent vector v, v f > 0 [by (a) and the coordinate representation, (x M ∂M) f (x) > 0]. ∈ \ ⇒ (d). For an outward-pointing tangent vector v, v f < 0 [by (c) and Lee [73, p.118, l. 14–l. 13]]. − −

Remark. By Lee [73, p.106, Corollary 5.14], ∂M is a (n 1)-dimensional embedded sub- − manifold. In p.22, the proof of Theorem 67 in https://wj32.org/wp/wp-content/ uploads/2012/12/Introduction-to-Smooth-Manifolds.pdf, the following state- ments can be proved by intuition: ( ∂ f )(p)= 0(i = 1,...,n 1) and we may choose ( ∂ f )(p) > 0. ∂xi − ∂xn Remark 174. Lee [73, p.123, Problem 5-3].

Proof. (a) follows from Lee [73, p.81, Theorem 4.12]; (b) and (c) follow from Lee [73, p.87, Proposition 4.22].

Remark 175. For a proof of Lee [73, p.123, Problem 5-4], see https://math.stackexchange.com/ questions/1791853/intuition-of-immersed-versus-embedded-submanifolds. Remark 176. For a proof of Lee [73, p.123, Problem 5-5], see https://math.stackexchange.com/ questions/893692/why-is-the-irrational-winding-of-the-torus-not- locally-path-connected?noredirect=1 and https://math.stackexchange. com/questions/537125/dense-curve-on-torus-not-an-embedded-submanifold. Remark 177. For a proof of Lee [73, p.124, Problem 5-12], see https://math.stackexchange.com/ questions/3568235/problem-5-12-john-lees-smooth-manifolds-a-smooth- covering-map-restricted-to-a. Remark 178. For a proof of Lee [73, p.124, Problem 5-13], see https://math.stackexchange.com/ questions/2755709/showing-a-dense-curve-in-the-torus-is-weakly-embedded. it is is iαt Remark 1. ϕ(e ,e )=(t,e − ). Remark 2. (Correction for the last statement of the first answer: γ( ε,ε) is a path component of − U S. By Lee [73, p.608, Proposition A.43(a)], we may let this path component be V S, where ∩ ∩ V is open in T2. Then f 1(γ( ε,ε)) = f 1(V) is open in M. − − − Remark 179. For a proof of Lee [73, p.124, Problem 5-15], see https://math.stackexchange.com/ questions/729104/immersed-submanifold. Remark 180. Lee [73, p.124, Problem 5-18].

Proof. (a). Let f : ( π,π) S1 1 − → \ {− } t eit . Then 7→ 1 the domain of f − cannot be extended to C. (b). Let g : ( ∞,0) 0 R − × { } → (x,0) 1/x. Then the domain7→ − of g cannot be extended to R2.

Remark 181. For a proof of Lee [73, p.124, Problem 5-20], see https://math.stackexchange.com/ questions/2525649/tangent-space-of-an-immersed-submanifold.

30 Remark 182. For a proof of Lee [73, p.124, Problem 5-21], see https://math.stackexchange.com/ questions/896589/if-b-is-a-regular-value-of-f-f-1-infty-b-is-a- regular-domain. Remark 183. For a proof of Lee [73, p.124, Problem 5-22], see https://math.stackexchange.com/ questions/3567228/lees-smooth-manifolds-problem-5-22-proving-theorem- 5-48-on-the-existence-of-de. Remark 1. The fact that 0 is a regular value of f follows from the formula given in Lee [73, p.119, l.5]. Remark 2. The case when D is compact: By Lee [73, p.46, Proposition 2.28], we may obtain an exhausting function f 0 1 for the open 0 ≥ submanifold M D. Then replace f0 with f . If one checks the proofs of the results for the old f when D is not restricted\ to the compact case, one will find these proofs remain valid for the new f when D is compact except that trivial changes are required. Remark 184. A proof of Lee [73, p.148, Problem 6-14] follows from Lee [73, p.101, Proposition 5.7] and https://mathoverflow.net/questions/73334/intersection-of-non-transverse- submanifolds. Remark 185. Lee [73, p.148, Problem 6-15, (1) (3)]. ⇔ Proof. : T (M N)= Tp(M) T (N) [Lee [73, p.59, Proposition 3.14]] ⇒ (p,g(p)) × ⊕ g(p) = T(p,g(p))S) T(p,g(p))( p N) [(1) (2) and Lee [73, p.55, Proposition 3.6(d)]]. : ⊕ { }× ⇔ ⇐ I. Since S p N is a one-point set, πM S is bijective. II. For every∩ {p}×M, the submanifolds S and| p N intersect transversely. ∈ { }× Since dim( p N)= n < dim(M N), we must have dim d(πM S) (T S) m. { }× × | (p,q) (p,q) ≥ Consequently, dim d(πM S) (T S)= m. | (p,q) (p,q) III. d(πM S) is injective locally (by I and the fact that d(πM S)= πM S locally). Therefore, | | | dim d(πM S) (T S)= dim T S. | (p,q) (p,q) (p,q) IV. dim T(p,q)S = m [by II and III]. Thus, for every p M, d(πM S) : T S TpM is a vector space isomorphism. ∈ | (p,q) (p,q) → By Lee [73, p.79, Theorem 4.5], πM S : S M is a local diffeomorphism. | → By Lee [73, p.80, Proposition 4.6(f)], πM S : S M is a diffeomorphism. | → The desired result follows from (1) (2). ⇔ Remark 186. https://math.stackexchange.com/questions/902313/why-are-clopen-sets- a-union-of-connected-components proves that H is a union of components [Lee [73, p.156, l. 16]]. − Remark 187. By Munkres [83, p.480, Theorem 79.2; p.481, Lemma 79.3], Autπ (E) acts transitively on each fiber of π if and only if π is a normal covering map [Lee [73, p.163, l. 8–l. 7]]. − − Remark 188. For a proof of Lee [73, p.202, Problem 8-18(b)], see https://math.stackexchange. com/questions/1967049/f-related-vector-field/1967816. For a proof of Lee [73, p.202, Problem 8-18(c)], see https://math.stackexchange.com/questions/ 3207333/f-related-vector-fields-and-surjective-submersion.

Remark 189. die(Xe)= Xe in p.50, l.16–l.17 [Lee [73, p.203, Problem 8-25]] in https://wj32.org/ wp/wp-content/uploads/2012/12/Introduction-to-Smoot− h-Manifolds.pdf follows from pp. 31–32, Theorem 96(2) [Lee [73, p.171, Problem 7-2(b)]] in https://wj32.

31 org/wp/wp-content/uploads/2012/12/Introduction-to-Smooth-Manifolds. pdf. Remark 190. These definitions agree where they overlap [Lee [73, p.214, l.9]].

(p) Proof. [t1 δ,t1 + δ] U1 W [Lee [73, p.214, l.4]] t1 D . − × ⊂ (θ(t ,⇒p)) ∈ (t t1,θ(t1, p)) ( ε,ε) U0 W t t1 D 1 . − ∈ − × ⊂ ⇒ − ∈ The desired result follows from Lee [73, p.211, (9.7)] [Lee [73, p.213, l. 20–l. 19]]. − − Remark 191. Lee [73, p.222, Theorem 9.24] follows from Lee [73, p.219, l. 9–l. 7]. − − Remark 192. By Lee [73, p.83, Theorem 4.14(c)], it suffices to show that each p M has a neighborhood 1 ∈ 1 U such that D π− (U) is an embedded submanifold (possibly with boundary) in π− (U) E [Lee [73, p.265,∩ l. 6–l. 4]]. ⊆ − − Remark 193. For a proof of Lee [73, p.270, Problem 10-9], see https://math.stackexchange.com/ questions/1666582/extension-of-smooth-functions-on-embedded-submanifolds and p.62, l. 12 in https://wj32.org/wp/wp-content/uploads/2012/12/Introduction- − to-Smooth-Manifolds.pdf. Remark 194. For a proof of Lee [73, p.270, Problem 10-11], see https://math.stackexchange. com/questions/783769/why-is-this-bundle-homomorphism-a-isomorphism.

Remark 195. The concrete meaning of the transition function τ∗(p) [Lee [73, p.277, l.13]] when the dual bundle is the cotangent bundle.

Solution. By Lee [73, p.64, (3.12)], 1 1 v˜1 ∂x˜ ∂x˜ v1 ∂x1 ··· ∂xn . . . . . v˜ =  .  =  . .. .  .  = Avv. v˜n ∂x˜n ∂x˜n vn    ∂x1 ··· ∂xn   By changing  (x,v,x˜,v˜) to (x˜,v˜,x,v), we have 1 1 v1 ∂x ∂x v˜1 ∂x˜1 ··· ∂x˜n . . . . . 1 v =  .  =  . .. .  .  = A− v˜ vn ∂xn ∂xn v˜n    ∂x˜1 ··· ∂x˜n   = τ(p)v˜ [Lee [73, p.277, l.4; p.253, l.6]]. The formula given in Lee [73, p.277, l.4] can be written as ξ 1 . ξ =  .  ξ n 1  n ∂x˜  ∂x˜ ξ˜1 ∂x1 ··· ∂x1 . .. . . =  . . .  .  [Lee [73, p.277, l.4; p.253, l.3 & l.6]] ∂x˜1 ∂x˜n ξ˜n  ∂xn ∂xn   ··· T 1 T = Bξ . Then B = A=( τ−(p)) = τ∗(p). Remark 196. Lee [73, p.299, Problem 11-2].

Proof. Let B [B∗, B∗∗, resp.] be a basis of V [V ∗, V ∗∗, resp.]. By solution 1 in https://math. stackexchange.com/questions/35779/what-can-be-said-about-the-dual-

32 space-of-an-infinite-dimensional-real-vector-spa and Lee [73, p.620, Ex- ercise B.5(b)], B∗ > B ( ). (a).| | By| Dugundji| ∗ [26, p.46, Definition 7.3], injection B B . ∃ → ∗ (b). If injection B B, then B B which contradicts ( ). ∃ ∗ → | ∗|≤| | ∗ (c). By ( ), ∗ B > B ( ). | ∗∗| | | ∗∗ If injection B B, then B B which contradicts ( ). ∃ ∗∗ → | ∗∗|≤| | ∗∗ Remark. We want to prove the following two statements in https://math.stackexchange. com/questions/35779/what-can-be-said-about-the-dual-space-of-an- infinite-dimensional-real-vector-spa: (1). V = d F . | | | | (2). f c c F 0 is linearly independent. { | ∈ \ { }} Proof. (1). Every v V can be represented as a finite subset of B F. Then (1) follows from Dugundji [26, p.54, 8.8].∈ × (2) follows from Jacobson [61, vol. 2, p.24, Exercise 1]. Remark 197. For the detailed proofs of Lee [73, p.322, Proposition 12.32 (b), (c) & (d)], read p.2, l. 14– p.3, l.4; p.4, l.1–p.6, l. 12; p.7, l. 5–l. 1 in https://webspace.science.uu.nl/− − − − ˜ban00101/lecnotes/lieder.pdf. Remark 198. dθ : T k(T M) T k(T M) is linear and smooth (hence continuous), so t∗0 θ∗t (p) p∗ d → ds s=0d(θt0 )∗pd(θs)θ∗ (p)(Aθs(θt (p))) | t0 0 d = d(θt0 )∗p ds s=0d(θs)∗θ (p)(Aθs(θt (p))) [Lee [73, p.324, l. 16–l. 15]]. | t0 0 − − Remark 199. For a proof of Lee [73, p.331, Exercise 13.10], see https://math.stackexchange. com/questions/3409110/on-the-pullback-of-a-riemannian-metric. Remark 200. For a proof of Lee [73, p.346, Problem 13-17], see https://www.ams.org/journals/ proc/1961-012-06/S0002-9939-1961-0133785-8/S0002-9939-1961-0133785- 8.pdf. Remark 1. For the information about conformal metrics, read https://math.stackexchange. com/questions/1677824/on-conformal-metrics-notation. Conformal met- rics d,d′ generate the same topology. Remark 2. For all x,y M, if S(x,r(x)) S(y,r(y)) = ∅, then r(x) r(y) = d(x,y); otherwise, ∈ ∩ 6 | − | r(x) r(y) < d(x,y). | − | Remark 3. ω(x) can be constructed by using partitions of unity. That is, ω(x) is constructed by the formula given in Lee [73, p.46, l. 10] except that j [the coefficient of ψ j] should be replaced 1 − with the maximum of r− on an appropriate compact set. Remark 4. (M,d′) is complete.

Proof. Let xn be a Cauchy sequence in (M,d ). { } ′ We may assume xn S (x1,1/3). Since S (x1,1/3) S(x1,r(x)/2) and the latter set is rela- { }⊂ ′ ′ ⊂ tively compact in (M,d), x S¯(x1,r(x)/2) : xn x. Note that x S¯ (x1,1/3). ∃ ∈ → ∈ ′ Remark 201. Lee [73, p.361, Lemma 14.16 (b)] follows from Lee [73, p.353, Proposition 14.8; p.355, (14.3) & l. 4–l. 1]; Lee [73, p.361, Lemma 14.16 (c)] follows from Lee [73, p.361, Lemma 14.16 (b); − − p.285, Proposition 11.25].

33 Remark 202. d satisfies (iv) [Lee [73, p.366, l.6]]. The meaning of this statement: In terms of a coordinate chart, the meaning in the generalized sense and that in the narrow sense should be consistent. The proof of the statement follows from Lee [73, p.363, (14.19); p.278, Proposition 11.11(d)]. Remark 203. We may use Lee [73, p.322, Proposition 12.32(c)] to prove Lee [73, p.372, Exercise 14.34]. V f = LV f [the last identity in Lee [73, p.373, l.3]] follows from Lee [73, p.322, Proposition 12.32(a)]. Remark 204. For a solution of Lee [73, p.374, Problem 14-2], see https://math.stackexchange. com/questions/1017259/for-which-k-n-the-k-covector-is-decomposable- 14-2-from-lee. Note that in answer 1 the argument is incorrect if 2 k n and k is odd. ≤ ≤⌊ 2 ⌋ Remark 205. By Lee [73, p.163, l. 8–l. 7], Autπ (E) acts transitively on each fiber of π [Lee [73, p.392, l. 10–l. 9]]. − − − − Remark 206. By Munkres [83, p.345, Theorem 54.4] and Lee [73, p.392, Theorem 15.36; p.397, Problem 15-3], RPn is orientable if and only if n is odd [Lee [73, p.393, l.4–l.5]].

Remark 207. By Munkres [83, p.488, Theorem 81.2] and Lee [73, p.163, l. 6], π1(M, p)/H is isomorphic to − Autπˆ (Mˆ ) [Lee [73, p.397, l.6–l.7]]. Remark 208. For a proof of Lee [73, p.397, Problem 15-3], see https://math.stackexchange.com/ questions/1560981/the-antipodal-map-is-orientation-preserving-iff- n-is-odd. Remark 209. For a proof of Lee [73, p.397, Problem 15-4], see https://math.stackexchange.com/ questions/4163920/prove-that-flow-map-theta-t-is-orientation-preserving.

Remark 210. U ∂Wi ∂Bi [Lee [73, p.409, l.5]] can be proved using the concept of limit point. ∩ ⊂ Remark 211. Without the supplement of https://math.stackexchange.com/questions/2901181/ left-invariance-of-differential-forms-vs-left-invariance-of-vector- fields, the discussion given in Lee [73, p.410, l.17–p.411, l.13] is essentially incomplete. Remark 212. For a proof of Lee [73, p.415, Exercise 16.18], see https://math.stackexchange. com/questions/2855110/are-the-spaces-mathbbhn-and-overline-mathbbrn- homeomorphic. Remark 213. For a proof of Lee [73, p.423, Exercise 16.31], see https://math.stackexchange. com/questions/4163799/divergence-operator-on-an-oriented-riemannian- manifold-does-not-depend-on-the-ch. Remark 214. For a proof of Lee [73, p.430, Proposition 16.38(c)], see https://math.stackexchange. com/questions/623511/pullback-of-density. Remark 215. Lee [73, p.433, Exercise 16.47] follows from Lee [73, p.389, Exercise 15.30; p.433, Exercise 16.46(a)]. Remark 216. By Lee [73, p.381, l.18] and https://math.stackexchange.com/questions/903440/ why-is-the-integral-of-any-orientation-form-over-mathbbs1-non-zero, any orientation form on S1 has nonzero integral [Lee [73, p.450, l.11–l.12]]. 1 Remark 217. By Munkres [83, p.345, Theorem 54.5], Hom (π1(S ,1),R) is 1-dimensional [Lee [73, p.450, l.13]]. Remark 218. The groups on both sides are trivial when p > 1 [Lee [73, p.450, l. 9–l. 8]]. − −

34 Proof. Let k p 1, p . ∈ { − } Hk (U) Hk (V) Hk (Rn) Hk (Rn) [since U,V are diffeomorphic to Rn; by Lee [73, p.446, dR ⊕ dR ≃ dR ⊕ dR Corollary 17.12]] = 0 [by Lee [73, p.447, Corollary 17.16]].

p n p 1 Remark 219. H (S )) H − (U V) [Lee [73, p.450, l. 8]]. dR ≃ dR ∩ − F1 p 1 F2 p n F3 Proof. 0 HdR− (U V) HdR(S )) 0. −→ ∩ −→ −→p n 0 = Im F1 = kerF2; Im F2 = kerF3 = HdR(S )). p n HdR(S )) = Im F2 p 1 HdR− (U V)/kerF2 [by the vector space version of Jacobson [61, vol. 1, p.44, l. 11–l. 8]] ≃ p 1 ∩ − − H − (U V) [since kerF2 = 0]. ≃ dR ∩ Remark. The vector space version of Jacobson [61, vol. 1, p.44, l. 11–l. 8] is often written − − in the form of Jacobson [61, vol. 2, p.45, Theorem 4] perhaps because the algebra textbook authors all copy a theorem’s version from van der Waerden’s Modern Algebra [van der Waerden [120, vol.2, p.105, l. 9–l. 8]] or because all the algebraists never encounter the Mayer–Vietoris theorem [Lee [73, p.449,− Theorem− 17.20]]. Remark 220. For a proof of Lee [73, p.450, Exercise 17.22], see https://math.stackexchange. com/questions/43056/exact-differential-n-forms. Remark 221. Lee [73, p.453, Theorem 17.28].

n n Proof. Let Φ : Z (R ) R,ω n ω. c → 7→ R If ω Bn(Rn), then η Ωn 1(Rn) : ω = dη. ∈ c ∃ ∈ c− R Take balls B ,B : B B¯ B supp η. Then ′ ′ ⊃ ⊃ ⊃ Rn ω = Rn dη = B¯ dη = ∂B¯ η = 0. ′ n n ′ Thus, Φ induces Φ¯ : H (R ) R,[ω] n ω. R R R c R → 7→ R ¯ is well-defined, linear and surjective. Φ R Lee [73, p.453, Theorem 17.28] follows from Lee [73, p.452, Lemma 17.27].

Remark 222. For a proof of Lee [73, p.469, Lemma 18.2], see https://math.stackexchange.com/ questions/2941050/boundary-operator-of-n-chain-composed-with-itself- is-zero. Remark 223. For a proof of Lee [73, p.470, Proposition 18.3(a)], see Hatcher [52, p.110, Proposition 2.8]. For a proof of Lee [73, p.470, Proposition 18.3(b)], see Hatcher [52, p.109, Proposition 2.6 & Propo- sition 2.7] and p.103, Proposition 6.6 in http://math.hunter.cuny.edu/thompson/ topology_notes/chapter%20six.pdf. In view of p.103, l. 9–l. 8 in http://math. hunter.cuny.edu/thompson/topology_notes/chapter%20s− − ix.pdf, we see that the formulation of Lee [73, p.470, Proposition 18.3(b)] is incorrect. Lee [73, p.470, Proposition 18.3(c)] is Hatcher [52, p.111, Corollary 2.11]. By Hatcher [52, p.111, 5], Hatcher [52, p.111, − Corollary 2.11] follows from Hatcher [52, p.111, Theorem 2.10]. The statement given in Lee [73, p.471, l.6–l.8] means (i0) =(i1) whose proof is similar to the one given in Hatcher [52, ∗ ∗ p.113, l.1–l.4]. The map h : Cp(M) Cp+1(M I) satisfying Lee [73, p.471, (18.2)] may be defined as follows: → × i h(σ)= Σi( 1) (σ ½) [v0, ,vi,wi, ,wn]. Then − × | ··· ··· P = F h [Hatcher [52, p.112, l.21]] and ◦

35 f =(F i0) = F (i0) = F (i1) =(F i1) = g [Lee [73, p.445, l. 8]; Hatcher [52, p.113, l.4]].∗ ◦ ∗ ∗ ◦ ∗ ∗ ◦ ∗ ◦ ∗ ∗ − Remark 224. The meaning of “dimensional reason” [Lee [73, p.493, l. 11]] is given by https://math. stackexchange.com/questions/2382017/1-form-criterio− n-for-smooth- distribution. By Lee [73, p.277, l.12], the rough sections Ei(i = 1, ,n) [Lee [73, p.493, l. 9]] are smooth. The local frame criterion [Lee [73, p.493, l. 7]] refers··· to Lee [73, p.258, − − Proposition 10.19].

Remark 225. By Lee [73, p.258, Example 10.18], we may choose a smooth local frame X1, ,Xk for D [Lee [73, p.497, l.13]]. ··· ∂ ∂ Remark 226. “V,W have this property” [Lee [73, p.499, l.7]] means “V,W are π-related to ∂x , ∂x ”. Remark 227. By Lee [73, p.496, l. 8–l. 7], w = c is an integral manifold of D [Lee [73, p.499, l. 3]]. − − − Remark 228. By Lee [73, p.608, Proposition A.43(a)], each of which is open in H [Lee [73, p.500, l.8]].

Remark 229. By Lee [73, p.112, Theorem 5.29], F B is smooth from B into S H [Lee [73, p.501, l.10]]. | ∩ Remark 230. Lee [73, p.610, Exercise A.51].

Proof. I. The case when M is a second-countable Housdorff space: Dugundji [26, p.174, Ex. 3; Theorem 6.3; p.228, Defnition 3.1; p.229, Theorem 3.2]. II. The case when M is a metric space: Munkres [83, p.179, Theorem 28.2].

Remark 231. Lee [73, p.611, Exercise A.54].

Proof. (a). Bourbaki [10, part 1, p.103, Corollary 2; p.104, Proposition 6]. (b). Bourbaki [10, part 1, p.101, Theorem 1, b) a); p.104, Proposition 6]. ⇒ (c). Bourbaki [10, part 1, p.98, Proposition 2, c) a); p.104, Proposition 6]. (d). Bourbaki [10, part 1, p.99, Proposition 5 d);⇒ p.98, Proposition 2, b) a); p.104, Proposition ⇒ 6]. (e). Bourbaki [10, part 1, p.98, Proposition 3 a); p.104, Proposition 6].

Remark 232. A has an empty interior in A, which is a contradiction [Lee [73, p.612, l.9]].

Proof. For the subspace topology on A, IntAA = A = ∅. Thus, we reach a contradiction. 6 Remark 233. For a proof of Lee [73, p.614, Exercise A.67], see https://math.stackexchange.com/ questions/139195/show-that-a-star-like-region-is-simply-connected. Remark 234. For a proof of Lee [73, p.615, Exercise A.75], see https://math.stackexchange.com/ questions/1074341/covering-map-is-proper-iff-it-is-finite-sheeted. Remark 235. For a proof of Lee [73, p.616, Proposition A.79], see Munkres [83, p.342, Lemma 54.1; p.343, Lemma 54.2; p.344, Theorem 54.3] and https://math.stackexchange.com/questions/ 755355/show-that-if-b-is-simply-connected-then-p-is-a-homeomorphism. Remark 236. For a proof of Lee [73, p.620, Exercise B.5(a)], see Proposition 1 in https://planetmath. org/ZornsLemmaAndBasesForVectorSpaces; for a proof of Lee [73, p.620, Exercise B.5(b)], see Proposition 2 in https://planetmath.org/CardinalitiesOfBasesForModules. Remark 237. For a proof of Lee [73, p.638, Exercise B.48], see https://math.stackexchange.com/ questions/1393301/frobenius-norm-of-product-of-matrix.

36 I Remark 238. In Lee [73, p.648, l.13–l.16], all of the definitions of I, I ,∂I, and (x a) are incorrect. They should be corrected as follows: | | − I =(i1, ,in), where i j (1 j n) are nonnegative integers. ··· ≤ ≤ m = I = i1 + + in. | | m ··· ∂ = ∂ . I ∂(x1)i1 ∂(xn)in ··· (x a)I =(x1 a1)i1 (xn an)in . − − ··· − n+k Remark 239. There are n 1 terms on the right-hand side of (C.9) [Lee [73, p.649, l.13]]. − Proof. Read https://math.stackexchange.com/questions/2410399/ho w-to- find-all-the-n-tuples-with-nonnegative-integers-that-add-up-to-a- fixed-in.

n+k k+1 Remark. n 1 n . − ≤ Remark 240. Bd (M N)=(( Bd M) N) (M (Bd N)) [Munkres [85, p.5, l.6]] follows from https:// math.stackexchange.com/questions/2090102/find-the-b× × ∪ × oundary-of-a- product-of-sub-spaces-of-topological-spaces. Remark 241. The openness of St (x,K) [Munkres [85, p.70, l.11–l.12]] follows from Dugundji [26, p.172, l. 15]. This is the only place where I have found the correct definition and proof for the star of − a vertex. Remark 242. By https://math.stackexchange.com/questions/1692955/simplicial-complex- not-locally-finite-then-not-locally-compact, the complex K is locally finite if K is locally compact [Munkres [84, p.11, l. 5–l. 4]]. | | − − Remark 243. Munkres [84, p.14, Exercise 9].

Proof. Let τ be the coherent topology on K [Deo [25, p.116, l.18–l.19]] and let τd be the sub- | | space topology induced on K by the metric space (RJ,d) [Deo [25, p.115, l. 3]]. Then | | − τ τd. τ τd follows from the theorem in https://planetmath.org/theunionofalocallyfinitecollectionofclosedsetsisclosed ⊃ ⊂

Remark 244. The Klein bottle is homeomorphic to P2#P2 [Munkres [84, p.39, l. 2–l. 1]]. − − Proof. Munkres [84, p.39, Figure 6.9] only gives one part of the proof. The orientation of each 2-simplex in the figure is the counterclockwise direction; the left figure represents the Klein bottle. The other part of the proof is to establish a homeomorphism between the right figure in Munkres [84, p.39, Figure 6.9] and the first figure in Munkres [84, p.38, Figure 6.8] using the last two figures in Munkres [84, p.38, Figure 6.8]. The last two figures in Munkres [84, p.38, Figure 6.8] represent P2#P2. Let us consider its left branch first. It is a 2-simplex denoted by the path D D C using its directed boundary segments rather than the order of its vertices. The minus→ sign→− represents the clockwise direction. The starting point and the end point of the path D D are the same. C is a closed curve. The same 2-simplex can also be represented by D C→ D. The corresponding 2-simplex in the right figure of Munkres [84, p.39, Figure 6.9] →− → is C B C. Although the starting point and the end point of the path seem to look different, we can→− relabel→ vertices to make the path closed so that it may represent a 2-simplex.

Remark 245. If M in O’neill [86, p.146, Definition 3.5] is considered a differentiable manifold, then O’neill [86, p.6, Definition 2.1] satisfies O’neill [86, p.146, Definition 3.5] [O’neill [86, p.6, l. 2–l. 1]]. − −

37 Remark 246. For the proof of O’neill [86, p.39, Theorem 7.10], see Lee [73, p.657, Theorem C.34]. Remark 247. O’neill [86, p.50, Exercise 12]

Proof. I. The construction of the formula given in O’neill [86, p.50, l. 7] − g 1 g Procedure. Let f = cosϑ and g = sinϑ. Then f = tanϑ. Thus, ϑ = tan− f . 2 dϑ f fg′ g f ′ d 1 1 dt = f 2+g2 −f 2 [ dt tan− t = 1+t2 ] = f g g f . ′ − ′

II. Want to prove ( f cosϑ + gsinϑ)′ = 0.

Proof. f 2 + g2 = 1 f f + gg = 0. ⇒ ′ ′ ( f cosϑ + gsinϑ)′ =( f + gϑ )cosϑ +(g f ϑ )sinϑ ′ ′ ′ − ′ =( f f ′ + gg′)( f cosϑ + gsinϑ)= 0.

III. By II, f cosϑ + gsinϑ = c, where c is a constant. Then c = F(0)cosϑ0 + g(0)sinϑ0 = 1. ( f cosϑ)2 +(g sinϑ)2 = f 2 + g2 + cos2 ϑ + sin2 ϑ 2( f cosϑ + gsinϑ)= 0. − − − Remark 248. (a). θ1(v)= dρ. (b). θ2 = ρ cosϕdϑ. (c). θ3 = ρdϕ. (d). ω12 = cosϕdϑ. (e). ω13 = dϕ. (f). ω23 = sinϕdϑ. [O’neill [86, p.94, l.13–l.15]].

Proof. (a). θ1(v)= v F1(p) · = v1 cosϕ cosϑ + v2 cosϕ sinϑ + v3 sinϕ [O’neill [86, p.83, l. 18]]. − dρ(v)= v[ρ] [O’neill [86, p.23, Definiton 5.2]] ∂ρ 2 2 2 1/2 = ∑vi [O’neill [86, p.12, Lemma 3.2]], where ρ =(x + x + x ) . ∂xi 1 2 3 (b). θ2(v)= v F2(p) · = v1 sinϑ + v2 cosϑ [O’neill [86, p.83, l. 17]]. − ∂ϑ− ρ cosϕdϑ(v)= ρ cosϕv[ϑ]= ρ cosϕ ∑vi , where ∂xi ϑ = tan 1 x2 . − x1 (d). v[cosϕ cosϑ]= cosϑ( sinϕ)v[ϕ]+ cosϕ( sinϑ)v[ϑ]. − − v[cosϕ sinϑ]= sinϑ( sinϕ)v[ϕ]+ cosϕ cosϑv[ϑ]. − ω12(v)= ∇vF1 F2 [O’neill [86, p.85, l. 10]] · − = sinϑv[cosϕ cosϑ]+ cosϑv[cosϕ sinϑ] [O’neill [86, p.83, l. 18–l. 17]] − − − = cosϕv[ϑ].

Remark. For proofs of (a), (b), (c), we have to use the expansion given in O’neill [86, p.12, Lemma 3.2], while for the proofs of (d), (e), (f), we need not. Remark 249. O’neill [86, p.111, Exercise 4].

Proof. Let λ be a characteristic root of C. Then x : Cx = λx. ∃ (x,x)=(Cx,Cx)=(λx,λx)= λ 2(x,x) | | λ 2 = 1. ⇒|Since| the characteristic polynomial of C has real coefficients, C must have a real characteristic root 1. ± Suppose C has a triple characteristic root 1, then detC = 1, a contradiction. Hence C has a − −

38 real characteristic root +1. Let e3 be its eigenvector C(e3)= e3, e1 e3, and e2 = e3 e1. ⊥ × Since 0 = C(e1) C(e3)= C(e1) e3, C(e1) can be expressed as cosϑe1 + sinϑe2. · · Similarly, C(e2)= ae1 + be2. C(e1) C(e2)= 0 acosϑ + bsinϑ = 0. · cosϑ⇒ sinϑ detC = 1 = 1. ⇒ a b

The desired results follow by solving a,b.

Remark 250. E¯i = Fi [O’neill [86, p.121, l. 12]]. − Proof. E¯i Fj 1 [by the Schwarz inequality]. · ≤ E¯i Fj = 1 E¯i = Fj. · ⇔ Remark 251. dg is never zero on M [O’neill [86, p.130, l.6]].

Proof. f (α1(t),α2(t)) = c. ∂ f ∂ f ∂x α1′ (t)+ ∂y α2′ (t)= 0. Hence α = 0 d f = 0. ′ 6 ⇔ 6 Remark 252. Solution to O’neill [86, p.143, Exercise 13(a)]: u = g(t) t = g 1(u) h(t)= h g 1(u)= ⇒ − ⇒ ◦ − f (u). Remark 253. O’neill [86, p.151, Exercise 12].

Proof. : V(x(u,v)) = f (u,v)xu(u,v)+ g(u,v)xv(u,v). ⇒ Let y(u,v) be the resulting vector after xv(u,v) is rotated by angle π/2. Then V(x(u,v)) y(u,v)= f (u,v)xu(u,v) y(u,v), where xu(u,v) y(u,v) = 0. · · · 6 If xu(u,v) y(u,v)= 0, then xu(u,v),xv(u,v) would be linearly dependent, a contradiction. Thus, V· (x(u,v)) y(u,v) f (u,v)= · is differentiable. xu(u,v) y(u,v) Similarly, g(u,v)· is differentiable.

Remark 254. (In order to keep up with the modern research, we should adopt a new viewpoint toward the inverse function theorem) The inverse function theorem usually refers to the version given in O’neill [86, p.161, Theorem 5.4]. In order to keep up with the modern research, we should adopt a new viewpoint toward the inverse function theorem and interpret it as the following natural and complete version that can be illustrated by a geometric figure (i.e., a linear isomorphism between tangent spaces is equivalent to a diffeomorphism between coordinate neighborhoods). For the proof of O’neill [86, p.161, Theorem 5.4], most authors of differential geometry textbooks pass the buck to the authors of advanced calculus textbooks. However, the students who have studied advanced calculus may still not know where to start the proof because mappings of surfaces are more complicated than mappings of open sets in planes. The difficulty lies in the attempt to fit many requirements. Before I start the proof, I point out two facts: Spivak [108, vol. 1, p.41, (1)] and the chain rule [by Carmo [12, p.73, l.6]]. (The inverse function theorem) Let F : M N be a mapping of surfaces and p M. Then → ∈ (F p : Tp(M) TF(p)(N) is a linear isomorphism at p M) ∗(F : M →N is a local diffeomorphism at p M) [O’neill∈ [86, p.161, Theorem 5.4]]. ⇔ → ∈

39 1 1 Proof. : F− F = I d(F− ) dF = I [Carmo [12, p.91, Exercise 24]]. : By⇐ Spivak [108,◦ vol.⇒ 1, p.56, Theorem◦ 9(1)], y F x 1 : is a diffeomorphism. The ⇒ ◦ ◦ − U→V result follows from Spivak [108, vol. 1, p.41, (1) & (3)].

i i 1 ∂x ∂(x u− ) n Remark 1. det[ ∂u j (p)] = det[ ∂◦u j (u(p))] = 0; by the R -version of the inverse function 6 1 theorem, there are a neighborhood W1 of u(p) and a neighborhood W2 of x(p) such that x u ◦ − is a diffeomorphism from W1 onto W2 [Spivak [108, vol. 1, p.57, l.6–l.8]]. Remark 2. The word “square” given in Spivak [108, vol. 1, p.57, l. 1] should have been replaced m − with “rectangle”; Dmψ given in Spivak [108, vol. 1, p.57, l. 1] should have been replaced with m − Dnψ . Remark 3. By Rudin [97, p.196, Theorem 9.20], we can write ψr(a)= ψ¯ r(a1, ,ak),r = ··· k + 1, ,m [Spivak [108, vol. 1, p.58, l.1–l.2]]. Note that ψ¯ r C∞. ··· ∈ Remark 4. For the proof of Spivak [108, vol. 1, p.59, Theorem 10(2)], we may incorporate the proof for the the special case x = I and that for the general case into one. Remark 255. O’neill [86, p.220, Exercise 9(a)].

Proof. S(xu)= axu + bxv. aE = S(xu) xu · = U xuu [O’neill [86, p.213, Lemma 4.2]] · = l. Remark 256. O’neill [86, p.230, Exercise 3]. Proof. I. Let the profile curve be α(u)=(g(u),h(u),0). Then α′(u)=(g′(u),h′(u),0). II. All meridians are geodesics.

Proof. β(u)= x(u,v0)=(g(u),h(u)cosv0,h(u)sinv0). β ′(u)=(g′(u),h′(u)cosv0,h′(u)sinv0). β ′′(u)=(g′′(u),h′′(u)cosv0,h′′(u)sinv0). (h , g cosv0, g sinv0 U ′ − ′ − ′ [O’neill [86, p.234, l. 8]]. β = 2 2 √g′ +h′ − 2 2 g′ + h′ = 1 [O’neill [86, p.238, l.16]] g g + h h = 0 ⇒ ′ ′′ ′ ′′ β (u) U . ⇒ ′′ k β

III. The parallels through α(t) is a geodesic iff α′(t) is parallel to the axis of rotation.

Proof. γ(v)= x(u0,v)=(g(u0),h(u0)cosv,h(u0)sinv). γ (v)=(0, h(u0)sinv,h(u0)cosv). ′ − γ′′(v)=(0, h(u0)cosv, h(u0)sinv). (h (u0),−g (u0)cosv, g (−u0)sinv U ′ − ′ − ′ [O’neill [86, p.234, l. 8]]. γ = 2 2 √g′ (u0)+h′ (u0) −

IV. γ (v) Uγ ′′ k h (u0)= 0 [by III] ⇔ ′ α (u0)=(g (u0),0,0) [by I] ⇔ ′ ′ α (u0) is parallel to the x-axis. ⇔ ′ 40 h 1 h 1 Remark 257. (kµ = − ′ ,kπ = ) [O’neill [86, p.241, l. 5]] should have been corrected as (kµ = ′ ,kπ = − ). c ch′ − c ch′ Remark 258. (A proof should be natural and straightforward: one should not make a great fuss about little things) The proof of Kreyszig [66, p.206, Theorem 66.1] is natural and straightforward. In contrast, the proof given in O’neill [86, p.257, l. 8–p.259, l.13] uses two big theorems: O’neill [86, p.255, − Theorem 2.7](O’neill [86, p.257, l. 3–l. 2]) and O’neill [86, p.179, Theorem 7.6](O’neill [86, p.259, l.11]). The former big theorem− − uses connection equations which can be avoided by Kreyszig [66, p.206, l. 8]; the proof of the latter big theorem uses reduction to absurdity which can be avoided by using− the definition of compactness [Kreyszig [66, p.207, l.1–l.3]]. These two big theorems may easily distract us from the theme of Kreyszig [66, p.206, Theorem 66.1]. Because O’neill fails to explain why every point in M is in such a region , I prove O’neill [86, p.257, Lemma 3.2] as follows: O Proof. I. Let be the range of a coordinate patch in M [O’neill [86, p.124, Definition 1.1]]. By O’neill [86,O p.178, Theorem 7.5], is orientable. O By O’neill [86, p.246, Lemma 1.2], there exists an adapted frame field on . II. Let p be a fixed point in , q be an arbitrary point in , and O O O α be a curve in such that α(0)= p and α(1)= q. dK(α(t)) O dt = α′(t)[K] [O’neill [86, p.149, Definition 3.10]] = dK[α′(t)] [O’neill [86, p.23, Definition 5.2]] = 0 [O’neill [86, p.258, l.1–l.2]]. Thus, K is constant on α[0,1]. In particular, K(p)= K(q). III. Let q be an arbitrary point in M and β be a curve in M such that β(0)= p and β(1)= q. β[0,1] is compact. Consequently, there exist a finte number of i such that β[0,1] i i. O ⊂∪ O By II, K is constant on each of i . Therefore, {O } K is constant on β[0,1]. Remark. We assume that the surface is of class r 3 because we use the partial derivatives of λ ≥ in Kreyszig [66, p.206, (66.3)].

Remark 259. We can arrange that k1 > k2 > 0 in [O’neill [86, p.262, l. 9]]. N − Proof. By O’neill [86, p.246, Lemma 1.2], there exists an adapted frame field in . By O’neill [86, p.254, Lemma 2.6], there exists a principal frame field in [sinceN m is not N umbilic]. Another option is k2 < k1 < 0. If it is so, we may replace U with U [O’neill [86, p.202, − l. 11–l. 10]] to obtain k2 > k1 > 0. − − − − Remark 260. k1 has a ocal maximum at m and k2 has a ocal minimum [O’neill [86, p.262, l. 7]]. − Proof. k1 k2 > 0 has a maximum in at m [O’neill [86, p.262, l. 8]] −2 N − (k1 k2) has a maximum in at m ⇔ − 2 N (k1 + k2) has a maximum in at m [since K = k1k2 is constant] ⇔ N k1 + k2 has a maximum in at m. ⇔ N 2k1 =(k1 k2)+(k1 + k2) has a maximum in at m. − N k2 has a minimum in at m [since K = k1k2 is constant]. N 41 Remark 261. O’neill [86, p.271, Exercise 14].

2x 2y Proof. P(x,y,z)=( 2 z , 2 z ) [O’neill [86, p.160, l.12]]. x(u,v)=(cosvcosu,−cosv−sinu,1 + sinv) [O’neill [86, p.135, l.3]]. xu(u,v)=( cosvsinu,cosvcosu,0) [O’neill [86, p.135, l.9]]. − xv(u,v)=( sinvcosu, sinvsinu,cosv) [O’neill [86, p.135, l.10]]. ∂(−P x) − P (xu)= ∂u◦ [O’neill [86, p.160, Definition 5.3]]. ∗ 2cosv = 1 sinv ( sinu,cosu). − −∂(P x) P (xv)= ∂v◦ [O’neill [86, p.160, Definition 5.3]]. ∗ 2 = 1 sinv (cosu,sinu). P −x(u,v) = 2cosv . || ◦ || 1 sinv λ x(u,v)= 2 − . ◦ 1 sinv The desired result− follows from the conformal mapping version of O’neill [86, p.269, Exercise 1, (d) (b)] [O’neill [86, p.269, l.4–l.6]]. ⇒ Remark. We can also write xu(u,v) in the form of the chain rule [Taylor–Mann [111, p.341, Theorem III]]: xu(u,v) 2 0 2cosvcosu cosvsinu 1 sinv (1 sinv)2 − 2cosv sinu = − − cosvcosu = − . 0 2 2cosvsinu 1 sinv cosu 1 sinv (1 sinv)2 !  − − − 0   Remark 262. ω12(E1)= dθ1(E1,E2) [O’neill [86, p.272, l.9]].

Proof. dθ1(E1,E2)=[ω12 dθ2](E1,E2) [O’neill [86, p.249, Theorem 1.7(1)]] ∧ = ω12(E1) [O’neill [86, p.153, Definition 4.3]].

By O’Neill [86, p.153, Lemma 4.2], ω12 satisfies the first structure equations [O’Neill [86, p.272, l. 7]]. − Remark 263. The Gauss mapping carries its profile curve in one-to-one fashion onto a quarter of a great circle in Σ [O’Neill [86, p.291, l.5–l.7]]. Proof. I. The graph of y = h(x) cannot intersect positive x-axis, so h > 0. h < 0, so as x +∞, ′ → h(x) 0. Thus, as x +∞, the direction of the surface normal approaches the negative y-axis. → → II. By O’Neill [86, p.241, l. 9], h (0)= ∞. Thus, the surface normal at (0,c,0) has the direc- − ′ − tion of the positive x-axis. III. h′′ > 0. Remark 264. By Blaga [6, p.120, l.8], F M : M M¯ is an isometry of surfaces [O’Neill [86, p.297, l. 16]]. | → − F (v)=(F α)′(0) [O’Neill [86, p.297, l. 1, the first equality]] follows from O’neill [86, p.160, ∗ ◦ − Definition 5.3]. (F α)′(0)= F (v) [O’Neill [86, p.297, l. 1, the last equality] follows from O’neill [86, p.38, Theorem◦ 7.8]. ∗ − Remark 265. Except that the parallels in Struik [109, p.121, Fig. 3-3(a)] are bounded and periodic, all the parallels and meridians in Struik [109, p.121, Fig. 3-3] are unbounded [O’Neill [86, p.275, Exercise 5(b)]]. Remark 266. The Mobius¨ strip is a part of the projective plane RP2 [Spivak [108, vol. 1, p.15, the lower right figure]], so RP2 is not orientable. Thus, RP2 cannot embed in R3 [O’Neill [86, p.317, l. 12–l. 11]]. − − 42 Remark 267. O’Neill [86, p.325, Exercise 5].

b Proof. (a). ψα = ω12(α )dt [O’Neill [86, p.323, l.7]] − a ′ [O’Neill [86, p.167, Definition 6.1]] = α ω12 R − d [O’Neill [86, p.170, Theorem 6.5]] = x R ω12 = Kθ1 θ2 [O’Neill [86, p.312, l.8]] Rx ∧ KdM [O’Neill [86, p.292, l.4]]. = Rx (b). Let dM be the u v -area form of O’Neill [86, p.323, Example 3.7]. Then R ( , ) dM = r2 sinθdθdu [Wangsness [121, p.33, Figure 1-40]; since dθ = dv]. − − ψα = x KdM [by (a)] π 2π 2 1 2 = 0R π 2 r sinθdθdu 2 v0 r − π − 2 = 2Rπ Rπ sinθdθ v0 − 2 − = 2π sinv0. − R Remark 268. O’Neill [86, p.325, Exercise 7].

Proof. Let Y = fE1. Y = f E1 + f ω21(α )E2 [O’Neill [86, p.320, l. 6]]. ′ ′ ′ − F (Y ′)= f ′E¯1 + f ω21(α′)E¯2 [since F is linear]. ⇒ ∗ ∗ F (Y)= Y¯ = f E¯1 [since F is linear]. ∗ ∗ (Y¯ )′ = f ′E¯1 + f ω¯21(F (α′))E¯2 [O’Neill [86, p.320, l. 6]] ∗ − = f ′E¯1 + f (F∗ω21)(α′)E¯2 [O’Neill [86, p.163, l.8]] = f ′E¯1 + f ω21(α′)E¯2 [O’Neill [86, p.273, Lemma 5.3(2)]]. Remark 269. O’neill [86, p.325, Exercise 8].

Proof. I. f¯i(F(p)) = W¯ E¯i = F (W) F (Ei)= W Ei = fi(p). · ∗ · ∗ · II. By O’neill [86, p.165, Exercise 8(a)], α′[ f¯iF]= F α′[ f¯i]. Thus, ∗ V[ fi](p)= V¯ [ f¯i](F(p)). III. (a). Let W = f1E1. ∇VW = V[ f1]E1 + f1ω12(V)E2 [O’neill [86, p.318, l. 7]] − ∇VW = V[ f1]E¯1 + f1ω12(V)E¯2 [since F is linear] ⇒ ∗ = V¯ [ f¯i]+ f1ω12(V)E¯2 [by II]. (b). ω¯12(V¯ )= ω¯12(F V)=(F∗ω¯ )(V) ∗ = ω12(V) [O’neill [86, p.273, Lemma 5.3(2)]]. f¯1ω¯12(V¯ )= f¯1(F(p))ω12(V) = f1(p)ω12(V) [by I]. (c). ∇VW = V¯ [ f¯1]E¯1 + f¯1ω¯12(V¯ )E¯2 [by (b)] = ∇ ¯ W¯ [O’neill [86, p.318, l. 7]]. V − Remark 270. O’neill [86, p.325, Exercise 9].

xv xu Proof. ( )u ( )=[ω21(xu) E1] E1 [O’neill [86, p.320, 6]] √G · √E · · − = ω21(xu). ( xv ) = 1 x + x [ 1 ]x [O’neill [86, p.80, Corollary 5.4(3)]]. √G u √G vu u √G v √ ω (x )= ( E)v [O’neill [86, p.277, (3); l.11–l.12]]. 21 u √G

43 Remark 271. O’neill [86, p.337, Exercise 2].

Proof. Let h : t at. 7→ γav(0)= p, γa′v(0)= av. (γv h)(0)= γv(0)= p. ◦ (γv h) (0)= γ (0)h (0)= av. ◦ ′ v′ ′ By O’neill [86, p.330, Lemma 4.6], γv h is a geodesic. ◦ By O’neill [86, p.328, Theorem 4.3], γav(t)= γv h(t). ◦ Remark 272. By Coddington–Levinson [16, p.22, Theorem 7.1], y is differentiable [O’neill [86, p.341, l. 10]]. − Remark 273. y is regular at the origin [O’neill [86, p.341, l. 9]]. − Proof. If the rank of x at the origin is less than 2, then there exists an ε > 0 such that the ∗ dimension of x(Dε ) is 0 or 1 [Spivak [108, vol. 1, p.59, Theorem 10]]. This would contradict the fact that for every direction v, γv(u) exists at p [O’neill [86, p.328, Theorem 4.3]]. Remark 274. EG F2 is never zero [O’neill [86, p.343, l.4–l.5]]. − Proof. Let p be the chosen point on the geometric surface M. Then xu(p) and xv(p) are tangent 3 vectors at p. Since the tangent plane can be embedded in R , we may still construct xu(p) xv(p). × Even though we cannot use xu(p) xv(p) to represent the surface normal which is meaningless × for a geometric surface, we may use it to express the area form [O’neill [86, p.284, l.4]]. Thus, 2 we may use xu(p) xv(p) as an auxiliary tool to show EG F > 0 if x is regular at p [Carmo × − [12, p.54, l.3–l.5]; O’neill [86, p.49, Lemma 1.8]].

Remark 275. e1 = U1(p),e1 = U1(p) is a frame at p [O’neill [86, p.344, l.1–l.2]]. { } Proof. x(u,v)=(ucosv,usinv) [O’neill [86, p.340, l. 17]]. − E1 = xu/√E,E2 = xv/√G [O’neill [86, p.277, (1)]], where 2 E = 1 ,F = 0,G = u [O’neill [86, p.340, l. 10]]. g g2 − Since E2 is a differentiable frame field, it is continuous. Thus, E2(0,0)= lim0

Remark 279. Let r = α1(θ),r = α2(θ)( π/2 θ π/2) be two curves written in polar coordinates. If − ≤ ≤ α1(θ) > α2(θ)( π/2 < θ < π/2), then L(α1) > L(α2) [O’neill [86, p.359, l.8–l.9]]. − π/2 Proof. L(α)= π/2 α(θ)dθ. − Remark 280. F(β)= G(β) [O’neillR [86, p.363, l. 16]]. − 44 Proof. I. There exists a curve α(t) from p = α(0) to q = α(t∗) such that L(α) < ρ(p,q)+ε. For every t [0,t ], there exists a normal neighborhood N of α(t) [O’neill [86, p.341, Lemma ∈ ∗ t,δ(t) 5.3]]. Since α[0,t ] is compact, there exist t0,t1, ,tn such that 0 t0 < t1 < < tn t and ∗ ··· ≤ ··· ≤ ∗ n N α[0,t ]. ∪i=0 ti,δ(ti) ⊃ ∗ II. Let Ai = nt ,δ(t ). Suppose F (v)= G (v), where v = α′(ti 1). By the argument given in i i ∗ ∗ − O’neill [86, p.363, l.12–l.20], we have F G on Ai 1 Ai. We may take z Ai 1 Ai; construct ≡ − ∩ ∈ − ∩ a geodesic from α(ti 1) to z and a geodesic from z to α(ti); make these two geodesics parts of β. By the triangle inequality− [O’neill [86, p.351, Exercise 5(b)(iii)]], there exists a small normal neighborhood N of α(ti) such that for every w N, there exists a normal neighborhood of w ∈ containing z. Then F G on N. Thereby, we may continue to consider Ai Ai 1. ≡ ∩ + Remark 281. H is geodesically complete [O’Neill [86, p.363, l. 2]; proof: O’Neill [86, p.335, l. 4]]. e1 = − − U1(p),e2 = U2(p) [O’neill [86, p.364, l.10]; see O’Neill [86, p.312, l. 8]]. Let x,x¯ be the resulting geodesic polar mappings of H and N [O’Neill [86, p.364, l.11–l.12];− see O’Neill [86, p.340, l. 9]]. √G = sinhu [O’Neill [86, p.364, l.20]; proof: O’Neill [86, p.354, l. 9]]. At the pole P, the− preservation of inner product is an honest consequence of continuity [O’Neill− [86, p.364, l. 5–l. 4]; see Lee [73, p.327, l. 3 & p.328, l.1]]. The largest normal neighborhood of p − − − in Σ is not all of Σ [O’Neill [86, p.365, l.1–l.3]; proof: If it were all of Σ, then it would contradict the injectiveness of the geodestic polar mapping given in O’Neill [86, p.341, Lemma 5.3]]. The frames determining F2 are chosen so that the derivative maps of F1 and F2 agree at p∗ [O’Neill [86, p.365, l.7–l.8]; see O’Neill [86, p.340, l. 9; p.364, l. 12]]. N is connected [O’Neill [86, − − p.365, l.11]; proof: O’Neill [86, p.348, Theorem 5.9]]. F maps Σ onto N [O’Neill [86, p.365, l.12]; proof: Lee [73, p.99, Proposition 5.1 & Proposition 5.2; p.339, Theorem 13.29] and the methods given in O’Neill [86, p.179, Theorem 7.6; p.180, Exercise 6]]. Isometries preserve the Gaussian curvature [O’Neill [86, p.366, l. 1]; proof: O’Neill [86, p.273, Theorem 5.4]]. − For some unit tangent vector u, F (u)= α′(t0) [O’Neill [86, p.273, Theorem 5.4]; proof: if the domain and the codomain have the∗ same dimension, then a linear transformation’s injectiveness is equivalent to its surjectiveness.]. If M is a compact surface in E3 with constant Gaussian curvature K(> 0), and F : M E3 is an isometric immersion, then F is an isometry of M onto → a Euclidean sphere Σ of radius 1/√K in E3 [O’Neill [86, p.368, l. 11–l. 9]; proof: By O’Neill − − [86, p.262, Theorem 3.7], M is a sphere Σ of radius 1/√K in E3. F(M) is a compact surface in E3 [O’Neill [86, p.367, l. 6]] with constant curvature K [O’Neill [86, p.273, Theorem 5.4]]. − By O’Neill [86, p.262, Theorem 3.7], F(M) is a sphere of radius 1/√K in E3. Id : M Σ is an → isometry. By O’Neill [86, p.363, Theorem 7.2], F = Id.]. O’Neill [86, p.368, l.8–l.11] says that if we replace “a proper isometric imbedding” with “an isometric immersion” in the hypothesis of O’Neill [86, p.367, Lemma 7.7], then F(M) is a surface in E3 [proof: take the domain of a proper patch in F(M) smaller than that of the corresponding proper patch in M]. Remark 282. Invariance of domain [Spivak [108, vol. 1, p.3, Theorem 1]].

Proof. By Dugundji [26, p.359, Corollary 3.2], it suffices to prove the following lemma: Let f : V n Rn [Dugundji [26, p.3, l.20]] be continuous and injective. Then → f (0) Int [ f (V n)] [Dugundji [26, p.71, Definition 4.8]]. ∈ Proof. G is closed in V n G is compact ⇒ f (G) is compact in Rn ⇒

45 f (G) is closed in Rn. Thus, ⇒ f : V n f (V n) is a homeomorphism. →

Remark 283. Since the two pieces of cloth will pass through each other [Spivak [108, vol. 1, p.16, l.7]], f is not one-to-one [Spivak [108, vol. 1, p.16, l. 16]]. − Remark 284. Using the atlas A2 of Example 3 in http://www.math.ubc.ca/˜feldman/m428/manifolds. pdf, we easily see that the projections P1 and P2 from points (0, ,0,1) and (0, ,0, 1) are C∞-related [Spivak [108, vol. 1, p.39, l. 4–p.40, l.1]]. ··· ··· − − ∂xi i 1 if i = j Remark 285. ∂x j (p)= δ j = [Spivak [108, vol. 1, p.47, l.3]]. (0 if i = j 6 Proof. I. Case i = j: 1 6 x− (x(p)+ he j) is the j-th coordinate curve whose i-coordinate is constant; by Spivak [108, vol. i 1, p.46, l. 2–l. 1], ∂x (p)= 0. − − ∂x j II. Case i = j: 1 x(x− (x(p)+ he j)) = x(p)+ he j i 1 i x (x− (x(p)+ he j)) = x (p)+ h ⇒ i 1 i x (x (x(p)+ he j)) x (p)= h. ⇒ − − Remark 286. Because (x is a diffeomorphism if and only if x is a vector space isomorphism), the rank of 1 ∗ (y f x− ) equals the rank of f [Spivak [108, vol. 1, p.52, l.8–l.10]. ◦ ◦ ∗ ∗ Remark 287. It clearly suffices to consider the case where M and N are Rn [Spivak [108, vol. 1, p.56, l. 8– l. 7]]. − − Proof. φ : Bn Rn → φ(x)= x . √1 x 2 ψ : Rn B−kn k → y ψ(y)= . √1+ y 2 k k ψ φ(x)= x. ◦ φ ψ(y)= y. ◦ Remark 288. The curve itself manages to be differentiable by slowing down to velocity 0 at the pont (0,0) [Spivak [108, vol. 1, p.61, l. 4–l. 3]]. − − ( exp( x 2),exp( x 2)) if x 0 Proof. h x − − . ( )= − − 2 − 2 ≤ ((exp( x− ),exp( x− )) if x > 0 − − By Spivak [108, vol. 1, p.61, l.7], Pr1(h(x)) is differentiable at x = 0. By Spivak [107, p.29, Problem 2-25], Pr2(h(x)) is differentiable at x = 0. Remark 289. Spivak [108, vol. 1, p.61, l. 2–l. 1]. − − 1 (x + 1)2 if 1 < x 0 Proof. Let y = − − ≤ . ( 1 (x 1)2 if 0 x < 1 p − − ≤ ( exp( x 2), 1 (1 exp( x 2))2) if x 0 Hence the requiredph will be h(x)= − − − − − − − ≤ . 2 2 2 ((exp( x− ), 1 (1 exp( x− )) ) if x > 0 − p− − − p 46 Remark 290. M1 may even be a dense subset of M [Spivak [108, vol. 1, p.62, l. 4–l. 3]]. − − Proof. Let β : R [0,1] [0,1] → × t (t,tα [tα]), where α R Q. Then7→ identify− left 0 with right∈ 1;\ lower 0 with upper 1. By https://math.stackexchange.com/questions/903142/for-an-irrational- number-a-the-fractional-part-of-na-for-n-in-mathbb-n-is, nα [nα] n { − | ∈ N is dense in [0,1]. } Remark 291. q U1 V is uniquely determined by q V [Spivak [108, vol. 1, p.63, l.7–l.8]]. ′ ∈ ∩ ∈ Proof. y : V y(V) is a diffeomorphism. Point q V→is uniquely determined by the y-coordinates given in Spivak [108, vol. 1, p.63, ′ ∈ l.7–l.8]. By Spivak [108, vol. 1, p.63, l.2], q U1 V. ′ ∈ ∩ Remark 292. Spivak [108, vol. 1, p.65, Proposition 12] follows from Spivak [108, vol. 1, p.56, Theorem 9(2)].

Remark 293. ∑U O ψU > 0 everywhere [Spivak [108, vol. 1, p.69, l.3]]. ∈

Proof. U′ is a refinement of [Dugundji [26, p.161, Definition 1.2]; Spivak [108, vol. 1, p.67, Theorem{ } 14]]. O p M, U : p U U¯ U. ∀ ∈ ∃ ′ ∈ ′ ⊂ ′ ⊂ Consequently, ψU (p)= 1. Remark 294. supp f U [Spivak [108, vol. 1, p.69, l. 6]]. ⊂ − Proof. I. The proof of this statement is not difficult, but Spivak’s confusing notations make it extremely difficult. First, “For each p C choose an open set Up U with compact closure” should have been ∈ ⊂ replaced with “For each p C choose an open set Up such that U¯ p is a compact subset of U”. Let be defined as the ∈gven in Spivak [108, vol. 1, p.68, l. 8]; O O − let 1 be defined as the gven in Spivak [108, vol. 1, p.69, l.11]; O O let f = ∑W O φW , where 1′ = W 1 : W Up for some p C . ∈ 1′ O { ∈O ⊂ ∈ } II. f (q) = 0 ( W : φW (q) = 0). 6 ⇒ ∃ ∈O1′ 6 W (W 1 and W Up, where p C). ∈O1′ ⇒ ∈O ⊂ ∈ III. p supp f ( pi : f (pi) = 0 and pi p). ∈ ⇒ ∃ 6 → By II, Wi : φW (pi) = 0. Thus, pi Wi. ∃ ∈O1′ i 6 ∈ For every open set V containing p, V intersects with only a finite number of members W in 1 O because 1 is locally finite. O Let them be W1, ,Wk. There exist i0 1, ,k and a sequence im such that pi Wi and ··· ∈ { ··· } m ∈ 0 pi p. m → p W¯ i U¯ p U. ∈ 0 ⊂ ⊂ Remark 295. The existence of the imbedding i : M RN given in Spivak [108, vol. 1, p.91, l. 4] comes from → − Spivak [108, vol. 1, p.70, Theorem 17], where N = nk + k [Spivak [108, vol. 1, p.70, l.9]]. Remark 296. By Dugundji [26, p.343, Theorem 3.3], one cannot comb the hair on a sphere [Spivak [108, vol. 1, p.94, l. 5]]. E1 in O’neill [86, p.247, l.10] is an example. − Remark 297. λ(2π)= λ(0) [Spivak [108, vol. 1, p.96, l.2]]. − 47 Proof. λ(0)(1,0,0)(2,0,0) = λ(0) f ((0,1)(0,0)) [Spivak [108, vol. 1, p.95, l. 8–l. 7]] ∗ − − = f ((0,λ(0))(0,0)) = f ((0,λ(2π))(2π,0) [The same dashed tangent vector must be taken at ∗ ∗ (2,0,0)] = λ(2π) f ((0.1)(2π,0))= λ(2π)( 1,0,0)(2,0,0). ∗ − Remark 298. There is no way to map T(M,i), fibre by fibre, homeorphically onto M R2 [Spivak [108, vol. × 1, p.96, l.4–l.5]].

Proof. Let M be the Mobius¨ strip. Assume there is a homeomorphism g such that

g T(M,i) / M R2 ❋ × ❋❋ ①① ❋❋ ①① π ❋❋ ①① ❋❋# { ①① π′ M ①

. Since f (0,λ(θ))(θ,0) is the non-zero dashed tangent vector at f (θ,0) and g is one-to-one, ∗ ∗ λ(θ)(g f ) (0,1)(θ,0) = g ( f (0,λ(θ))(θ,0)) = 0. Hence λ(θ) = 0(θ [0,2π]). This contradicts ◦ ∗ ∗ ∗ 6 6 ∈ λ(2π)= λ(0) [Spivak [108, vol. 1, p.96, l.2]]. − Remark 299. By https://www.reddit.com/r/math/comments/abmscl/spivaks_differential_ geometry_and_the_notion_of_a/, the Mobius¨ strip M is a 1-dimensional vector bundle over S1, but is not a trivial bundle [Spivak [108, vol. 1, p.98, l. 5–p.99, l.6]]. Lee [73, pp. 251– − 252, Example 10.3] also proves the same facts using a more complicated language. An argument using a complicated language may blur the key idea that can be easily recognized if the argument is expressed in a simple and flexible language. https://math.stackexchange.com/ questions/1688248/are-two-vector-bundles-mobius-band-and-s1-times- mathbbr-isomorphic-as-vect gives two proofs of the fact that M is not a trivial bundle, but Spivak’s proof is the most straightforward one among the three because he uses the shortest argument to reach the contradiction. Remark 300. In order to prove tni : TM T(M,i) is an equivalence [Spivak [108, vol. 1, p.105, l. 15]], we must show tni is∗ a homeomorphism→ [Spivak [108, vol. 1, p.98, l.6]]. Consequently,− we need to know the∗ differentiable structure (and hence the topology) on TM. This differentiable structure on TM is given in Spivak [108, vol. 1, p.110, l. 4–p.111, l. 6] or the proof of − − [p.1, Lemma 4.1] in http://idv.sinica.edu.tw/ftliang/diff_geom/*diff_ geometry(I)/bltangent.pdf. Remark 301. It is a simple exercise to use the coordinate system x to transfer this result from Rn to M [Spivak [108, vol. 1, p.108, l.10–l.11]].

1 n i ∂( f x− Proof. f f (p)= ∑i=1 x ( ∂◦ui x) [Spivak [108, vol. 1, p.107, Lemma 2]] n i−∂ f ◦ = ∑i=1 x ∂xi .

n ∂ Remark 302. If ∑i=1 ci 0 f = 0 for all f , then i1 i n,ci = 0 [Spivak [108, vol. 1, p.108, l. 9–l. 8]]. ∂xi | ∀ ≤ ≤ − − n ∂ Proof. Fix i0. Then ci = ∑ ci 0xi = 0. 0 i=1 ∂xi | 0 Remark 303. ( f (l))(g)= l(g f ) [Spivak [108, vol. 1, p.109, l.5]]. ∗ ◦

48 n i ∂(g f ) Proof. I. l(g f )= ∑i=1 a ∂x◦i p [Spivak [108, vol. 1, p.108, l. 4]] n i ◦ 1 | − = ∑i=1 a Di(g f x− (x(p)) [Spivak [108, vol. 1, p.46, l.9]]. 1 ◦ ◦ 1 1 D(g f x− )(x(p))= D(g y− )(y( f (p)))D(y f x− )(x(p)) [Spivak [108, vol. 1, p.89, l. 10, (1)]]◦ ◦ ◦ ◦ ◦ − n ∂g ∂(y j f ) i =(∑i=1 ∂y j f (p) ∂x◦i p) [Spivak [108, vol. 1, p.46, l.9]]. | | 1 II. f (l)= f ([x,a]p)=[y,D(y f x− )(x(p))(a)] f (p) [Spivak [108, vol. 1, p.104, l.12, (c)]]. ∗ ∗ ◦ ◦j 1 n ∂(y f ) i j D(y f x )(x(p))(a)=(∑ i◦ pa ) [Spivak [108, vol. 1, p.46, l.9]]. ◦ ◦ − i=1 x | n i ∂ n n n Remark 304. ∑i=1 a ∂xi p corresponds to ap when we identify TR with ε (R ) [Spivak [108, vol. 1, p.109, l.7]]. |

n i ∂ Proof. l = ∑ a i p corresponds to [x,a]p [Spivak [108, vol. 1, p.108, l. 4]], where x is the i=1 ∂x | − identity coordinate system. n TRn t / εn(Rn)

/ [x,a]p ✤ (p,a)= ap [Spivak [108, vol. 1, p.104, l.3]].

Remark 305. T(S1 S1) is trivial [Spivak [108, vol. 1, p.118, l. 2–l. 1]]. × − − Proof. T(S1 S1)= T(S1) T(S1) (S1 R) (S1 R) (S1 S1) R2. × × ≃ × × × ≃ × × Remark 306. The Mobius¨ strip is not orientable [Spivak [108, vol. 1, p.119, l.3–l.9]].

Proof. If TM were orientable, then TM S [Spivak [108, vol. 1, p.97, l. 7]] would be orientable [Spivak [108, vol. 1, p.119, l. 8–l. 7]].| − − − Let µp(p S) be the orientation for S and we could choose vp,wp as in Spivak [108, vol. 1, p.119, ∈ l.7–l.9]. If µp were continuous on S, its second nonzero component would also be continuous on S. This would contradict the statement given in Spivak [108, vol. 1, p.95, l. 4–l. 3]. − − Remark 307. Spivak [108, vol. 1, p.130, Problem 1].

Proof. (a). In essence, part (a) is a topological problem with a outlook of metric and mani- fold. Consequently, the statement of problem is somewhat confusing. If the author were to use the language of topology instead, then the statement would become “The topological space (M,T = M,R,U2,R 0 , /0 ) is not a Hausdorff space”. { \ { } } (b). Choose the radius and center coordinates of each ball to belong to Q. (c). Arrange in a sequence of all pairs (Ai,A j) with A¯i A j and take a sequence of fn : x(U) ⊂ → [0,1] (n N corresponds to (Ai,A j)), with supp fn x(U) as in Pervin [90, p.159, l.12–l. 12], ∈ ⊂ − where fn(A¯i)= 1, fn(x(U) A j)= 0 ( ) [Pervin [90, p.159, l. 12]]. \ ∗ − By using regularity twice, we have p Ai A¯i A j x(U) C. ∈ ⊂ ⊂ ⊂ \ By ( ), ∗ fn(p)= 1 and fn(x(U) C)= 0 because A j x(U) C x(U) A j x(U) C. ∩ ⊂ \ ⇒ \ ⊃ ∩ 49 (d). In Spivak [108, vol. 1, p.131, l.1], fi j : xi(Ui) [0,1]. In Spivak [108, vol. 1, p.131, l.3], → n fi j : Ui [0,1] [by identifying Ui with xi(Ui)]. If we consider xi(Ui) x R : x < k instead → ∩ { ∈ | | } of xi(Ui), we may assume supp fi j is compact. Then gi j C(M). ∈ In Spivak [108, vol. 1, p.131, l.4], a bounded metric d¯ on Rn can be defined as d¯(x,y) = min 1, x y . { | − |} (e). Let p,q M, p = q. ∈ 6 If there is a Ui containing both p and q, then by (c), there exists fi j such that fi j(p)= 1 and fi j(q)= 0. If there is a Ui such that p Ui and q / Ui, then by the proof of (c), there exists fi j such that ∈ ∈ fi j(p)= 1 and supp fi j Ui. Then gi j(q)= 0. ⊂ Remark 1. The proof of Pervin [90, p.159, Urysohn’s metrization theorem] is simpler, clearer and more complete than the approach from (c) to (e). Remark 2. The normality of X [Pervin [90, p.159, l. 14]] may also follow from Pervin [90, − p.81, Theorem 5.3.6; p.92, Theorem 5.5.6]. Remark 308. Spivak [108, vol. 1, pp.139–140, Problem 23].

Proof. (a). (y,e) π 1(y) ∈ ′− [(y,e) Y E, f (y)= π(e)] ⇔ ∈ × 1 [(y,e) Y E,e π− ( f (y))]. ⇔ ∈ × ∈ 1 Consequently, a vector space structure can be defined on π′− (y) by using the vector space struc- 1 ture on π− ( f (y)) [Spivak [108, vol. 1, p.139, l. 5–l. 2]]. 1 1 − − 1 1 1 Thus the fibre π− (π(e)) = π− ( f (y)) is isometric to the fibre π′− (y) of π′− ( f − (U)) [Spivak [108, vol. 1, p.139, l. 1]]. 1 − 1 1 1 1 (b). Both π′′− (y) and π′− (y) are isomorphic to π− (y), so π′′− (y) and π′− (y) are isomorphic. 1 1 1 1 1 1 1 Both π′′− ( f − (U)) and π′− ( f − (U)) are homeomorphic to f − R, so π′′− ( f − (U)) and 1 1 × π′− ( f − (U)) are homeomorphic. (c). g f = g˜ f˜. (d). See◦ Spivak◦ [108, vol. 1, p.97, l. 7]. − (e). Theg orientation of ξ 1 n t : π− (U) U R ⇒ n → × 1 n (x,ei) ∑ j=1 ai j(x)(x,e j) [by identifying π− (x) with x R (x U), see Spivak [108, vol. 1, p.116,7→ l. 5]], { }× ∈ − where U is open and connected, and detai j(x) = 0 is continuous in x U [Spivak [108, vol. 1, 6 ∈ p.117, l.1–l.4]]. There exists an open connected set V such that f (V) U. ⊂ By the isomorphism on each fibre given in Spivak [108, vol. 1, p.139, l. 1], 1 n − t′ : π′− (V) V R n→ × 1 n (y,ei) ∑ ai j( f (y))(y.e j) [by identifying π (y) with y R (y V)], 7→ j=1 ′− { }× ∈ where detai j( f (y)) = 0 is continuous in y V. 6 ∈ (f). f ([0,2π ε])(ε > 0), where f is given in Spivak [108, vol. 1, p.12, l.7], is orientable; see (d). − (g). Fix connected open set U in B. If π∗(ξ) were orientable, we could let its orientation µe =[e1, ,en], and 1 1 1 n ··· t′ : π′− (π− (U)) π− (U) R , n → × 1 n (e,ei) ∑ ai j(π(e))(e,e j) [by identifying π (e) with e R (e V), see Spivak [108, 7→ j=1 ′− { }× ∈

50 vol. 1, p.116, l. 5]]. − 1 where detai j(π(e)) = 0 would have the same sign in e π− (U) (by the compatibility condition 1 6 ∈ even if π− (U) is not connected). π˜ : π 1(e) π 1(π(e)) is a vector space isomorphism [Spivak [108, vol. 1, p.139, l. 1]], ′− → − − so the the same ordered basis would be allowed to be carried over to the following fibre in 1 π− (U). 1 n t : π− (U) U R →n × 1 n (x,ei) ∑ j=1 ai j(x)(x,e j) [by identifying π− (x) with x R (x U)], 7→ {1 }× ∈ where detai j(x) = 0 would have the same sign in e π (U). This would contradict the hypoth- 6 ∈ − esis that ξ is not orientable.

Remark 309. Spivak [108, vol. 1, p.141, Problem 26].

Proof. T(M N)= π (TM) π (TN) × M∗ ⊕ N∗ [x y,(u,v)](m,n) [((m,n),[x,u]m),((m,n),[y,v]n)]. × ∂yi 7→ (M) i i ( ∂x j ) 0 ∂y (M) ∂y (N) (b). det i = det( ) det( ) ∂y (N) ∂x j · ∂x j 0 ( ∂x j ) ! > 0 [Spivak [108, vol. 1, p.120, l. 1]]. − (c). By (a), πM∗ (TM) is orientable. By Spivak [108, vol. 1, p.140, Problem 23 (g)], TM is orientable.

1 1 Remark 310. By Jacobson [61, vol. 2, p.58, l. 11–l. 10], (y′ y′ )− =(y x )− [Spivak [108, vol. 1, − − ∗ ◦ ∗ ∗ ◦ ∗ p.150, l. 6]]. − Remark 311. ω(X) C∞ [Spivak [108, vol. 1, p.150, l. 4]]. ∈ − Proof. X : M TM n → i ∂ i ∞ p Xp = ∑ a i p, where a C [Spivak [108, vol. 1, p.112, l. 6–l. 5]]. 7→ i=1 ∂x | ∈ − − ω : M T ∗M → n j ∞ p ωp = ∑ ω j(p)dx (p), where ω j C [Spivak [108, vol. 1, p.152, l.2–l.3]]. 7→ j=1 ∈ ω(X) : M R → n i p ωp(Xp)= ∑ a ωi [Spivak [108, vol. 1, p.151, l. 4]] 7→ i=1 − C∞. ∈ Remark 312. d f is a C∞ section of T M [Spivak [108, vol. 1, p.150, l. 3]]. ∗ − n ∂ f i ∂ f ∞ Proof. d f = ∑ i dx [Spivak [108, vol. 1, p.152, l. 7]], where i C . j=1 ∂x − ∂x ∈ dxi is the dual basis of ∂ [Spivak [108, vol. 1, p.151, l. 4]]. { } { ∂xi } − The result follows by an argument similar to the proof of the statement given in Spivak [108, vol. 1, p.112, l. 6–l. 5]. − − ∞ Remark. It would be difficult to prove that d f is a C section of T ∗M if we were to use the definition d f (p)(X)= X( f ) [Spivak [108, vol. 1, p.150, l. 1]]. By the way, d f (p)(X)= X( f ) follows from O’neill [86, p.23, Definition 5.2]. − i α n ∂x ∂x′ β Remark 313. By Spivak [108, vol. 1, p.52, l.3–l.6], ∑i=1 β i = δα [Spivak [108, vol. 1, p.170, l.2]]. ∂x′ ∂x Remark 314. By Spivak [108, vol. 1, p.166, l. 4–p.167, l.1], the evaluation map is just the identity map − [Spivak [108, vol. 1, p.171, l.2]].

51 Remark 315. Bβγ given in Spivak [108, vol. 1, p.171, l. 1] satisfies Spivak [108, vol. 1, p.169, ( )]. µν − ∗ Proof. B j2 j3 = n Ai3 j2 j3 ( ), i1i2 ∑i3=1 i1i2i3 i ∗∗i i β β β ′β1β2β3 j1 j2 j3 ∂x 1 ∂x 2 ∂x 3 ∂x′ 1 ∂x′ 2 ∂x′ 3 Aα α α = ∑ i ,i ,i A α α α j j j . 1 2 3 1 2 3 i1i2i3 ∂x′ 1 ∂x′ 2 ∂x′ 3 ∂x 1 ∂x 2 ∂x 3 j1, j2, j3 B′β1β2 n A′α3β1β2 α1α2 = ∑α3=1 α1α2α3 i i i α β β j1 j2 j3 ∂x 1 ∂x 2 n ∂x 3 ∂x′ 3 ∂x′ 1 ∂x′ 2 = ∑ i ,i ,i A ∑ j j j 1 2 3 i1i2i3 ∂x′α1 ∂x′α2 α3=1 ∂x′α3 ∂x 1 ∂x 2 ∂x 3 j1, j2, j3 i i β β n i3 j2 j3 ∂x 1 ∂x 2 ∂x′ 1 ∂x′ 2 = ∑ i ,i ∑ A j j [Spivak [108, vol. 1, p.170, l.2, the second equality]] 1 2 i3=1 i1i2i3 ∂x′α1 ∂x′α2 ∂x 2 ∂x 3 j2, j3 i i β β j2 j3 ∂x 1 ∂x 2 ∂x′ 1 ∂x′ 2 = ∑ i ,i B j j [by ( )]. 1 2 i1i2 ∂x′α1 ∂x′α2 ∂x 2 ∂x 3 j2, j3 ∗∗ Remark 316. Spivak [108, vol. 1, p.174, Problem 1].

∂(g f ) 1 Proof. (a). ∂t◦i = Di(g f x− )(x(p)) [Spivak [108, vol. 1, p.46, l.9]] 1 1◦ ◦ = Di([g y− ] [y f x− ])(x(p)) m ◦ ◦ 1◦ ◦ 1 1 j = ∑ j=1 D j(g y− )((y f x− )(x(p))Di[y f x− ] (x(p)) [by the chain rule]. ◦ ◦ ◦ 1 ◦ ◦ (b). f ([x,v]p)=[y,D(y f x− )(x(p))(v)] f (p)) [Spivak [108, vol. 1, p.104, (c)]]. ∗ ◦ ◦ 0 . 1 1 1 1 . ∂(y f x− ) ∂(y f x− ) ◦∂t1◦ x ◦∂tn◦ x 0 . ◦ ···. . ◦   f ([x,(0, ,0,1,0, ,0)]p)=  . .. . 1. ∗ ··· ··· n 1 n 1   ∂(y f x− ) ∂(y f x− ) 0  ◦ 1◦ x ◦ n◦ x   ∂t ◦ ··· ∂t ◦ .  .   0   j ∂ j m ∂yk f ∂   (c). ( f ∗dy )( ∂xi p)=(dy )( f (p))∑k=1 ∂x◦i (p) ∂yk f (p) [by (b) and Spivak [108, vol. 1, p.155, l. 3; p.156, l.1]]| | −∂(y j f ) = ∂x◦i (p). j ∂ j ∂ (d). f ∗(dy )( ∂xi )=(dy )( f (p))( f ∂xi ) [Spivak [108, vol. 1, p.155, l. 3; p.156, l.1]]. j1 jn∗ − f ∗(∑ j1, , jn a j1, , jn dy dy ) ··· ··· ⊗···⊗ j1 jn = ∑ j1, , jn a j1, , jn ( f (p))( f ∗dy ) ( f ∗dy ) [Spivak [108, vol. 1, p.164, l.9]] ··· ··· j⊗···⊗ j m ∂y 1 f i m ∂y n f i a f p ◦ dx 1 p ◦ dx n p [by (c)]. = ∑ j1, , jn j1, , jn ( ( ))[∑i1=1 i1 ( )] [∑in=1 ∂xin ( )] ··· ··· ∂x ⊗···⊗ Remark 317. Spivak [108, vol. 1, p.178, Problem 6].

Proof. (a). ( f ∗∗(v∗∗))(λ)= v∗∗( f ∗λ)=( f ∗λ)(v) = λ( f v) [Spivak [108, vol. 1, p.148, l.10]] = iW ( f v)(λ) [Spivak [108, vol. 1, p.178, l. 3]]. (b). Read p.231, l. 5–p.234, l.10 in https://www.ams.org/journals/tran/1945-− 058-00/S0002-9947-1945-0013131-6/S0002-9947-1945-00− 13131-6.pdf.

Remark 318. φ C∞ [Spivak [108, vol. 1, p.203, l.7]] follows from Pontryagin [91, p.20, Theorem 2; p.179, ∈ (32); p.180, (37)] [Fix τ in Pontryagin [91, p.179, (35)] as t0 = 0]. Remark 319. χ 0 = I [Spivak [108, vol. 1, p.206, l. 2]]. ∗ −

52 Proof. Let α(t)=(t,0, ,0). α can be considered the identity function. ∂ ··· (i). α (0)= 0. ′ ∂t | (ii). χ (α′(0))=(χ α)′(0) [O’neill [86, p.160, Definition 5.3]]. ∗ ◦ (iii). χ α(t)= χ(t,0, ,0) ◦ ··· = φt(0,0, ,0) [Spivak [108, vol. 1, p.206, l.5]] ··· = φ0(t,0, ,0) ··· =(t,0, ,0) [since φ0 = I]. Then ··· ∂ (χ α) (0)= 0. ◦ ′ ∂t | (iv). The desired result follows from (i), (ii), and (iii).

∂ Remark 320. χ ( 1 )= X χ [Spivak [108, vol. 1, p.207, l.2]]. ∗ ∂t ◦ ∂ Proof. χ ( 1 q)( f )=(X f )(χ(a)) [Spivak [108, vol. 1, p.206, l.11]] ∗ ∂t | = Xχ(a)( f ) [Spivak [108, vol. 1, p.113, l.10]] =[X χ(a)]( f ). The desired◦ result follows from the equality given in Spivak [108, vol. 1, p.113, l. 2]. − ∂ Remark 321. X = ∂x1 [Spivak [108, vol. 1, p.207, l.3]].

1 ∂ f ∂ f x− Proof. ∂x1 = ∂◦t1 x [Spivak [108, vol. 1, p.46, l.9]]. ◦ 1 ∂ f ∂ f 1 ∂( f x− ) ∂( f χ) ∂x1 χ = ∂x1 x− = ∂◦t1 = ∂t◦1 ◦ ∂ ◦ = χ ( 1 )( f ) [Spivak [108, vol. 1, p.206, l.7]] ∗ ∂t =(X f )(χ(a)) [Spivak [108, vol. 1, p.206, l.11]]. Note that χ(a) M. ∈ Remark 322. LX f ω = X f ω + f LX ω [Spivak [108, vol. 1, p.209, l. 7, (4)]]. · · − 1 Proof. (LX ( f ω))(Xp)= limh 0 h [φh∗( f ω)(p) ( f ω)(p)](Xp) [Spivak [108, vol. 1, p.207, l. 6]] 1 → − − = limh 0 h [ f (φh(p))ω(φh(p))(φh Xp) f (p)ω(φh(p))(φh Xp)] → 1 ∗ − ∗ + f (p)limh 0 [φh∗ω(p)(Xp) ω(p)(Xp)] [Spivak [108, vol. 1, p.207, l. 4]] → h − − =(X f )(p)limh 0[ω(φh(p))(φh Xp)]+ f (p)(LX ω)(p)(Xp) [Spivak [108, vol. 1, p.207, l.5; l. 6]]. → ∗ − By the equality given in Spivak [108, vol. 1, p.113, l. 2], we may prove that (4) is valid in a neighborhood of p. −

Remark 323. LX [ω(Y)]=(LX ω)(Y)+ ω(LXY) [Spivak [108, vol. 1, p.209, l. 4, (5)]]. − ω(Y)(φh(p)) ω(Y)(p) Proof. LX [ω(Y)] = limh 0 − [Spivak [108, vol. 1, p.207, l.5; l.10]]. → h (LX ω)(Yp)= limh 0[(φh∗ω)(p)(Yp) ω(p)(Yp)] [Spivak [108, vol. 1, p.207, l. 6]] → − − = limh 0[ω(φh(p))(φh Yp) ω(p)(Yp)] [Spivak [108, vol. 1, p.207, l. 4]]. → ∗ − − ω(φh(p))Yφ (p) ω(φh(p))(φh Yp) h − ∗ = ω(φh(p))[Yφ (p) Yp +Yp φh Yp]. h − − ∗ Remark 324. LXY [Spivak [108, vol. 1, p.208, l.4]] is defined in terms of integral curves φh of the vector field X, so it is defined independently of any coordinate system [Spivak [108, vol. 1, p.213, l.6–l.7]]. Remark 325. g C∞( ε,ε) [Spivak [108, vol. 1, p.213, l. 7–l. 6]]. ∈ − − − 1 Proof. g(t)= 0 f ′(st)ds [Spivak [108, vol. 1, p.107, l.9]]. 1 g′(t)= 0 f ′′(stR)sds [Rudin [99, p.27, Theorem 1.34 or p.246, Exercise 16]]. R 53 1 2 Similarly, g′′(t)= 0 f ′′(st)s ds. Then (g is differentiable in ( ε,ε)) ′ R − g C( ε,ε) [i.e. g C1( ε,ε)]. ⇒ ′ ∈ − ∈ − Remark 326. limh 0(Ygh)(φ h(p))=(Yg0)(p) [Spivak [108, vol. 1, p.214, l. 8–l. 7]]. → − − − Proof. I. By Spivak [108, vol. 1, p.213, l. 3] [“there is a function f ” should have been replaced with “there is a function g”], g C∞. Then− ∂g ∈ C∞. Define ∂xi ∈

Yg : ( ε,ε) U R − × → (h,q) Ygh(q) 7→

, where U is a compact neighborhood of p and gh(q)= g(h,q). By Spivak [108, vol. 1, p.107, Theorem 1], Yg C∞(( ε,ε) U). ∈ − × II. By Rudin [97, p.135, Theorem 7.11], in order to prove

limn ∞ limh 0(Ygn)(φ h(p))= limh 0 limn ∞(Ygn)(φ h(p)) [ gn can be any sequence of ghn → → − → → − { } { } satisfying hn 0], it suffices to prove that Ygn converges uniformly on [ ε/2,ε/2] U. → − × This requirement follows from I because Yg is uniformly continuous on [ ε/2,ε/2] U. − × 1 n α 1 k+1, ,an) )) Remark 327. For the precise meaning of χ(a , ,a ) = π α (φ (0, ,0,a [Spivak [108, vol. ··· a a1 ··· ··· ··· 1, p.220, l. 9]], read p.8, l. 12 in https://syafiqjohar.files.wordpress.com/ 2018/12/frobenius-1.pdf− − .

Remark 328. ( f c) (0)= D1( f α3)(0,0)+ D2( f α3)(0,0) [Spivak [108, vol. 1, p.223, l.6]]. ◦ ′ ◦ ◦ Proof. ( f c)(t)= f α3(t,t) [Spivak [108, vol. 1, p.222, l. 6]]. ◦ ◦ − ( f c) (t)= D1( f α3)(t,t)+ D2( f α3)(t,t) [by the cain rule]. ◦ ′ ◦ ◦ Let t = 0.

Remark 329. D2( f α3)(0,0)= D1( f α2)(0,0)+ D2( f α2)(0,0) [Spivak [108, vol. 1, p.223, l.7]]. ◦ ◦ ◦ Proof. α3(0,t)= α2(t,t) [Spivak [108, vol. 1, p.222, l. 3, (b)]]. − ( f α3)(0,t)=( f α2)(t,t). ⇒ ◦ ◦ D2( f α3)(0,t)= D1( f α2)(t,t)+ D2( f α2)(t,t) [by differentiating the two sides of the last ◦ ◦ ◦ equality with respect to t]. Let t = 0.

Remark 330.  [Spivak [108, vol. 1, p.223, l.11]] follows from c′(0)[ f ]=( f )′(0) [O’neill [86, p.149, Defi- nition 3.10]]. ◦ Remark 331. Spivak [108, vol. 1, pp.231–232, Problem 5].

Proof 1. The solution follows from Pontryagin [91, p.179, l.11–l.17]. For (a), let (t, µ,τ,ξ,x)= (t,y,0,x,α(t,y,x)). For (b), let (t, µ,τ,ξ,x)=(t,,0,x,α(t,x)).

Proof 2. (a). I. If (α¯ 1,α¯ 2)= α¯ : ( b,b) W V U is a flow for f¯ in a neighborhood of − × → × (y0,x0), so that ∂ ¯ ∂t α¯ (t,y,x)= f (t,α¯ (t,y,x)), α¯ (0,y,x)=(y,x).

54 Proof. We may rephrase Spivak [108, vol. 1, p.194, l. 3–l. 1] as follows: − − α : ( b,b) B¯a(x0) U satisfies ∂α(t,x−) × → ∂t = f (α(t,x)), α(0,x)= x. Here B¯a(x0) corresponds to W and U corresponds to V U. × II. We can write α¯ (t,y,x)=(y,α(t,y,x)) for some α and α satisfies ( ). ∗ Proof. Let α(t,y,x)= Pr2(α¯ (t,y,x)). Then α(0,y,x)= Pr2(α¯ (0,y,x)) [by definition] = Pr2(y,x) [Spivak [108, vol. 1, p.231, l. 2]] − = x. ∂ α(t,y,x)= f (t,α¯ (t,y,x)) [Spivak [108, vol. 1, p.231, l. 3]] ∂t − = f (t,(y,α(t,y,x))) [Spivak [108, vol. 1, p.232, l.1]].

(b). I. By applying Pontryagin [91, p.177, (B)] to Pontryagin [91, p.178, (29)], we have f ∈ C1 α C1(W) [Spivak [108, vol. 1, p.232, l.9–l.12]]. ⇒ ∈ II. D1D2α(t,x) = D2D1α(t,x) = D2 f (α(t,x)) [Spivak [108, vol. 1, p.232, l.7 or l. 7]] ∂ f ∂z − = ∂z ∂x [let z = α(t,x); assume n = 1] n = D2 f (α(t,x))D2α(t,x) [since z R to which D2 refers]. ∈ Remark 332. Spivak [108, vol. 1, p.235, Problem 8].

Proof. (a). Let (x,U) be the coordinate system in a neighborhood of p M. ∈ Let (y,V) be the coordinate system in a neighborhood of q N. Then ∈ (x y,U V) is the coordinate system in a neighborhood of (p,q) M N. × × ∞ ∈ × f (x1, ,xm,y1, ,yn) C [by hypothesis]. ··· ··· ∈ ∞ Fix y1, ,yn. Then f ( ,y1, ,yn) C . ··· · ··· ∈ (b). Read Spivak [108, vol. 1, p.207, l.5; p.113, l.11]. ∞ (c). (i). φt C [Spivak [108, vol. 1, p.203, l.7]] ∞∈ φt C [since the matrix elements are partial derivatives of φt] ⇒ ∗ ∈ φ C∞ [Multiply the matrix and the column vector v]. ⇒ ∗ ∈ 1 (ii). limh 0 h [(φh∗ω)(Xp) ω(Xp)] →1 − = limh 0 h [ω(φh(p))(φh Xp) ω(Xp)] [Spivak [108, vol. 1, p.207, l. 4]] ∂A → ∗ − ∞ − = , where A = ω(φh(p))(φh Xp) C [(i) and Spivak [108, vol. 1, p.150, l. 4]]. ∂h ∗ ∈ − (d). Read Spivak [108, vol. 1, p.208, l.4].

Remark 333. Spivak [108, vol. 1, p.246, l. 5–l. 1] follows from Lee [73, p.491, l.8–l.12]. − − Remark 334. Spivak [108, vol. 1, p.271, Problem 7].

∂β m j Proof. (a). I. (u,t)= ∑ t f j(ut,β(u,t)) [Spivak [108, vol. 1, p.271, l.6]]. ∂u j=1 · ∂α Proof. ∂t j (t)= f j(t,α(t)) [Spivak [108, vol. 1, p.254, ( )]]. Then ∂α ∗ ∂t j (ut)= f j(ut,α(ut))u. Namely, ∂β(u,t) ∂t j = f j(ut,β(u,t))u.

55 II. β(0,t)= x [Spivak [108, vol. 1, p.271, l.7]]. Proof. β(0,t)= α(0 t)= α(0) · = x [Spivak [108, vol. 1, p.254, l.6]]. III. We can solve such equations [Spivak [108, vol. 1, p.271, l.8–l.9]]. Proof. Treat u as t in Spivak [108, vol. 1, pp.231–232, Problem 5(a)]; treat t as y in Spivak [108, vol. 1, pp.231–232, Problem 5(a)]. By Spivak [108, vol. 1, p.232, l.2], we may obtain the solution β. IV. If we choose ε as the b in Spivak [108, vol. 1, p.231, l. 5], then this ε is for all t W [Spivak − ∈ [108, vol. 1, p.271, l.9–l.10]]. (b). I. By the natural version of Spivak [108, vol. 1, p.194, Theorem 2], there exists a unique ∂β m j solution β(u,t) for the ODE ∂u (u,t)= ∑ j=1 t f j(ut,β(u,t)), and the initial conditions β(0,t)= x. · II. We may assume ε = 1 [Spivak [108, vol. 1, p.271, l. 8–l. 7]]. − − Proof. Let S : ( 1,1) εW ( ε,ε) W −1 × → − × (u1,t1) (εu1,t /ε)=(u,t) and β1 = β S. Then ∂β1 7→ m j ◦ (u1,t1)= ∑ t f j(u1t1,β1(u1,t1)). ∂u1 j=1 1 · ∂β ∂β(1,vt) (c). ∂t j (v,t)= ∂t j [by (b)] = v ∂β (1,vt) [by the chain rule]. · ∂(vt j) ∂β (d). ∂t j (v,t)= f j(vt,β(v,t)).

∂ j ∂ f j i ∂ f j n ∂ f j j k i k Proof. I. ∂v [v f j(vt,β(v,t))] = f j + v[t ∂ vt j + ∑i= j t ∂ vti + ∑k=1 k (t f j + ∑i= j t fi )] · ( ) 6 ( ) ∂x 6 j ∂ f j i ∂ fi j n ∂ f j k i n ∂ fi k = f j +vt ∂ vt j +v∑i= j t ∂ vt j +vt ∑k=1 xk f j +v∑i= j t ∑k=1 xk f j [Spivak [108, vol. 1, p.254, ( ) 6 ( ) ∂ 6 ∂ ( )]]. ∗∗∂ ∂β ∂ ∂β II. ∂v ∂t j (v,t)= ∂t j ∂v (v,t) ∂ m i = ∂t j ∑i=1 t fi(vt,β(v,t)) [Spivak [108, vol. 1, p.271, l.6]] · k k j ∂ f j j n ∂ f j ∂β (v,t) i ∂ fi n ∂ fi ∂β (v,t) = f j +t v ∂ vt j +t ∑k=1 xk ∂t j + ∑i= j t [ ∂ vt j v + ∑k=1 xk ∂t j ]. ( ) ∂ 6 ( ) ∂ ∂β(0,t) ∂β III. ∂t j = 0 ∂(vt j) (1,0 t) [by (c)]. · n · f j(0t,β(v,t)) R [Spivak [108, vol. 1, p.254, l.2]]. ∈ (e). I. α(vt)= β(1,vt) [by definition] = β(v,t) [by (b)]. II. α(0)= β(1,0) [by definition] = β(1,0t) = β(0,t) [by (b)] = x [Spivak [108, vol. 1, p.271, l.7]]. ∂β(1,t) III. ∂t j = f j(t,β(1,t)) [by (d)]. Consequently, for all t W [Spivak [108, vol. 1, p.271, l.10]], ∂α(t) ∈ ∂t j = f j(t,α(t)). Remark 335. Since dimΩn(Rn)= 1 [Spivak [108, vol. 1, p.279, Theorem 3]], of any two alternating n-linear functions on Rn, one is a multiple of the other [Spivak [108, vol. 1, p.274, l.6–l.7]].

56 Remark 336. By the equality gievn in Lee [73, p.353, l. 3], every ω Ωk(Rn) is a linear combination of the − ∈ v1 . functions (v1, ,vk) det of a k k minor of . [Spivak [108, vol. 1, p.280, l. 5–l. 3]]. ··· 7→ ×  .  − − v  k Remark 337. A clearer version of Spivak [108, vol. 1, p.281, Corollar  y 6] is given in Lee [73, p.379, Proposi- tion 15.3]. ∞ Remark 338. f ∗ is C [Spivak [108, vol. 1, p.282, l.7]].

Proof. Let (x,U) be a coordinate neighborhood of p M; (y,V) be a coordinate neighborhood ∈ of f (p) N; ∈ dx1, ,dxn be a basis for M ; and dy1, ,dym be a basis for N . ··· p∗ ··· ∗f (p) By Spivak [108, vol. 1, p.174, Problem 1(c)], f ∗ can be expressed as an m n matrix whose elements are C∞. ×

Remark 339. The argument given in p.80, l. 19 & l. 8–l. 4 in https://www.ime.usp.br/˜gorodski/ − − − teaching/mat5799-2013/ch4.pdf gives a neat proof of Spivak [108, vol. 1, p.284, The- orem 9]. By Lee [73, p.359, Exercise 14.14] and p.4, Lemma 7.2 in http://homepages. warwick.ac.uk/˜masgak/manifolds/notes4.pdf, “Ωn(TM) has a nowhere 0 sec- tion implies that Ωn(TM) is trivial” [Spivak [108, vol. 1, p.285, l.10–l.11]]. “If Ωn(TM) is trivial, then Ωn(TM) has a nowhere 0 section” follows from the equality given in p.80, l. 7 of − https://www.ime.usp.br/˜gorodski/teaching/mat5799-2013/ch4.pdf. The proof method of Spivak [108, vol. 1, p.284, Theorem 9] can be used to prove the general state- ment given in Spivak [108, vol. 1, p.285, l.12–l.15]. Remark 340. H2(T) R [Spivak [108, vol. 1, p.577, l. 1]] follows from Tu [117, p.303, l. 17–l. 14]. ≈ − − − Remark 341. xz y2 = 0 is a cone [Struik [109, p.4, l. 11–l. 9]]. − − − X2 Y 2 Z2 Proof. The equation of an elliptic cone: a2 + b2 c2 = 0 [R. E. Johnson, F. L. Kiokemeister & E. S. Wolk, Calculus, 2nd ed., Boston: Allyn and− Bacon, 1971, p.516, 7.17 ]. Let y = Y ,x = Z + X ,z = Z X . b c a c − a Remark 342. (Fundamental theorem for space curves) The proof given in Struik [109, p.29, l. 10–p.31, l. 1] is better than that given in Blaga [6, p.72, l.6–p.73, l. 1]. − − − Remark 343. dv = dϕ = 0 [Struik [109, p.60, l.13]] should have been corrected as δv = δϕ = 0.

Remark 344. x a(v1)+ p (w1)= 0 [Struik [109, p.66, l. 13]]. · ′ − Proof. Let f (u)= x a(u). Then f (u)= x a (u). · ′ · ′

Remark 345. On a cylinder the striction line is indeterminate [Struik [109, p.193, l.4–l.5]].

Proof. For a cylinder Xt is a constant vector [Hicks [55, p.32, Fig. 2.5]; Struik [109, p.66, l. 3]]. − Consequently, D¯ T X = 0. Thus, a = 0 = c [Hicks [55, p.33, l.8]]. a s = a2+c2 [Hicks [55, p.34, l.5]] 0 = 0 .

57 Remark 346. The statement that the first three derivatives of s and R2 with respect to s vanish [Weatherburn [124, vol. 1, p.22, l.6–l.7]] can be proved by the following viewpoint: It is true that s and R2 depend on s if P varies, but here we consider only the fact that the osculating sphere has contact of order three with the given curve at P. Thus, P is fixed; s and R2 should be treated as constant vectors. Remark 347. By Kreyszig [66, p.34, l. 14], the curvature of the helix does not vanish [Weatherburn [124, vol. − 1, p.26, l. 8]]. − Remark 348. By O’Neill [86, p.61, Corollary 3.5; p.62, Lemma 3.6], the cylinder on which the helix is drawn is a circular cylinder [Weatherburn [124, vol. 1, p.28, l. 14–l. 13]]. − − Remark 349. The point r1 lies in the normal plane to the given curve [Weatherburn [124, vol. 1, p.32, l.13– l.14]].

Proof. [r1(s),r(s)] lies on the tangent line at r1(s) which is perpendicular to t(s) [Kreyszig [66, p.52, l.18]]. Since the normal plane at r(s) is perpendicular to t(s) [Kreyszig [66, p.32, Fig. 10.2]], [r1(s),r(s)] lies on the normal plane at r(s).

Remark. The definition of involute given in Kreyszig [66, p.52, l.18] is original. http:// mathworld.wolfram.com/Involute.html uses the consequential property [Kreyszig [66, p.52, (15.2); l. 4–l. 1]; Weatherburn [124, vol. 1, p.30, l.2–l.4]] of the above definition as the definition. The− former− definition is a simple characteristic property, while the latter definition provides one procedure of construction. When considering the converse problem [Kreyszig [66, p.53, l. 4–l. 1]], we would like to choose the former definition because there are fewer steps required− to be− reversed. See the above proof. If we use the former definition, there is a single infinitude of evolutes [Weatherburn [124, vol. 1, p.33, l. 1–p.34, l.1]]. If we use the latter defi- nition, the curve has a unique evolute [http://mathworld.wolfram.com/Involute.− html]. This is because the horizonal line segment given in Weatherburn [124, vol. 1, p.30, Fig. 6] must be in the direction of the principal normal at r1 [˜c in Kreyszig [66, p.54, (15.6)] must be π 2 ]. Thus, the latter definitions destories the symmetry between involutes and evolutes. ∂ ∂ Remark 350. F(a,b)+ ∂a F(a,b)+ ∂b F(a,b)= 0 [Weatherburn [124, vol. 1, p.48, l.16]] follows from the Taylor formula [Struik [109, p.55, (1-2)]] just as Struik [109, p.66, (4-2)] follows from Rolle’s theorem. Remark 351. h has the same sign as the principal radii of curvature, α and β [Weatherburn [124, vol. 1, p.74, l. 9–l. 8]]. − − Proof. The side of a tangent plane at P is positive if it is the side to which P’s surface normal points. The principal radius of curvature is positive if the corresponding centre of curvature is located at the positive side of the tangent plane.

Remark 352. If the lines of curvature are taken as parameter curves, then the cojugate directions are parallel to conjugate diameters of indicatrix [Weatherburn [124, vol. 1, p.81, l.14–l. 11]]. − Proof 1. The statement follows from Weatherburn [124, vol. 1, p.75, l.1–l.2].

Proof 2. By §Parametric representation in https://en.wikipedia.org/wiki/Ellipse, ( cosθ , sinθ ) : ( cosθ ′ , sinθ ′ ) are the directional cosines of conjugate diameters of the indicatrix √α √β √α √β

58 [Weatherburn [124, vol. 1, p.75, l.1–l.2]]. ( cosθ , sinθ ) ( cosθ ′ , sinθ ′ )= 0 [Bell [2, p.115, l. 1]]. √α √β · √α √β − Remark 353. Weatherburn [124, vol. 1, p.86, (28)] is equivalent to Weatherburn [124, vol. 1, p.86, (31)].

∂ 2 Proof. : ∂u∂v (logE logG)= 0. ⇐ ∂ 2 − ∂ 2 Let f (u,v)= ∂u∂v logE. Then f (u,v)= ∂u∂v logG. ∂ 2 Because ∂v∂u (logE f (u,v)dudv)= 0, f (u,v)dudv − Ee− is a functionRR of u only. RR f (u,v)dudv Similarly, Ge− is a function of v only. Let λ = e f (u,RRv)dudv. RR Remark 354. Weatherburn [124, vol. 1, p.103, Ex.3] follows from Weatherburn [124, vol. 1, p.102, (7′)] [Bell [2, p.365, Ex. 1]] and Bell [2, p.365, Ex. 3]. Remark 355. Of all geodesics through a given point, those which bisect the angles between the lines of cur- vature have the greatest torsion [Weatherburn [124, vol. 1, p.104, l. 12–l. 10]] if κb > κa [Weatherburn [124, vol. 1, p.104, (12)]]. − − Remark 356. In order to justify the statement given in Weatherburn [124, vol. 1, p.109, l.14–l.16], one should read Kreyszig [66, p.138, Fig. 42.1; (42.3)(b)]. Remark 357. Weatherburn [124, vol. 1, p.112, Example (1)].

Proof. I. The statements given in this example are somewhat sloppy, so we should clarify their precise meanings. E,F,G refer to the patch r(u,v) on surface S. Curve φ(u,v)= 0 refers to the point set r(u,v) S φ(u,v)= 0 . { ∈ | } By Weatherburn [124, vol. 1, p.119, (20)], 2 2 2 2 kg = Hu (v + λu + 2µu v + νv ) Hv (u + lu + 2mu v + nv ) [Weatherburn [124, vol. 1, ′ ′′ ′ ′ ′ ′ − ′ ′′ ′ ′ ′ ′ p.112, l.12–l.13]]. Want to prove u′ = v′ = 1 [Weatherburn [124, vol. 1, p.112, l.4]]. φ2 φ1 Θ − 2 2 2 2 u′ v′ u′v′ Eu′ +2Fu′v′+Gv′ Proof. 2 = 2 = = 2 2 . φ φ φ1φ2 Eφ 2Fφ φ +Gφ 2 1 − 2 1 2 1 2 2 − dr 2 Eu′ + 2Fu′v′ + Gv′ =(r1u′ + r2v′ =( ds ) = 1.

∂Θ Fφ Gφ (F1φ2+Fφ12 G1φ1 Gφ11)Θ (Fφ2 Gφ1) ∂ 2 1 − − − − ∂u II. ∂u ( −Θ )= Θ2 . ∂Θ Fφ Eφ (F2φ1+Fφ12 E2φ2 Eφ22)Θ (Fφ1 Eφ2) ∂ 1 2 − − − − ∂v ∂v ( −Θ )= Θ2 , where 2 2 Θ = Eφ 2Fφ1φ2 + Gφ . 2 − 1 ∂Θ 1 2 2 1/2 2 2 =q(Eφ 2Fφ1φ2 + Gφ ) (E1φ + 2Eφ2φ21 2F1φ1φ2 2Fφ11φ2 2Fφ1φ21 + G1φ + ∂u 2 2 − 1 − 2 − − − 1 2Gφ1φ11). ∂Θ 1 2 2 1/2 2 2 = (Eφ 2Fφ1φ2 + Gφ ) (E2φ + 2Eφ2φ22 2F2φ1φ2 2Fφ12φ2 2Fφ1φ22 + G2φ + ∂v 2 2 − 1 − 2 − − − 1 2Gφ1φ12). ∂ Fφ2 Gφ1 III. ∂u ( −Θ )= 2 2 1 2 2 (F1φ2+Fφ12 G1φ1 Gφ11)(Eφ 2Fφ1φ2+Gφ ) (Fφ2 Gφ1)(E1φ +2Eφ2φ21 2F1φ1φ2 2Fφ11φ2 2Fφ1φ21+G1φ +2Gφ1φ11) − − 2 − 1 − 2 − 2 − − − 1 Θ3 . 3 ∂ Fφ2 Gφ1 3 2 2 2 2 2 2 Θ ∂u ( −Θ )=(F1Eφ2 2F1Fφ1φ2 +F1Gφ1 φ2 +FEφ2 φ12 2F φ1φ2φ12 +FGφ1 φ12 G1Eφ1φ2 + 2 3 − 2 2 2 − − 2FG1φ φ2 G1Gφ GEφ φ11 + 2FGφ1φ2φ11 G φ φ11) 1 − 1 − 2 − 1

59 1 3 2 2 2 2 2 1 2 1 2 ( 2 FE1φ2 +FEφ2 φ21 FF1φ1φ2 F φ2 φ11 F φ1φ2φ21 + 2 FG1φ1 φ2 +FGφ1φ2φ11 2 GE1φ1φ2 − 2 − − 2− 1 3 2 2 − − GEφ1φ2φ21 + GF1φ1 φ2 + GFφ1φ2φ11 + GFφ1 φ21 2 GG1φ1 G φ1 φ11). ∂ Fφ1 Eφ2 − − ∂v ( −Θ )= 2 2 1 2 2 (F2φ1+Fφ12 E2φ2 Eφ22)(Eφ 2Fφ1φ2+Gφ ) (Fφ1 Eφ2)(E2φ +2Eφ2φ22 2F2φ1φ2 2Fφ12φ2 2Fφ1φ22+G2φ +2Gφ1φ12) − − 2 − 1 − 2 − 2 − − − 1 Θ3 . 3 ∂ Fφ1 Eφ2 2 2 3 2 2 2 3 Θ ∂v ( −Θ )=(F2Eφ1φ2 2F2Fφ1 φ2 +F2Gφ1 +FEφ2 φ12 2F φ1φ2φ12 +FGφ1 φ12 E2Eφ2 + 2 2 2 −2 2 − 1 2 − 2 2E2Fφ1φ2 E2Gφ1 φ2 E φ2 φ22 +2EFφ1φ2φ22 EGφ1 φ22) ( 2 FE2φ1φ2 +FEφ1φ2φ22 FF2φ1 φ2 2 − 2 2 −1 3 2 1 − 3 2 −2 2 2 − − F φ1φ2φ12 F φ1 φ22 + 2 FG2φ1 +FGφ1 φ12 2 EE2φ2 E φ2 φ22 +EF2φ1φ2 +EFφ2 φ12 +EFφ1φ2φ22 1 2 − − − − 2 EG2φ1 φ2 EGφ1φ2φ12). 3 ∂ Fφ2 −Gφ1 ∂ Fφ1 Eφ2 Θ [ ∂u ( −Θ )+ ∂v ( −Θ )] 3 2 2 2 1 3 1 3 2 1 2 = F1Eφ2 2F1Fφ1φ2 G1Eφ1φ2 +2FG1φ1 φ2 2 G1Gφ1 ( 2 FE1φ2 FF1φ1φ2 + 2 FG1φ1 φ2 1 −2 2 − 3 1 3 − 2 − 2 − 1 2 1 3 − 2 GE1φ1φ2 ) F2Fφ1 φ2 +F2Gφ1 2 E2Eφ2 +2E2Fφ1φ2 E2Gφ1 φ2 ( 2 FE2φ1φ2 + 2 FG2φ1 1 2 − 2 2 − 2 − − − 2 EG2φ1 φ2) H (φ11φ2 2φ1φ2φ12 + φ22φ1 ) 3 3− 2− 2 2 1 3 1 3 2 1 2 = Θ [F1Eu′ +2F1Fv′u′ +G1Ev′u′ +2FG1v′ u′ + 2 G1Gv′ ( 2 FE1u′ +FF1v′u′ + 2 FG1u′v′ + 1 2 2 3 1 3 2 − 2 1 2 1 3 2 GE1v′u′ ) F2Fv′ u′ F2Gv′ 2 E2Eu′ 2E2Fv′u′ E2Gv′ u′ ( 2 FE2v′u′ 2 FG2v′ 1 2 − 2 2 − 2 − − 2 − − − − − 2 EG2v′ u′)] Θ H (φ11u′ + 2v′u′φ12 + φ22v′ ) [Weatherburn [124, vol. 1, p.112, l.4]]. 2 − 2 φ11u + 2v u φ12 + φ22v = Θ(u v v u ) [Weatherburn [124, vol. 1, p.112, l.9]]. ′ ′ ′ ′ − ′ ′′ − ′ ′′ Remark 358. Weatherburn [124, vol. 1, p.114, Ex.].

Proof. I. In order to fully understand the problem given in Weatherburn [124, vol. 1, p.114, Ex.], we should define envelope and edge of regression rigorously as in Kreyszig [66, p.254, l.17; p.256, l. 4]. − Draw tangent lines to the geodesics at P and its consecutive (forward or backward)point on C (suppose it has a sense). These two consecutive tangent lines determines a plane. These planes constitute a one-parameter (we may use the arc-length s of C as the parameter) family of planes. The above tangent lines are their characteristics. The characteristics (generators) constitute an envelope (developable surface as said in Weatherburn [124, vol. 1, p.114, l. 18]). The intersec- tion of the two consecutive characteristics is a corresponding point of P on the− edge of regression on the envelope (note that the intersecting point is not on C). II. By Weatherburn [124, vol. 1, p.42, l.4], the generators are the tangents to the edge of regres- sion [Weatherburn [124, vol. 1, p.115, l.4]]. III. 0 = Gv + rr1 r12v [Weatherburn [124, vol. 1, p.115, l.7]] because r2 r11 = 0 follows from ′ · ′ · Weatherburn [124, vol. 1, p.90, l.13]. IV. F = 0,E = 1 [Weatherburn [124, vol. 1, p.115, l.8]] follow from Weatherburn [124, vol. 1, p.114, l. 11]. − V. v′ is not zero [Weatherburn [124, vol. 1, p.115, l.9]].

Proof. Assume that v′(s0)= 0. Then the tangent line at C(s0) and the tangent line to the geodesic at C(s0) would concide. Their directions would not be conjugate. In addition, the tangent plane at the forward consecutive point of C(s0) would be the same as that at the backward consecutive point of C(s0). The two tangent planes would not intersect in a line. This would contradict the definition of conjugate directions given in Weatherburn [124, vol. 1, p.80, l.6–l.7].

Remark 359. For Willmore [126, p.24, Exercise 8.1], we let α(u)=(aeu cosu,aeu sinu,beu) and s(u) = u 2 2 1/2 u ∞ α′ [O’Neill [86, p.51, l. 7]]. Then s =(2a + b ) e . The desired results follow from − k k − R 60 O’Neill [86, p.69, Theorem 4.3]. Remark 360. Willmore [126, p.61, l.11–l.20].

Proof. I. Let T be defined as in Willmore [126, p.59, (11.1)]. ∂T ∂T ( = 0,V = 0) ( = Fu′ + Gv′ = constant α) [Willmore [126, p.60, (11.2)(ii)]], i.e., ∂v ⇒ ∂v′ Fdu + Gdv = αds. Squiring both sides, we have F2du2 + 2FGdudv + G2dv2 = α2ds2 = α2(Edu2 + 2Fdudv + Gdv2) ( ). Then dv FG α2F (α4F2 α4EG α2F2G+α2G2E)1/2 ∗ du = − ± −α2G G−2 . II. If G = α2, then u = constant− c [by I, ( )]. Then ∗ V = 0 [Willmore [126, p.60, l.9]]. Thus, to see whether u = c is a geodesic, it suffices to check if U = 0. By I, Fu′ + Gv′ = α, so v′ = α/G. 2 ∂T ∂T α G1 = 0, = 2 . ∂u′ ∂u 2G (u = c is a geodesic) G1(c)= 0 [Willmore [126, p.60, (11.2)(i)]]. 2 ⇔ III. (F /E is a constant) 2EF1 = FE1 ⇔ (v = c is a geodesic) [Willmore [126, p.58, (10.7)]]. ⇔ Remark 361. tanα = Hv /u [Willmore [126, p.62, l.5]] follows from Willmore [126, p.42, (6.3)]. (r˙ r¨) ′ ′ × · (r1 r2)=(r˙ r1)(r¨ r2) (r˙ r2)(r¨ r1) [Willmore [126, p.63, l. 18]] follows from O’Neill × · · − · · − [86, p.209, Ex. 6]. The fact that r′′ is in the direction of r¨ [Willmore [126, p.64, l.2]] follows from O’Neill [86, p.69, l.9; p.66, l.16]. (w) (k) Let Γi jk be defined as in Willmore [126, p.64, (12.5)]; Γ jki be defined as in Kreyszig [66, p.127, (w) (k) (38.3)]. Then Γi jk = Γ jki. (w)Γ1 = H 2(GΓ(w) FΓ(w)) [Willmore [126, p.65, (12.9)(i)]] jk − 1 jk − 2 jk 2 (k) (k) 11 (k) 12 (k) = H− (GΓ jk1 FΓ jk2)= g Γ jk1 + g Γ jk2 [Kreyszig [66, p.113, (34.2)] (k) 1 − (w) 2 (k) 2 = Γ jk [Kreyszig [66, p.127, (38.5)]]. Similarly, Γ jk = Γ jk. Remark 362. For a proof of Willmore [126, p.68, Exercise 13.2], see https://conversationofmomentum. wordpress.com/2014/09/07/geodesics-on-a-cone/.

Remark 363. κg = H(u µ v λ) [Willmore [126, p.71, (15.5)]]. ′ − ′

π H(u′µ v λ) Proof. sin( )= − ′ [Willmore [126, p.42, (6.3); p.71, l.11]; Kreyszig [66, p.138, Fig. − 2 κg 42.1]]. | | κg = κg [Kreyszig [66, p.138, Fig. 42.1]]. −| | Remark 364. By Willmore [126, p.42, (6.2)], 2T = 1 [Willmore [126, p.72, l.8]]. Remark 365. For a developable surface, the central point of a generator may be interpreted as the point of contact of the generator with the edge of regression, and the edge of regression itself becomes the line of striction [Willmore [126, p.111, l.9–l.12]].

Proof. The first statement follows from Weatherburn [124, vol.1, p.42, Fig. 8]. For a cone, both the edge of regression and the line of striction are the vertex of the cone; for a cylinder, both the edge of regression and the line of striction are ∅. Thus, all we have to do is focus on tangential developables [Struik [109, p.67, l. 16–l. 15]]. This surface is generated by the tangent lines − − to a space curve x(s) [Struik [109, p.64, l.1–l.2]]. This tangent line is the intersection (i.e.

61 characteristic) of the consecutive osculatng planes [Weatherburn [124, vol. 1, p.12, l.5–l.6; p.45, l.15–l.16]]. If we interpret tangent lines as generators, the intersection of two consecutive tangent lines (i.e. the intersection of three consective osculating planes) is the central point [because the distance between the two tangent lines is 0]. If we interpret tangent lines as characteristics, the intersection of two consecutive characteristics is a characteristic point [Struik [109, p.67, l. 8]]. − Thus, the line of striction is the edge of regression [i.e. x(s)].

Remark 366. Willmore [126, p.111, Exercise 8.1].

Proof. b˙ r˙ + vb˙ 2 = 0 [Willmore [126, p.109, (8.9)]]. · b˙ = τn [Willmore [126, p.14, (4.13)]]. Consequently, − v = 0.

Remark 367. Willmore [126, p.111, Exercise 8.2].

[t,n,n˙] Proof. p = n˙ 2 [Willmore [126, p.108, (8.6)]] b n˙ | | = n˙· 2 [Willmore [126, p.12, (4.11)]] | |τ = κ2+τ2 [Willmore [126, p.14, (4.13)]]. Remark 368. Willmore [126, p.130, l.9–l.10, (3.2)] follows from Weatherburn [124, vol. 1, §41, (1)–(4); §43, (7) & (8)].

2. Partial differential equations (Courant–Hilbert [21]; Sneddon [105]; Sneddon [106])

2 2 2 2 ∂x 2 ∂y 2 ∂z 2 ∂ξn 2 ∂ξn 2 ∂ξn 2 1 Remark 1. ds = ∑ h (dξn) , where h =( ) +( ) +( ) =[( ) +( ) +( ) ] [Feshbach– n n n ∂ξn ∂ξn ∂ξn ∂x ∂y ∂z − Morse [32, part 1, p.115, l.13–l.14]].

∂x ∂x ∂x ∂ξ1 ∂ξ2 ∂ξ3 Proof. Let A = ∂y ∂y ∂y .  ∂ξ1 ∂ξ2 ∂ξ3  ∂z ∂z ∂z  ∂ξ1 ∂ξ2 ∂ξ3  Then A is an orthogonal matrix.  dx dξ1 2 t ds = dx dy dz dy = dξ1 dξ2 dξ3 A A dξ2 .     dz dξ   3 ( ∂x )2 +( ∂y)2 +( ∂z )2 0  0 ∂ξ1 ∂ξ1 ∂ξ1 AtA = 0 ( ∂x )2 +( ∂y )2 +( ∂z )2 0 .  ∂ξ2 ∂ξ2 ∂ξ2  0 0 ( ∂x )2 +( ∂y )2 +( ∂z )2  ∂ξ3 ∂ξ3 ∂ξ3    ∂ξ1 ∂ξ1 ∂ξ1 Remark. By O’Neill [86, p.148, Lemma 3.8], ( ∂x , ∂y , ∂z ) is the normal vector field on the surface ξ1(x,y,z)= c. Remark 2. ∂(u,g) = 0 u = w[g(x,y)] [Courant–Hilbert [21, vol. 2, p.5, l.5–l.7]]. ∂(x,y) ⇔ ∂u ∂u ∂g ∂g ∂x ∂y w′ ∂x w′ ∂y Proof. : ∂g ∂g ∂g ∂g = 0. ⇐ ∂x ∂y ∂x ∂y : 0 =( ∂g , ∂g ) ( ∂ u , ∂u ). ⇒ ∂x ∂y · ∂ y − ∂x The latter vector is perpendicular to u, so u is parallel to g. ▽ ▽ ▽ 62 By O’Neill [86, p.148, Lemma 3.8], g is perpendicular to g(x,y)= C1, where C1 is a constant. ▽ Consequently, u is perpendicular to g(x,y)= C1. ▽ Express g(x,y)= C1 as a curve of parameter t: α(t)=(α1(t),α2(t)). Then d (u(α(t)) = u (α (t),α (t)) = 0. dt ▽ · 1′ 2′ Therefore, u(α(t)) = C2, where C2 is a constant. Let w(C1)= C2. Then u(x,y)= u(α(t)) = C2 = w(C1)= w[g(x,y)].

Remark. Read Sneddon [105, p.20, Theorem 3]. Remark 3. Sneddon [105, p.8, (6)]

Proof. Assume that the string is flexible. Its shape changes according to the forces applied to dy W it. Considering the vertical and horizontal components of these forces, we have dx = H and dy W ds = T . Remark 4. For the proof of Sneddon [105, p.9, Theorem 1], see Hartman [51, p.8, Theorem 1.1]. Remark 5. The statement given in Sneddon [105, p.19, l.5–l.6] can be interpreted as Arnold [1, p.17, Theo- rem; Fig. 4] or Coddington–Levinson [16, p.22, Theorem 7.1; p.23, Fig. 2]. The latter interpre- tation is more advanced than the former one. Remark 6. By Haloms [50, p.14, Theorem 2; p.23, Theorem 2] and O’Neill [86, p.23, l. 3–l. 1], there ∂F ∂F ∂F − − must exist a function µ(x,y,z) such that µP = ∂x , µQ = ∂y , µR = ∂z [Sneddon [105, p.22, l.6–l.7]]. Remark 7. The family of planes given in Sneddon [105, p.31, (14)] has the common axis (c,0,c) c R . { | ∈ } Remark 8. (The Cauchy–Kowalewski theorem) The general PDE F(x,Dα u)= 0 [John [62, p.60, (2.29a)]] can be reduced to 0 = ∂F + ∂xn ∂F α ∑ DnD u [John [62, p.60, (2.29b)]]. Thus, it suffices to solve the general mth-order quasi- α ∂ pα linear PDEs [John [62, p.56, (2.2)]]. The proof of the Cauchy–Kowalewski theorem is given in John [62, §3.3.(d)]. However, the notations given in John [62, p.58, (2.11), (2.11), (2.20); p.63, (3.5)] are confusing and some prerequisites given in John [62, pp.56–73] are unnecessary for the proof. One should pack light for a long trip. In order to to understand the essence of the proof, one should read Evans [30, §4.6] first because Evans’ proof is leaner, simpler, and more carefully written. Remark 1. John [62, §§ 3.3.(a)–(c)] provides a complete theory of real analytic functions. Fritzsche–Grauert [37, p.15, Theorem 3.8] provides a complete version of John [62, p.70, The- orem]. For the proof of the claim given in John [62, p.71, l. 15–l. 13], see Fritzsche–Grauert [37, p.13, Theorem 3.5]. Although the formula given in John− [62, p.68,− l.6] summarzes the result of chain rule, there is something important missing. For a full understanding of the formula, one should check the formula in detail for the simple cases such as α =(0, ,0,1),(0, ,0,2). ··· ··· Remark 2. Evans [30, p.224, Definition] indicates that for the purpose of proving the Cauchy– Kowalewski theorem, it is unnecessary to consider the necessary and sufficient condition for a surface to be characteristic when N > 1 [John [62, p.59, l. 15–l. 11]]. This is a good observa- − − tion. The only reason for considering the necessary and sufficient condition for a surface S [John [62, p.74, l.18]] to be noncharacteristic is to reduce the general form [John [62, p.56, (2.2)]] to the standard form [John [62, p.74, (3.30)]]. Once we obtain the standard form, we need not consider this necessary and sufficient condition any more.

63 n Remark 9. z = ∑r=1 fr(vr) leads to a linear partial differential equation of the nth order [Sneddon [105, p.89, l. 10–l. 8]]. − − 2 Proof. z = f1(x + ay)+ f2(x + aωy)+ f3(x + aω y). 3 (3) (3) (3) 3 zyyy = a ( f1 + f2 + f3 )= a zxxx. Remark 10. Bell [2, §217] shows that if z=f(x,y) is a developable surface, then rt = s2 [Sneddon [105, p.89, l. 3–l. 1]]. − − Remark 11. This is equivalent to taking G and R to be zero in equation (7) [Sneddon [105, p.92, l.3–l.4]].

iωt ∂φ iωt ∂ 2φ Proof. Let φ = φ0e [Wangsness [121, p.462, (27-49)]]. Then ∂t = iωφ0e and ∂t2 = 2 iωt ω φ0e . − Remark 12. Sneddon [105, p.93, (18) & (19)] follows from the formula given in Riley–Hobson–Bence [94, p.374, l.15]. ∂ψ Remark 13. ∂n = 4πσ, where σ is the surface density of the on the conductor [Sneddon [105, p.142,− l. 10–l. 9]]. − − Proof. E = σ [Wangsness [121, p.85, (6-4)]]. ε0 E = E n · = ∇ψ n [Jackson [60, p.34, (1.16)]] − · = ∂ψ . Therefore, − ∂n ∂ψ = σ (SI units). − ∂n ε0 By Jackson [60, p.781, Table 2], ∂ψ = 4πσ (Gaussian units). ∂n − Remark 14. If ψ 0 on S and is bounded at infinity, then ψ 0 outside V [Sneddon [105, p.152, l.1–l.2]]. ≡ ≡ Proof. I. There exists a function f analytic outside V such that ψ = ℜ f . Let g = e f . Then g = eℜ f . Note that g is bounded| | outside V; so g is analytic at infinity. II. By Rudin [99, p.302, Theorem 14.8], the exterior of V can be considered Ω given in Rudin [99, p.274, (1)]. Let M(x)= sup g(x) : ∞ < y < ∞ (a x b). By Rudin [99, p.275, (3)], { | | − } ≤ ≤ M(x)b a M(a)b xM(b)x a = 1. − ≤ − − Thus, ℜ f 0. ≤ Similarly, ℜ( f ) 0. − ≤ Remark 15. The exterior Dirichlet problem is equivalent to the corresponding interior Dirichlet problem [Sneddon [105, p.152, l.3–l.5]]. The proof given in Sneddon [105, p.152, l.3–l. 5] is based − on Kelvin’s inversion theorem [Sneddon [105, p.165, l.6]].

Remark 16. ψ0 is continuous at the origin and takes the value 1 there [Sneddon [105, p.153, l.6]].

2 2 2 Proof. Let a = x + y and u = z′ z. 1 z u 1−z du ψ(x,y,z)= − du + z − z (u2+a2)1/2 z (u2+a2)1/2 − − 2 2 1/2 1 z 2 2 1 z =(u + a )R −z + zln u + √uR + a −z . 2 |− 2 1/2 1 |z ||− 2 2 2 2 Let ψ0 =(u + a ) −z + zln [1 z + (1 z) + a ][z + √z + a ] . |− | − − | p 64 Remark 17. Sneddon [105, p.156, l. 7, (6)] follows from Sneddon [104, p.68, (18.10)]. − ∂ψ 1 ∂ψ Remark 18. By Wangness [121, p.33, (1-101)], qr = and qθ = [Sneddon [105, p.158, l.11–l.12]]. − ∂r − r ∂θ ∂ψ1 ∂ψ2 Remark 19. By Jackson [60, p.18, (I.17)], κ( ∂r ) ∂r = 4πλPn(cosθ) on r = a [Sneddon [105, p.158, l. 8]]. − − Remark 20. By Wangsness [121, p.99, (7-8)] and Sneddon [105, p.158, l. 8, the second equation], the energy κ ∂ψ1 − due to the interior of the sphere is 8π ψ1( ∂n )dS [Sneddon [105, p.159, l.1–l.2]]. 2 1 Remark 21. As = 2 ρ f (ρ)J0(λSρ)dρ [SneddonR [105, p.160, l.7, (21)]] follows from Guo–Wang [48, [J1(λS)] 0 p.425, (6)]. R Remark 22. “The theory of Fourier series” given in Sneddon [105, p.160, l. 10] should have been replaced with “the Sturm–Liouville theory”. By Coddington–Levinson [16,− p.197, Theorem 4.1], ∞ ∞ mπx nπy f (x,y)= ∑m=1 ∑n=1 fmn sin a sin b , where 4 a b mπx nπy fmn = f (x,y)sin sin dxdy [Sneddon [105, p.160, l. 9–l. 8]]. Note that Amn = ab 0 0 a b − − fmn sech(γmnc) [Sneddon [105, p.160, l. 6]] should have been replaced with Amn = fmn csch(γmnc). R R − Remark 23. The inversion carries a sphere S′ into itself if and only if S′ is orthognal to S [Sneddon [105, p.164, l. 10–l. 9]]. − − Proof. Let K be a plane passing through the centers of S and S′. Then consider the circles S K and S K. ∩ ′ ∩ 2 2 2 Remark 24. Since x = ξ , y = η , z = ζ , ξ = a x ,η = a y ,ζ = a z [Sneddon [105, p.164, l. 6, (1)]]. r ρ r ρ r ρ r2 r2 r2 − ∂ a2 ∂ψ a2 ∂ 2 a ∂ 2 1 Remark 25. ∂x ( r2 ∂x )= r ∂x2 ( r ψ) ψ ∂x2 ( r ) [Sneddon [105, p.165, l.2]] should have been corrected as ∂ a2 ∂ψ a ∂ 2 a −a2 ∂ 2 1 ∂x ( r2 ∂x )= r ∂x2 ( r ψ) r ψ ∂x2 ( r ). 6 2 − 2 2 Remark 26. ∇2ψ = r [ ∂ ( a ∂ψ )+ ∂ ( a ∂ψ )+ ∂ ( a ∂ψ )] [Sneddon [105, p.164, l. 3]]. a6 ∂x r2 ∂x ∂y r2 ∂y ∂z r2 ∂z − ∂ 2 ∂ 2 ∂ 2 Proof. ( ∂ξ 2 + ∂η2 + ∂ζ 2 )ψ = 1 ( ∂ [ h2h3 ∂ψ ]+ ∂ [ h3h1 ∂ψ ]+ ∂ [ h1h2 ∂ψ ]) [Corson–Larrain–Larrain [20, p.23, (1-83)]], h1h2h3 ∂x h1 ∂x ∂y h2 ∂y ∂z h3 ∂z where a2 h1 = h2 = h3 = 2 [Feshbach–Morse [32, p.115, l. 5]]. r − Remark 27. By Sneddon [105, p.165, l.8–l.10] and p.31, Theorem 7 of http://www.math.harvard. 1 ∂ a a2 edu/˜canzani/math253/Lecture5.pdf, 4π S[ ∂r ( r ψ0( r2 r))]r=adS = aψ0(0) [Sned- don [105, p.165, l. 3]]. Note that the integrand of− the right− side of the equality (2.5)− on p.31 − R of http://www.math.harvard.edu/˜canzani/math253/Lecture5.pdf should have been corrected as φ∂ν ψ ψ∂ν φ. a a2r a− Remark 28. ψ(r)= ψ0(r) r ψ0( r2 )+ r ψ(0) [Sneddon [105, p.166, l.7, (7)]] follows from Sneddon [105, p.165, (5); l. −4–l. 3] and Wangsness [121, p.179, Example]. − − 2 3 2 3 ∂ψ ∂ψ0 a a r a 2 a λ a λ r λ Remark 29. ( )r a =( )r a [ ψ0( 2 )+ 4 (r gradψ0)]r a + [ 2 ψ0( 2 )+ 4 (r gradψ0)]r adλ ∂r = ∂r = − r r r · = a 0 r r r r · = [Sneddon [105, p.166, l. 6–l. 5]]. − − R ax ay az a2r Proof. Let x′ = r2 ,y′ = r2 ,z′ = r2 and ψ1(r)= ψ0( r2 ). ∂ψ1 ∂x′ ∂x′ ∂x′ ∂x ∂x ∂y ∂z ∂ψ1 ∂ψ1 ∂ψ1 ∂ψ1 ∂y′ ∂y′ ∂y′ .  ∂y  = ∂x ∂y ∂z  ∂x ∂y ∂z  ∂ψ1 ∂z′ ∂z′ ∂z′ ∂z   ∂x ∂y ∂z   2  ∂ψ1 ∂ψ1 ∂ψ1 ∂ψ1 r a  =( , , ) = 3 ∇ ψ1 r. ∂r ∂x ∂y ∂z · r − r ′ · 65 2 λx λy λz λ r ∂ψ1 ∂ψ1 ∂x′ ∂ψ1 ∂y′ ∂ψ1 ∂z′ Remark 30. Let x′ = 2 ,y′ = 2 ,z′ = 2 and ψ1(r)= ψ0( 2 ). Then = + + = r r r r ∂λ ∂x′ ∂λ ∂y′ ∂λ ∂z′ ∂λ 2λ 2 [∇ ψ1 r] [Sneddon [105, p.166, l. 4–l. 3]]. r ′ · − − ∂ Remark 31. If we treat ∂n as ∇ n, where n is the outnormal of the area element, then the proof of Sneddon [105, p.168, (3)] can· be simplified and the two proofs [Sneddon [105, p.168, l.1–l. 1; p.169, − l.1–l.12]] can be merged into one. Note that the formula given in Sneddon [105, p.169, l.7–l.8] is incorrect.

Remark 32. For the proof of G(r1,r2)= G(r2,r1) [Sneddon [105, p.170, l.5, (9)]], see http://www. math.washington.edu/˜morrow/336_08/Green_Symmetry.pdf. Remark 33. The proof of Sneddon [105, p.172, l.7, (19)] is incorrect. For a correct proof, see Jackson [60, p.65, l.1–l.9]. 2 1/2 ∞ n Remark 34. By differentiating (1 2hcosΘ + h ) = ∑ h Pn(cosΘ) [Watson–Whittaker [122, p.302, − − n=0 l.10]] with respect to h, we have 2 1 h = ∞ (2n + 1)hnP (cosΘ) [Sneddon [105, p.173, l. 7]]. (1 2hcos−Θ+h2)3/2 ∑n=0 n − − Remark 35. In order to clarify what Sneddon [105, p.175, l.7–l.22] says, one should read Fomin–Gelfand [35, p.192, l.1–p.193, l.3; p,195, l.4–p.196, l.5; p.198, l.1–p.200, l.9; p.202, l. 9–l. 6; §41.4]. − − Remark 36. Sneddon [105, p.314, l. 3–l. 1] considers three consecutive tangents on the edge of regression; − − see Bell [2, p.310, Ex. 6; §215 & §216]. Remark 37. (Applications of Laplace’s equations) (a) Confocal conicoids can be a set of equipotentials [Smythe [103, §5.00–§5.01]]. Remark. Smythe [103, p.122, l.3–l. 11] provides a brief report about confocal conicoids. − For the full story, see Hilbert–Cohn-Vossen [56, p.19, l.7–p.23, l. 9]. For a rigorous treat- ment, see Bell [2, §115–§118]. − (b) Charged conducting ellipsoids, disks, ribbons, and needles [Smythe [103, §5.02–§5.03]; Griffiths [47, p.41, Problem 2.52]] Remark. σ = ε(∇V)θ 0 given in Smythe [103, p.123, l. 1] should have been corrected as − = − σ = ε ∇V θ=0. Strictly speaking, Griffiths [46, p.90, (2.36)] is the only correct formula. −∂V|below | ∂Vabove Since = 0, ∇V θ 0 can be regarded as . Note that A can be calculated from ∂n | | = ∂n the last equality given in Smythe [103, p.123, l. 7] and that the equality follows from the − condition θ ∞. → 2n+ν 1 Remark 38. By differentiating α − αJν (α)+ 2nJν 1(α) with respect to α, we have Sneddon [106, { − } p.27, l.8, (2.1.9)]. Remark 39. Sneddon [106, p.27, l.12, (2.1.10)] should have been corrected as s 2 2 1/2 1 (s ρ ) J1(ξρ)dρ = (ξs) 1 cos(ξs) . 0 − − − { − } R π ∞ 1 2 if 0 ρ < a Remark 40. 0 ξ − sin(aξ)J0(aξ)dξ = 1 ≤ [Sneddon [106, p.28, l.6, (2.1.15)]]. (sin− (a/ρ) if ρ > a R ∞ 1 2 π/2 ∞ 1 Proof. 0 ξ − sin(aξ)J0(aξ)dξ = π 0 dθ 0 ξ − sin(aξ)cos(ρξ sinθ)dξ [Sneddon [106, p.27, (2.1.11)]] R R R 1 π/2 ∞ sin[(a+ρ sinθ)ξ]+sin[(a ρ sinθ)ξ] = π 0 dθ 0 ξ − dξ. π/2 ∞ sin[(a+ρ sinθ)ξ]+sin[(a ρ sinθ)ξ] Let SR= 0 Rdθ 0 ξ − dξ. π/2 ∞ sin[(a+ρ sinθ)ξ] π/2 ∞ sin[(a ρ sinθ)ξ] If 0 ρ R< a, S =R dθ dξ + dθ − dξ ≤ 0 0 ξ 0 0 ξ R R R R 66 π π π = 2 ( 2 + 2 ) [Rudin [99, p.244, (7)]]. 1 π/2 ∞ sin[(a+ρ sinθ)ξ] sin− (a/ρ) ∞ sin[(a ρ sinθ)ξ] π/2 ∞ sin[(ρ sinθ a)ξ] If ρ > a, S = 0 dθ 0 dξ + 0 dθ 0 − dξ 1 dθ 0 − dξ ξ ξ − sin− (a/ρ) ξ π2 π 1 π π 1 = + sin−R (a/ρ)R ( sin− (a/ρ))R. R R R 4 2 − 2 2 − Remark 41. (The Hankel inversion theorem) [Sneddon [106, p.29, (2.1.22)]; Titchmarsh [113, §8.18]] Remark 1. J (λy)= Acos(λy)+Bsin(λy) + O( 1 ) [Titchmarsh [113, p.241, l.5]] follows from ν (λy)1/2 (λy)3/2 Guo–Wang [48, p.379, (5)]. Remark 2. For the Riemann–Lebesgue theorem [Titchmarsh [113, p.241, l. 14]], see Rudin [98, − p.169, Theorem 7.5]. x+δ ν+1 λ Remark 3. x χ1(y)y dy 0 Jν (xu)Jν (yu)udu x+δ ν+1 λ = χ1(x + δR) ξ y dy 0 JRν (xu)Jν (yu)udu [Titchmarsh [113, p.242, l.5–l.6]] follows from Watson–Whittaker [122, p.66, l. 5]. R R − λ Remark 4. By Watson [123, p.404, (6)], Jν (xu)Jν 1(yu)du = O(1) as λ ∞. The proof 0 + → given in Titchmarsh [113, p.242, l.9–l.11] is incorrect. R n zt Remark 42. V2n+3(ρ,z)= ρ− e− Jn(ρt) satisfies Sneddon [106, p.25, (1.6.1)] [Sneddon [106, p.32, l.4–l.6]]. This is because Jn(x)(x = ρt) satisfies Jackson [60, p.112, (3.77)]. Remark 43. (Contour integrals for Bessel functions) ∞ γ+1 2 (α+β+γ 2ν)π Sν,α,β,γ (ρ,t;a)= 0 Jα (ρx)Jβ (tx)x dx+ π sin 2 − Kν,α,β,γ (ρ,t;a) [Sneddon [106, p.35, l.15–l.16, (2.2.9)]]. R iθ π Jν (az)+iYν (az) 1+γ Proof. I. Let CR = Re 0 θ and F(z)= Jα (ρz)Jβ (tz)z . We want to { | ≤ ≤ 2 } Jν (az) prove limR ∞ C F(z)dz = 0. → R νπ π νπ π R νπ π 1 i(az 2 4 ) i(az 2 4 ) 1 ℜ(iaz) Proof. ( cos(az 2 4 ) )− = 2( e − − +e− − − )− 4e (note that ℜ(iz) < 0). | − − | | | ≤ By Guo–Wang [48, pp.378–379], we have (1) 2 ℜ(iaz) 2 ℜ(iρz) Hν (az) πaR e and Jα (ρz) πaR e . | |∼ | |≤ π We may assumeq that arg(iz) lies between q2 + δ and π.

∞ Jν (aiy)+iYν (aiy) 1+γ II. F(z)dz = Jα (ρiy)Jβ (tiy)(iy) d(iy). [∞i,0] − 0 Jν (aiy) (1) Jν (Raiy)+ iYν (aiy)= RHν (aiy) [Watson [123, p.73, (1)]] 2 ν 1 = π Kν (ay)i− − [Jackson [60, p.116, (3.101)]]. ν Jν (aiy)= i Iν (ay) [Jackson [60, p.116, (3.100)]]. ℜ(i 2ν+α+β+γ+1)= ℜ[eπi/2( 2ν+α+β+γ+1)] − − − − ℜ[(cos[( 2ν + α + β + γ)/2]+ isin[( 2ν + α + β + γ)/2])i] − − − = sin[( 2ν + α + β + γ)/2]. − p III. Let Γ = ∑s=1 γs. Then p Γ F(z)dz = πi∑s=1 ResF(λs) [Gonzalez´ [42, p.683, Lemma 9.4]]. − 2 Yν (aλs)= − [Watson [123, p.76, l.2–l.3]] R πaλsJν′ (aλs) = 2 [Watson [123, p.45, (4)]]. πaλsJν+1(aλs) limp ∞ Γ ℜF(z)dz = Sν,α,β,γ (ρ,t;a) [Guo–Wang [48, p.422, l.4–l.10]]. IV. The→ desired result follows− from Watson [123, p.482, l.4–l.5] and Cauchy’s theorem. R ∞ δ+1 2 (δ+β+γ 2ν)π Remark 44. S = J (ux)Jγ (vx)x dx+ sin − K (u,v) [Sneddon [106, p.35, l. 4– ν∗,H,β,γ,δ 0 β π 2 ν∗,H,β,γ,δ − l. 3, (2.2.10)]]. − R 67 iθ π δ+1 Proof. I. Let CR = Re 0 θ and F(z)= φ(z)J (uz)Jγ (vz)z , where { | ≤ ≤ 2 } β z Jν′ (z)+iYν′ (z) +H Jν (z)+iYν (z) φ(z)= { } { } . Then limR ∞ C F(z)dz = 0. zJν′ (z)+HJν (z) → R 2 (δ+β+γ 2ν)π II. [∞i,0] ℜF(z)dz = π sin 2 − Kν∗,H,β,γ,δ (uR,v). p III.R Let Γ = ∑s=1 γs. Then p F(z)dz = πi∑ ResF(µs) [Gonzalez´ [42, p.683, Lemma 9.4]]. Γ − s=1 p Jβ (uµs)Jγ (vµs) 2+δ R ℜF(z)dz = 2∑ 2 2 2 2 µ (0 < u < 1,0 < v < 1) [Watson [123, p.480, l.21– Γ s=1 (µ ν +H )J (µs) s − s − ν Rl.24]]. d H2+µ2 ν2 Proof. (zJ′ (z)+ HJν (z)) z=µ = − Jν (µs). dz ν | s − µs zJ (z)+ HJ (z) zY (z)+ HY (z) J (z) Y (z) ν′ ν ν′ ν = z2 ν′ ν′ d (zJ (z)+ HJ (z)) d (zY (z)+ HY (z)) J (z) Y (z) dz ν′ ν dz ν′ ν ν′′ ν′′ J (z) Y (z) J (z) Y (z) + Hz( ν′ ν + ν ν′ ) J (z) Y (z) J (z) Y (z) ν′′ ν′ ν′ ν′′ 2 Jν (z) Yν ( z) 2(z2 ν2+ H2) +(H + H) = − [Watson [123, p.76, (1), (5), & (6)]]. J (z) Y ( z) πz ν′ ν′

IV. The desired result follows from Watson [123, p.482, l. 21–l. 19] [we must assume that − − ν H] and Cauchy’s theorem. ≥ ∞ ∞ ∞ sinh(xy) πy Remark 45. ∑ J0(nu)sin(nx)= J0(tu)sin(xt)dt I0(uy)g dy; n=1 0 − 0 sinh(πy) − ∞ ∞ ∞ cosh(xy) πy ∑n=1 J1(nu)sin(nx)= R0 J1(tu)cos(xt)dt R 0 sinh(πy) I1(uy)g− dy [Sneddon [106, p.37, l.3 & l.10]]. − R R Proof. Note that we must assume u + x < 2π to insure the convergence of the integrals given in Sneddon [106, p.37, (2.2.15); (2.2.16)] at infinity. Let z = Reiθ , where R = N + 1 and 0 θ π/2. ≤ ≤ ( csc(πz) )= 2( eiπz + e iπz ) 1 4eℜ(iπz) (note that ℜ(iz) < 0). | | | − | − ≤ 2 ℜ( iuz) J0(uz) e [Guo–Wang [48, p.379, (5)]]. | |≤ πuR − ℜ( ixz) sin(xz),cosq(xz)= O(e − ).

∞ 1 xH(x u) Remark 46. J (tu)cos(xt)dt = − [Sneddon [106, p.37, l.12]] follows from Sneddon [106, p.28, 0 1 u u√x2 u2 − − 1 (1)k(3/2)k k+1 1/2 (2.1.17),R (2.1.18a) & (2.1.18b); p.29, (2.1.18c)]. Note that =( ) − . 2 k!(2)k − k+1 3. Probability and statistics (Borovkov [9]; Chung [15]) 

(A). Borovkov [9]

Remark 1. Let P = Φ0 1 and η = α +σξ. Then Fη (x)= Φα σ (( ∞,x)) [Borovkov [9, p.38, 15–l.16]]. ξ , , −

68 Proof. x α F (x)= F ( − ) [Borovkov [9, p.38, l.14]] η ξ σ x α = Φ( − ) [Borovkov [9, p.33, l. 3]] σ − x α − 1 σ u2/2 = e− du [Borovkov [9, p.33, l. 3]] √ ∞ − 2π Z − x 1 (v α)2/(2σ 2) = e− − dv [let v = α + σu] σ√2π ∞ Z− = Φα σ (( ∞,x)) [Borovkov [9, p.33, (3)]]. , −

Remark. Let Pλ = Φα,σ . By the above proposition, Pλ = Pη , i.e., Fλ (x)= Fη (x). One may ask whether λ = η, but this is a meaningless question. If we consider the literal meaning of random variable, we see that the sole purpose of a random variable is to construct a probability on (R,B) for calculation [Borovkov [9, p.28, Definition 2]]. As far as probability theory is concerned, there is no difference between λ and η. We write F (x)= F (x) as . λ η α + σξ(ω) = η(ω)= x = λ(ω) symbolically. 1 1 Remark 2. fg(ξ) = f (g− (y))(g− (y))′ [Borovkov [9, p.38, l.19]]. Proof.

dFg(ξ)(y) f = [Borovkov [9, p.36, l.19]] g(ξ) dy dF(g 1(y)) = − [Borovkov [9, p.38, l.7]]. dy

Remark 3. 2 (x)dx1 dxr = 1 [Borovkov [9, p.41, l. 2]]. α,σ ··· − 2 2 1 Proof.R We may assume that A = diag σ − , ,σ . Then A =(σ1 σr) . { 1 ··· r− } | | ··· − Remark 4. B = σ(A), where A is the algebra of finite unions of semi-intervals [ , ) [Borovkov [9, · · p.47, l.14]]. Remark. The definition of A given in Borovkov [9, p.47, l.14] is not clear. Remark 5. Let α(ξ)= ξ 1(A) A A , where A is the algebra of finite unions of semi-intervals [ , ). { − | ∈ } · · Then σ(α(ξ)) = σ(ξ) [Borovkov [9, p.47, l. 8]]. − Proof. σ(α(ξ)) is the σ-algebra generated by α(ξ). σ(ξ)= ξ 1(B) B is a Borel set [Borovkov [9, p.46, l.18]]. { − | } Remark. The definition of α(ξ) given in Borovkov [9, p.47, l. 9–l. 8] is confusing. n − − Remark 6. We may represent Ai as a union of disjoint events from [Borovkov [9, p.48, l. 13]]. ∪i=1 A − Proof. Let c1 < a1 < c2 < a2,d1 < b1 < d2 < b2. Then 1 1 1 1 ξ1− ([a1,a2)) ξ2− ([b1,b2)) ξ1− ([c1,c2)) ξ2− ([d1,d2)) 1 ∩ 1 \ 1 ∩ 1 1 1 =[ξ − ([c2,a2)) ξ − ([b1,d2))] [ξ − ([a1,c2)) ξ − ([d2,b2))] [ξ − ([c2,a2)) ξ − ([d2,b2))]. 1 ∩ 2 ∪ 1 ∩ 2 ∪ 1 ∩ 2

Remark 7. Section 3.3 given in Borovkov [9, p.48, l. 4] refers to Borovkov [9, p.39, l. 9–l. 7]. − − −

69 Remark 8. By Borovkov [9, p.428, Theorem 1], it suffices to define probability on the algebra of these sets [Borovkov [9, p.49, l.10 l.11]]. −− Remark 9. g(ξ(ω))P(dω)= g(x)Pξ (dx) [Borovkov [9, p.50, l.12]].

Proof. The case of an indicator of a set A follows from I (ξ(ω))= I 1 (ω) and Borovkov R R A ξ − (A) [9, p.28, l. 3]. The general− case follows from Borovkov [9, p.435, (1)]. Remark 10. Borovkov [9, p.50, (16)] follows from Rudin [97, p.112, Theorem 6.14(b)(i)]; Borovkov [9, p.50, (17)] follows from Borovkov [9, p.434, Lemma 1; p.435, (1)]. b b b Remark 11. a g(x)dF(x)= g(x)F(x) a a F(x)dg(x) [Borovkov [9, p.51, l.13]] follows from Rudin [97, p.122, Theorem 6.30].| − R R x Remark 12. Let F(x) be absolutely continuous and have the form ∞ p(t)dt, where p(t) is Riemann inte- grable. If g x is Riemann integrable, then the Stieltjes− integral g x dF x g x p x dx ( ) R ( ) ( )= ( ) ( ) becomes a conventional Riemann integral [Borovkov [9, p.51, l. 3–l. 1]]. R− − R Proof. If p L1(R1) and if F(x)= x p(t)dt, then F NBV, F is absolutely continuous, ∈ ∞ ∈ and F x p x a.e. m [Rudin [99,− p.176, Theorem 8.17]]. ′( )= ( ) [ ] R If g R, F R on [a,b], then g R(F) and g(x)dF(x)= g(x)F (x)dx [Rudin [97, ∈ ′ ∈ ∈ ′ p.115, Theorem 6.17]]. R R Remark 1. If we assume F NBV alone, then F(x) is not necessarily absolutely continuous [Rudin ∈ [99, p.177, Theorem 8.18]]. If F(x) is absolutely continuous, then F is differentiable 1 1 x a.e., F′ L (R ), and F(x)= ∞ F′(x)dx [Rudin [99, p.177, Theorem 8.18]]. ∈ − Remark 2. Read Rudin [99, p.176, TheoremR 8.16; p.21, Proposition 1.24(d)]. Remark 13. (16) and (17) coincide when F(x) is continuous and g(x) is a function of bounded variation [Borovkov [9, p.51, l.11–l.12]]. Proof. By Rudin [97, p.109, Theorem 6.9], g R(F). Then Borovkov [9, p.50, (16)] fol- ∈ lows from Rudin [97, p.112, Theorem 6.14(b)(ii)]. Now we can use an argument similar to that given in Borovkov [9, p.50, l. 6–p.51, l.19] to prove (16)=(17). − Remark 14. If P(ξ = k)= qk(1 q), where q (0,1), k = 0,1,2, , then P(ξ x+k ξ k)= P(ξ x) − ∈ ··· ≥ | ≥ ≥ [Borovkov [9, p.61, l.8–l.10]]. Proof. P(ξ k)= qk. ≥ P(ξ x+k) x+k k x P(ξ x + k ξ k)= P(ξ≥ k) = q /q = q = P(ξ x). ≥ | ≥ ≥ ≥ Remark 15. By Borovkov [9, p.49, l.10–l.11], we may construct the probability space < Ω,F,P > given in Borovkov [9, p.63, l.23] Remark 16. If there exists a z 1 such that v(z) < ∞, then v(k) < ∞ for all k 1 [Borovkov [9, p.64, l.5]]. ≥ ≥ Proof. v(k)= kv(1) [by induction]. zv(1)= v(z) < ∞ v(1) < ∞. ⇒ Remark 17. Eg(ξ)= zdFg(ξ)(z)= g(x)dFξ (x) [Borovkov [9, p.64, Theorem 1]]. 1 Proof. TheR second equalityR follows from Pg(ξ) = Pξ g− [Borovkov [9, p.38, l.4]] and the following lemma: ◦ Lemma. (Intergration by change of variable for a pushforward measure) Let X,Y be measure spaces and g : X Y, f : Y R. Then → → ( f g)dµ = f dν, where ν(B)= µ(g 1(B)) is a measure defined for all measurable X ◦ Y − B Y. R ⊆ R 70 Remark 18. ν satisfies the definition of a Markov or stopping time [Borovkov [9, p.66, l.17]].

Proof. I. Since (ξ1 + ξ2)(ω) < x ω r Q ω ξ1(ω) < r,ξ2(ω) < x r , σ(ξ1 + ξ2) ⇔ ∈∪ ∈ { | − } ⊂ σ(ξ1,ξ2). n II. ω ν(ω) n = ω Sn a + bk { | ≤ } ∪k=1{ | ≥ } F1 n [by I]. ∈ , Remark 19. E(ξk : ν k)= P(ν k)Eξk [Borovkov [9, p.67, l.1]]. ≥ ≥ i 1 i i 1 i Proof. I. P(ν k)P(ξk [ −2n , 2n )) = P(ξk [ −2n , 2n );ν k). ≥ ∈ 1 ∈i 1 i ≥ 1 II. We may assume ξk 0. Let En,i = ξk− ([ −2n , 2n )),Fn = ξk− ([n,∞)), and n2n i 1 ≥ gn = ∑i=1 −2n IEn,i + nIFn . P(ν k) gn(ω)P(dω)= ν k gn(ω)P(dω) [by I]. ≥ ≥ Let n ∞. Then gn ξk [Rudin [99, p.16, Theorem 1.17]]. → R ↑ R Remark 20. The sequences ξ (1) ∞ ,ξ (1) ∞ , are mutually independent [Borovkov [9, p.67, l.19]]. { k }k=1 k }k=1 ··· Proof. ξ1, ,ξn, are independent ··· ··· For any subsequence n1,n2, ,nk, ξn1 ,ξn2 , ,ξnk are independent [Borovkov [9, p.44, Definition⇔ 10]] ··· ···

For any subsequence n1,n2, ,nk, σ(ξn1 ),σ(ξn2 ), ,σ(ξnk ) are independent [Borovkov [9,⇔ p.47, l.8–l.9]] ··· ··· σ(ξ1), ,σ(ξn), are independent [Borovkov [9, p.44, Definition 10]]. ⇔ ··· ··· (n) (n) Since an event in the algebra generated by σ(ξ1 ), ,σ(ξk ), can be represented as (n) 1 ··· (···n) 1 a disjoint union of events of the form (ξn ) ([a1,b1)) (ξn ) ([ak,bk)) [Borovkov 1 − ∩···∩ k − [9, p.48, l. 13]], σ(ξ (n), ,ξ (n), ),n = 1,2,3, are independent [Borovkov [9, p.44, − 1 ··· k ··· ··· Theorem 4]]. z 2 Remark 21. P(η > 2z) (1 2− ) [Borovkov [9, p.68, l.15]]. ≤ − z Proof. P(η > z) 1 2− can be interpreted as follows: Given the total capital≤ − 2z, the probability that no one wins after z games is no more than 1 2 z. − − After z games, z1,z2 may change, but the total capital 2z keeps the same. Consequently, the same probability can be used for the next run of z games.

Remark 22. P(ηk ζk)= dF(t)G(t + 0) [Borovkov [9, p.69, l.6]]. ≤ t+ε Proof. G(t + 0R)= limε 0 ∞ g(y)dy = y t g(y)dy. ↓ − ≤ P(ηk ζk)= ∑x ∑y x P(x,y)= t y t f (t)g(y)dydt. ≤ ≤ R ≤ R Remark 23. The variance is equal to the momentR R of inertia of the distribution of unit mass along the line [Borovkov [9, p.69, l. 11–l. 9]]. − − Proof. µ = center of mass = ∑i miri = expection. ∑i mi 2 Let µ = 0,∑i mi = 1. Then moment of inertia = ∑i miri = variance. Remark 24. Note that z = o(n2/3) p p [Borovkov [9, p.94, l.3]]. ⇒ ∗ ∼ Remark 25. For fixed y, limn ∞ P(ζn < y)= Φ(y) [Borovkov [9, p.99, l.9–l.10]]. → Proof 1. By Borovkov [9, p.99, (20)], we have 1/7 1/7 1/7 P( n ζn 0,P( n=1 m=n Em)= 0]. ⇔ ∀ ∩ ∞ ∪ ∞ ε ε > 0,0 = P( n=1 m=n Em) ∀ ∞∩ ε∪ = limn ∞ P( m=nEm) [Rudin [99, p.17, Theorem 1.19(e)]] → ∪ ε inf1 n<∞ supm n P(Em) ≥ ≤ ≥ ε = limsupn ∞ P(En ). → p a.s. Remark 27. If ξn is monotonically increasing or decreasing then convergence ξn ξ implies that ξn −→ −−→ ξ [Borovkov [9, p. 108, l. 1–p.109, l.1]]. − p Proof. It suffices to prove that if ξn and ξn 0, then there exists a subsequence ξnk such a.s. ↓ −→ that ξnk 0. −−→ k k Given k, there exists N(k) > N(k 1) such that for n N(k), P( ω ξn(ω) > 2− ) < 2− . k − k ≥ { | } Let Ek = ω ξN(k) > 2− . Then P(Ek) < 2− . ∞ { | ∞ }c i ω / i=kEi ω i=kEi ( i k,ξN(i) 2− ) ξN(i) 0. ∈∪∞ ∞ ⇒ ∈∩ ∞ ⇒ ∀ ∞≥ ≤ ∞ ⇒i →k+1 P( k=1 i=k Ei) P( i=kEi) ∑i=k P(Ei) ∑i=k 2− = 2− . ∩ ∪r ≤ ∪ r ≤ r ≤ Remark 28. ξn ξm Cr( ξn ξ + ξm ξ ) for some Cr [Borovkov [9, p. 111, l.5–l.6]]. | − | ≤ | − | | − | Proof. f (ω)+ g(ω) r ( f (ω) + g(ω) )r [2max( f (ω) , g(ω) )]r | | ≤ | | | | ≤ | | | | = 2r max( f (ω) r, g(ω) r) 2r( f (ω) r + g(ω) r). | | | | ≤ | | | | Remark 29. By Gray [45, p.136, Lemma 5.11], ηn is uniformly integrable together with ξn [Borovkov [9, p. 113, l.19–l.20]]. Remark 1. The proof given in Borovkov [9, p. 113, l. 14–l. 8] is not as direct as that given in − − Roch [95, p.2, l.2–l. 7] since the latter proof need not use Gray [45, p.136, Lemma 5.11]. − Remark 2. The formulation of Borovkov [9, p. 113, Theorem 5] is not as complete and organized as that of Chung [15, p.101, Theorem 4.5.4]. Remark 3. When we need consider Fourier transforms on Rn, all the proofs of Borovkov [9, p. 113, Theorem 5], Chung [15, p.101, Theorem 4.5.4] and Roch [95, p.1, Theorem 10.3] can be generalized from the case of random vatiable to that of random vector.

Remark 30. T =( F(x+)=F(x) F(x) ) ( F Ci on a certain interval Ii Ci ) [Borovkov [9, p. 118, l.12–l.14]]. ∪ 6 { } ∪ ∪ ≡ { } Remark 31. Some remarks on the proof sketch of Borovkov [9, p. 119, Theorem 4A] given in Borovkov [9, p. 119, l.14–l.18] Remark 1. The intuitive meaning of the statement “H is continuous at each point of a Borel set B such that P(ξ B)= 1 (i.e. P (B)= 1 [Borovkov [9, p. 33, l.7]])” is “H is continuous ∈ ξ a.e. [Pξ ]”. Remark 2. P = P H 1. H(ξ) ξ ◦ − Remark 3. Strictly speaking, we cannot directly use Borovkov [9, p. 112, Theorem 4] to prove Borovkov [9, p. 119, Theorem 4A]. A delicate modification is required and a rigorous is given in Wellner [125, p.7, Theorem 1.2] (Note that Wellner should have added “Xn X” to the hypothesis). In other words, we should use Wellner [125, pp. 4–5, Theorem⇒ 1.1 (ix)] instead of Wellner [125, pp. 4–5, Theorem 1.1 (i)].

Remark 32. Both bnξn 0 [Borovkov [9, p.120, l.8]] and h(bnξn)ξn H′(a)ξ [Borovkov [9, p.120, l.11]] follow⇒ from Slutsky’s theorem [Geyer [39, p.18, l. ⇒6]] which can be proved using − Geyer [39, p.17, Theorem 11; p.18, Theorem 13].

72 Remark 1. One may wonder if one can replace Geyer [39, p.18, Theorem 13] with Billingsley [4, p.23, Theorem 2.8] in the proof of Slutsky’s theorem. In fact, one cannot. This is because a product measure, by definition, requires that its component measures be independent. However, X,Y in Geyer [39, p.18, l. 3] may be dependent. w − Remark 2. d(Yn,Y) 0 [Geyer [39, p.18, l.2]] −→ p d(Yn,Y) 0 [Billingsley [4, p.27, l.6–l.16]] ⇔ p −→ Yn Y ⇔ −→w Yn Y [Billingsley [4, p.27, l.6–l.16]]. ⇔ −→ Remark 3. The second assertion is proved in the same way [Borovkov [9, p.120, l.12]] because it suffices to replace x f (x,Y) in Geyer [39, p.18, 16] with x f (x,x,Y). 7→ − 7→ Remark 33. The wording of Borovkov [9, p. 121, Corollary 6] is confusing. The corollary should have been formulated as follows:

“Let Gn,G . If [( Gnk )Gn is a subsequence of Gn and Gn Fk G,Fk = G], then Gn G”. ∈G ∀ k k ⇒ ∈ ⇒ Remark 34. If 1 and fdFn fdF,F , then F [Borovkov [9, p. 122, l. 13–l. 12]]. ∈L → ∈G ∈ F − − Proof. dFnR fdF R → 1 = Fn(+∞) Fn( ∞)= F(+∞) F( ∞). ⇒ R −R − − − Remark 35. (F ) (Fn F) [Borovkov [9, p. 122, l. 12–l. 11]]. ∈ F ⇒ ⇒ − − Proof. By Borovkov [9, p. 120, Theorem 8], there exist a Gl and a subsequence Fn of ∈G l Fn such that Fn Gl. l ⇒ Let Fn be any subsequence of Fn such that Fn Gk. Then k k ⇒ f ,lim fdFn = f dGk. ∀ ∈L k By Borovkov [9, p. 122, (13)], lim fdF = fdF. R R nk Since is a distributing determining class, Gk = F = Gl. L R R By Borovkov [9, p. 121, Corollary 6], Fn F. ⇒ Remark 36. By Borovkov [9, p. 131, Corollary 1], eitx t R is a distribution determining class { | ∈ } [Borovkov [9, p. 124, l.11–l.12]]. Remark 37. Except for Borovkov [9, p. 114, l.10–l. 1], all the content of chapter 6 can be extended to the multidimensional case [Borovkov [9,− p. 124, l.14–l.15]]. Since the case of random variable and that of random vector are quite similar, most of the extension work is routine. In order to facilitate the extension, it suffices to pay attention to the following points: 1. By extending Borovkov [9, p. 116, Definition 6] to the multidimensional case, the n n domain of Fm is R , but Fm(R ) [0,1]. ⊂ 2. For n = 2, replace fε (t) in Borovkov [9, p. 116, l.18] with fε;x,y(t,s)= fε;x(t) fε;y(s) and Fn(x j) Fn(x j 1) in Borovkov [9, p. 117, l.10] with [Fn(x j,y j) Fn(x j,y j 1)] − − − − − [Fn(x j 1,y j) Fn(x j 1,y j 1)]. See Borovkov [9, p. 39, l. 12]. − − − − 1 1 − 1 1 3. In Borovkov [9, p. 118, l.15], replace Fm− F− with j1 j n : Pr j Fm− Pr j F− . 6→ ∃ ≤ ≤ 6→ 4. For Borovkov [9, p. 120, Theorem 7], replace the derivative with partial derivatives. See Rudin [97, p.191, l.11–l.12]. In Borovkov [9, p. 120, l.3], replace H′(a)ξ with ∂H /∂x ∂H /∂y ξ x x x . ∂H /∂x ∂H /∂y ξ  y y  y 5. In Borovkov [9, p. 120, l. 6–l. 4], replace ∞ with ( ∞ [ ∞,+∞]) ([ ∞,+∞] − − − {− }× − ∪ − × ∞ ) and +∞ with (+∞,+∞) for n = 2. {− }

73 6. In order to derive the multivariate analog of Borovkov [9, p. 124, (14)], we use Spivak ∂ f [108, p.343, Theorem 4] (let ω = Fn f and note that d f = ∑ dxi) instead of Rudin i ∂xi [97, p.122, Theorem 6.30]. Remark 38. Eξη = EξEη [Borovkov [9, p. 125, l.12]].

Proof. (x1 + ix2)(y1 + iy2)dFξ1ξ2η1η2 (x1,x2,y1,y2) = (x + ix )(y + iy )P (d(x ,x ,y ,y )) 1 R 2 1 2 ξ1ξ2η1η2 1 2 1 2 = (x + ix )P (d(x ,x )) (y + iy )P (d(y ,y )) R 1 2 ξ1ξ2 1 2 1 2 η1η2 1 2 (since and are independent, we can use the Fubini theorem) R (ξ1,ξ2) (η1,η2) R = (x1 + ix2)Fξ1ξ2 (x1,x2) (y1 + iy2)Fη1η2 (y1,y2). P P Remark.R Note that ξ1 andR ξ2 may be dependent. If ξ1 and ξ2 were to be considered separately, then measures on Rn would be easily confused with the product measure. itξ itx Remark 39. Ee = e dFξ (x) [Borovkov [9, p. 125, l.14]]. itξ itx Proof. LetR η1 = e 1 and z = g(x)= e . itξ Ee = Eη = zdFη (z) eitxdF i x = x ( )R (Integration by change of variable for the pushforward measure P P g 1 [Borovkov [9, R η = ξ − p. 38, l.4]]). ◦

Remark 40. ϕS2 (t)= ϕξ1 (t)ϕξ2 (t) [Borovkov [9, p. 126, l.6]]. itξ itξ Proof. Let η1 = e 1 and η2 = e 2 . it(ξ1+ξ2) ϕξ1+ξ2 (t)= Ee [Borovkov [9, p. 125, Definition 1]] = E(η1η2) = Eη1Eη1 (By [Borovkov [9, p. 47, Theorem 6]], η1 and η2 are independent. Then use [Borovkov [9, p. 65, (7)]])

= ϕξ1 (t)ϕξ2 (t). Remark 41. By Rudin [99, p.226, Corollary], a Laplace transform β(s) on the half-line s 0 determines ≥ uniquely the ch.f. ϕξ (λ) [Borovkov [9, p.127, l.16–l.17]]. Remark 42. Eηk = 0 for odd k, and Eηk = σ k(k 1)(k 3) 1 for k = 2,4, [Borovkov [9, p.128, − − ··· ··· l.7–l.8]]. k 1 ∞ k x2/(2σ) Proof. Eη = √ ∞ x e− dx 2πσ − σ k ∞ k u2/2 = √ ∞ u e− duR (let x = σu). 2π − 1 ∞ 2n u2/2 2 ∞ 2n u2/2 √ ∞R u e− du = √ 0 u e− du 2π − 2π 2n ∞ n 1/2 w √ = √πR 0 w − e− dw (let uR= 2w) 2n 1 = √π ΓR(n + 2 ). Remark 43. By Rudin [99, p.197, Theorem 9.6], ϕξ (t) 0 as t ∞ if the distribution of ξ has a density [Borovkov [9, p.129, l.14]]. → → Remark. Note that the definition of Fourier transform given in Rudin [99, p.192, (4)] is different from that given in Borovkov [9, p.125, l. 10–l. 9]. − − Remark 44. Let f be the density of the distribution of ξ. If f has integrable derivatives f , , f (k), ′ ··· then ϕ (t) = o( t k) as t ∞ [Borovkov [9, p.129, l.17–l.21]; Feller [31, vol. 2, p.514, | ξ | | |− | | → Lemma 4]]. Proof. By Rudin [99, p.176, Theorem 8.17], f , , f (k 1) are absolutely continuous, and ′ ··· − hence uniformly continuous. It suffices to prove the following lemma:

74 If f is integrable and uniformly continuous on [0,∞), then f (t) 0 as t ∞. For the proof of the lemma, read | | → → http://math.stackexchange.com/questions/92105/f-uniformly-continuous- and-int-a-infty-fx-dx-converges-imply-lim-x. Remark 1. The last two proofs of the lemma can be generalized to the multidimensional case. The third proof is the most direct one because it does not use reduction to absurdity. However, the discussion given in the the last two lines of the third proof should have divided into two cases: Case f (x0) < ε and Case f (x0) ε. | | | |≥ Remark 2. For the multidimensional case, f ′(x)dx given in Borovkov [9, p.129, l. 14] should be n ∂ f − replaced with d f = ∑ dxi. i=1 ∂xi Remark 45. Formula (4) can be obtained from (5) [Borovkov [9, p.131, l.4]]. Proof. Let h = f I([ y, x]) be the density of the distribution of η. I([ ∗y, x])− − F(y) F(x) I. h(0)=( f −y −x )(0)= y−x . ∗ − − itx ity e− e− II. By Borovkov [9, p.126, l.11], ϕη (t)=[ (y −x)it ]ϕξ (t). − III. By II and Borovkov [9, p.130, l. 9], ϕη (t) is integrable with respect to t. − 1 ∞ By Borovkov [9, p.130, l. 1], h(0)= 2π ∞ ϕη (t)dt. − itx ity − F(y) F(x) 1 ∞ e− e− By I and II, y−x = 2π ∞ (y −x)it ϕRξ (t)dt. − − − Remark 46. By the equality given in RudinR [98, p.169, l.8],g ˜ is absolutely integrable [Borovkov [9, p.135, l. 6]]. − Remark 47. If ϕ (t) is known, the ch.f. of any subcollection of random varibles (ξk , ,ξk ) can be ξ 1 ··· j obtained by putting all tk except tk , ,tk , to be equal to 0 [Borovkov [9, p.139, l.3–l.4]]. 1 ··· j Proof. It suffices to consider an indicator I([a,b)).

I([a,b))(x1)Pξ1ξ2 (dx1,dx2) = I([a,b) ( ∞,∞))(x1,x2)P (dx1,dx2) R × − ξ1ξ2 =[F (b,∞) F (b, ∞)] [F (a,∞) F (a, ∞)] [Borovkov [9, p.39, l. 11]] R ξ1ξ2 − ξ1ξ2 − − ξ1ξ2 − ξ1ξ2 − − = F (b,∞) F (a,∞) [Borovkov [9, p.29, FF2]] ξ1ξ2 − ξ1ξ2 = F (b) F (b) [Borovkov [9, p.29, FF1]] ξ1 − ξ1 = I([a,b))(x1)Pξ1 (dx1). Remark 48. (AnR index set must be chosen properly: one more candidate would be too many and one less would be too few) The ch.f. of a random variable uniquely determines its distribution function [Borovkov [9, p.139, l.18–l.19]].

Proof. Let n = 2 and ∆ =(a1,b1) (a2,b2) [Borovkov [9, p.130, l.6–l.7]]. × In order to define F (0), we need to find ∆i so that i∆i =( ∞,0) ( ∞,0), where ∆i’s ξ ∪ − × − are mutually disjoint. This will ensure that except for countable i, Pξ (∂∆i)= 0. Then, by left-continuity of Fξ and the inversion formula [Borovkov [9, p.130, l.6–l.9]], we may define F (0)= sup P(ξ ∆i). ξ i ∈ Case I. (Improper choice: one more candidate would be too many) If we define Fξ (0)= sup∆ ( ∞,0) ( ∞,0),P (∂∆)=0 P(ξ ∆), at best we define only a unsolved ⊂ − × − ξ ∈ problem. This is because many recruited candidates are unqualified, but we put them into consideraton and have no way to rid them in order to satisfy the condition. Confucius says, “Only by careful distriguishing what one knows from what one dones’t may one have a deeper understanding.” Case II. (Wrong choice: one less would be too few)

75 1 1 k N, let ∆k =( k, ) ( k, ). Suppose we define F (0)= sup P(ξ ∆k). Since ∀ ∈ − − k × − − k ξ k ∈ P (∂∆k) may not be 0, there may be not enough ∆k’s that can satisfy the condition ∆k = ξ ∪Pξ (∂∆k)=0 ( ∞,0) ( ∞,0). Thus, we choose too few candidates. − × − Case III. (Proper choice) 1 1 x 1, let ∆x =( x, ) ( x, ). By Rudin [99, p.17, Theorem 1.19(d)], we define ∀ ≥ − − x × − − x Fξ (0)= supx 1,P (∂∆x)=0 P(ξ ∆x). Then ∆x’s are mutually disjoint. Consequently, except ≥ ξ ∈ for countable x, Pξ (∂∆x)= 0. Remark. Mathematics discusses the process of finding a solution rather than just proves that a given solution is true. Remark 49. 1 1tA 1tT + o(∑t2)= 1 + itEξ T 1tMtT + o(∑t2) [Borovkov [9, p.141, l.3]]. − 2 − k − 2 k Proof. For the one-dimensional case, ϕ(k)(0)= ikEξ k [Borovkov [9, p.126, l. 16]]. For − the multidimensional case, consider the partial derivatives of ϕ. Remark 50. The strong law of large numbers and the central limit theorem [Borovkov [9, p.151, Theo- rem 1; p.152, Theorem 2]; Lindgren [74, p.155, Khintchine’s theorem; p.158, central limit theorem]] Remark 1. The proofs of both Lindgren [74, p.155, Khintchine’s theorem; p.158, central limit theorem] and Borovkov [9, p.151, Theorem 1; p.152, Theorem 2] are essentially the same except that the former proofs use the lemma given in Lindgren [74, p.156, l.6–l.8] while the latter proofs do not. It is easy for the former proofs to be generalized to the t2 multidimensional case, but it is difficult for the latter proofs. Furthermore, 2 +o(1) 2 − → t given in Borovkov [9, p.153, l.3] is incorrect because o(1) refers to t 0 rather − 2 → than n ∞. → Remark 2. (Stronger convergences) The proofs of both Lindgren [74, p.155, Khintchine’s theorem; p.158, central limit theorem] and Borovkov [9, p.151, Theorem 1; p.152, Theorem 2] use Borovkov [9, p.132, Theorem 2]. Therefore, the covergences in both Khintchin’s theorem and the central limit theorem are essentially weak convergences. The proof of Borovkov [9, p.151, Theorem 1] gives the weak covergence F a. By Lindgren Sn/n ⇒ [74, p.154, Theorem B], the weak convergence can be strengthened to the convergence p in probability Sn/n a. The strong law of large numbers [Chung [9, p.133, Theorem −→ p 5.4.2 (8)]] strengthens the convergence in probability Sn/n a further to the almost a.s. −→ sure conergence Sn/n a [Borovkov [9, p.151, l. 11]]. The proof of Borovkov −−→ − [9, p.152, Theorem 2] gives the weak covergence F Φ. By Borovkov [9, p.116, ζn ⇒ Theorem 6], we have the pointwise convergence F (x) Φ(x)(x R). By Parzen [87, ζn → ∈ p.438, Exercise 5.2], Fζn (x) Φ(x) uniformly in x R. Remark. The strong law of large→ numbers for Bernoulli∈ scheme follows from Borovkov [9, p.91, Theorem 2; p.109, Theorem 1]. Remark 3. If the metric space S given in Billingsley [4, p.3, l.9] is the real line R, then the proof given in Lindgren [74, p.154, l. 5–p.155, l.5] is more intuitive than that given in Billingsley [4, p.27, l.5–l.15]. − Remark. For the right side of the inequality given in Lindgren [74, p.155, l.2], note that P(Yn = k ε)= 0 [Billingsley [4, p.26, Theorem 2.1(iii)]]. − Remark 4. The motivation to choose Zn in formulating the central limit theorem [Lindgren [74, p.157, l.8–l. 3]]. − I. Choose Zn instead of Sn to keep track of the shape of the limiting distribution function.

76 II. Choose Zn instead of Yn to avoid the singularity of the limiting distribution function. By the weak law of large numbers, FY FI , where FI has a single jump at EX. n ⇒ EX EX III. Standization (mean = 0, var = 1) that keeps the limiting distribution function from shrinking or expanding leads us from Yn to Zn naturally. Remark 5. (Motivation for using characteristic functions; key points vs. details; natural proofs; physical meanings) Both Reif [93, p.35, l.6–p.40, l.8] and Borovkov [9, p.75, Theorem 7; §5.1–§5.3; §5.5; §7.1–§7.4; §7.6; §8.1–§8.2] discuss the strong law of large numbers and the cen- tral limit theorem. The former indicates the motivation for using characteristic func- tions to prove these theorems [Reif [93, p.36, l.1–l.17]] and reveals that the key idea of proving these theorems is simple, original and excellent [Reif [93, p.36, l.17–l. 1]]. − However, the former lacks details; its statements are crude; its proofs are not rigorous. Although the latter provides details, accuracy, and rigor, its proofs lack motivations and its key points are vague. The way to keep the merits of both approaches is to select the key statements in the former and find their corresponding rigorous ones in the latter. I. P(x)dx [Reif [93, p.35, l.10]] dF(x) [Borovkov [9, p.29, l. 17]]. For the latter → − expression, every component (the set of elementary outcomes, σ-algebra, probability) of the probability space [Borovkov [9, p.17, Definition 6]] is clearly specified. Thus, we have a rigorous mathematical structure ready to hand. II. Reif [93, p.35, (1 10 2)] Borovkov [9, p.54, l.7–l.16; p.126, l.11]. This can lead · · → to the equality given in Borovkov [9, p.97, l.8]. III. By using Dirac δ function, we obtain Reif [93, p.35, (1 10 2) & (1 10 3)] = Reif · · · · [93, p.36, (1 10 4)]. Reif [93, p.36, (1 10 5)] Borovkov [9, p.126, l.4–l.6; p.130, (5)]. Reif [93, p.36,· · (1 10 5)] which leads· to· Reif→ [93, p.36, (1 10 6)] gives the reason why · · · · we should use characteristic functions to prove the strong law of large numbers and the central limit theorem. Consequently, do not consider the inversion formula a unnatural thing. In fact, based on the physical consideration given in Reif [93, p.36, l.1–l.17], only through the use of characteristic functions and the inversion formula may we have a simple, natural and general [Reif [93, p.37, l. 7–l. 6]] method of dealing with the − − convergence of the sum of a sequence of independent identically distributed random variables. The approaches given in Borovkov [9, p.97, l. 12–p.99, l.11] and in the − proof of Chung [15, p.114, Theorem 5.2.2] are artificial, while the proofs of Borovkov [9, p.151, Theorem 1; p.152, Theorem 2] are natural. IV. For the Riemann–Lebesgue lemma [Borovkov [9, p.129, 8]; Rudin [99, p.197, Theorem 9.6)]], Reif [93, p.38, l.1–l.4] provides its physical meaning and the moti- vation for its formulation. The proof given in Reif [93, p.38, Remark] is not as good as the proof given in https://en.wikipedia.org/wiki/Riemann%E2%80 %93Lebesgue_lemma. V. Reif [93, p.38, l.1–l.12] provides the motivation for formulating the central limit theorem [Reif [93, p.39, l. 7]; Borovkov [9, p.152, Theorem 2]]. − Remark 51. Royden [96, p.62, Proposition 15] shows that the definition given in Royden [96, p.56, Definition], the definition of A [Borovkov [9, p.428, l.7–l.8]], and the definition of FN [Borovkov [9, p.430, l. 8]] are equivalent. By Borovkov [9, p.429, Theorem 2], the ex- − tension of P given in Borovkov [9, p.429, l.2] is the same as the extension of P given in Borovkov [9, p.430, l. 8]. −

77 Remark 52. We may easily extend the function F on D to a non-decreasing and left-continuous function on R by defining F(x)= supd D,d x F(d) [Borovkov [9, p.455, l.16–l.18]]. ∈ ≤ (B). Chung [15] Remark 1. Chung [15, p.46, Exercise 4] can be proved by replacing X with X/c in Chung [15, p.45, Theorem 3.2.1]. Remark 2. Theorem 3.3.2 [Chung [15, p.54, l. 8–l. 4]] − − 1 Proof. ω f (X1,X2) B = ω (X1,X2) f − (B) . We may{ approach| B by∈ lattices.} { | See Borovkov∈ [9,} p.426, l.9; p.428, l.7–l.8, Lemma 1 and Theorem 1]. Remark 3. ∑ c c [Chung [15, p.118, l. 10]] because ∞ 1 dx = 1 ∞ ∞ 1 dx. k>n k2 logk ∼ nlogn − n x2 logx − xlogx |n − n x2(logx)2 Remark 4. X dP = 0 [Chung [15, p.124, l.8]]. Mk k′+1 R R Proof. X dP = 1 X dP R Mk k′+1 Ω Mk k′+1 = 1 dP X dP (by independence). Ω MRk Ω k′+1 R Remark 5. ∑nRXn convergesR a.e. [Chung [15, p.127, l.6]]. Proof. By Chung [15, p.127, (10)], ε ε, m0 1 : m m0 P maxm< j n Sm, j > 2ε < . ∀ ∃ ≥ ≥ ⇒ { ≤ | | } 1 ε Let n ∞. Then − → P ε ε, m0 1 : m m0 sup j>m Sm, j > 2ε 1 ε . ∀ ∃ ≥ (k+1) ≥ ⇒ { | | }≤ − Let ε = 2− and m0 = mk. Then P k k k 1, mk 1 : m mk sup j>m Sm, j > 2− 2− . ∀ ≥ ∃ ≥ ≥ ⇒ { |p | }≤ Let Tm(ω)= sup j>m Sm, j(ω) . Then Tm 0. | | −→a.s. By Borovkov [9, p.109, Theorem 1], Tm 0. −−→ k 1 : m k Tm(ω) < ε. ∃ ≥ ≥ ⇒| | Remark 6. The proof given in https://en.wikipedia.org/wiki/Kronecker%27s_lemma is better than the proof of Chung [15, p.129, Kronecker’s lemma].

Remark 7. ∑ P X1 > n < ∞ [Chung [15, p.133, l.8]] follows from Chung [15, p.45, Theorem n {| | } 3.2.1]. Sn Remark 8. If ∑n P Xn an =+∞, then limn | | =+∞ a.e. [Chung [15, p.136, l.13–l.14]] {| |≥ } an Proof. By an argument similar to that given in Chung [15, p.118, l. 2–p.119, l.2], − P Sn > an/2 i.o. = 1. {| | }Sn 1 Consequently, limn | | a.e.( ) an ≥ 2 ∗ A > 0,∑n P Xn Aan =+∞. Otherwise, A > 0 : ∑n P Xn Aan < +∞. By an ∀ {| |≥ } ∃ {| |≥ }Sn /A argument similar to that given in Chung [15, p.135, l.4–p.136, l.12], lim | | = 0 a.e. . n an This would contradict ( ). By an argument similar∗ to that given in Chung [15, p.118, l. 2–p.119, l.2], − P Sn > Aan/2 i.o. = 1. {| | } Sn(ω) A Namely, A > 0, Z(A)P(Z(A))=0 : ω Ω Z(A) limn | | . ∀ ∃ ∈ \ ⇒ an ≥ 2 Remark 9. (Indigo blue is extracted from the indigo plant but is bluer than the plant it comes from) Most mathematical theorems do not come from nowhere. A new theorem is often a supplement, a stronger version, an analog or an extention of an old theorem. The generation of this kind of derivatives makes up a significant part of the development of a theory. Thus, indigo blue is extracted from the indigo plant but is bluer than the plant it comes from. The

78 following evidences convince us that we should learn to control rather than follow the fllow of a proof. I. Supplements: Chung [15, p.133, Theorem 5.4.2, (9)] is a supplement of Chung [15, p.133, Theorem 5.4.2, (8)]. The former discusses Case ( X1 )= ∞, while the latter discusses Case E | | ( X1 ) < ∞. II.E | Stronger| versions: Chung [15, p.133, Theorem 5.4.2] is stronger than Chung [15, p.114, Theorem 5.2.2]. The convergence in probability of the latter is strengthened to the almost sure convergence of the former. III. Extensions: Chung [15, p.134, Theorem 5.4.3] is an extension of Chung [15, p.133, Theorem 5.4.2, (9)] because the hyperthesis of the former is more flexible than that of the latter. Similarly, Chung [15, p.121, Theorem 5.3.1] is an extension of Chung [15, p.50, Chebyshev’s inequality] and Chung [15, p.116, Theorem 5.2.3] is an extension of Chung [15, p.114, Theorem 5.2.2]. The example of strengthening given in II is a special case of Chung [15, p.126, Theorem 5.3.4]. Since Borovkov [9, p.151, Theorem 1] is the same as Chung [15, p.114, Theorem 5.2.2], we expect the proof of Borovkov [9, p.151, Theorem 1] and that of Chung [15, p.133, Theorem 5.4.2, (8)] should be similar, but they are actually different. In order to organize the structure of the proof of Chung [15, p.133, Theorem 5.4.2, (8)], we may use Borovkov [9, p.151, Theorem 1] and Chung [15, p.126, Theorem 5.3.4] to prove Chung [15, p.133, Theorem 5.4.2, (8)]. The proof provided by this method is more compatible with that of Borovkov [9, p.151, Theorem 1] than that of Chung [15, p.133, Theorem 5.4.2, (8)]. IV.Analogs: The formula given in Ellison–Ellison [28, p.265, l. 10] is an analog of Ellison– Ellison [28, p.46, Theorem 2.5]. − Remark 1. The proof of Chung [15, p.133, Theorem 5.4.2, (8)] and that of Loeve´ [75, p.251, l.15–p.252, l.6] are essentially the same. Remark 2. (Control the work flow by dividing it into several applicable sections) It seems magical that Chung [15, p.121, Theorem 5.3.1] makes the hypothesis of Borovkov [9, p.75, (13)] more flexible [Chung [15, p.121, l. 3–l. 1]]. If we compare Chung [15, p.122, l. 2– l. 1] with Borovkov [9, p.75, l.4], we− find− that their key ideas are essentially the same− − except that the former divides the work flow into several applicable sections [Chung [15, p.122, l.10]]. This technique of segmentation designed for independence is also used in the proofs of Chung [15, p.123, Theorem 5.3.2; p.126, Theorem 5.3.4].

4. Calculus of variations (Courant–Hilbert [21, vol. 1, chap. IV]; Fomin–Gelfand [35])

(A). Courant–Hilbert [21, vol. 1, chap. IV]

Remark 1. The area of A1′ AhAn′ +1 is greater than that of A1AhAn+1 [Courant–Hilbert [21, vol. 1, p.167, l.8]]. Proof. Let lengths OA and OB be fixed. Then the area of AOB is the largest when ∠AOB △ is a right angle. Remark 2. The requirement that y(x) have continuous derivatives up to the 2n-th order [Courant–Hilbert [21, vol. 1, p.190, l.10]] comes from Courant–Hilbert [21, vol. 1, p.190, l. 8, (21)]. − Remark 3. Courant–Hilbert [21, vol. 1, p.190, l. 8, (21)] follows from the statement given in Courant– Hilbert [21, vol. 1, p.185, l. 19–l. 17].− − − d Remark 4. [F]y vanishes identically if and only if F(x)= dx G(x,y) [Courant–Hilbert [21, vol. 1, p.194, l.1–l.2]].

79 Proof. [F]y vanishes identically ∂A ∂B = 0 [Courant–Hilbert [21, vol. 1, p.193, l. 10]] ⇔ ∂y − ∂x − F(x)dx = dG(x,y) [Courant–John [22, vol. 2, p.104, l.16–l.21]]. ⇔ Remark 5. [F]u vanishes identically in any admissible function u if and only if F = Ax + By, where A and B are functions of x,y, and u. Proof. I. The following three questions talk about the same thing (question 1 is the same as question 3) even if they have different outlooks: 1. (From the viewpoint of calculus of variations) If F constant for any admissible func- tion y (with given boundary conditions), what form does≡F have? R 2. (From the viewpoint of differential equations) If the Euler expression [F]y vanishes iden- tically for any admissible function y, what form does F have? 3. (From the viewpoint of differential geometry) If the differential form ω in Spivak [108, vol.1, p.354, Stokes’ theorem] is specified, what form does dω have? II. (∇ G) ndSˆ = G dr is a special case of Spivak [108, vol.1, p.354, Stokes’ theorem]. × · · Let G =( B,A,0). Then R − R G dr =( B,A,0) (dx,dy,dz)= Bdx + Ady. · − ∂0 · ∂A ∂ B −∂0 ∂A ∂B ∂A ∂B (∇ G) nˆ = [( + )i +( − ) j +( + )k] k = + . × · ∂y ∂z ∂z − ∂x ∂x ∂y · ∂x ∂y 2 uxxuyy u − xy Remark 6. By O’Neill [86, p.219, Exercise 2], the expression 2 2 is the Gaussian curvature of the √1+ux +uy 2 2 surface z = u(x,y) multiplied by 1 + ux + uy [Courant–Hilbert [21, vol. 1, p.196, l.1–l.4]]. Remark 7. By Courant–John [22, vol. 2, p.429,q (29a)], the integral of the curvature over a segment of a surface depends only on the tangent planes of the surface along the boundary of the segment [Courant–Hilbert [21, vol. 1, p.196, l.5–l.7]]. Remark 8. According to Rudin [99, p.186, (1)], Courant–Hilbert [21, vol. 1, p.199, (33)] should have been corrected as follows: F(x,y,z,zx,zy)dxdy ∂(y,z) ∂(z,x) R R ∂(u,v) ∂(u,v) ∂(x,y) = F(x,y,z, ∂(x,y) , ∂(x,y) ) ∂(u,v) dudv − ∂(u,v) − ∂(u,v) | | R R F ∂(y,z) ∂(z,x) ∂(x,y) = (x,y,z, ∂(u,v) , ∂(u,v) , ∂(u,v) )dudv. x1 2 Remark 9. NoteR R that if ϕ is piecewise continuous and (ϕ c) dx = 0, then ϕ c on [x0,x1] [Courant– x0 − ≡ Hilbert [21, vol. 1, p.201, l.10–l.11]]. R

Remark 10. y′ may be expressed as a continuously differentiable function ϕ(x,y,Fy′ ) [Courant–Hilbert [21, vol. 1, p.202, l. 13–l. 12]]. − − Proof. Let z = Fy (x,y,y ). Then Fy (x,y,y ) z = 0. ′ ′ ′ ′ − Since Fy y = 0, by the implicit function theorem, ′ ′ 6 there exists a continuously differentiable function ϕ(x,y,z) such that y′ = ϕ(x,y,z) and

Fy′ (x,y,ϕ(x,y,z)) = z. Remark 11. The argument given in Courant–Hilbert [21, vol. 1, p.212, l.7–l.8] is oversimplified. For details, see Reif [93, §A 10]. · Remark 12. Tx : Ty =(e+ f y′) : ( f +gy′) is the condition for an orthogonal intersection [Courant–Hilbert [21, vol. 1, p.213, l. 6–l. 5]]. − − Proof. I. Let α t x u t v t and L α t1 α t dt. Then ( )= ( ( ), ( )) ( )= t0 ′( ) dα(t) du dv | | dt = xu dt + xv dt . R

80 dα(t) 2 du 2 du dv dv 2 dt = e( dt ) + 2 f ( dt )( dt )+ g( dt ) . | | 2 Let t = x = u,y = v. Then e + 2 f y′ + gy′ dx is the arc length element of a surface x(u,v). dx(x,y(x)) x x y II. dx = u + v ′. p (xu + xvy ) xu = e + f y . ′ · ′ (xu + xvy ) xv = f + gy . ′ · ′ (B). Fomin–Gelfand [35] Remark 1. Fomin–Gelfand [35] is a good book because it rigorously defines the variation of a func- tional) δJ[h] [Fomin–Gelfand [35, p.11, l. 10–p.12, l.3]], carefully considers about the the- oretical development of calculus of variations,− and clearly provides the required hypotheses for the theory. However, there are some drawbacks in using notations which make it difficult to understand the essencial meanings of a theorem. Example 1.2. (Better notations) Fy = xF¯y x + yF¯y y + y F¯y y given in Fomin–Gelfand [35, p.17, l. 14] should have △ ′ △ ′ △ ′ △ ′ ′ ′ − been corrected as follows: Fy = xFy x(x+θ x,y+θ y,y +θ y )+ yFy y(x+θ x,y+θ y,y +θ y )+ △ ′ △ ′ △ △ ′ △ ′ △ ′ △ △ ′ △ ′ y′Fy′y′ (x+θ x,y+θ y,y′ +θ y′), where 0 < θ < 1. See Taylor’s polynomial formula [Mann–Taylor△ △ [76, p.208,△ (7.5-4)]].△ Explanations. By Fomin–Gelfand [35, p.17, l.14–l.15], F satisfies the condition given in Mann–Taylor [76, p.207, l. 14–l. 13]. − − “Certain intermidate curves” given in Fomin–Gelfand [35, p.17, l. 12] refer to (y + θ − △ y)(x). Another drawback of Fomin–Gelfand [35] is that it states the required hypotheses only once in the beginning of the book [Fomin–Gelfand [35, p.14, l.8–l.15; p.22, l.13–l.21]] then it will mention them no more. When readers need to use them, they probably have already forgotton them. Without reviewing these hypotheses, the readers may have difficulty in quoting proper theorems in a proof. Example 1.3. The right side of the equality given in Fomin–Gelfand [35, p.23, l.20] omits the small terms denoted by donoted by the dots given in Fomin–Gelfand [35, p.23, l.20] because ( (x,y) 0) (h,hx,and hy simultaneously approach 0). This is because z(x,y) satisfies the△ condition→ given⇒ in Fomin–Gelfand [35, p.22, l.19]. Remark 2. Fomin–Gelfand [35, P.24, l.8] should have been stated in detail as follows: Let R be a closed region in the xy-plane. We want to find a function z(x,y) so that the graph z = z(x,y) gives the least surface area, where z(x,y) must satisfy Fomin–Gelfand [35, P.22, conditions 1,2 and 3]. 2 2 Remark 3. J[z]= R 1 + zx + zydxdy [Fomin–Gelfand [35, p.24, l.9]]. Proof. RLet R qA be the surface area of the graph z = z(x,y). By O’Neill [86, p.281, l.6], A = xxx yxy = xx xy x y, △ k△ ×△ k k × k△ △ where x(x,y)=(x,y,z(x,y)). xx =(1,0,zx), xy =(0,1,zy) and i j k xx xy = 1 0 zx =( zx, zy,1). × − − 0 1 z y

81 Remark 4. Substituting x = r(θ sinθ)= c, y = r(1 cosθ) [Fomin–Gelfand [35, p.26, l. 1]] directly into the Euler equation,− we may find it is− a solution. However, it is more difficult− to guess the solution given in Fomin–Gelfand [35, p.26, l. 1] from the Euler equation than to guess Marion–Thornton [77, p.212, (6.23)] from Marion–Thornton− [77, p.212, (6.22)]. Conse- quently, it is important to choose the right coordinate system. In Fomin–Gelfand [35, p.15, (13)], the set of admissible functions is h C1[a,b] h(a)= h(b)= 0 , while in Fomin– { ∈ | } Gelfand [35, p.25, (27); l. 8], the set of admissible functions is C1[a,b]. Fomin–Gelfand [35, p.25, (28)] cab be proved− as follows: Proof 1. The set of admissible functions for Fomin–Gelfand [35, p.25, (27)] is larger, but we consider only a smaller set of admissible functions h C1[a,b] h(a)= h(b)= 0 [Fomin– { ∈ | } Gelfand [35, p.25, l. 8]]. Similar to the proof of Fomin–Gelfand [35, p.15, (14)], Fomin– Gelfand [35, p.25, (28)]− follows from Fomin–Gelfand [35, p.25, (27)]. 1 Proof 2. Suppose y0 C [a,b] is the curve for which Fomin–Gelfand [35, p.25, (26)] has ∈ an extremum. Then let h C1[a,b] vary, but let its endpoints be 0: h(a)= 0 and h(b)= 0. ∈ Then Fomin–Gelfand [35, p.25, (28)] follows from the necessary condition of an extremum and Fomin–Gelfand [35, p.25, (27)]. During the process of finding solution for the extremum of a functional, we should have the following understanding: Fomin–Gelfand [35, pp.26–27, Example] illustrates the fact that the necessary condition for an extremum of a functional greatly reduces the scope of solu- tions. However, before we discuss the sufficient condition for an extremum, the statement that the particle reaches the line x = b in shortest time is unproven. σ xu xv Remark 5. By O’Neill [86, p.281, FIG. 6.15], △σ as σ1 0 [Fomin–Gelfand [35, p.30, 1 → yu yv △ → △ l. 15]]. − b dΦ Remark 6. a dx dx takes the same value along all the curves satisfying the boundary conditions (2) [Fomin–Gelfand [35, p.36, l.18–l.19]] R Proof. b dΦ dx = Φ(b,B1, ,Bn) Φ(a,A1, ,An). a dx ··· − ··· Z

Remark 7. The arc length between the points corresponding to the values t1 and t2 of parameter t equals

t2 2 2 J[u,v]= Eu′ + 2Fu′v′ + Gv′ dt t1 Z p [Fomin–Gelfand [35, p.37, l. 10–l. 9]]. − − Proof.

2 2 dr dr dr Eu′ + 2Fu′v′ + Gv′ dt = dt (since = u′ru + v′rv) r dt · dt dt p = r′(t) dt (by O’Neill [86, p.51, l.13]). k k

Remark 8. The proof of Fomin–Gelfand [35, p.40, (16)] can be found in Courant–Hilbert [21, vol.1, p.196, l. 8–p.198, l.9]. −

82 b (n) Remark 9. J = (Fyh + Fy h + + F (n) h )dx + [Fomin–Gelfand [35, p.41, l. 4]]. △ a ′ ′ ··· y ··· − (n) (n) Proof.R F(x,y + h,y′ + h′, ,y + h(n)) F(x,y,y′, ,y ) ··· (n) (n) − ··· (n) (n) = Fy(x,y+θh,y′ +θh′, ,y +θh )h+Fy′ (x,y+θh,y′ +θh′, ,y +θh )h′ + + ··· (n) (n) (n) ··· ··· F (n) (x,y + θh,y + θh , ,y + θh )h + . y ′ ′ ··· ··· Remark 10. An obvious generalization of Lemma 1 of Sec. 3.1 given in Fomin–Gelfand [35, p.42, l.11] refers to letting n n h(x)=(x x1) (x2 x) . − − Remark 11. Isoperimetric problems: The proof of Bendersky [3, p.143, Theorem 27.6] is simpler than that of Fomin–Gelfand [35, p.43, Theorem 1]. This is because the latter unnecessarily use the complicated concept of variational derivative. See Fomin–Gelfand [35, p.28, l. 10; Fig- − ure 3]. The concept of variational derivative may help in understanding the circumstance, but it has little to do with the purpose of calculus of variations. The calculus of variations uses y as an independent variable. We should consider it as the base element of the theory and should not related it further to x. At best, the latter proof is only a special case of the former one. Holonomic problems: Similarly, the proof of Bendersky [3, pp.146–147, Theo- rem 27.8(i)] is simpler than that of Fomin–Gelfand [35, p.46, Theorem 2]. Non-holonomic problems: Bendersky [3, p.148, l.10–p.150, l. 5] provides detailed proof of Fomin–Gelfand − [35, p.48, Remark 1]. Only through simplication may we have a clear idea about how to deal with complicated cases. “y = y(x) is not an extremal of K[y]” given in Fomin–Gelfand [35, p.43, l.19] should have been corrected as“δG = 0”. “Either y0(x) is an extremal of G[y]” given in Bendersky 6 g d g [3, p.143, l. 10] should have been corrected as “either ˙y dx ˙y′ 0”. This is because the variation of− a functional being 0 is only a necessary condition− of its≡ having an extremum. Remark 12. The integral on the right side of equality given in Fomin–Gelfand [35, p.49, l.11] should have been corrected as y(s)x′(s)ds. This is because Fomin–Gelfand [35, p.49, l.11] considers only the part of a circle that passes the endpoints of [ a,a] and has length smaller than the R − semicircle. If l is greater than the semicicle, then y would not be a function of x.

Remark 13. By Fomin–Gelfand [35, p.170, (80)], h(x0) y (x0)δx0 [Fomin–Gelfand [35, p.56, l.3–l.4]]. ∼ ′ Remark 14. “We can solve equations (8) for y , ,y as functions of variables x,y1, ,yn, p1, , pn 1′ ··· n′ ··· ··· [Fomin–Gelfand [35, p.58, l.9–l.11]].

Proof. Let Gi = Fy pi. Then i′ − ∂(F , ,F ) ∂(G1, ,Gn) y1′ y′n ··· = ··· = det Fy y = 0. ∂(y , ,y ) ∂(y , ,y ) || i′ k′ || 6 1′ ··· ′n 1′ ··· ′n Then use the implicit function theorem. Remark 15. (10) is an extremal [Fomin–Gelfand [35, p.58, l. 5]]. − Proof. The definition of extremal can be found in Fomin–Gelfand [35, p.15, l. 16]. The − desired result follows from Fomin–Gelfand [35, p.35, Theorem]. Remark 16. (Free endpoints) In order to be consistent with the argument given in Fomin–Gelfand [35, p.61, l.9–l.22], Fomin–Gelfand [35, p.59, l. 15–p.60, l.14] should have been corrected as − follows: Among all smooth curves whose endpoints P0(x0,y0) and P1(x1,y1) lie on two given curves x ϕ y and x ψ y . Find the curve y x for which J y x1 F x y y dx has an = ( ) = ( ) ( ) [ ]= x0 ( , , ′) extremum. R

83 y=y1 y=y1 Solution. δJ = Fy δy y=y +(F y Fy )δx y=y . ′ | 0 − ′ ′ | 0 Let δx0 =[ϕ′(y0)+ ε0]δy0 and δx1 =[ψ′(y1)+ ε1]δy1. Remark. Courant–Hilbert [21, vol.1, chap. IV, §5.2, pp.211–214] parametrizes the integrand with parameter t, while Fomin–Gelfand [35, p.61, l.10] parametrizes the surfaces for the constraint with parameters y and z. Because Fomin–Gelfand [35, p.61, l.11] involves only an integral with respect to x, Fomin–Gelfand [35, p.61, l.9–l.22] uses x as independent variable when finding J, and uses y,z as independent variables when finding δJ; it will easily confuse the readers. In contrast, Courant’s argument proceeds more along its nature by changing the integral with respect to x to the integral with respect to t [Courant–Hilbert [21, vol.1, p.211, l. 16]]. Furthermore, Courant’s argument is consistent with Mann–Taylor [76, p.178, l. 9–p.181,− l.8; p.182, l. 7–p.183, l. 16] without losing the latter’s feature, and − − − is thereby heuristic.note that the two equalities

ξ(λTx Fx˙)= 0,η(λTy Fy˙)= 0 − − given in Courant–Hilbert [21, vol.1, p.212, l.11] should have been corrected as

ξ(λTx λ0Fx˙)= 0,η(λTy λ0Fy˙)= 0. − − The statement “We may take λ0 = 1” given in Courant–Hilbert [21, vol.1, p.212, l.8] should have been placed after the above two equalities; this is because only after knowing how to express λ in terms of λ0 may we be able to choose λ0. Remark 17. In order to prove the three equalities given in Fomin–Gelfand [35, p.69, l. 4], one should − read O’Neill [86, chap. 1, sec. 5]; the explanation given in Fomin–Gelfand [35, p.69, l.12– ∂H ∂F l.15] is not clear. In the following, we prove only ∂x = ∂x : − 2n+1 Proof. Let U1, ,U2n 1 be the natural frame field for (x,y1, ,yn, p1, , pn) E ··· + ··· ··· ∈ [O’Neill [86, p.9, l.11]].

dH(U1)= U1[H] (by O’Neill [86, p.23, Definition 5.2]) ∂H = (by O’Neill [86, p.23, Definition 5.2]). ∂x ∂F n ∂F n ∂F n ∂F n ( dx Σi=1 dyi + Σi=1yi′dpi)(U1)= 1 Σi=1 0 + Σi=1yi′ 0 − ∂x − ∂yi − ∂x · − ∂yi · · (by O’Neill [86, p.23, l. 1]) − ∂F = . − ∂x

Remark 18. For the geometric meaning of Fomin–Gelfand [35, p.73, (20)], read Cherkaev [13, p.2, §1.1 Legendre and Young–Fenchel transforms, Geometric interpretation]. Replacing the independent invariables (x,y,y′) with (x,y, p) illustrates the duality between points and lines. Remark 19. By Fomin–Gelfand [35, p.74, (21)], the inverse function of f ′ exists locally. Note that the negative sign of the second term on the right side of the equality given in Fomin–Gelfand [35, p.74, (23)] should have been corrected as positive sign. The difference between Fomin– Gelfand [35, p.74, (23)] and Fomin–Gelfand [35, p.74, l.10] is that the extremum of the latter can be a minimum. Let n H(p1, , pn)= sup [ f (ξ1, ,ξn)+ ∑ ξi pi], ··· (ξ1, ,ξn) (domain of f ) − ··· i=1 ··· ∈

84 where the domain of H consists of (p1, , pn) for which the above supremum is finite. Then the Legendre transformation H of f···is a convex function. Proof. This is because the supremum of a set of convex functions is convex. Remark. In order to understand a theorem, except for its statement and proof, one should also understand its origins: what thought leads to this theorem? In proving the above the- orem, if f is a proper convex function, then one will naturally think of Hadamard global inverse function theorem [Ho [57, p.240, Corollary]]. The thinking journey of mathemati- cians cannot be different. Others can discover a thorem, but one connot even understand its origins. This is because one’s background and experiences fail to meet the required standard. Remark 20. The second equality in Fomin–Gelfand [35, p.75, (28)] can be obtained by applying the method of integration by parts to the second term of Fomin–Gelfand [35, p.75, (27)]. Remark 21. Fomin–Gelfand [35, p.75, l. 8–p.76, l. 11] shows that the y(x) for which Fomin–Gelfand − − [35, p.75, (24)] has an extremum and the y(x) for which Fomin–Gelfand [35, p.75, (27)] has an extremum are the same. In Fomin–Gelfand [35, pp.75–77, sec.18.2], for the pro- cess of deriving Euler equations from finding an extremum of a functional, the description with independent variables (x,y,y′) is a viewpoint; through the Legendre transformation, the description with independent variables (x,y, p) is an equivalent dual viewpoint. Fomin– Gelfand [35, pp.75–77, sec.18.2] shows that in the process there are two propositions whose two descriptions are equivalent; in fact, for any proposition in the process, its two descrip- tions are equivalent. ∂H ∂F Remark 22. ∂x = 0 ∂x = 0 [Fomin–Gelfand [35, p.80, l. 5]]. ⇒ − n Proof. By Fomin–Gelfand [35, p.68, (6)], H = F + ∑ y pi. − i=1 i′ By Fomin–Gelfand [35, p.58, l.9–l.11], yi′ can be expressed as a function of x,y1, ,yn, p1, , pn. 2 2 ··· ··· By Fomin–Gelfand [35, p.70, (11)], ∂yi′ = ∂ H = ∂ H = 0. ∂x ∂x∂ pi ∂ pi∂x Thus, yi′ can be expressed as a function of y1, ,yn, p1, , pn. ∂H ··· ∂F ··· Since ∂x = 0, by the definition of partial derivative, ∂x = 0. Remark 23. The flow chart of Fomin–Gelfand [35, p.80, l. 20–l. 13] is as follows: Replace y and y − − ′ in Fomin–Gelfand [35, p.80, (45)(i)] with y(x) and y′(x) respectively. Consequently, we can express x∗ as a function of x; then we can solve x in terms of x∗. Replace y and y′ in Fomin– Gelfand [35, p.80, (45)(ii)] with y(x) and y′(x) respectively. Then express x,y(x),y′(x) all in terms of x∗, we have thereby y∗ = y∗(x∗). Remark 24. Note that Fomin–Gelfand [35, p.84, l. 8–p.85, l.14] has not proved that the action is least. It only shows that equations of motion− can be derived from the concept of least action. Therefore, the principle of least action is a reasonable axiom. In fact, we do not need assume that much. The assumption that the action has an extremum can equally leads to equations of motion. Remark 25. Fomin–Gelfand [35, p.86, l.14–p.88, l.8] shows that the conservation of momentum and that of angular momentum from the viewpoint of a vector component in the Cartesian coordinate system. Landau–Lifshitz [67, §7 & §9] shows the same thing from the viewpoint of vector in the Cartesian coordinate system. Morin [82, pp. VI-13–VI-14, Examples 1–3] adopts the Lagrangian method and discusses the same thing from the viewpoint of generalized coor- dinates. This way allows us to discuss momentum and angular momentum simultaneously. Note that Morin [82, p. VI-16, l-4–l.5] defines the concept of symmetry. Remark 26. Canonical Euler equations represent the characteristic system associated with the Hamilton– Jacobi equation [Fomin–Gelfand [35, p.90, l. 12–l. 10]]. − − 85 Proof. Read Bendersky [3, p.182, l. 3–p.183, l. 1]. − − Bendersky [3, p.182, (56)] should have been corrected as“G(x1, ,xm,u, p1, , pm)= 0, ∂u ··· ··· where pi = ,”. def ∂xi Bendersky [3, p.183, (57)] follows from Tkachev [115, p.2, l.1–l.9]. Remark 27. Fomin–Gelfand [35, p.91, l. 3–p.92, l.1] − Explanation. For the definition of complete integral, see Sneddon [105, p.49, l. 21–l. 5]. − − The reason that a complete integral requires n parameters can be found in Sneddon [105, p.69, l.18–l.22] or Gibbon [40, p.2, l. 11–l. 10]. The constants are c1,c2,...,cn 1, f (c1,c2,...,cn 1), − − − − where f is an arbitrary function. Remark 28. By Fomin–Gelfand [35, p.78, l.12–l.13], the solutions of the new and old canonical Euler equations) [Fomin–Gelfand [35, p.77, (34) & (35)]] are the same. The yi, pi in Fomin– Gelfand [35, p.93, l. 3–l. 2] satisfy the new canonical Euler equations [Fomin–Gelfand [35, p.93, l. 12]], so− they− also satisfy the original canonical Euler equations [Fomin– − Gelfand [35, p.92, (82)]]. Therefore, it is unnecessary to repeat the check as given in Fomin–Gelfand [35, p.92, l. 12–p.93, l.7]. − Remark 29. By the statement given in Fomin–Gelfand [35, p.78, l.8], Fomin–Gelfand [35, p.92, (82)] and the system of equations given in Fomin–Gelfand [35, p.93, l. 12] have the same solu- tion [Fomin–Gelfand [35, p.93, l. 1]]. − − Remark 30. The second variation of the functional J[y] is uniquely defined [Fomin–Gelfand [35, p.99, l. 11-l. 10]]. − − Proof. The difference of any two quadratic functionals is a quadratic functional. ϕ2[h] If ϕ2[h] is a quadratic functional and if h 2 0 as h 0, then ϕ2[h] 0. k k → k k → ≡ Remark 31. If we do not assume that P > 0 on [a,b] [Fomin–Gelfand [35, p.108, l. 2-l. 1]], then we will have no way to obtain the needed normal form of a system of differential− − equations to quote Pontryagin [91, p.179, Theorem 15].

Remark 32. Consider Fomin–Gelfand [35, p.110, Figure 7]. Let A =(x2,t2). By Fomin–Gelfand [35, p.110, l. 21], ε > 0,x3 [a,b] : h(x3,t2 ε) = 0 [Fomin–Gelfand [35, p.110, l. 17- l. 16]]. − ∃ ∈ − − − Remark 33. This would contradict the assumption that the functional is positive definite for all t [Fomin– Gelfand [35, p.110, l. 12-l. 11]] because (h(x,t)= 0 hx(x,t) = 0) [Fomin–Gelfand [35, p.110, l.7]]. − − ⇒ 6

Remark 34. We would have h(x,t)= 0,hx(x,t)= 0 simultaneously [Fomin–Gelfand [35, p.110, l. 9]]. − hx(x,t) Proof. By the implicit function theorem, t′(x)= . − ht (x,t) t (x)= 0 hx(x,t)= 0. ′ ⇒ Remark. This statement contradicts the statement given in Fomin–Gelfand [35, p.109, l. 10]. − Remark 35. Case D [Fomin–Gelfand [35, p.110, l. 8–l. 6]] shows that (x,t) [a,b] [0,1] h(x,t)= − − { ∈ × | 0 [a,b] 0 = (a,0) . This case will be considered in case E. }∩ × { } { } Remark 36. The functional (36) is positive definite for all t except possibily t = 1 [Fomin–Gelfand [35, p.111, l.12–l.13]]. Proof. Suppose h 0 on [a,b] and h(a)= h(b)= 0. Then h cannot be a constant on [a,b]. 6≡ b b Thus, h 0 on [a,b]. If t = 0, (1 t)h 2dx > 0. If 0 < t < 1, tPh 2dx > 0. ′ 6≡ a − ′ a ′ R R 86 Remark 37. M˜ =(a˜,y(a˜)) is the limit as α 0 of the points of intersection of extremals y = y (x) and → α∗ y = y(x) [Fomin–Gelfand [35, p.115, l.5–l.6]].

Proof. There exists a point of intersection (xα ,yα ) of y = yα∗ (x) and y = y(x) such that xα (a˜ β,a˜ + β), where β 0 as α 0. ∈ − → → Remark 38. (Derivation of the equation of the vibrating membrane) In order to effectively solve a problem, we must quickly understand the circumstance with the minimum effort, and then directly attack the heart of the matter. The local con- sideration given in [§6.1; http://personal.egr.uri.edu/sadd/mce565/Ch6. pdf] provides a simple derivation of the equation of the vibrating membrane. Newton’s law is the only requirement. Considering a circular membrane with polar coordinates only com- plicates the circumatance [§4.3.1; https://theses.lib.vt.edu/theses/available/ etd-08022005-145837/unrestricted/Chapter4ThinPlates.pdf]. Fomin–Gelfand [35, p.164, (48)] is derived from the viewpoint of the calculus of varia- tions. The derivation starts with the Hamiltonian principle and ends with the Euler equation. The principle acts like an axiom and the equation acts like a theorem. The formal develop- ment makes it difficult to see the key point. The benefit of this approach is to provide the boundary condition [Fomin–Gelfand [35, p.164, (51)]] simultaneously. ∂ ∂ ∂u Remark. By Courant–John [22, vol. 2, p.553, (6)], R[ ∂x (uxψ)+ ∂y (uyψ)]dxdy = Γ ∂n ψds [Fomin–Gelfand [35, p.163, l. 12–l. 10]]. − − R R R The global consideration given in [§Vibrating Membranes; http://www.math. iit.edu/˜fass/Notes461_Ch7Print.pdf] increases the difficulties of the fol- lowing problems: 1. Finding the tensile force FT [p.6, l.4]. 2. The balance of forces [p.7, (1)]. 3. Physical explanations of the vector triple product [p.9, l.2]. 4. There is no displacement u on the right-hand side [p.10, l. 2–l. 1]. − − The formal operations given in [p.9, l.4; (2); p.12, 1] make it difficult to see the key point. − Remark 39. By Ince [59, p.217, l.12–l.14], the Sturm–Liouville system given in Fomin–Gelfand [35, p.198, l.3–l.8] is self-adjoint. Remark 40. The fact given in Fomin–Gelfand [35, p.204, l.1–l.8] follows from Birkhoff–Rota [5, p.268, Theorem 3]. (1) (1) Remark 41. By Hartman [51, p.4, Remark 2], yn y uniformly [Fomin–Gelfand [35, p.204, l.18]]. → Remark 42. Fomin–Gelfand [35, p.207, Problem 8] can be solved by using Zygmund [127, vol. I, p.57, (8.1); p.59, (8.7)] and Rudin [99, p.230, Theorem 10.28]. Remark 43. The Ritz method is an effective tool for studying Sturm–Liouville Problems [Fomin–Gelfand [35, pp.198–205, §41]] I. Calculus tools for finding extrema of functions: Kaplan [63, §2.19; §2.20]. Tools in calculus of variations for finding extrema of functionals: Direct methods (the Rayleigh–Ritz method; the method of finite differences) and using Euler equations [Courant– Hilbert [21, vol.1, chap. IV, §2]]. II. Solving Sturm–Liouville Problems effectively [Fomin–Gelfand [35, pp.196–197, Re- mark 2]] by the Ritz method [Fomin–Gelfand [35, p.196, Theorem]]: construct a complete sequence of functions ϕn as in Fomin–Gelfand [35, p.195, (8)]; this sequence allows us to reduce the problem of finding the minimum of the functional J[y] to the problem of finding the minimum of the function J[α1ϕ1 + + αnϕn] of the n variables α1, ,αn [Fomin– ··· ··· 87 Gelfand [35, p.195, (10)]]. Thus, it suffices to calculate yn given in Fomin–Gelfand [35, p.196, l.13–l.14] by using calculus tools for finding extrema for functions. III. The existence of λ (1) given in Fomin–Gelfand [35, p.200, (24)] is more constructive and effective than the existence of µ0 given in Coddington–Levinson [16, p.195, l. 9]. − Explanation. (A). 1. M defined as in Fomin–Gelfand [35, p.199, l.5] can be computed by calculus. 2. For a system’s solution, we may replace its function (uncountable) form y(x) with its sequence (countable) form αk as in Fomin–Gelfand [35, p.199, (18)]. Thus, J[y] is trans- formed to J(α1ϕ1 + + αnϕn) , a quadratic form in α1, ,αn. The minimum of the latter can be computered by··· the methods given in Kaplan [63, §2.19;··· §2.20]. (1) (1) 3. Define λn ,yn (n = 1,2, ) as in [Fomin–Gelfand [35, p.199, l. 10–l. 7]]. Then (1) (1) ··· (1) − − λ λn [Fomin–Gelfand [35, p.200, (23)]]. Define λ as in Fomin–Gelfand [35, p.200, n+1 ≤ (1) (1) (1) (1) (24)]. After obtaining λ1 , ,λm , we know λ is between λm and the lower bound of (1) ··· (1) λn . Thus, the possible range of λ is getting shorter and shorter as the process goes on.{ In} Fomin–Gelfand [35, p.201, l. 14–p.203, l. 3], we use the method of Lagrange mul- tipliers to obtain Fomin–Gelfand [35,− p.203, (36)]− and then use Fomin–Gelfand [35, p.201, Lemma 2] to prove Fomin–Gelfand [35, p.202, (32)]. (B). In contrast, µ0 = sup u =1 ( u,u) (u C on [a,b]) [Coddington–Levinson [16, p.195, k k | G | ∈ l.2; l. 9]]. The existence of supremum is derived from reduction to absurdity [Rudin [97, p.11,− l. 17–l. 16]]. We have no way to know its location on the real line. Furthermore, as − − we collect more elements of the index set (u I) and find sup ( u,u) u I , this procedure will not help narrow down the search scope of∈ the final supremum.{ G | ∈ } Remark. Based on (A), one can easily create a effective computer program to find λ (1). However, the idea given in (B) is useless for one to find µ0 using a computer. Mathemati- cians should put more effective stuff than the content given in Coddington–Levinson [16, p.194, l. 6–p.197, l.8] into mathematical textbooks. − IV. By III, λ (1),λ (2), ; y(1),y(2), [Fomin–Gelfand [35, §41.4)]] can be effectively calcu- ··· ··· lated using the method of Lagrange multipliers, while the existence of µk (k = 0,1,2, ) ··· given in Coddington–Levinson [16, p.195, l. 9–p.196, l. 2] is derived from the (k + 1)th − − level of reduction to absurdity. Furthermore, that the process of finding µ0, µ1, can be continued is proved by reduction to absurdity [Coddington–Levinson [16, p.197,··· l.1–l.7]], while that the process of constructing λ (1),λ (2), can be continued because each step of the process satisfies the conditions of the method··· of Lagrange multipliers.

5. Landau–Lifshitz [67]

Remark 1. Hamilton’s principle or the principle of least action [Landau–Lifshitz [67, §2]; Marion–Thornton [77, §7.1–§7.4]] (a). Hamilton’s principle originated from expressing Newton’s law in terms of generized coor- dinates [Shapiro [102, §2.1]; Marion–Thornton [77, p.257, l. 18–l. 17]]. − − (b). For wide applications, we need the most generalized version: Landau–Lifshitz [67, p.2, l.6–l.16]. (c). Landau–Lifshitz [67, p.2, l. 2– 1] emphasizes that the equations of motion are derived − − using the property that the action has an extremum rather than the property that the action has a minimum. Nevertheness, from the viewpoint of reality, the action cannot have a maximum.

88 (d). Landau–Lifshitz [67, p.2, l.17–l.20]. (e). The tool of generalized coordinates is required to formulate the general version of Hamilton principle [Landau–Lifshitz [67, p.9, (5.5)]]. Generalized coordinates can also be flexibly used to customize to special situation [Marion–Thornton [77, p.221, Example 6.5; l. 1]] so − that the laws of mechanics take their simplest form [Landau–Lifshitz [67, p.5, l.4]]. When the coordinates are generalized, some features are still preserved [Landau–Lifshitz [67, p.9, l. 8–l. 5; p.14, l. 8–l. 7]]. − − − − Remark 2. “The equations of motion for a closed system do not involve time explicitly” [Landau–Lifshitz [67, p.13, l.12–l.13]] means that “the Lagrangian does not depend explicitly on time” [Landau– Lifshitz [67, p.13, l. 4–l. 3]]. Consequently, the system is autonomous [Pontryagin [91, p.103, l. 6]]. − − − Remark 3. One of the arbitrary constants in the solution of the equations can always be taken as an additive constant t0 in time [Landau–Lifshitz [67, p.13, l.14–l.16]].

Proof 1. If we use Hamilton’s characteristic function W(qi,Pi) [Van Esch [29, p.8, (40); l.7]] as the generating function for a canonical transformation, the new Hamiltonian is K(Qi,Pi)= P1 [Van Esch [29, p.8, l.17]]. By Fomin–Gelfand [35, p.70, (11)], dQ1 = ∂K = ∂P1 = 1. dt ∂P1 ∂P1 Hence we have Q1 = t +t0 [Van Esch [29, p.8, (44)]]. Proof 2. Before we examine the eligibility of time to be a constant of integration, we must char- acterize the features of a constant of integraton. Suppose the general solution of a system of differential equations has n arbitrary contants. If s is an additive constant and H(s) is a solution, then for any s0, H(s + s0) is also a solution. Conversely, if y is any solution of the system whose trajectory passes through a given point, then y = H(s + s0) for some s0. If s has the above two properties, we call s a constant candidate. Being a constant candidate is obviously a necessary condition for s to be a constant of integration. But is it a sufficient condition for s to be a con- stant of integration? I would say yes. The argument proceeds as follows: By indution, we may construct a new constant candidate by keeping the previous constant candidates constant until we find all n of them. By this time, the n constant candidates must be constants of integration because there can be no more constant candidates. Thus, for each inductive step we must com- plete exactly one nth amount of workload. As regards the time’s eligibility for being a constant candidate, read Pontryagin [91, pp.104–106, (A) and (B) ]. Remark. Note that for a nonautonomous system, time cannot be treated as a constant of integra- tion. Example 1.4. (Nonautonomous system) dx = 2t t2 + x. dt − x(t)= et +t2 is a solution, while x(t + c) is not if c = 0. 6 Remark 4. Eliminating t +t0 from the 2s equations qi = qi(t +t0,C1,C2, ,C2s 1),q˙i = q˙i(t +t0,C1,C2, ,C2s 1), ··· − ··· − we can express the 2s 1 arbitrary constants C1,C2, ,C2s 1 as functions of q andq ˙ Landau– − ··· − Lifshitz [67, p.13, l.16–l.18]. Eaplanation. The procedure goes as follows: Fix qi,q˙i by setting t = 0. Solve t0 from one of the 2s equations and then substitute it into the remaining 2s 1 equations. Next, solve C1,C2, ,C2s 1 from these 2s 1 equations. Thus, − ··· − − once we choose initial time t0, the remaining 2s 1 constants depend only the initial condition. − 89 Remark 5. p.4, l.9–p.5, l.7 of https://ocw.mit.edu/courses/aeronautics-and-astronautics/ 16-07-dynamics-fall-2009/lecture-notes/MIT16_07F09_Lec26.pdf provides the proofs of statements in Landau–Lifshitz [67, p.100, l. 20–l. 3]. − − Remark 6. Solution to Landau–Lifshitz [67, p.102, Problem 2(e)].

Proof. Let α be the angle between x3-axis and the boundary of the cone. 2 2 2 Rcotα ρ 3 π (Rcotα)5 4 πR5 cotα y dV = ρ sin φρdρdφdz = π 0 0 r drdz = 4 5 tan α = 20 . 2 Rcotα 2 ρ R5 cot3 α R z dV = 2RRRπ 0 z 0 rdrdz = π R 5 . R I y2 z2 dV x2 z2 dV I I x2 y2 dV 2 y2dV. R1′ = ρd ( +R ) =R ρd ( + ) = 2′ , 3′ = ρd ( + ) = ρd ρd Rcotα 2π ρ Y = µ R0 0 0 zrdrdRφdz [Marion–Thornton [77,R p.330, (9.4)]] R 2 ρd π tan α Rcotα 3 3 = µR 0 R zR dz = 4 h. I = Iik + µ(aδik aiak) [Landau–Lifshitz [67, p.101, (32.12)]], where ik′ R − 0 a = 0 .   3 h − 4   Remark 7. Solution to Landau–Lifshitz [67, p.102, Problem 2(f)].

2 Proof. I′ = 5 µUS [Landau–Lifshitz [67, p.102, l.8]] 8 = 15 πρUS. I1 = (y2 + z2)dxdydz = 1 abc I′ (b2 + c2)= 4 πabc(b2 + c2). ρell 2 ρUS 15 1 4 2 2 1 2 2 I1 = 5RRR[ρell( 3 πabc)](b + c )= 5 µell(b + c ). Remark 8. Remark 39 Remark 9. By Gonzalez´ [43, p.426, Corollary 5.21, (5.17-10)], s = snτ [Landau–Lifshitz [67, p.118, l. 10]]. − cnτ = √1 sn2τ [Landau–Lifshitz [67, p.118, l. 8]] follows from Gonzalez´ [43, p.422, Theo- − − rem 5.37, 4]; dnτ = √1 k2sn2τ [Landau–Lifshitz [67, p.118, l. 8]] follows from Gonzalez´ [43, p.422, The- − − orem 5.37, 5]. Remark 10. By Gonzalez´ [43, p.426, (5.11-6)], K = 1 ds [Landau–Lifshitz [67, p.118, l. 1, 0 √(1 s2)(1 k2s2) − − − (37.11)(i)]]. R By Gonzalez´ [43, p.422, Table; l. 6], their period in the variable τ is 4K [Landau–Lifshitz [67, p.118, l. 3]]. − − 6. Marion–Thornton [77]

Remark 1. Marion–Thornton [77] was a textbook for theoretical mechanics when I was a sophmore. before I took the course, my classmates told me that the textbook was a difficult one. At the end of the semester I was barely passed the course. After reading it forty-five years later, I finally find the reason why it was so difficult for me to read then: This is because in the beginning of the book the author uses the concept of tensor to describe vectors. For example, the proof of Marion [77, p.28, (1.78)]. Because I did not have the background of tensor then, no wander I got lost. Tensor is a physical quantity that has a scheme for unification: various physical quantities are defined by the same scheme which keeps the model simple and consistent and facitates the proof of relationships among physical quantities. This s useful in studying relativity, and thereby, eletrodynamics. This is the advantage and also disadvantage of tensors: because this is its sole function, it restricts to

90 formal and superficial discussion without concerning its physical and geometric meanings. the material given in Marion–Thornton [77, chap. 1] is not enough. One should supplement it with Kreyszig [66, chap. V]. Remark 2. The Lissajous curve represented by Marion–Thornton [77, p.105, (3.27)] is closed if and only if ωx/ωy is a ratonal fraction [Marion–Thornton [77, p.106, l.2–l.4]]. Proof. The period of x is 2π and the period of y is 2π . ωx ωy Therefore, the combined period is lcm( 2π , 2π ) ωx ωy = 2π lcm(ω ,ω ) ωxωy y x = 2π . gcd(ωx,ωy) Remark 3. Marion–Thornton [77, p.159, Figure 4-11] should be supplemented with Arnold [1, §12.9]. Remark 4. The proof of Marion–Thornton [77, p.193, (5.33)]: read Jackson [60, p.33, l. 3, the last equal- ity; Fig. 1.7]. − Remark. We give another proof of Marion–Thornton [77, p.194, (5.37)] as follows:

Proof. ρ(s)(r s) g(r)= G − dv. − r s 3 Z | − | r 1 ( )= 2 ( ) (by Wangsness [121, p.36, (1-141)]) ▽· r 3 −▽ r | | | | = 4πδ(r) (by Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1477, (6)]). g(r)= 4πG ρ(s)δ(r s)dv ▽· − Z = 4πGρ(r). −

This proof is worse than that given in Marion–Thornton [77, p.192, l. 3–p.194, l.5]. This − is because the former proof unnecessarily uses difficult concept of the Dirac delta function, while the latter proof only uses the geometric concept of solid angle. Remark 5. By comparing Marion–Thornton [77, p.214, (6.31)] with Marion–Thornton [77, p.216, (6.36)], one sees that a right vchoice of coordinate system may greatly simplify the procedure of problem solving.

Remark 6. On the right side of the equality in Marion–Thornton [77, p.220, l.9], η1,η2 are dependent be- cause of the constraint; while the equality in Marion–Thornton [77, p.220, (6.66)] contains only η1. Therefore, in order to solve a variational problem with constraints, we first transform it to a variational problem without constraints. Remark 7. Lack of thorough understanding of mathematics often leads physicists to overdoing, coining new terms, or making a fuss because of inexplicable worry. Marion–Thornton [77, p.225, l. 8–l. 1] − − attempts to explain δJ and δy. However, it fails to scratch the itch, i.e., grasp the main points. For a rigorous definition of δJ, read Fomin–Gelfand [35, p.12, l.3]. δy is the difference between an admissible curve and the actual one. As to how to set up admissible curves, read Fomin–Gelfand [35, p.47, l.2; p.14, l.8–l.15; p.22, l.13–l.21].

91 Remark 8. (Hero’s principle vs. Fermat’s principle) [Marion–Thornton [77, p.229, l. 9–p.230, l.7]] Hero’s principle: a light ray traveling from one point to another takes the− short possible path. Fermat’s principle: a light ray travels from one point to another by a path that requirs the least time. The former can only be used in the same medium, while the latter can be used in different media, so it has wider applications. Note that light travels at different speeds in different media. Remark 9. We should not tolerate an explanation that looks plausible, but actually wrong. Mistakes must be corrected and confusion must be clarified. It increases the difficulty for others to provide explanations because they must clarify his confusion. In Marion–Thornton [77, p.220, l.9], we ∂J consider ∂α , where α directly affects only y and z. Therefore, Marion–Thornton [77, p.219, (6.63)(i)] can be proved as follows: ( ) dg = ∂g dx + ∂g dy + ∂g dz ,where dx = 0. ∗ dα ∂x dα ∂y dα ∂z dα dα Finding the total derivative of g with respect to α has nothing to do with finding the total deriva- tive of g with respect to x: ( ) dg = ∂g + ∂g dy + ∂g dz [Marion–Thornton [77, p.249, (7.63)]]. ∗∗ dx ∂x ∂y dx ∂z dx Therefore, it is a digression to mention the total derivative of g with respect to x when discussing ∂J ∂α . However, if we multiply (*) by dα and multiply (**) by dx respectively and then compare ∂g the two resulting equations, we will see that the latter one has an extra term ∂x dx. The mistake we commit here is that we should not compare the two resulting equations that come from dif- ferent origins. Marion–Thornton [77, p.249, l. 4–p.250, l.1] says, “the variation process involved in Hamil- − ton’s Principle holds the time constant at the endpoints, we could add to Equation 7.64 a term ∂ fk ∂ fk ( ∂t )dt without affecting the equations of motion.” Its mentioning of ( ∂t )dt when discussing ∂J ∂α is a mistake; its explanation about endpoints is rubbish. The explanation dt = 0 given in Flygare [34, p.4, l.9] is problematic. This is because α affects q directly and q is a function of t; α will indirectly affect t through q. Remark 10. (Reduction with separation of variables in mind) Given two differential equations Marion–Thornton [77, p.253, (7.87) & (7.88)]. We want to express λ in terms of θ using a single integration even though Marion–Thornton [77, p.253, (7.88)] is a differential equation of the second order. See Marion–Thornton [77, p.254, (7.93)]. The key to reducing the second order differential equation to the first order one is using Marion–Thornton [77, p.253, (7.90)]. Then we can solve the result- ing differential equation by the method of separation of variables [Marion–Thornton [77, p.253, (7.91)]]. Remark. Hartman [51, p.50, l. 3–p.51, l.2, Lemma 3.1] provides an example of reduction by − some known solutions. Remark 11. Marion–Thornton [77, p.255, (7.101)] follows from the definition of partial derivatives. Remark 12. Marion–Thornton [77, p.261, l. 18–l. 8] points out which step is the key to proving H = T +U − −dH and which step is the key to proving dt = 0 in the proof of energy conservation: Marion– Thornton [77, p.259, (7.122)] is the crucial factor leading to H = T + U = total energy E ∂U dH [Marion–Thornton [77, p.261, (7.130)]]; ∂q˙ = 0 is the crucial factor leading to dt = 0 [Marion– Thornton [77, p.260, (7.127); p.261, l.3]]. To explain with concrete examples, for the latter claim, see Thornton [112, pp.207–208, Solution 7-22]; for the former claim, see Thornton [112, pp.199–200, Solution 7-17]. It is difficult to see H = T +U by direct comparison [Thornton [112, p.199, (3), (4); p.200, (15)]], but the falsehood6 of Marion–Thornton [77, p.259, (7.122)] for this case follows easily from Thornton [112, p.199, (3)] and Border [7, p.1, Euler’s theorem].

92 Remark 13. (Conservation of linear momentum) The infinitesimal displacement δr given in Marion–Thornton [77, p.262, l.2, (7.132)] is not as specific as that given in Landau–Lifshitz [67, p.15, l.11]. The δ of δL given in Landau–Lifshitz [67, p.15, l.14] comes from the Lagrangian rather than the ac- tion, so it is difficult to recognize that r + δr are the admissible (virtual) functions. In constrast, Fomin–Gelfand [35, pp.86–87, 2] considers the action [Fomin–Gelfand [35, p.81, (49); p.83, l.1]], the link of transformations in Fomin–Gelfand [35, p.86, l.27] to their role of admissible functions [Fomin–Gelfand [35, p.82, (52)]] is clear. Fomin–Gelfand [35, p.86, l.23–l.25] consid- ers only infinitesimal transformations. For the relationship between infinitesimal transformations and finite transformations, read Vvedensky [119, §7.3.1 & §7.5]. Without the guide of the theory of Lie groups, the discussion about angular momentum given in Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, chap. VI, §B, §D.1.a, Complement BVI] looks disorganized. In contrast, with the theory of Lie groups as a guide, Vvedensky [119, §7.4.1 & §7.4.2] clarifies the inner structure of angular momentum. Comparing Vvedensky [119, §7.3.1] with Marion–Thornton [77, p.35, l. 22–p.36, l. 1], the latter discussion is marginal. It is unnecessary to represent the infinites- − − imal rotation δθ by a vector [Marion–Thornton [77, p.262, l. 2–l. 1; p.263, Fig. 7-8]]; see Fomin–Gelfand [35, p.87, l. 18–l. 5]. In order to be natural, simple,− − and easy to understand, it is an improper approach to represent− − a matrix by a vector.

Remark 14. The qk and the pk are independent, whereas the qk and theq ˙k are not [Marion–Thornton [77, p.269, l. 18–l. 17]]. − − Correction and explanation. The Lagrangian L(t,q,q˙) is defined as L(t,q,z) z=q˙. Consequently, q˙ means the variable slot z of L in Lagrangian mechanics. Since q and z are independent| variables of L, q andq ˙ are independent. We are not talking about calculus here. The fact that q and p are independent follows from the implicit function theorem [Fomin–Gelfand [35, p.58, l.9–l.11]]. The argument given in Van Esch [29, p.3, l.28–l.32] is incorrect: once things enter the head, it is difficult to forget them.

Remark 15. It is possible to find transformations that render all the coordinates cyclic [Marion–Thornton [77, p.270, l.17–l.18]].

Proof. Read Van Esch [29, §3].

Remark a. If the Hamiltonian does not contain time explicitly, then q j is a linear function of time [Van Esch [29, p.4, l. 16]]. − Proof. By Fomin–Gelfand [35, p.70, (11)], dqi = ∂H . dt ∂Pi Remark b. Qi are cyclic [Van Esch [29, p.8, l. 17]]. − ∂W Proof. By Van Esch [29, p.8, (41)], H(qi, )= P = K(Qi,Pi), ∂qi 1 so Qi do not appear in K(Qi,Pi). Remark c. All Pi are constant, and all Qi for i = 1 are constant [Van Esch [29, p.8, l. 16]]. 6 − Proof. By Fomin–Gelfand [35, p.70, (11)], dPi = ∂K = ∂P1 = 0. dt − ∂Qi − ∂Qi dQi ∂K ∂P1 = = = δ i. dt ∂Pi ∂Pi 1 Remark d. The new Hamiltonian would have been K = K(P1,P2,...,Pn) [Van Esch [29, p.8, l. 8– l. 7]]. − − Proof. The result follows from Van Esch [29, p.7, (39) & p.8, (45)]. Remark e. Van Esch [29, p.8, (46)]

93 ∂W Proof. H(qi, )= K(P ,P ,...,Pn). ∂qi 1 2 By Fomin–Gelfand [35, p.70, (11)], dQi ∂K = = νi. dt ∂Pi

Remark 16. (Do not abuse the term “independence”) [Marion–Thornton [77, §7.11]] In classical mechanics, whenever we discuss “independence”, it refers to the variables of a function “F or f ” and the implicit function theorem [Mann–Taylor [76, p.225, Theorem 1]]. Since z = f (x,y) [Mann–Taylor [76, p.225, (8.1-1)]], x and y are independent variables and z is a dependent variable. We often say that x and y are independent when trying to express a function in terms of x and y. Example 1.5. By the implicit function theorem, we can express yi as a function of x,y1, ,yn, p1, , pn ··· ··· [Fomin–Gelfand [35, p.68, l.14–l.15]], where x,y1, ,yn, p1, , pn are independent. H is de- n ··· ··· fined as F(x,y1, ,yn,y , ,y )+ ∑ y pi [Fomin–Gelfand [35, p.68, (6)]], so H can be − ··· 1′ ··· n′ i=1 i′ expressed as a function of x,y1, ,yn, p1, , pn, where x,y1, ,yn, p1, , pn are independent. ··· ··· ··· ··· Marion–Thornton [77, p.272, l.12] says that the δq j and the δq˙ j are not independent. In that context, we have no function with variables δq j and δq˙ j to talk about. Instead, we are discussing integration by parts which can be described without using the term “independence”. Consequently, this is not a proper case for using the term “independence”. In Marion–Thornton [77, p.274, l.5], we discuss a differential equation instead of a function itself. We could say that p j and q j turn out to be connected by a differential equation without saying the assumed independence will eventually be eliminated [Marion–Thornton [77, p.274, l.9–l.10]]. In order to show the role that the independence plays in this context and indicate the proper timing for using independence, I would say: with the goal of deriving the canonical Lagrange equations in mind, studying independence is the first step which facilitates the operation of the function H. Remark 17. For the equipartition theorem [Marion–Thornton [77, p.278, l. 7]], read Reif [93, p.202, l. 12– − − l. 11; p.204, (6.2.8); p.249, (7.5.7)]. − Remark 18. Marion–Thornton [77, p.279, (7.212)] should have been corrected as follows: r U = dU r = k(n + 1)rn+1 =(n + 1)U. ·▽ dr · ∂U n ∂r n 1 Proof. ∂x =(n + 1)kr ∂x =(n + 1)kr − x. ∂U x =(n + 1)krn 1x2. ∂x · − U r =(n + 1)krn+1 = dU r. ▽ · dr · Remark 19. By Marion–Thornton [77, p.301, Fig. 8-8] or Symon [110, p.130, l.7–l. 9], all possible orbits − of a body moving in a potential proportional to l/r are conic sections [Marion–Thornton [77, p.300, l. 2–l. 1]]. − − Remark 20. Marion–Thornton [77, p.301, (8.42), (8.43); p.302, Fig. 8-9]

Proof. Read Symon [110, p.131, l.8–p.132, l.5]. Marion–Thornton [77, p.301, (8.42)] can be proved by comparing Marion–Thornton [77, p.300, (8.41)] with Symon [110, p.131, (3.244)]. Marion–Thornton [77, p.301, (8.43)] follows from Symon [110, p.132, (3.245)]. The geometric representation of α given in Marion–Thornton [77, p.301, Fig. 8-9] can be proved as follows:

94 Proof. α2 +(2aε)2 =(2a α)2. − α = a aε2 = a(1 ε2). − −

Remark 21. (Expanding the scope of application without loss of efficiency: methods of determining the sta- bility of circular orbits) [Marion–Thornton [77, §8.10]] Marion–Thornton [77, p.317, l.5–l.7] gives a method of determining the stability of circular orbits. However, examples that may take advantage of this method are few. They are limited to simple functions such as that given in Marion–Thornton [77, p.317, (8.75)]. For the complicated function given in Marion–Thornton [77, p.319, (8.96)], the method would take a tremendous amount of calculations. Instead, we should use the method given in Marion–Thornton [77, p.317, l.19–p.319, l. 8]. − Remark 22. U = 1 ρg(b2 + 2bx x2) [Marion–Thornton [77, p.335, l.1]]. − 4 − b x Proof. The length l1 of the falling part = −2 . (b x)ρb The mass M1 of the falling part = −2b . b+x The length l2 of the motionless part = 2 . (b+x)ρb The mass M2 of the motionless part = 2b . (b x)ρb b x (b+x)ρb b+x U = U1 +U2 = g − [ (x + − )]+ g ( ). 2b − 4 2b − 4 Remark 23. ∆p =(ρ∆x)(2u) [Thornton [112, p.289, l.1]]. Explanation. ∆x at the bend has the downward velocity u before reaching the bottom. − ∆x at the bend has the upward velocity u after reaching the bottom. Remark 24. The maximum value of θ is 180 [Marion–Thornton [77, p.350, l. 12–l. 11]]. ◦ − − Explanation. This is because in the spherical coordinate system θ runs from 0◦ to 180◦. The 3- dimensional picture should be like Symon [110, p.139, Fig. 3.42]. Marion–Thornton [77, p.348, Fig. 9-10(c)] is only the cross section containing the moving path of the particle m1. Remark 25. Marion–Thornton [77, p.354, (9.87b)]

sinψ sinθ Proof. = m [Marion–Thornton [77, p.350, (9.69)]]. cosψ cosθ+ 1 m2 sin(ψ θ)= m1 sinψ. − − m2 θ = arcsin( m1 sinψ)+ ψ. m2 1 cosθ = 1 [ 1 ( m1 sinψ)2]cosψ + m1 sin2 ψ. − − ± − m2 m2 2 2 m1 2 m1 2 m1+2m1m2+m2 q2m1m2(1 [ 1 ( sinψ) ]cosψ+ sin ψ) T1 − − ± − m2 m2 = 2 [Marion–Thornton [77, p.354, (9.87a)]]. T0 (m1q+m2) m m m2(cos2 ψ cosψ ( 2 )2 sin2 ψ+( 2 )2 sin2 ψ) 1 ± m1 − m1 − = 2 . q(m1+m2) Remark. Marion–Thornton [77, p.354, (9.87a)] is derived from Marion–Thornton [77, p.348, Fig. 9-11a]. See Marion–Thornton [77, p.353, l.9]. However, Marion–Thornton [77, p.354, (9.87a)] also applies to Marion–Thornton [77, p.348, Fig. 9-11b] since θ f [resp. θb] determines v1, f [resp. v1,b]. The LAB scattering angle ψ cannot uniquely determine the final velocity in the LAB system for the case m1 > m2 since v1 is double-valued. Remark 26. Marion–Thornton [77, p.354, (9.88)]

95 θ Proof. tanζ = cot 2 [Marion–Thornton [77, p.350, (9.73)]] 2 2 2 θ 2 θ sec ζ = 1 + tan ζ = 1 + cot 2 = csc 2 ⇒ 2 2 θ cos ζ = sin 2 ⇒ 2 θ 1 cosθ = 2sin 2 . ⇒ − T2 2m1m2 4m1m2 2 By Marion–Thornton [77, p.354, (9.87a)], = 2 (1 cosθ)= 2 cos ζ. T0 (m1+m2) − (m1+m2) By Marion–Thornton [77, p.351, (9.74)], ζ π . ≤ 2 Remark 27. ( 5x2 + 26x 5 0) ( 1 x 5) [Marion–Thornton [77, p.358, l.13]]. − − ≥ ⇒ 5 ≤ ≤ Proof. (5x 1)(x 5) 0. − − ≤ (5x 1 0 and x 5 0) or (5x 1 0 and x 5 0). − ≥ − ≤ − ≤ − ≥ The latter option leads to a contradiction, so 1 x 5. 5 ≤ ≤ Remark 28. By Marion–Thornton [77, p.291, (8.15)], the value of rmin is a root of the radical in the denomi- nator in Equation 9.121 [Marion–Thornton [77, p.365, l.10]]. Remark 29. Marion–Thornton [77, p.369, (9.138)]

Proof. It suffices to prove the equality given in Landau–Lifshitz [67, p.53, l.11]. 1 By Landau–Lifshitz [67, p.38, (15.14)], (r = rmin φ = 0) and (r = ∞ cosφ0 = ). ⇒ ⇒ e Remark 30. The total cross section for the Rutherford scattering is infinite [Marion–Thornton [77, p.371, l. 19–l. 18]]. − − Proof. By Marion–Thornton [77, p.370, (9.140); p.371, (9.143)], k2 1 3 σt = 2π 0 4T 2 2 sinθdθ O( 0 θ − dθ)= ∞. ( 0′) sin (θ/2) ≈ R R Remark 31. Marion–Thornton [77, p.389, (10.9a); p.390, (10.9b) & (10.9c)] follow from Marion–Thornton [77, p.389, (10.5)]. Marion–Thornton [77, p.390, Figure 10-2] simply provides Marion–Thornton [77, p.389, (10.9a); p.390, (10.9b) & (10.9c)] a geometric interpretation. Remark 32. The Q in Marion–Thornton [77, p.390, (10.12)] can be arbitrary because the r in Marion– Thornton [77, p.388, (10.2)] need not be the origin of the rotating frame.

Remark 33. R¨ f = ω R˙ f [Marion–Thornton [77, p.396, l. 8]]. × −

Proof. Let the x′y′z′ frame be the fixed inertial frame, the x′′y′′z′′ frame be the rotating frame with O′′ = O′ and z′′-axis = z′-axis. See Marion–Thornton [77, p.396, Figure 10-5]. Then by Marion–Thornton [77, p.390, (10.12)], dR˙ f dR˙ f R¨ f =( )fixed =( )rotating + ω R˙ f . dt dt × R˙ f is the velocity of point O. Since point O and the x′′y′′z′′ frame rotate with the same angular ˙ ˙ dR f velocity ω, R f is a constant vector in the x′′y′′z′′ frame. Therefore, ( dt )rotating = 0. Remark 34. Because Chasles’ theorem is more specific and delicate than the statement given in Marion– Thornton [77, p.412, l.1–l.3], it is incorrect to say that the former is more general than the latter [Marion–Thornton [77, p.412, l. 5]]. A simple proof of Chasles’ theorem is given in Landau– − Lifshitz [67, p.98, l. 3–l. 1]. − − Remark 35. The angular momentum L rotates with an angular velocity ω in such a way that it traces out a cone whose axis is the axis of rotation [Marion–Thornton [77, p.421, Figure 11-4]].

96 Proof. r1 =(r1 sinθ cosφ,r1 sinθ sinφ,r1 cosθ), where φ˙ = ω and θ is the angle between r1 and ω. v1 =( r1 sinθ sinφ,r1 sinθ cosφ,0). − 2 r1 v1 = r sinθ(cosθ cos(π φ),cosθ sin(π φ),sinθ). × 1 − − L r1 v1. k × 1 2 Remark 36. It is incorrect to say that we rotate through an angle of cos− ( 3 ) about the x2′ -axis in the second step [Marion–Thornton [77, p.436, l.13–l.14]]. This is becauseq cos 1( 2 ) (0,π/2), − 3 ∈ but the required angle should be in ( π/2,0). In Marion–Thornton [77, p.419,q Figure 11-3], let − P =(1,1,0),Q =(1,1,1). The second rotation θ about the x2′ -axis should rotate the direction of x1′ -axis (i.e. −→OP) to the direction of −→OQ. That is, 1 cosθ sinθ 0 x3′ x3′′ x3′ 3 x3′′ 0 sinθ cosθ 0 x1′ = x1′′ , where x1′ = q 2  and x1′′ = 1 .        3     − 0 0 1 x x x x 0 2′ 2′′ 2′ q0  2′′              Therefore, cosθ = 2 ,sinθ = 1 .   3 − 3 Remark 37. The angular velocityq of a rigid bodyq [Landau–Lifshitz [67, p.97, l. 5–p.98, l.2]] refers to the angular velocity of the body frame [Landau–Lifshitz [67, p.96, l. 12–l.− 11]] with respect to the ˙ ˙ − − fixed space frame. Therefore, ω~ = ~φ +~θ + ψ~˙ [Marion–Thornton [77, p.443, l. 18]] may follow from Symon [110, p.452, l.19–l.22] and Landau–Lifshitz [67, p.111, l.1–l.13].− Remark 1. The change resulting from the infinitesimal rotation δ~θ [Marion–Thornton [77, p.35, (1.106); p.36, Figure 1-19]] can be expressed in matrix form:

x1 Proof. Let~r = x2 .   x3 δ~r = δ~θ ~r [Marion–Thornton  [77, p.35, (1.106)]] × eˆ1 eˆ2 eˆ3 0 δθ3 δθ2 x1 − = δθ1 δθ2 δθ3 = δθ3 0 δθ1 x2 .  −   x1 x2 x3 δθ2 δθ1 0 x3 −   

Remark 2. Why can we neglect higher-order infinitesimals in Goldstein–Poole–Safko [41, p.165, (4.67)]?

Explanation. The general infinitesimal rotation 1 +[εi j] can be represented by 1 (dφ + dψ) 0 A = dψ~ d~θd~φ = (dφ + dψ) 1 dθ [Goldstein–Poole–Safko [41, p.165, l.14]]. −  0 dθ 1 − Namely, d~Ω = idθ + k(dφ + dψ) [Goldstein–Poole–Safko [41, p.165, l.15]]. Our goal is to divide δ~r = δ~θ ~r [Marion–Thornton [77, p.35, (1.106)]] by dt and then let × dt 0 in order to obtain~v = ω~ ~r [Marion–Thornton [77, p.35, (1.105)]]. → × dφ dθ We keep the first-order infinitesimals because only the first-order limits [limdt 0 , limdt 0 , → dt → dt dψ dφdθ dφ 2 limdt 0 dt ] can be nonzero. In contrast, the second-order limits [limdt 0 dt , limdt 0 dt ] must→ be zero. Consequently, we may ignore the higher-order infinitesimals.→ →

97 Remark 3. The general infinitesimal rotation given by Goldstein–Poole–Safko [41, p.165, l.14] is a matrix, so it is necessary to prove it to be a vector [i.e. show the independence of order for their composition function]. Goldstein–Poole–Safko [41, p.165, (4.67)] provides the proof. In Marion–Thornton [77, p.35, (1.106)], the general infinitesimal rotation is defined by the vector d~θ that satisfies δ~r = δ~θ ~r. Thus, d~θ is already a vector whose magnitude and directon are the same as those in Landau–Lifshitz× [67, p.18, l. 9–l. 8]. Consequently, the proof given in − − Marion–Thornton [77, p.36, l.8–l. 1] is totally unnecessary and the statement given in Marion– Thornton [77, p.36, l.3–l.4] is not− true. From the viewpoint of vectors, we understand that two velocities applied to the same position can be added. It is why two angular veclorities can be added that requires an explanation. This is the main problem whose answer is provided by the distributive law for vectors. As for the problem of showing the independence of order for their composition function [Marion–Thornton [77, p.36, l.4–l.7]; Goldstein–Poole–Safko [41, p.165, (4.67)]] is only a side problem. The key to proving that the composition function for d~θ is independent of order is also the distributive law for vectors. See Marion–Thornton [77, p.36, l.8–l. 1]. Actually, the proof given in Marion–Thornton [77, p.36, l.8–l. 1] is essentially the same− as the proof given in Symon [110, p.452, l. 15–l. 7] except that− the summands in the former proof are infinitesimal rotations, while the summands− − in the latter proof are not restricted to infinitesimal angular velocities. In my opinion, the tool of matrix is good for describing the result of motion, but is not appropriate for describing the process of motion; for example, angular velocities. Remark 38. If the angular velocity vector ω lies along a principal axis of the body, then the motion is trivial [Marion–Thornton [77, p.449, l.5–l.6]].

Proof. For force-free motion, the angular momentum L is constant. ω1 If ω lies along a principal axis of the body, say x1-axis, then ω = 0 .   0 I1 0 0 L = Iω [Marion–Thornton [77, p.420, (11.20b)]], where I = 0 I2 0 is constant. Conse-   0 0 I3 quently, ω1 is constant.   Remark 39. Marion–Thornton [77, p.449, l.19–p.450, l.3] [Landau–Lifshitz [67, p.115, l. 7–l. 1]] provides one method of solving the ODEs given in Marion–Thornton [77, p.449, (11.132)].− − Symon [110, p.446, l.17–l. 6] provides a less tricky method of solving the same ODEs. Now we would like to supply more− details about the proof of Symon [110, p.446, (11.19)]. By Symon [110, p.446, (11.18)] and Pontryagin [91, p.22, Theorem 3], ω eiβω3t e iβω3t cos(βω t + θ) 1 = eiθ +e iθ − = 2 3 is also a solution of the above iβω3t − iβω3t ω2 ie ie− sin(βω3t + θ) ODEs. −      Remark. The key [Marion–Thornton [77, p.449, (11.133)–(11.136)]] to solving the coupled equations in Marion–Thornton [77, p.449, (11.132)] is based on the physical observation of precession [Marion–Thornton [77, p.450, Figure 11-12]]. The key [Marion–Thornton [77, p.470, (12.2)]] to solving the coupled equations in Marion–Thornton [77, p.470, (12.1)] is based on the physical observation of oscillation [Marion–Thornton [77, p.470, l.6]]. Two coupled circuits [Wangsness [121, p.460, Figure 27-9; (27-38); (27-39)] provide another example.

98 Remark 40. In the argument for reaching the conclusions given in Marion–Thornton [77, p.452, l. 9–p.453, − l.3], a condition on e2 is imposed in Marion–Thornton [77, p.452, l.8–l.9]. The same conclu- sions [Symon [110, p.446, l. 5–l. 3; p.448, l.1–l.2]] are reached without assuming this extra condition [Symon [110, p.448,− Fig.− 11.1]]. Thus, the former argument should have explained why we may assume the extra condition without loss of generality. Landau–Lifshitz [67, p.111, l. 20–l. 18] supplies the reason. − − Remark 41. The inertia tensor given in Marion–Thornton [77, p.455, (11.149)] is incorrect. The inerta tensor I1 0 0 is I = 0 I1 0 if the coordinate axes are the principal axes of inertia and the origin is the   0 0 I3   0 center of mass [Landau–Lifshitz [67, p.101, l.9]]. However, the center of mass is at 0 in   h 2 I1 + Mh 0 0   2 this case, so I should be replaced with I′ = 0 I1 + Mh 0 [Landau–Lifshitz [67,   0 0 I3 1 2 ˙ ˙ 2 1 p.101, (32.12)]]. Thus, the kinetic energy should be Trot′ = 2 (I1 + Mh )(θ + φ sin θ)+ 2 I3(ψ˙ + φ˙ cosθ)2 [Landau–Lifshitz [67, p.112, l.6]].

Remark 42. The particles oscillate always out of phase and with frequency ω1 [Marion–Thornton [77, p.472, l.10]].

Proof. Out of phase: η2(t) 0 x1 + x2 0 x2 = x1. ≡ ⇒ ≡ ⇔ − When x1 and x2 are expressed in terms of cos or sin, their phase difference is 180◦. Frequency ω1:

1 1 1 + iω1t iω1t x1 = 2 (η2 + η1)= 2 η1 = 2 (C1 e +C1−e− ) η2 0 1 1 1 + iω1t iω1t ≡ ⇒ (x2 = (η2 η1)= η1 = (C e +C−e− ). 2 − − 2 − 2 1 1

Remark 43. Marion–Thornton [77, p.482, (12.56)] follows from Marion–Thornton [77, p.481, (12.55)].

Remark 44. ∑ m jka jrakr > 0 [Marion–Thornton [77, p.482, l. 20–l. 17]]. j,k − − 2 Proof. We may adjust δr so that sin (ωr t δr ) > 0 and for every r = r0, sin(ωrt δr)= 0. 0 − 0 6 − Since (ωr = 0 T > 0), ∑ m jka jr akr > 0. 0 6 ⇒ j,k 0 0 Remark 45. Solving the problem of coupled harmonic oscillators [Marion–Thornton [77, chap. 12]; Cohen- Tannoudji–Diu–Laloe¨ [17, vol.1, pp.575–585, Complement HV ]]. (1) From the viewpoint of differential equations: by changing variables [Marion–Thornton [77, p.471, (12.11)]], we may make the coupled differential equations given in Marion–Thornton [77, p.470, (12.1)] completely separable [Marion–Thornton [77, p.471, (12.14)]]. (2) From the viewpoint of individual particles using the Newtonian mechanics: The results are summarized in Marion–Thornton [77, p.487, Table 12-1]; the pictorial features are given in Marion–Thornton [77, p.472, Fig. 12-2]. (3) From the viewpoint of the entire system using the Lagrangian in the Lagrangian mechanics:

99 (a). If the equations connecting the generalized coordinates and the rectangular coordinates do not explicitly contain the time, then the kinetic energy has the form given in Marion– Thornton [77, p.476, (12.18)]. (b). The expansion of the potential energy in a Taylor series about the equilibrium configu- ration yields Marion–Thornton [77, p.476, (12.32)]. (c). The Lagrangian equations yield Marion–Thornton [77, p.478, (12.38)]. By substituting Marion–Thornton [77, p.478, (12.39)] into Marion–Thornton [77, p.478, (12.38)], we have Marion–Thornton [77, p.479, (12.40)]. In order to find the solutions of Marion– Thornton [77, p.479, (12.40)], we solve ω for Marion–Thornton [77, p.479, (12.42)] first. Then for each ωr, we solve Marion–Thornton [77, p.479, (12.40)] to obtain the corresponding eigenvector ar. (d). Using Marion–Thornton [77, p.483, (12.63)], we simultaneously diagonalize T and U [Marion–Thornton [77, p.484, (12.65); (12.66)]]. Then the Langrangian equations in normal coordinates become completely separable [Marion–Thornton [77, p.485, l.4]]. (4) From the viewpoint of the entire system using the Hamiltonian operator in quantum me- chanics: The first equality given in Marion–Thornton [77, p.480, (12.45)] is a special case of Cohen-Tannoudji–Diu–Laloe¨ [17, vol.1, p.576, (4)]. By Cohen-Tannoudji–Diu–Laloe¨ [17, vol.1, pp.584–585, Complement HV , 2d]], we find that < XG > (t) and < XR > (t) oscillate at angular frequencies of ωG and ωR, which agrees with the classical result. Remark. As we go to a more advanced level and widen our consideration, new physical meanings of mathematical equations continue to develop and meanings of equations become richer and more delicate. Nonetheless the meanings in older theories are still well-preserved in a newer theory. Remark 46. Marion–Thornton [77, p.505, (12.160)]

n ikx Proof. Let Dn(x)= ∑k= n e . Then by Rudin [97, p.174, (77)], − rπ 0 if r = 0 D ( )= n n + 1 ( 1)r+1 if r = 0.  − 6 n rπ sπ 1 (r + s)π (r s)π ∑ sin( j )sin( j )= [Dn( ) Dn( − )]. j=1 n + 1 n + 1 −4 n + 1 − n + 1

Remark. Marion–Thornton [77, p.505, (12.160)] is a finite version of Marion–Thornton [77, p.514, (13.7)]. The latter can be futher generalized as the closure relation in integral form given in Cohen-Tannoudji–Diu–Laloe¨ [17, vol.1, p.100, (A-32)]. The equivalent statements for an orthonormal basis in a Hilbert space are given in Rudin [99, p.90, Theorem 4.18]. In terms of projections in state space, an orthonormal basis can be expressed as Cohen-Tannoudji–Diu– Laloe¨ [17, vol.1, p.123, (C-9)]. See Cohen-Tannoudji–Diu–Laloe¨ [17, vol.1, p.123, l. 5–p.124, − l.10].

Remark 47. Because the initial displacement was symmetrical, the subsequent motion must also be symmet- rical, so none of the even modes (for which the center position of the string is a node) are excited [Marion–Thornton [77, p.516, l.13–l.17]].

100 Proof. The amplitude of normal mode with frequency ωr is determined by (the eigenvector a jr) rπx or sin( L ) [Marion–Thornton [77, p.500, 912.138); p.502, (12.150); p.514, (13.5)]]. L rπ 2 sin( L )= 0 if and only if r is even. rπ(L x) rπx sin( L− )= sin( L ) if r is odd. Remark 48. (Avoiding contradictions: from the Galilean invariance to the Lorentz invariance) [Marion– Thornton [77, chap. 14]] Avoiding contradiction is an important mathematical method. A theory cannot allow con- tradiction or inconsistency. If a theory cannot explain certain phenomena, we should modify it so that the new theory can explain them and in special cases the results of the new theory should reduce to those of the old one. The Michelson–Morley experiment [Fowler [36]] suggests that the velocity of light be con- stant, independent of any relative motion of the source and the observer. However, the Galilean transformation is inconsistent with this suggestion [Marion–Thornton [77, p.548, l. 14–l. 7]]. Maxwell’s equations are invariant in form under the Lorentz transformation [Wangsness− − [121, §29-5]]. For the contradiction caused by the Galilean invariance given in Marion–Thornton [77, p.549, l.9–l.11], the solution is to use Lorentz transformations instead of Galilean transforma- tions. Muon decay provides an experimental verification of special relativity [Marion–Thornton [77, p.555, l. 14–p.556, l. 16]]. The speed of the light in invariant under Lorentz transforma- − − tions [Marion–Thornton [77, p.552, l.1–l.5]]. Linear momentum is not conserved according to special relativity if we use the conventions for momentum of classical [Marion–Thornton [77, p.564, l.5–l.6]]. The key to solving this inconsistency is to use the proper time in the def- inition of linear momentum [Marion–Thornton [77, p.564, l.21–l.26; p.565, Example 14.6]]. Marion–Thornton [77, p.567, Example 14.7] shows that the relativistic kinetic energy reduces to the classical result for small speeds, u c. If we use the position 4-vector X with x4 = ict ≪ to construct the Lorentz transformation matrix [Marion–Thornton [77, p.572, (14.77)]], then the momentum vector p given in Marion–Thornton [77, p.564, (14.45)] becomes the momentum– E energy 4-vector P =(p,i c ) [Marion–Thornton [77, p.573, (14.91)]], where E is the total energy. The contradiction given in Marion–Thornton [77, p.574, l. 7–l. 1] forces us to modify the ve- locity addition rule: Marion–Thornton [77, p.576, (14.98)].− In order− to make the Lagrange equa- tions [Marion–Thornton [77, p.578, (14.107)]] accommodate the relativistic momentum vector [Marion–Thornton [77, p.578, (14.108)]], we must modify the definition of Lagrangian [Marion– Thornton [77, p.578, (14.113)]]. Because mass and energy are interrelated in relativity theory, it no longer is meaningful to speak of a “center-of-mass” system; in relativistic kinematics, one uses a “center-of-momentum” coordinate system instead. Such a system possesses the same essential property as the previously used center-of-mass system–the total linear momentum in the system is zero [Marion–Thornton [77, p.579, l. 16–l. 11]]. This modification of coordi- nate system leads to Marion–Thornton [77, p.582, (14.128);− − (14.129)] which are reduced to the classical results given in Marion–Thornton [77, p.350, (9.69); (9.73)] when γ1 1. → Remark 49. One can always find a suitable inertial frame in which the events occur at the same point in space but at different times [Marion–Thornton [77, p.571, l.2–l.3]].

∆x v∆t ∆x Proof. 0 = ∆x′ = − 2 ∆x v∆t = 0 v = ∆t . 1 v ⇒ − ⇒ − c2 q Remark 50. Since two events can only be connected by a signal with speed c, events with a spacelike ≤ interval cannot be causally connected [Marion–Thornton [77, p.571, l.7–l.8]].

101 Remark 51. The phase velocity of a light wave in a medium for which the index of refraction is less than unity is greater than c, but the phase velocity does not correspond to the signal velocity in such a medium; the signal velocity is indeed less than c [Marion–Thornton [77, p.576, l.7–l.10]]. Explanation. To see why the group velocity need not correspond to the speed of information (energy) in a wave, notice that in general, by superimposing simple waves with different fre- quencies and wavelengths, we can easily produce a waveform with a group velocity greater than the propagation speeds of the constituent waves. For more details read Mathpages [78]. Remark 52. Marion–Thornton [77, p.581, (14.122a); (14.122b)]

2 2 Proof. (b). (m1cγ1 + m2c) γ2′ 1 = m1cγ2′ γ1′ 1. m − − 2 2 2 (γ1+ m ) γ q q 1 2′ . γ 2 1 = γ 2 1 1′ − 2′ − (a). m2(γ 2 1)= m2(γ 2 1), where γ can be substituted by Marion–Thornton [77, p.581, 1 1′ − 2 2′ − 2′ (14.122b)].

Remark 53. ψ + ζ < π [Marion–Thornton [77, p.582, l. 8]]. 2 − Proof. If ζ = π ψ, then tanζ = cotψ = 1 . 2 − tanψ tan is an increasing function on [0, π ), so ζ < π ψ. 2 2 − 7. Watson–Whittaker [122]

φ(ζ) b Remark 1. How do we choose γ and z so that | φ(t) −b | < 1 [Watson–Whittaker [122, p.129, l. 3]]? | − | −

Solution. φ(a)= b and φ ′(a) = 0. By Rudin [99, p.231, Theorem6 10.30], there exist an open set U containing a and an open set V containing b such that φ is one-to-one on U and φ(U)= V. Take ε > 0 such that v v b = ε V. { || − | }⊂ Let γ = t U φ(t) b = ε . { ∈ || − | } δ > 0 t a < δ φ(a) b < ε. ∃ ∋| − | ⇒| − | Let z t U t a < δ . ∈ { ∈ φ||(ζ−) b| } ζ [a,z] ( | φ(t) −b | < 1, where t γ). ∈ ⇒ | − | ∈

1 f (t)φ ′(ζ) Remark. f ′(ζ)= 2πi γ φ(t) φ(ζ) dt [Watson–Whittaker [122, p.129, l. 4]]. − − R Proof. φ(t) φ(ζ)=(t ζ)g(t), where g(ζ)= φ ′(ζ) = 0. φ ′(ζ) −φ ′(ζ)/g(t) − 6 φ(t) φ(ζ) = t ζ , where g(t) has no zeros on U. − − φ ′(ζ) = 1. g(t) |t=ζ Remark 2. [θ(z)] nθ (z)dz = 0 if n > 1 [Watson–Whittaker [122, p.131, l. 10]]. C − ′ − n+1 R n [θ(z)]− Proof. C[θ(z)]− θ ′(z)dz = n+1 C = 0. − | Remark 3. The equationR ζ = a + tφ(ζ) has one root in the interior of C [Watson–Whittaker [122, p.133, l.8]].

102 Proof. Assume that ζ = a +tφ(ζ) has two roots ζ1,ζ2 in the interior of C. Then ζi a θ(ζi)=(ζi a)θ1(ζi)= − = t(i = 1,2). − φ(ζi) By the argument given in [Watson–Whittaker [122, p.131, l. 10–l. 9]], we see that θ(z)= t has one root in the interior of C. − −

Remark 4. The necessary condition of Guo–Wang [48, p.56, l. 12–l. 9, Theorem] can be used as a supple- − − ment to Watson–Whittaker [122, §10.3; §10.31; §10.32]. However, for the proof of the sufficient condition of Guo–Wang [48, p.56, l. 12–l. 9, Theorem], it is better to read Watson–Whittaker − − [122, §10.3; §10.31; §10.32]. Watson–Whittaker [122, p.198, l.1–l.2; p.199, l. 10–l. 6] are − − important points, but Guo–Wang [48, §2.4] fails to mention them. For the case ρ1 ρ2 = a pos- − itive integer, the second solution given in Guo–Wang [48, p.60, l. 4] fails to be represented in the form required by Guo–Wang [48, p.56, (7)]. The term-by-term− integration given in Watson– Whittaker [122, p.200, l. 7–l. 5] can be justified by Rudin [99, p.27, Theorem 1.34]. − − a b c a b c z a k z c l Remark 5. (I) ( z−b ) ( z−b ) P α β γ z = P α + k β k l γ + l z [Watson–Whittaker − −    − −  α′ β ′ γ′  α′ + k β ′ k l γ′ + l  [122, p.207, l.10]]. − −     a b c Proof. Let u = P α β γ z . At the singularity z = a with the exponent α, u can be   α′ β ′ γ′  α z a k z c l α+k considered (z a) + . Consequently, ( z−b ) ( z−b ) u = A(z a) + . −  ···  − − − ··· z a α z a α z a Remark 6. u = A( z−b ) +B1( z−b ) log( z−b ) [Watson–Whittaker [122, p.208, l. 11]] follows from the the- orem given− in Ince [59,− p.137,− l. 17–l. 14]. − − − Remark 7. Γ (z)= Γ(z) ∞ e x 1 dx [Watson–Whittaker [122, p.247, l. 3]]. ′ 0 { − − (1+x)z } x − R Proof. e x 1 1 can be expressed as a power series in x in the neighborhood of x = 0. { − − (1+x)z } x ∞ dx ∞ dx . | x(1+x)z |≤ xℜz+1 The desired result follows from Rudin [99, p.27, Theorem 1.34]. R R Remark 8. “The integral converges uniformly when the real part of z is positive” [Watson–Whittaker [122, p.248, l. 13–l. 12]] should have been corrected as “The integral converges uniformly on com- − − pact subsets of z C ℜz > 0 ”. { ∈ | } Remark 9. (The finishing touch) Providing a solution to a problem alone is not enough; the author should tell the readers from where the solution comes. This way can bring the readers to an advanta- geous point for a bird’s-eye view of the circumstance. By substitution, (z a)α (z b)β (z c)γ (t a)β+γ+α′ 1(t b)γ+α+β ′ 1(t c)α+β+γ′ 1(z t) α β γ dt [Watson– − − − C − − − − − − − − − − Whittaker [122, p.292, l. 14]] is a solution of Riemann’s differential equation [Watson–Whittaker R [122, p.283, l.13–l.16]].− Watson–Whittaker [122, p.293, l. 19–l. 14] uses the definition of the − − beta function and the binomial theorem to prove that the given integral form is a solution, but Watson–Whittaker [122, §14 6] fails to explain from where the form comes. In contrast, Lebedev · [72, p.239, (9.1.3); (9.1.4)] indicate that the integral form is built by means of the definition of beta function and the binomial theorem. The approach given in Guo–Wang [48, §4.5] is even better because Guo–Wang [48, p.150, (3)] comes from Guo–Wang [48, §2.14, (23)]. Guo–Wang [48, §2.14] shows that the Euler transform is an important tool for solving a differential equation

103 of Fuchsian type with three singularities. Remark. Guo–Wang [48, p.150, l. 4–p.151, l.3] gives a detailed calculations to find M in Lebe- − dev [72, p.240, l.4–l.5]. ∞ tdt Remark 10. (Musket to kill a butterfly) The differentiation under the integral sign for 0 (z2+t2)(e2πt 1) [Watson– ∞ arctan(t/z) − Whittaker [122, p.250, l. 4]] ( 0 e2πt 1 dt resp. [Watson–WhittakerR [122, p.250, l. 2]]) can be justified by either− the classical method− [Titchmarsh [114, p.100, l.7–l.13]] or the− mod- R ern method [Rudin [99, p.246, Exercise 16]]. For the latter method, we let X =[0,∞),dµ = 2πt 1 2 2 1 ∞ tdt 2πt t(e 1)− dt, and ϕ(z,t)=(z + t )− for 0 (z2+t2)(e2πt 1) ; let X = [0,∞),dµ = t(e − − − 1 arctan(t/z) ∞ arctan(t/z) 1)− dt, and ϕ(z,t)= t for 0 e2πt 1 dtR . From the hindsight, the uniform convergence of the infinite integral in Titchmarsh [114, p.100,− l.12] hints the boundedness of in Rudin [99, R ϕ p.246, Exercise 16] for most cases. Remark. The modern method attacks directly toward the goal by using therems flexibly. A com- plex measure need not distinguish a compact integral contour from a noncompact one. A single proof is good enough for dealing with both compact and noncompact cases. Furthermore, the proof is free from complex analysis except for using the definition of analytic functions. In con- trast, the classical method must follow a formal, tedious, and inflexible procedure. In order to ensure the finiteness of a contour integral, the Borel measure [Rudin [99, p.49, l.10]] on [0,∞) must distinguish a compact integral contour from a noncompact one. In fact, in order to include the case of noncompact integral contour, the modification and supplement have to use almost all the theorems in complex analysis [Titchmarsh [114, §2.8–§2.84]] and, thus, lead to unnecessary complications. The proof of Rudin [99, p.246, Exercise 16] is simpler than that of the theorem given in Titchmarsh [114, p.99, l.2–l.9] because the former proof only uses Rudin [99, p.27, The- orem 1.34] and does not involve any unnecessary fuss given in Titchmarsh [114, §2.8–§2.84]. Consequently, the modern method is better than the classical method. However, the classical method is still extensively used by modern authors [Lebedev [72, p.240, l.1]; Guo–Wang [48, p.120, l. 9; p.121, l.6]; Lang [71, chap. XII]]. Perhaps, these authors are not familiar with the − modern method. Remark 11. The proof of Guo–Wang [48, p.111, (2)] is incorrect because the rough estimate O( z 2n) [Guo– | |− Wang [48, p.111, l.9]] cannot attain the required accuracy O( z (2n+1)). For a correct proof, see | |− Watson–Whittaker [122, p.251, l. 11–p.252, l.10]. − Remark. Kz csc(2∆) [Watson–Whittaker [122, p.252, l.10]]. ≤ π Proof. Let z = x + iy and θ = argz. Then θ 2 ∆. 2 2 | |≤ − If x y , then Kz = 1 [Watson–Whittaker [122, p.252, l. 4]]. ≥ − If x2 < y2, then π < θ π ∆ 4 | |≤ 2 − π < 2 θ π 2∆ ⇒ 2 | |≤ − 2xy = sin(2θ) sin(π 2∆)= sin(2∆). ⇒| x2+y2 | | |≥ − Remark 12. (Determine arg(1 t) on a contour around the branch point t = 1) − We need a method rather than correct results. Any step coming from guess may lead to the desired result this time; it may not next time. For example, if the choice arg( 1)= π can lead to − the desired result, we want to know why we cannot choose arg( 1)= π. Thus, if one provides correct results without a method, one may still make mistakes sometimes.− − Ten correct examples are not as good as one correct method. Only when a complete method is provided may we check if results are correct. When encountering a situation where a confusion may easily occur, we

104 should deliberately clarify the confusion rather avoid discussing it. Example. After the circuit (1+), arg(1 t)= 2π [Watson–Whittaker [122, p.257, l.1–l.2]]. − Proof. t = 1 + δ exp(is), where δ > 0, π s π. − ≤ ≤ 1 t = δ exp(is)= δ exp[i(s + s0)]. − − Before the circuit (1+), s = π and s + s0 = arg(1 t)= 0 [Watson–Whittaker [122, p.257, − − l.1]]. Hence s0 = π. After the circuit (1+), s = π. So arg(1 t)= s + s0 = 2π. − π 1 t Remark 13. ζ(s)= O[exp 2 t +( 2 σ it)log 1 s + iarctan 1 σ ]ζ(1 s) [Watson–Whittaker [122, { | | − − | − | −π } 1− t p.275, l.21]] should have been corrected as ζ(s)= O[exp t +( σ it)log 1 s +iarctan − ]ζ(1 { 2 | | 2 − − | − | 1 σ } − s). The phrase “, from the results already obtained for ζ(s,a),” [Watson–Whittaker [122, p.275,− l.22–l.23]] should have been crossed out. Remark 14. The equalities in Watson–Whittaker [122, p.276, l.7–l.9] follow from the equalities in Watson– 1/2 σ Whittaker [122, p.276, l.2–l.3; l.5] by multiplying the same factor O( t − ) as that used in proving the formula given in Watson–Whittaker [122, p.275, l. 11]. | | − Remark 15. ζ(s,a)= O( t τ(σ) log t ), where | | | | τ(σ)= 1 σ,(σ 0);τ(σ)= 1 ,(0 σ 1 );τ(σ)= 1 σ,( 1 σ 1);τ(σ)= 0(σ 1) 2 − ≤ 2 ≤ ≤ 2 − 2 ≤ ≤ ≥ [Watson–Whittaker [122, p.276, l.11–l.13]].

Proof. I. Given an arbitrary δ > 0. By Watson–Whittaker [122, p.275, l.3–l.13; p.276, l.7–l.9], we have O( t 1/2 σ if σ δ | | − ≤− O( t 1/2 log t ) if δ σ δ  | | | | − ≤ ≤  1/2 1 σ ζ(s,a)= O( t ) or O( t − ) if δ σ 1 δ  | | | | ≤ ≤ − O( t 1 σ log t ) if 1 δ σ 1 + δ | | − | | − ≤ ≤ O(1) if σ 1 + δ,  ≥ where O constants depend on δ.  II. Fix δ > 0. Then we want to use the Oδ constants to construct a global constant O for the formula in Watson–Whittaker [122, p.276, l.11]. The old rule refers to taking the Oδ constant given in I; the new rule refers to taking τ(σ) according to the definitions given in Watson–Whittaker [122, p.276, l.13]. 1. Case δ < σ < 0: O ( t 1/2 log t ) (old); O( t 1/2 σ log t ) (new). Consequently, the old rule − δ | | | | | | − | | can be replaced with the new one. That is, the old Oδ constant can be used as the new O constant. 2. Case σ > δ: O ( t 1/2) (old); O( t 1/2 log t ) (new). δ | | | | | | 3. Case 1 δ < σ < 1: O ( t 1 σ )log t ) (old); O( t 1 σ )log t ) (new). − δ | | − | | | | − | | 4. Case 1 < σ < 1 + δ: O ( t 1 σ )log t ) (old); O(log t ) (new). δ | | − | | | | Remark. The definitions of τ(σ) on various segments are consistent. Remark 16. When a = 1, this reduces to the formula found previously in §12 33 for argz π ∆ [Watson– · | |≤ 2 − Whittaker [122, p.278, l. 10–l. 9]]. − −

Proof. φm′ +2 =(m + 2)φm+1(0) [Guo–Wang [48, p.3, (11)]] = φm+1(0) [Guo–Wang [48, p.4, (14)(iii)]]. r 1 Let m = 2r 1(r 1). Then φm 1(0)= φ2r(0)=( ) Br [Guo–Wang [48, p.2, (4)(iii)]]. − ≥ + − −

105 π Remark. After the domain of z extends from argz 2 ∆ to argz π δ, the accuracy weak- 2n 1 | |≤ − | 2n|≤1+η− ens from O(z− − ) [Watson–Whittaker [122, p.252, l.8]] to O(z− − )(η > 0). In fact, if we n 1/2+∞i n (1 η)+∞i replace the integral − − [Watson–Whittaker [122, p.277, l. 7]] with − − − (η > n 1/2 ∞i − n (1 η) ∞i n 1/2 − − − − − − −n 1+η 0), then O(z− − )Rgiven in Watson–Whittaker [122, p.278, l. 11] will becomeR O(z− − ). − For our case, n should be replaced with 2n.

Remark 17. If ℜ(c a b) > 0, then limn +∞ un = 0 [Watson–Whittaker [122, p.282, l.7–l.8]]. − − → Proof. Γ(a + n)=(n + a 1/2)logn n + O(1) [Watson–Whittaker [122, p.279, l.4]]. − − logun = logΓ(a + n)+ logΓ(b + n) logΓ(c 1 + n) logΓ(1 + n)+ constant − − − =(a + b c)logn + O(1) . Consequently, −ℜ(a+b c) un = O(n − ). Remark 18. (Linear transformations of the hypergeometric function) I. By comparing Watson–Whittaker [122, §14 3 & §14 4] with Guo–Wang [48, §4.3], we have the following results: · · (A). The former considers the general equation of Fuchsian type having three regular singulari- ties [Guo–Wang [48, p.68, (1)]], while the latter considers the standard hypergeometric equation [Watson–Whittaker [122, p.207, Example]]. It is sufficient for our purpose to consider the stan- dard type. In addition, it is much simpler. (B). The former lists 24 solutions first, and then keeps 6 of them by eliminating repetitions. Through this trial-and-error approach, Watson–Whittaker [122, §14 4] finally obtains three pairs · of solutions, each pair corresponding to a regular singularity [Watson–Whittaker [122, p.286, l. 17–l. 16]]. The ineffective counting shows that we should redesign our counting plan to fit − − our needs. That is, we should use the correspondence between solution pairs and regular singu- larities as the guide to redesign our counting plan. This is exactly the approach of Guo–Wang [48, §4.3]. (C). Furthermore, Guo–Wang [48, p.141, (4) & (5)] can be derived from Guo–Wang [48, p.140, (2) & (3)] by inspection [Watson–Whittaker [122, p.207, (I) & (II)]]. We may establish a similar relationship between Guo–Wang [48, p.141, (6) & (7)] and Guo–Wang [48, p.140, (2) & (3)]. It would be more difficult to recognize the above simple relationships from Watson–Whittaker [122, §14 3 & §14 4]. II. By comparing· Guo–Wang· [48, §4.8] with Lebedev [72, §9.5], we have the following results: (A). (Calculations vs. inspection) By z = 1 z, the hypergeometric equation [Lebedev [72, ′ − p.248, (9.5.4)]] is transformed to the hypergeometric equation with parameters α′ = α,β ′ = β, and γ′ = 1 + α + β γ [Lebedev [72, p.248, l.17–l.19]]. This approach cannot find the solutions of the latter differential− equation without awkward calculations. In contrast, if we express the so- lutions of the hypergeometric equation by Riemann’s P-equation [Watson–Whittaker [122, p.206, l. 7]], then the solution of the transformed hypergeometric equation can be found by inspection − [Watson–Whittaker [122, p.207, (I) & (II)]]. Through Riemann’s P-equation, the transformation between two singularities [] can be viewed as the transformation between two hypergeometric equations with different parameters. The solutions there are all obtained by inspection. (B). Lebedev [72, §9.5] shows that Lebedev [72, p.249, (9.5.8) & (9.5.9); p.250, (9.5.10)] all follow from Lebedev [72, p.249, (9.5.7); p.247, (9.5.1) & (9.5.2)]. Based on the list of lin- ear transformations given in Lebedev [72, p.246, l. 14], the discussion given in Lebedev [72, §9.5] is complete. In contrast, Guo–Wang [48, §4.8]− fails to discuss Lebedev [72, p.249, (9.5.8); p.250, (9.5.10)] and fails to establish the relationship between Guo–Wang [48, p.160, (4)] and

106 Guo–Wang [48, p.160, (8)]. (C). The formula given in Watson–Whittaker [122, p.289, l.3–l.5] and the one given in Watson– Whittaker [122, p.291, l.3–l.5] are proved the hard way because they both use the contour in- tegrals of Barnes’ type [p.286, l. 7–p.287, l.3; p.289, l. 18–l. 17] and the residue theorem. − − − Furthermore, some cases are difficult to handle [Watson–Whittaker [122, p.290, l.14–l.16]]. In fact, we can still prove Lebedev [72, p.247, (9.5.1) & (9.5.2); p.248, (9.5.4); pp.249–250, (9.5.7)–(9.5.10)] without using any integral representation. For example, the proof of Lebedev [72, p.247, (9.5.1)] can be replaced with the proof of Guo–Wang [48, p.143, (9)]; the proof of Lebedev [72, p.244, (9.3.4)] can be replaced with the proof given in Watson–Whittaker [122, §4 11]. · Remark. Reversing the order of summation and integration can be justified by Rudin [99, p.150, Theorem 7.8] or Rudin [99, p.27, Theorem 1.34]. By proving the above statement both ways, we may see the close relationship between the Fubini theorem and Lebesgue’s dominated con- vergence theorem. Remark 19. For the theorem about Barnes’ integral representation for the hypergeometric function [Watson– Whittaker [122, p.288, l.6–l.11]], the proof given in Guo–Wang [48, §4.6] is clear than that given in Watson–Whittaker [122, §14 5]. By Watson–Whittaker [122, §13 6], we obtain · · lnΓ(λ + s)=(s + λ 1 )lns s + 1 ln(2π)+ O(s 1+η ), − 2 − 2 − where η is an arbitrary small positive number. By the existence and uniqueness of the Laurent series, we have Guo–Wang [48, p.155, (6)]. In order to prove Guo–Wang [48, p.155, (7)], we do not need so accurate an estimate. The rough estimate 1 1 lnΓ(λ +s)=(s+λ 2 )lns s+ 2 ln(2π)+O(1) is good enough. See Watson–Whittaker [122, p.279, l.4]. − − Γ(a)Γ(b) Γ(a)Γ(a b) a 1 Γ(b)Γ(b a) b Remark 20. Γ(c) F(a,b;c;z)= Γ(a c−) ( z)− F(a,1 c + a;1 b + a;z− )+ Γ(b c−) ( z)− F(b,1 1 − − − − − − − c + b;1 a + b;z− ) [Watson–Whittaker [122, p.289, l.3–l.4]] should have been corrected as Γ(a)Γ(b) − Γ(a)Γ(b a) a 1 Γ(b)Γ(a b) b Γ(c) F(a,b;c;z)= Γ(c a−) ( z)− F(a,1 c + a;1 b + a;z− )+ Γ(c b−) ( z)− F(b,1 − − − − − − − c + b;1 a + b;z 1). − − Remark 21. The fact that the function I is not analytic at ∞ [Watson–Whittaker [122, p.291, l. 6]] fol- lows from Ince [59, p.161, l.14]. The generalized hypergeometric equation [Watson–Whittaker− [122, p.291, l. 5]] refers to Guo–Wang [48, p.68, (1); l. 12] or the equation given in Watson– Whittaker [122,− p.206, l. 13–l. 11]. − − − 1/2+m z/2 1/2+m k (1/2+m k)(3/2+m k) 2 Remark 22. Mk m(z)= z e 1 + − z + − − z + and , − { 1!(2m+1) 2!(2m+1)(2m+2) ···} 1/2 m z/2 1/2 m k (1/2 m k)(3/2 m k) 2 Mk, m(z)= z − e− 1 + − − z + − − − − z + are two linearly indepen- − { 1!(1 2m) 2!(1 2m)(2 2m) ···} dent solutions near z = 0 of − − − d2W 1 k 1/4 m2 + + + − W = 0 [Watson–Whittaker [122, p.337, l. 7–p.338, l.2]]. dz2 {− 4 z z2 } − Proof. I. F(α,γ,z) and z1 γ F(α γ + 1,2 γ,z) are two linearly independent solutions near − − − z = 0 of 2 z d y +(γ z) dy αy = 0 [Statement: Guo–Wang [48, p.297, (1), (2) & (3)]; proof: Lebedev dz2 − dz − [72, p.262, l. 7–p.263, l.9]]. − z/2 γ/2 II. By means of the transformation y = e z− w(z), 2 z d y +(γ z) dy αy = 0 is transformed to dz2 − dz − w +[ 1 +(γ/2 α) 1 + γ (1 γ) 1 ]w = 0, ′′ − 4 − z 2 − z2 1 k 1/4 m2 i.e., w +[ + + − ]w = 0(γ = 1 + 2m;γ/2 α = k). ′′ − 4 z z2 − 107 z/2 1/2+m Mk,m(z)= e− z F(1/2 + m k,1 + 2m,z) [Guo–Wang [48, p.301, (5)]] and z/2 1/2 m − Mk, m(z)= e− z − F(1/2 m k,1 2m,z) [Guo–Wang [48, p.301, (6)]] are two linearly − − − − independent solutions near z = 0 of the last differential equation.

Remark 1. Watson–Whittaker [122, p.338, l.3–l.10] gives a brief summary of the above proof. Watson–Whittaker [122, §16 1] shows that Mk,m(z) and Mk, m(z) are solutions of Watson–Whittaker · − [122, p.337, (B)] by substitution. This approach has the shortcoming of losing the beautiful struc- ture of solution. Remark 2. (Grasping the overall situation) Hypergeometric functions and confluent hypergeometric functions are closely related. We must build paths between the two topics as many as possible. When we discuss confluent hyper- geometric functions, of course, we have to include their characteristic properties. Furthermore, for each property, we should find its corresponding property in hypergeometric functions, treat the latter as a motivation of the former and use the latter to prove the former. Just because of the complicated circumstance, we should give a rigorous proof rather than touch it lightly. Other- wise, the discussion is incomplete. Sneddon [104, p.32, l.1–l.18] sets a good example for discussing confluent hypergeometric functions. It says that I. By replacing x with x/β in Sneddon [104, p.23, (8.1)] (its formal solution is given by Watson– 2 Whittaker [122, p.207, l.7]), F(α,β;γ;x/β) is a solution of x(1 x ) d y + γ (1+ α+1 )x dy − β dx2 { − β } dx − αy = 0. d2y II. By Hartman [51, pp.4–5, Theorem 2.4], limβ ∞ F(α,β;γ;x/β) is a solution of x dx2 +(γ dy → − x) dx αy = 0. Consequently, by the uniqueness of solution, we have − r ∞ (α)r x III. limβ ∞ F(α,β;γ;x/β)= ∑r=0 γ r! . → ( )r · By comparison, Lebedev [72, §9.9] only mentions III. However, its proof is incorrect: “a comparison of (9.1.2) and (9.1.3)” given in Lebedev [72, p.261, l.11] should have been replaced with “a comparison of (9.1.6) and (9.11.1)”. Guo–Wang [48, §6.1] only mentions the trans- formation from hypergeometric equation to confluent hypergeometric equation [Guo–Wang [48, p.297, l.1–l.7]]. Therefore, both discussions are incomplete. Watson–Whittaker [122, chap. XVI] is poorly written because it is almost independent of Watson–Whittaker [122, chap. XIV]. A better of III is given as follows:

(α)n(β)n x n (α)n(β)n 1 n Proof. For 0 n t < n + 1, let fβ (t)= ( ) ,g(t)= ( ) . ≤ ≤ n!(γ)n β n!(γ)n 2 Let x R and β 2R. Then | |≤ | |≥ f (t) g(t) and lim f (t)= (α)n xn(0 n t < n + 1). β β ∞ β n!(γ)n ≤∞ ∞ → ≤ ≤ Treat ∑n=0 as 0 and apply Rudin [99, p.27, Theorem 1.34] to this case. F(α,γ,z)= ezRF(γ α,γ, z) [Guo–Wang [48, p.298, (6)]] can be proved similarly. − − z z β z Proof. F(α,β,γ, β =(1 β )− F(β,γ α,γ, z β ) [Guo–Wang [48, p.143, (10)]]. − − − (β)n(γ α)n z n (β)n(γ α)n 1 n For 0 n t < n + 1, let fβ (t)= − ( ) and g(t)= − ( ) . Then ≤ ≤ n!(γ)n z β n!(γ)n 2 f (t) g(t) if z R and β 2R. − β ≤ | |≤ | |≥ n ( )n 1 Remark 23. The coefficient of z in the product of absolutely convergent series on the left is − F( + m n! 2 − k, n;2m + 1;1) [Watson–Whittaker [122, p.338, l. 17–l. 15]]. − − −

108 1 q 1 1 n ( +m k)q( n)q n ( ) ( +m k)qn!/(n q)! Proof. F( + m k, n;2m + 1;1)= ∑ 2 − − = ∑ − 2 − − . 2 − − q=0 q!(2m+1)! q=0 q!(2m+1)! Remark 24. (The Chu–Vandermonde identity) Kummer’s first formula is still true when m + 1/2 + k is an integer, by a slight modification of the analysis of §14 11 [Watson–Whittaker [122, p.338, l. 2– l. 1]]. · − − This footnote is misleading. One will not get the result by modifying the argument of §14 11. Instead, we prove the equivalence of the following three identities first: · (a). F( n,b;c;1)=(c b)n/(c)n [W. N. Bailey, Generalized Hypergeometric Series, New York: Stechert-Hafner− Service− Agency,1964, p.3, l. 8–l. 6]. n n − − (b). (x + a)n = ∑k=0 (x)k(a)n k. k − (c). x+a = ∑n x a . Then we can prove (a) without assuming the hypothesis ℜ(c a + n k=0 k n k − n) > 0. −    Proof. I. (a) (b). ⇔ n ( n)k(b)k Proof. (c b)n = ∑ − (c)n k=0 k!(c)k n n− k k = ∑k=0 ( b k+1)k(c+k 1)n k [( n)k =( ) (n k+1)k;( ) (b)k =( b k+1)k;(c)n/(c)k = k − − − − − − − − − − (c + k 1)n k]. −  − II. (b) (c). ⇔ n x a 1 n k x 1 n k a 1 n k k n 1 Proof. (c) − −n − = ∑k=0 −k− −n−k− because k =( ) −k− . Namely, Γ(n x a)⇔ n Γ(k x) Γ(n k a) − − Γ(n+1)−Γ(−x a)= ∑k=0 Γ(k+1)−Γ( x) Γ(n k+−1)−Γ( a) .    − − n n − − − ( x a)n = ∑k=0 ( x)k( a)n k. − − k − − − III. (c) follows from  the binomial theorem.

r+s n r+s r s Proof. ∑n n x =(1 + x) =(1 + x) (1 + x) r k s n k n r s n = ∑k k x ∑n k n k x − = ∑n[∑k=0 k n k ]x . − − −    

( 1 +m)( 3 +m) (n m+ 1 ) Γ( n+ 1 m)Γ( 1 ) 2 2 ··· − 2 − 2 − 2 Remark 25. n!(2m+1)(2m+2) (2m+n) 1 n 1 n Γ( 2 m 2 )Γ( 2 2 ) ··· 1 1− − − Γ( 2 m)Γ( 2 ) = − 1 n 1 n [Watson–Whittaker [122, p.339, l.7–l.8]] should have n!(2m+1)(2m+2) (2m+n)Γ( 2 m 2 )Γ( 2 2 ) been corrected as··· − − − ( 1 +m)( 3 +m) (n+m 1 ) Γ( n+ 1 m)Γ( 1 ) 2 2 ··· − 2 − 2 − 2 n!(2m+1)(2m+2) (2m+n) Γ( 1 m n )Γ( 1 n ) ··· 2 − − 2 2 − 2 ( )nΓ( 1 m)Γ( 1 ) = − 2 − 2 . n!(2m+1)(2m+2) (2m+n)Γ( 1 m n )Γ( 1 n ) ··· 2 − − 2 2 − 2 Remark 1. F(2α,2β;α + β + 1 ;x)= F(α,β;α + β + 1 ;4x(1 x)) [Watson–Whittaker [122, 2 2 − p.339, l.5]] follows from Lebedev [72, p.252, (9.6.8)]. Remark 2. The result can be extended to the whole domain arg(1 z) < π [Lebedev [72, | ± | p.253, l. 2–l. 1]]. This is because the analytic continuation of F(α,β;γ;z) can be extended to − − the plane cut along [1,∞] [Lebedev [72, p.241, l.5–l.6]]. 1 1 z/2 k (0+) k 1/2+m t k 1/2+m t Remark 26. Wk m(z)= Γ(k + m)e z ( t) (1 + ) e dt [Watson–Whittaker , − 2πi 2 − − ∞ − − − z − − [122, p.339, l. 13–l. 12]] follows from Watson–Whittaker [122, p.292, l. 15–l. 10] and Guo– R Wang [48, p.95,− l. 8].− − − − Remark. (Methodical solutions) The differential equation given in Watson–Whittaker [122,

109 p.291, l. 11–l. 7] belongs to a special type. The given solution is justified simply by substitu- tion [Watson–Whittaker− − [122, p.292, l. 15–l. 10]]. We do not know from where the integrand − − comes. The underdeveloped solution based on guess, luck, and trial-and-error such as Watson– Whittaker [122, p.339, l. 13–l. 12] cannot be considered a methodical solution. In contrast, the − − integral solution given in Guo–Wang [48, p.305, l.10–l.19; §6.4] is built by a systematic method which applies to the wider class of equations of Laplacian type [Guo–Wang [48, §2.13]]. In fact, the integrand and the path of integration [Guo–Wang [48, p.302, l.4–l.13]] can be specified by the Laplace transform. Consequently, the latter solution is more methodical than the former one. Remark 27. “k m + 1 / N 0 ” [Watson–Whittaker [122, p.343, l. 1–p.344, l.1]] should have been cor- ± 2 ∈ ∪ { } − rected as “k m 1 / N 0 ”. ± − 2 ∈ ∪ { } Remark 28. The formula given in Watson–Whittaker [122, p.344, l.4] follows from the equalities given in Guo–Wang [48, p.311, l.7–l.9]. m imφ Remark 29. By the first formula given in Watson–Whittaker [122, p.325, l. 3], the curves on which Pn (cosθ)e vanishes are n m parallels of latitude and 2m medians [Watson–Whittaker− [122, p.392, l. 4– − − l. 3]]. We may ignore the zeros of the factor (z2 1)m/2. − − Remark 30. When ω2/ω1 is rational, then the function reduces to a singly-periodic function; when ω2/ω1 is irrational, then the function reduces to a constant [Watson–Whittaker [122, p.429, l. 5–l. 2]]. For details, see Gonzalez´ [43, p.366, Theorem 5.1 & Theorem 5.2]. − − Remark 31. In consequence of Saks–Zygmund [101, p.341, (12.2)], if a meromorphic function f (z) has an essential singularity at z = a, so does 1/ f (z) [Watson–Whittaker [122, p.431, l.9]]. Remark 32. Before reading the proof [Watson–Whittaker [122, p.465, l. 12–l. 2]] of the theorem given in Watson–Whittaker [122, p.465, l. 14–l. 13], we should read− the− penultimate paragraph of − − Watson–Whittaker [122, p.430] first. Simply put, if an integral contour passes through any pole of the integrand, the contour should be replaced with a congruent and similarly-situated one that does not.

Remark 33. If a,b,a′,b′ are suitably choosen constants, each of the functions 2 2 2 2 aϑ1 (z)+bϑ4 (z) a′ϑ1 (z)+b′ϑ4 (z) 2 , 2 ϑ2 (z) ϑ3 (z) is a doubly-periodic function (with periods π,πτ) having at most only a simple pole in each cell [Watson–Whittaker [122, p.466, l.15–l.19]]. For details, see Guo-Wang [48, p.504, l.2–l.3]. Remark 34. The statement given in Watson–Whittaker [122, p.468, l.1–l.2] means that w 11 1 1 w′ x −1 1 1 1 x 2 = − ′ . y  1 1 1 1 y  − ′  z   1 1 1 1 z     −  ′  Remark 35. Watson–Whittaker   [122, p.468, the data in l.7–l.9, (i), (ii), (iii), (iv); p.469, the formula in l.6] can be effectively proved using Watson–Whittaker [122, p.464, Example 2] instead of definitions.

Remark 36. [r]′(r = 1,2,3,4) is symmetric in w,x,y,z [Watson–Whittaker [122, p.468, l.14–l.15]].

Proof. By the equations given in Watson–Whittaker [122, p.467, l. 5–l. 2], switching x,y is − − equivalent to switching x′,y′. Remark 37. The formula given in Watson–Whittaker [122, p.470, Example] follows from Watson–Whittaker [122, p.467, Corollary].

110 2n 1 2iz 2n 1 2iz ∞ 2iq − e ∞ 2iq − e− Remark 38. ϑ3′(z)= ϑ3(z)[∑n=1 2n 1 2iz ∑n=1 2n 1 2iz ] [Watson–Whittaker [122, p.471, l.6]]. 1+q − e − 1+q − e− ∞ 2n 1 4n 2 Proof. ϑ3(z)= G∏n=1(1 + 2q − cos2z + q − ) [Watson–Whittaker [122, p.469, l. 2]]. ∞ 2n 1 4n 2 − logϑ3(z)= logG + ∑n=1 log(1 + 2q − cos2z + q − ). By Watson–Whittaker [122, p.33, l.3–l.4], ∑∞ log(1 + 2q2n 1 cos2z + q4n 2) converges uni- n=1 | − − | formly on compact subsets of C the zeros of ϑ3(z) . 2n 1 4n 2 \{2n 1 2iz 2n 1} 2iz 1 + 2q − cos2z + q − =(1 + q − e )(1 + q − e− ). The desired result follows from Rudin [99, p.230, Theorem 10.28].

Remark. (The Taylor series vs. the L’hopital rule in terms of convergence) Sometimes, only after studying advanced mathematics may we understand how we should properly deal with elementary mathematics. In order to study infinite products of analytic functions, we must master the concept of uniform convergence. Thus, it is important to see how the Taylor series and the L’hopital rule affect convergence. Among proofs for the case of point convergence, we should select the ones applicable to the case of uniform convergence. Watson–Whittaker [122, p.33, l.1–l.7] shows that the absolute convergence of ∑log(1 + an) is equivalent to that of ∑an using the Taylor series. The proof is applicable to the case of uniform convergence. The section “Convergence criteria” of https://en.wikipedia.org/wiki/Infinite_product proves that the convergence of ∑log(1+an) is equivalent to that of ∑an using the L’hopital rule. f (z) f ′(a) The proof is not applicable to the case of uniform convergence because limz a = refers → g(z) g′(a) to a single point z = a. 8. Watson [123]

Remark 1. An = 2Jn(nε)/n,Bn = 2(ε/n)J (nε) [Watson [123, p.6, l. 17]]. − n′ − Proof. I. See p.7, l.7 in http://www.murison.alpheratz.net/Maple/KeplerSolve/ KeplerSolve.pdf. ∞ II. Let ε cosE = ∑n=0 Bn cos(nM). −1 π B0 = π 0 ε cosEdM − π = 1 ε cosE(dE ε cosEdE) (because M = E ε sinE) − π 0 R − − 2 2. = ε / R Assume n = 0. 2 6 π Bn = π 0 ε cosE cosnMdM 2− π ε sinnM = π 0 R n sinEdE [integration by parts] − π = 2ε sinn(E ε sinE)sinEdE − nπR 0 − = 2(ε/n)J (nε) [Watson [123, p.19, l. 6]]. − R n′ − 2 du 2 2 2 Remark 2. ε ( dε + u )+ εu n (1 ε )= 0 [Watson [123, p.7, l.4]]. − udε − Remark. Let An = e and L1(An)= 0 be a second order differential equation. Then L1(An)= udε R e L2(u), where L2(u)= 0 is a first order differential equation. R Remark 3. Since z = z(h, µ), ∂z is meaningful [Watson [123, p.28, l. 14]]. ∂ µ − iθ Remark 4. The singularity of z qua function of h are at h = e± . Proof. h = 0 is a singularity of logh. 1 logh h = 0 is a singularity of e 2 = √h. 1 √(h eiθ )(h e iθ ) iθ ± − − − h = e± are singularities of z = h .

111 n n 1 ∞ h d − 1 cosθ Remark 5. ∑n=1 n! n 1 [ f (µ) φ ′(µ)] converges in h < −| 2 | [Watson [123, p.28, l. 15]]. dµ − { } | | − Proof. By Watson–Whittaker [122, p.133, l.3–l.4], µ = cosθ < 1. 1 cosθ | | | | Take h < −| | . Then | | 2 h√1 z2 2 h < 1 cosθ z cosθ on z = 1. | − |≤ | | −| |≤| − | | | Hence the hypothesis given in Watson–Whittaker [122, p.133, l.5] is satisfied.

1 √1 µh+h2 Remark 6. z = − − [Watson [123, p.28, l. 8]]. h − h 1 √1 µh+h2 Proof. Solve z = µ (1 z2) and obtain z = ± − . − 2 − h Consider 0 < h < 1. 1 2µh + h2 > 1 2h + h2 1 + 1 2µh + h2 > 1 (1 h)= h z > 1. The− last inequality− contradicts⇒ the statement− given in Watson− − [123, p.28,⇒ l. 9]. p By Watson–Whittaker [122, p.133, l.8], we should take the minus sign. −

Remark 7. iθ iθ n k e e 1 2n 2 − 2n sin2n θ =( − − )2n = + cos((2n 2k)θ) 2i 22n n 22n ∑ k −   k=0   [Watson [123, p.29, l.9]]. ∞ m n+2m m Remark 8. Jn(t)= ∑ ( ) amt [( ) J0(t)] [Watson [123, p.29, l. 11]]. m=0 − − − Proof. If n is odd, then 1 π 2 2 (n 1) 2 Jn(t)= π ( 1) − 0 cosnxsin(t cosx)dx [Watson [123, p.21, (8)]] π − 1 ∞ 1 ma tn+2m 1 m cos cosx dx [Watson [123, p.29, l.3]] = π 0 ∑m=0( ) mR ( ) ( ) ∞ m − n+2m m − = ∑ ( ) amt [( ) J0(t)] [Watson [123, p.21, (9)]]. mR=0 − − Remark 9. The change of order of summation and integration given in Watson [123, p.30, l.10] can be justified by Rudin [97, p.137, Theorem 7.14]. Remark 10. The integrand is unaffected if χ is increased by π while ψ is simultaneously decreased by π [Watson [123, p.31, l. 4–l. 3]]. The geometric meaning of this statement is shown in Figure − − 1.1. ψ ψ

χ χ

Figure 1.1: Change of the domain of integration

n s n (n+s 1)! 2s Remark 11. cos2nθ = ∑s=0( ) (n· s)!(−2s)! (2sinθ) [Watson [123, p.33, l.17]]. − −

112 n k 2n 2 n k 2k Proof. cos2nθ = ∑k=0( 1) 2k (1 sin θ) − sin θ [Hobson [58, p.48, (29)]]. − −s 2s s 2n n k Consequently, the coefficient of ( 1) sin θ is ∑k=0 2k s−k 2s 1− 2n 1 − s n(n 1)(n 2) (n s+1) s 2− 2− = 2 −1 3 5− (2···s 1−) ∑k=0 s k k   · · ··· − − s n(n 1)(n 2) (n s+1) n+s 1 = 2 −1 3 5− (2···s 1−) s− because  p+q · s· ··· p− q s = ∑k=0 k s k .  −  1 n  s (2n+1) (n+s)! 2s+1 Remark 12. sin(2n + 1)θ = 2 ∑s=0( ) (n s)!(·2s+1)! (2sinθ) [Watson [123, p.33, l.18]]. − − Proof. sin(2n + 1)θ = ∑n ( 1)k 2n+1 (1 sin2 θ)n k sin2k+1 θ [Hobson [58, p.48, (28)]]. k=0 − 2k+1 − − Consequently, the coefficient of ( 1)s sin2s+1 θ is s 2n+1 n k  ∑k=0 2k+1 s−k − 2s+1 2n 1 − s (2n+1)n(n 1)(n 2) (n s+1) s 2 2− = 2 1 −3 5 (2−s+1···) − ∑k=0 s k k   · · ··· − s (2n+1)n(n 1)(n 2) (n s+1) n+s = 2 1 −3 5 (2−s+1···) − s .   · · ··· Remark 13. Mistakes in a textbook can determine  a reader’s academic level: inability to discover them, ability to discover them, or ability to correct them. The argument given in Watson [123, p.35, l.6-l.7] should have been corrected as follows:

z n 2 2 z Correction. By Watson [123, p.16, (4)], Jn(z) | n!| exp( | 4| ). ∞ (m+n)! ∞ (m+n)! | |≤ ∑n=0 n! Jm+2n(z) + ∑n=0 n! Jm+2n+2(z) | 2 | z 2 n | 2 | z 2 n z m z ∞ ( 2 ) z m+2 z ∞ ( 2 ) exp( | | )∑ | | + exp( | | )∑ | | . ≤| 2 | 4 n=0 n! | 2 | 4 n=0 n! Remark 14. Watson [123, p.35, l.13–l.16]

Proof.

z m! (0+) ( )m = um 1 exp(z/u)du [express exp in series] 2 2m+1πi − Z m! (0+) z 1 dum = exp (t 1/t) dt 2m+1πi {−2 − }· m dt Z (0+) ∞ 1 z (m + 2n) (m + n 1)! m+2n 1 = exp (t 1/t) · − t − dt[the binomial theorem] 2πi {−2 − } ∑ n! Z n=0 ∞ (m + 2n) (m + n 1)! = ∑ · − Jm+2n(z) [Watson [123, p.14, (1)]]. n=0 n!

p (z/2)2m 2m k 2p p 2m (2p) Remark 15. ∑ ∑ ( ) 2mCk(2m 2k) = 2∑ z P [Watson [123, p.36, l.11–l.12]]. m=0 (2m)! k=0 − − m=0 2m Proof.

2m k 2p m 1 k 2p ( 1) 2mCk(2m 2k) − ( 1) 2mCk(2m 2k) ∑ − 2m − = ∑ − 2m − k=m+1 2 (2m)! k=0 2 (2m)! (2p) = P2m [Watson [123, p.36, l.3]].

113 Remark 16. The justification for the term-by-term differentiation given in Watson [123, p.37, l.9]

Proof. By Watson [123, p.34, l.17–l.19], the series given in Watson [123, p.36, l. 5] converges − uniformly on compact subsets. Since the integrand of the integral on the left side of the equality given in Watson [123, p.36, l. 3] is nonnegative, the series on the right side converges uniformly − on compact subsets. Then use Rudin [99, p.230, Theorem 10.28].

Remark 17. By the formula given in Coddington–Levinson [16, p.28, l.16–l.17], d [zW Jν (z),J ν (z) ]= 0 [Watson [123, p.42, l. 6]]. dz { − } − 2sinνπ Remark 18. W Jν (z),J ν (z) = [Watson [123, p.43, (2)]]. { − } − πz Proof. If we trace back the definition of O(z2), in this coontext it represents a convergent series 2 3 of the form A2z + A3z + . ··· 1 Remark 19. The integral given in Watson [123, p.48, (7)] is meaningful only when ℜν > 2 [Watson [123, p.48, l. 1]]. − Proof.

ε ε 2ν 2 2ℜν 2 sin(zcosθ)sin − θ cosθ dθ = O( sinθ − dθ) 0 | | 0 | | Z Z ε 2ℜν 2 = O( θ − dθ) Z0 θ 2ℜν 1 = O( − ε ) 2ℜν 1|0 − 1 < ∞ (since ℜν > 2ℜν > 1). 2 ⇒

ν Remark 20. (n + 1)Cν (t)=(2ν + n)tCν (t) (1 t2) dCn (t) [Watson [123, p.50, 9]]. n+1 n − − dt − Proof. By Guo–Wang [48, p.275, (7)], it suffices to prove (2ν + n)tCν (t) (n + 1)Cν (t)=(2ν + n 1)Cν (t) ntCν (t). n − n+1 − n 1 − n Namely, (2ν + 2n)tCν (t)=(n + 1)Cν (t)+(2ν +− n 1)Cν (t). n n+1 − n 1 This equality follows from Guo–Wang [48, p.274, (3)]. −

Remark. By the first formula given in Watson–Whittaker [122, p.330, l.3], ν ν 2 ν(t)+1 (n + 1)Cn+1(t)=(2ν + n)tCn (t) (1 t )(2ν)Cn 1 . Therefore, ν − − 2 ν 1 − nCn(z)=(n 1+2ν)zCn 1(z) 2ν(1 z )Cn −2 [the second formula given in Watson–Whittaker [122, p.330,− l.3]] should have− been− corrected− as− ν 2 ν+1 nCn(z)=(n 1 + 2ν)zCn 1(z) 2ν(1 z )Cn 2 . − − − − − Remark 21. Gegenbauer’s generalization of Poisson’s integral [Watson [123, §3.32]] The proof of Watson [123, p.50, (3)] given in Watson [123, p.50, l.11–l.18] requires the prerequisite given in Watson–Whittaker [122, §15.8]. It is difficult to locate its key step among the complicated formulas in Watson–Whittaker [122, §15.8]. In contrast, the following proof by ν ν induction highlights its key step: the recurrence formula (n + 1)Cn+1(t)= 2(ν + n)tCn (1 ν − − 2 dCn (t) t ) dt .

114 Proof. For n = 0, Watson [123, p.50, (3)] reduces to Watson [123, p.48, (4)]. n+1 z ν ( i) Γ(2ν)(n+1)!( ) 1 izt 2 ν 1 ν − 2 e (1 t ) 2 C (t)dt Γ ν 1 Γ 1 Γ 2ν n 1 1 − n+1 ( + 2 ) ( 2 ) ( + + ) − − n+1 z ν ( i) Γ(2ν)n!( ) 1 izt 2 ν 1 ν = − 2 R e (1 t ) 2 tC (t)dt Γ ν 1 Γ 1 Γ 2ν n 1 − n ( + 2 ) ( 2 ) ( + ) − − n+1 z ν 1 ν ( i) Γ(2ν)n!( 2 ) 1 izt 2 (ν+1) dCn+1(t) − R e (1 t ) 2 dt [Watson [123, p.50, l. 9]] Γ ν 1 Γ 1 Γ 2ν n 1 1 − dt − ( + 2 ) ( 2 ) ( + + ) − − − = J (z)+ (ν+n)Jν+nR[Watson–Whittaker [122, p.330, l.3, the first equation]; induction hypothesis] − ν′ +n z = Jν+n+1 [Watson [123, p.45, (1) & (2)]]. Remark. It is troublesome to find a starting point or to go through details if we try to prove a complicated theorem from scratch. Especialy, it is difficult for a reader who follows this kind of proof to recognize its essence A good beginning is half the battle. The initial step of mathematical induction often provides such an appropriate starting point. iϖn 2ν 1 iϖl 2ν 1 Remark 22. m 0 e m − dω = n 0 e n − dω [Watson [123, p.51, l. 11]]. ≥ ≥ − RProof. R Suppose we changeR R the tripe (l,m,n) in cyclic order. If l is replaced with n, then n should be replaced with m. If we keep units the same, then the area element dω is invariant under the change of axis labels, as shown in Figure 1.2.

Figure 1.2: dω is invariant under a cyclic pemutation of axis labels

r 2 n 2n 1 d (1 t ) n n!2 − Remark 23. [ dt−r ]t=1 =( ) r Cn n! (2n r)! [Watson [123, p.53, l. 12]]. − · · − − (r) r r (r k) (k) Proof. (uv) (t)= ∑k=0 k u − (t)v (the Leibniz rule for higher derivatives). Let u =(1 +t)n and v =(1 t)n. − For t = 1, one need only consider the term k=n.

115 Remark 24. The equality given in Watson [123, p.54, l.12] follows from Coddington–Levinson [16, p.89, Theorem 6.5]. Remark 25. ψ(m + 1)= 1 + 1 + + 1 γ [Watson [123, p.60, l. 4]]. 1 2 · m − − Incorrect proof. ∞ 1 1 ψ(m + 1)= γ + ∑ ( ) [Guo–Wang [48, p.108, (10)]] − n=1 n − m + n ∞ 1 ∞ 1 = γ + ∑ ∑ (incorrect step). − n=1 n − n=m+1 n

Correct proof.

Γ(z + 1)= zΓ(z)

Γ′(z + 1)= Γ(z)+ zΓ′(z) ⇒ Γ (z + 1) Γ (z) ′ = 1 + z ′ ⇒ Γ(z) Γ(z) Γ (z + 1) 1 Γ (z) ′ = + ′ . ⇒ Γ(z + 1) z Γ(z)

Remark 26. We may generalize Euler’s solutions to the two differential equations given in Watson [123, p.62, l. 8–l. 2] as follows: − − Let y = x1/2u and z = 2na1/2x1/(2n). Then the following two differential equations are equivalent: 3/2 d2y (n 2)/(2n) x dx2 + ax − y = 0; 2 z2 d u + z du +(z2 n2)u = 0. dz2 dz − Remark 27. Jν (z) and Y ν (z) approach their limits Jn(z) and Y n(z), as ν n, uniformly with respect to z, → except in the neighborhood of z = 0, where n is any integer, positive or negative [Watson [123, p.63, l. 13–l. 12]]. − − Proof. For Jν (z), read Watson [123, p.44, l. 16–l. 15] and Fritzsche–Grauert [37, p.15, Theo- − − rem 3.8; p.5, l.2–l.7]. For Y ν (z), read Watson [123, p.63, l.12–l.14].

Remark 28. By Watson [123, p.66, l. 13–l. 12] and Rudin [99, p.230, Theorem 10.28], Yν and its deriva- − − tives are continuous functions of ν [Watson [123, p.66, l. 10]]. − Remark 29. By Watson [123, p.16, (4)] and Rudin [99, p.230, Theorem 10.28], the order of the operations Σ and ∇0 in the equality given in Watson [123, p.67, l.17] can be interchanged and J z ∑∞ ( )n 1 2n( ) is an analytic function of z near the origin [Watson [123, p.67, l. 12–l. 11]]. n=1 − − n − − (0) Remark 30. J0(z) and Y form a fundamental system of solutions [Watson [123, p.67, l. 11]]. − (0) Proof. Assume Y = cJ0(z). (0) Y J0(z)logz 0 as z 0 [Watson [123, p.67, (1)]]. Then − → → (c logz)J0(z) 0 as z 0, a contradiction. − → → 116 (0) 1 Remark 31. Y = Y 0(z)+(log2 γ)J0(z) [Watson [123, p.67, (2)]]. 2 − 1 z Proof. 2Y 0(z)=(γ + log 2 )J0(z)+ o(1) as z 0 [Watson [123, p.60, (3)]]. (0) → Y = J0(z)logz + o(1) as z 0 [Watson [123, p.67, (1)]]. Then (0) 1 → F(z)= Y Y 0(z)+(γ log2)J0(z)= o(1) as z 0. − 2 − → Since ∇0(F)= 0, F(z)= AY 0(z)+ BJ0(z). Since F(z) 0 as z 0, A = B = 0 [Watson [123, p.60, (3)]]. → →

Remark 32. ∑∞ cos2nθ = ln(4sin2 θ) [Watson [123, p.68, l.17]]. n=1 n − n Proof. I ln(1 z)= ∑∞ z , where z B¯(0,1) 1 . − − n=1 n ∈ \{ } m zn m 1 n zm+1 ¯ Proof. (1 z)∑n=1 n = z ∑n=2 n(n 1) z m converges uniformly on B(0,1). n − − − − ∑m z converges uniformly on B¯(0,1) B(1,δ). n=1 n \ II Substitute z = e2iθ into I and then take the real part.

Remark 33. π/2 ln(2sinθ)dθ = 0 [Watson [123, p.69, l. 20]]. 0 − R π/2 π/2 Proof. Let l = 0 ln(sinθ)dθ. Then l = 0 ln(cosθ)dθ. 2l = π/2 ln sin2θ dθ = π/2 ln(sin2θ)dθ π ln2 = l π ln2. 0 2R 0 −R2 − 2 R π/2 R 2 π/2 2 Remark 34. lima 1 0 0 cos(zcosθ) ln(4asin θ)dθ = 0 cos(zcosθ) ln(4sin θ)dθ [Watson [123, p.69, l. 14]].→ − · · − R R Proof. π/2 ln(sinθ) dθ = π/2 ln(sinθ)dθ = π ln2. 0 | | 0 − 2 ln 4asin2 θ 2 lnsinθ ln8 (if 1 a 1). ( R ) +R 2 < |The desired equality|≤ | follows| from Rudin≤ [99, p.27, Theorem 1.34].

Remark 35. (Choosing the method of mathematical induction may help eliminate obstales and automatically reduce the case to the simplest one) A reduction formula establishes a recurrence relation which can reduce the form of an integral to a simper one. A recurrence relation is equivalent to an inductive step. If in a proof we choose to make a big jump without using reduction formula, we may encounter difficulties. This is because we may fail to predict the situation and control the key if we jump too far. Example 1.6. (Stokes’ formula) π/2 cos2n θ lnsinθdθ = (2n)! π ln2 + π ( 1 + 1 + + 1 ) [Watson [123, p.70, l.8]]. − 0 22n(n!)2 { 2 4 1 2 ··· n } ComplicatedR proof without using reduction formula. Let u = lnsinθ and dv = cos2n θdθ. Then 2n 1 2n 3 2n cos − θ sinθ 2n 1 cos − θ sinθ 2n 3 3 2 v = cos θdθ = 2n + 2−n ( 2n 2 + 2n−2 ( + 4 ( cos θdθ) )) 2n 1 2n 3 − − ··· ··· cos − θ sinθ 2n 1 cos − θ sinθ 2n 3 3 θ sinθ cosθ = R 2n + 2−n ( 2n 2 + 2n−2 ( + 4 ( 2 + 2 ) ))R . cos2n θ 2n 1 cos−2n 2 θ 2n −3 ··· 3 θ cotθ cos2 θ ··· vdu = ( + − ( − + − ( + ( + )dθ ))). 2n 2n 2n 2 2n 2 ··· 4 2 2 ··· 1 π/2 2n 1 1 3 (2−n 1) π −π 1 3 (2n 1) 1 R2n 0 cosR θdθ = 2n ·2 ···4 (2−n) 2 = 4 ·2 ···4 (2−n) n . · ··· · ··· 2n 1 1 π/2 2n 2 2n 1 1 1 3 (2n 3) π π 1 3 (2n 1) 1 2−nR 2n 2 0 cos − θdθ = 2−n 2n 2 2·4···(2n−2) 2 = 4 ·2 ···4 (2−n) n 1 . − − · ··· − · ··· − R 117 . . 2n 1 2n 3 1 π/2 2 2n 1 2n 3 1 π π 1 3 (2n 1) 1 2−n 2n−2 2 0 cos θdθ = 2−n 2n−2 2 4 = 4 ·2 ···4 (2−n) 1 . − ··· − ··· · ··· π/2 π/2 π/2 π 0 θ cotθdθR= θ lnsinθ 0 0 lnsinθdθ = 2 ln2. π/2 1 3 (2n 1) π 1| −1 1 π R vdu = · ··· − ( + R+ + )+ ln2 0 2 4 (2n) { 4 n n 1 ··· 1 2 } (2n)! · ··· − =R π ( 1 + 1 + + 1 )+ π ln2 . 22n(n!)2 { 4 1 2 ··· n 2 } Simple proof using reduction formula. 2n 1 Let u = cos − θ lnsinθ and du = cosθdθ. Then v = sinθ and du = cos2n 1 θ cosθ (2n 1)cos2n 2 θ sinθ lnsinθ. − sinθ − − − vdu = (cos2n θ (2n 1)cos2n 2 θ lnsinθ +(2n 1)cos2n θ lnsinθ)dθ. − − − − udv uv vdu. R = R π/2 2n − 2n 1 π/2 2n 2 1 π/2 2n R0 cos θ lnsinR θdθ = 2−n 0 cos − θ lnsinθdθ 2n 0 cos θdθ. π/2 2n 2n 1 π/2 2n 2 − 1 π/2 2n R 0 cos θ lnsinθdθ = 2−nR ( 0 cos − θ lnsinθdθ)+R 2n 0 cos θdθ. − − π/2 Thus,R we can continue to reach R lnsinθdθ directly. R − 0 R 2n π/2 2n Remark. In the first proof, we have to consider both cos θdθ and 0 cos θdθ. In contrast, π/2 2n in the second proof we need only consider 0 cosR θdθ. The redR parts in the first proof are difficult parts which disappear in the second proof. Note that the big jump in the first proof R applies to cos2n θdθ alone so that it loses the connection with lnsinθ. Remark 36. By Guo–WangR [48, p.109, (14)], ψ( 1 )= γ 2ln2 [Watson [123, p.73, l.13]]. 2 − − The presentation given in Guo–Wang [48, §3.11] seems somewhat sketchy. I try to make it rigorous by adding some details. Remark 1. By Rudin [99, p.325, l.4–l.5; l.12–l.13] and Guo–Wang [48, p. 24, Theorem 5], ψ(z)= γ 1 + ∑∞ ( 1 1 ) [Guo–Wang [48, p.108, (5)]]. − − z n=1 n − z+n ∞ t mt dt Remark 2. By Rudin [99, p.150, Theorem 7.8(a)], lnm = 0 (e− e− ) t [Guo–Wang [48, p.108, (8)]]. − R Remark. If two analytic functions defined on C ( ∞,0) coincide on (0,∞), then they coin- \ − cide on C ( ∞,0). \ 1− 1 1 1 Remark 3. Let f (t)= 2 + t 1 e t . Then 2 < f (t) 0 for t 0. − − − − ≤ ≥ Proof. f (0)= 0 and f ′(t) < 0 for t > 0. t zt e− e− 1 Remark 4. t 1 e t L (0,∞) [Watson [123, p.108, l. 7]]. − − − ∈ − 1 1 zt 1 Proof. By Remark 3, ( t 1 e t )e− L (0,∞). t mt − − − ∈ e− e− 1 By Remark 2, −t L (0,∞). ∈ mt Remark 5. When m 0, the values of the integrals containing the factor e− approach zero [Guo– Wang [48,→ p.108, l. 9–l. 8]]. − mt− zt (m+1)t e− e− e− 1 1 Proof. If 0 t < 1, + t =( + O(1))+( + O(1)) = O(1). t 1 e− t t ≤ mt −zt (m+1)t − [ ℜ−z (m+1)]t e− e− e− mt e − − 1 If t 1, + t e− + 1 L [1,∞). t 1 e− 1 e− Then≥ the|− desired result− follows|≤ from Remark− 4 and∈ Rudin [99, p.27, Theorem 1.34]. Remark 37. The Riemann surface for the Bessel function of the first kind: Watson [123, p.75, (1) & (2)] follow from Watson [123, p.40, (8)].

118 Remark 38. (Systematic reduction of calculations by using the formula for the derivative of a determinant) In order to prove Watson [123, p.76, (7)–(11)], Watson suggests that we express successive deriva- tives of Jν (z) and Yν (z) in terms of Jν (z), Jν′ (z), and Yν (z), Yν′ (z) by repeated differentiations of Bessel’s equation. The method he suggests is not efficient for calculation. In fact, only the proof of (9) requires the differentiation of Bessel’s equation. In order to derive the rest of formulae effectively and systematically, we should use the formula for the derivative of a determinant. For example, (7) follows from

J (z) Y (z) ′ J (z) Y (z) J (z) Y (z) ν ν = ν′ ν′ + ν ν [Coddington–Levinson [16, p.28, l.16–l.17]]. J (z) Y (z) J (z) Y (z) J (z) Y (z) ν′′ ν′′ ν′′ ν′′ ν′′′ ν′′′

Remark 39. 1 νπi π i π e 2 J (ze 2 ) ( π < argz ) I z − ν 2 [Watson [123, p.77, l.19–l.20]]. ν ( )= 3 νπi 3π i −π ≤ (e 2 Jν (ze− 2 ) ( < argz π) 2 ≤ Remarks. If the theory of an old item is well established and we want to study a new item which is closely related to the old one, it is unnecessary to repeat the same type of theory for the new item. All we have to do is define the new item in terms of the old one. Watson [123, p.77, (1)] is obtained by replacing z in Watson [123, p.38, (1)] with iz. The the- νπi/2 ory of Jν (z) is well eatablished, so we define Iν (z)= e− J(iz) [Watson [123, p.77, l.12]]. Why is the definition of Iν (z) divided into the two cases as shown in Watson [123, p.77, l.19–l.20]? First, it is conveninent to calculate Jν (z) by using its asymptotic expansion [Watson [123, p.199, (1)]] which is only valid in argz < π. Second, by Guo–Wang [48, p.374, (6); p.375, (11)(i)], | | K0(z) is expressed in terms of ln(z). Consequently, the domain of Iν (z) should be π < argz π. − ≤ By Guo–Wang [48, p.366, (3); p.367, (6)(i)], Y0(z) is expressed in terms of ln(z). Consequently, the domain of Jν (z) should be π < argz π. The above definition transforms from the natural − ≤ domain of Iν (z) to the natural domain of Jν (z). The factors before Jν are designed to make the new definition compatible with the old definition given in Watson [123, p.77, (2)]. The domain of 3 1 2 πi 2 πi π Jν is a Riemann surface; ze− must go around in a circle on a sheet to reach ze . If argz = 2 , π i 3π i 1 νπi π i 3 νπi 3π i Jν (ze 2 ) and Jν (ze 2 ) may be different, but e 2 Jν (ze 2 ) e 2 Jν (ze 2 ) [Watson [123, − − ≡ − p.75, (1)]]. 1 Remark 40. For the proof of Watson [123, p.78, (8)], it suffices to prove the case π < argz 2 [Rudin [99, p.225, Theorem 10.18]]. − ≤ Remark 41. (Comparing the degree of similarity) We have two options in proving Watson [123, p. 79, (1)(ii)]: Use Watson [123, p.78, (6); p.79,(1)(i)] or use Watson [123, p.78, (8); p.74, (3)(i)]. Since Watson [123, p.73, (2)(i)] and Watson [123, p.78, (6)] are similar, Kν is more closely to Hν than to Iν . Consequently, the latter option is the better choice. Remark 42. The proof of Watson [123, p.79, (5) (resp. (6))] is similar to that of Watson [123, p.46, (5) (resp. (6))]. Watson [123, p.80, (12)] follows from Watson [123, p.78, (6); p.80, (1) & (2)]. Watson [123, p.80, (15)] follows from Watson [123, p.78, (8); p.73, (1); p.64, (2)(ii); p.62, (3)(i)]. 1 Watson [123, p.80, (16)] follows from Watson [123, p.78, (5); p.79, (9)(iv)]. Note that ψ( 2 )= γ 2log2. Watson− − [123, p.80, (20)] follows from Watson [123, p.80, (19); p.79, (4)]. Watson [123, p.81, (9) & (10)] follows from Watson [123, p.80, (14)]. Watson [123, p.82, (12) & (13)] follows from Watson [123, p.74, l. 4, (5)(i)]. −

119 Watson [123, p.82, (14) & (15)] follows from Watson [123, p.74, l. 1, (7)(i)]. z 4m − 2 2 ∞ ( 2 ) By the following lemma, we have ber (z)+bei (z)= ∑m=0 (m!)2(2m)! [Watson [123, p.82, (11)]]. 0 if n is odd, Lemma. n k n 2 ∑k=0( ) k = n/2 n − (( ) n/2 if n is even.  − Proof of the Lemma. Consider the coefficient  of xn on the two sides of the identity (1 + x)n(1 x)n =(1 x2)n. − − 2 The coefficient of xn on the left side is ∑n ( )k n ; if n is even, the coefficient of xn on the k=0 − k right side is ( )n/2 n . − n/2  Remark 43. By Collatz [18, chap. II, §6], Watson [123, p.99, (1)] is integrable in finite terms in case A = 0 or m = 2 [Watson [123, p.100, l.5]]. − Remark 44. We express the differential equation given in Watson [123, p.101, l.13] in terms of I [Watson [123, p.101, l.17]] so that we may easily find the roots of the indicial equation [Guo–Wang [48, p.58, (14)]]. Watson [123, p.101, (1)(i)] holds because if we replace z in the upper equality given in Watson [123, p.101, l.13] with z, we obtain the lower equality, and vice versa. Watson [123, − p+1 3 1 2 2 d2u 2 p(p+1) p.101, (1)(ii)] holds because the series solution z 1F1(p+ ; c z ) of 2 c u = 2 u has 2 4 dz − z zp+n terms. Vandermonde’s theorem given in Watson [123, p.102, l. 9] refers to the identity n n − (s +t)n = ∑m=0 (s)k(t)n k. k − Remark 45. limp N 1F1(p + 1;2 p + 2; 2cz)= 1F1(1 N;2 2N; 2cz)k →−N 1 − − − − ( ) − (N !)!N! 2N 1 + − − ( 2cz) 1F1(N;2N; 2cz) [ Watson [123, p.104, l. 11–l. 10]] (2N 2)!(2N)! − − · − − − should− have been corrected as limp N 1F1(p + 1;2p + 2; 2cz)= 1F1(1 N;2 2N; 2cz)k →−N 1 − − − − ( ) − (N !)!N! 2N 1 + 2 −(2N 2)!(−2N)! ( 2cz) − 1F1(N;2N; 2cz). − − · − Remark 46. Watson [123, p.104, (7)] if and only if Watson [123, p.54, (4)] [Watson [123, p.105, l.1–l.2]]. Proof. One should not directly use Guo–Wang [48, p.94, (6)(i)] when z is 0 or a negative because n Γ(z)= ∞. Instead, one should change (z)n to ( ) ( z n + 1)n first and then use Guo–Wang − − − [48, p.94, (6)(i)]. For this proof, N r 1 (1 N)N r 1 =( ) − − (r + 1)N r 1; − − − − N r 1 − − (2 N)N r 1 =( ) − − (N + r)N r 1. Similarly,− − one− should− not use Guo–Wang− − [48, p.99, (2)] when z is an integer.

n 1 n 2n 2n 2nβ Remark 47. (Lommel’s formula) The solutions of Πr=−0 (ϑ +α 2γβ)(ϑ +α 2βν 2γβ)u =( ) β c z u βν α β − − − − are of the form u = z − Cν (γz ), where γ = cexp(rπi/n) (r = 0,1, ,n 1) [Watson [123, p.107, l.13–l.16]]. ··· −

Proof. Since (ϑ + a)(ϑ + b)=(ϑ + b)(ϑ + a), it suffices to prove (ϑ + α 2nβ)(ϑ + α 2βν 2nβ)(z2nβ u)= β 2γ2z2(n+1)β u. − − − − (ϑ + α 2nβ)(ϑ + α 2βν 2nβ)z2nβ u 2 d2− − − d 2nβ =[z dz2 +(1 + 2α 4nβ 2βν)z dz +(α 2nβ)(α 2βν 2nβ)](z u) 2 − − − − − = z2nβ [z2 d u ]+(1 + 2α 2βν)z du +(α2 2αβν)u] dz2 − dz − = z2nβ [ β 2γ2z2β ]u [Watson [123, p.97, (3)]]. − 120 d2 4µ d2u 2µ Application. The above theorem can be used to solve dz2 z dz2 = z u for µ = 0,1,4/5,3 [Watson [123, p.107, l.19–p.108, l.9]]. { } Remark. For r in Watson [123, p.107, l.16], by Guo–Wang [48, p.350, (18)], it is unnecessary to take r = n,n + 1, ,2n 1. ··· − Remark 48. The formula given in Watson [123, p.108, l. 12] should have been corrected as − z1/2C (ciz)=( ci) pzp+1( d )p z 1/2C (ciz) . p+1/2 − − zdz { − 1/2 } Proof. Assume the formula is true for case p. Consider case p + 1. ( ci) (p+1)zp+2( d )p+1 z 1/2C (ciz) − − zdz { − 1/2 } 1 p+2 1 d 1/2C =( ci)− z z− dz [z p+1/2(ciz)] [by the induction hypothesis] −1/2 = z Cp+3/2(ciz) [Watson [123, p.83, (6)]].

Remark 49. If Riccati’s equation dy = azn + by2 is integrable in finite terms, then n = 2 or 4m (m = dx − − 2m 1 0,1,2, ) [Watson [123, p.123, l.23–l.26]]. ± ··· Remark 1. n µ m [Watson [123, p.112, l.11]]. ≤ ≤ Proof. Let µ + 1 be the order of ψp,q(z) [Watson [123, p.114, l.8]]. By Watson [123, p.114, l.10], µ + 1 m + 1. ≤ By Watson [123, p.112, (1)], µ + 1 n. ≥ If µ + 1 = n, then (1) has a solution of order n [Watson [123, p.114, l.11]]. If µ + 1 > n, then µ n. ≥ Remark 2. θ = l fm(z),ϑ = ς fm(z), and Θ = e fm(z) [Watson [123, p.112, l.15–l.16]]. αθ Remark 3. By Pontryagin [91, p.51, (3)], F = e Φ1 + Φ2θ [Watson [123, p.113, l. 9]]. { } γ − Remark 4. If GΘ,Θ = AGΘ+BG, then, by Collatz [18, p.111, l.12–l.13], G = Θ Φ1 +Φ2 fµ (z) [Wat- son [123, p.115, l.12]]. { } Remark 5. By Collatz [18, p.111, (11.62)], γ and δ are the roots of the equation (x(x 1) Ax B = 0 [Watson [123, p.115, l.15–l.16]]. − − −

Remark 6. t is of order µ at most [Watson [123, p.116, l.12]] because if u1 = e fµ (z), then logu1 = fµ (z) is of order µ. Remark 7. By Jackson [60, p.126, l.4–l.5], the Wronskian of any two solutions of (1) is a constant [Watson [123, p.117, l.1]]. Remark 8. “u is an algebraic integral of (1)” means “u is a solution of (1) and u is algebraic over the field C(z)”. Remark 9. If not, all the roots of (3) would be zero [Watson [123, p.118, l. 2]]. − Proof. By Jacobson [61, vol. 1, p.110, Ex.4], p1 = = pM = 0. ··· By Jacobson [61, vol. 1, p.108, l.5], A (u,z)= uM. 2.4.6 s Remark 10. λAλ .s! 1.3.5 (···s 1) = 0 follows from Watson–Whittaker [122, p.282, l. 1] and Guo–Wang [48, p.99, l.2].··· − − Remark 11. M must be even [Watson [123, p.120, l.6]]. Proof. Assume M were odd. By Watson [123, p.120, l.1] and Jacobson [61, vol. 1, p.110, Ex.4], p1 = p3 = = pM = 0. M M··· 2 A (u,z)= u + u − p2 + + upM 1. A contains a factor u, so A cannot be irreducible. ··· −

Remark 12. The constant terms in the functions Br are not all zero [Watson [123, p.120, l.14]].

121 Proof. If for every r 1,2, ,M/2 , the constant term in Br were zero, then for every ∈ { ··· } m m = 1, ,M , the constant term in pm would be zero. By Jacobson [61, vol. 1, p.110, ∈ { ··· } Ex.4], for every r 1,2, ,M/2 , the constant term in s2r would be zero. This would contradict the statement∈ { given··· in Watson} [123, p.120, l.4]. Remark 13. If Bessel’s equation has an integral expressible in finite terms, then (2) must have a solution which is of order zero [Watson [123, p.121, l.1]]. Remark. By Watson [123, p.120, l. 14], the option [Watson [123, p.112, l.7]] that there − exists a 0-order solution of Watson [123, p.120, l. 7, (1)] cannot occur. − Remark 14. An irreducible polynomial in C[z] has only simple roots [Van der Waerden [120, vol. I, p.120, l.1–l.2]], so [((3) has a pair of equal roots) ((3) is reducible)]. ⇒ Remark 15. q : κq = 0 [Watson [123, p.121, l. 2]]. ∃ 6 − Proof. Assume that q,κq = 0. ∀ By Watson [123, p.122, l.12], λ = 0. Thus, V becomes a constant. Similar to the argument given in Watson [123, p.122, l.13], we may prove that A is reducible, a contradiction.

Remark 16. The B1,q are not all zero [Watson [123, p.122, l. 11–l. 10]]. p− iz − Proof. If all the B1,q were zero, then u = z− e± . That is, w = 1 [Watson [123, p.122, l. 4]]. However, w = 1 would not satisfy Watson [123, p.122, (7)]. Then u would not − satisfy Watson [123, p.120, (1)]. Therefore, B1,q cannot be all zero. Summary. Since Riccati’s equation is a variant [Ince [59, p.24, l. 11–p.25, l.4]; Watson [123, p.96, (6)]] of Bessel’s equation, we may use the language of Bessel’s− functions to translate the above theorem as follows: If Bessel’s equation for functions of order ν [Watson [123, p.117, l. 10]] is soluble in finite − terms, then 2ν is an odd integer. In the proof [Watson [123, §4 7–§4 74]] of the latter version, despite possible difficulties, · · all the problems can be solved except one. That is, an infinite power series cannot be expressed as a polynomial [Watson [123, p.123, l.4]]. From this, we may determine the cases that Bessel’s equation is soluble in finite terms [Watson [123, p.123, l.6–l.8]]. Remark 50. By Watson–Whittaker [122, p.390, l.16], the general solution of the Laplace equation is π V = π f (z + ixcosu + iysinu,u)du [Watson [123, p.124, l.8–l.10]]. − Remark 51. TheR two formulas given in Watson [123, p.126, l.7–l.8] follow from Watson–Whittaker [122, p.397, l. 4–l. 3]. − − ikct π π ikr cosω ¯ Remark 52. Vn = e π 0 e Sn(θ,φ;ω,ψ)sinωdωdψ [Watson [123, p.126, l.14]] follows from Wat- son [123, p.125,− l. 5; p.126, l.7–l.8]. Note that R R − dΩ = sinududv (in (u,v) coordinates) = sinωdωdψ (in (ω,ψ) coordinates). For the definition of a surface harmonic, see Watson–Whittaker [122, p.392, l. 9]. − Remark 53. An(θ,φ) is a surface harmonic of degree n [Watson [123, p.126, l. 1–p.127, l.1]] because n m imφ − r Pn (cosθ)e are the basis of the solution space of the Laplace equation [Coddington– {Levinson [16, p.201,} Riesz–Fischer Theorem]]. m Remark 54. The factor which involves θ must be of the form Pn (cosθ); and the factor which involves r is annihilated by the operator d (r2 d ) n(n + 1)+ k2r2, so that if this factor is to be anaytic at the dr dr − origin it must be a multiple of Jn+1/2(kr)/√r [Watson [123, p.127, l.9– l.12]]. 1 ∂ ∂V 1 Proof. I. Let n(n + 1)= (sinθ )+ 2 [Watson [123, p.127, l.8; l.11]], sinθ ∂θ ∂θ sin2 θ ∂ V ∂φ2 2 V = P(θ)Q(φ), m2 = 1 ∂ Q [Jackson [60, p.95, (3.4)]]. Then we will have Jackson [60, p.96, − Q ∂φ 2

122 (3.6)]. II. Let V = U(r). By the definition of n given in I, the formula given in Watson [123, p.127, l.8] can be written as 1 ∂ 2 ∂U 1 ∂ 2U 2 r2 [ ∂r (r ∂r ) n(n + 1)] = c2 ∂t2 = k . Then 2 − − ∂ U + 2 ∂U +(k2 n(n+1) )U = 0. Dividing the equality by k2, we will obtain Guo–Wang [48, ∂r2 r ∂r − r2 p.376, (1); p.377, (3)].

Remark 55. By Wangsness [121, p.377, Figure 24-2; p.378, (24-17); p.379, l.1], eik(ct+z) represents a wave in the direction of the z-axis from +∞ to ∞ with angular frequency kc and wavelength 2π/k − [Watson [123, p.128, l.5–l.6]]. ikr cosθ 2π 1/2 ∞ n Remark 56. e =( kr ) ∑n=0 cni Jn+1/2(kr)Pn(cosθ) [Watson [123, p.128, l.10]] follows from Watson [123, p.126, l. 12; p.127, l.10]. n − n Remark 57. i =(2π)1/2 cni (2n)! and c = n + 1/2 [Watson [123, p.128, l.12–l.13]] follow from n! 2n+1/2Γ(n+3/2) 2n(n!)2 n Watson [123, p.40, (8)], Guo–Wang [48, p.213, (6)] and Watson–Whittaker [122, p.240, l. 3]. − C C ′ C C ′ 1 C C Remark 58. µ+1(kz) µ (kz)+ µ (kz) µ+1(kz)+ kz µ (kz) µ+1(kz) − 1 1 = Cµ (kz)C µ (kz) Cµ 1(kz)C µ+1(kz) Cµ+1(kz)C µ 1(kz) [Watson [123, p.134, l. 2–l. 1]]. − 2 − − 2 − − − C 1 C C Proof. ′µ (kz)= ( µ 1(kz) µ+1(kz)) [Watson [123, p.82, (2)]]. 2 − µ+1 − C ′ C C µ+1(kz)= µ (kz) kz µ+1(kz) [Guo–Wang [48, p.368, l.11]]. Consequently, it suffices− to prove µ C C 1 C C 1 C C kz µ (kz) µ+1(kz)+ 2 µ+1(kz) µ+1(kz)= 2 µ 1(kz) µ+1(kz). This formula follows from Watson− [123, p.82, (1)]. − −

µ2 C ′2 C 2 C C Remark 59. µ (kz) 2 2 µ (kz)= µ 1(kz) µ+1(kz) [Watson [123, p.135, l.3]]. − k z − − C µ C C Proof. µ′ (kz) kz µ (kz)= µ+1(kz) [Guo–Wang [48, p.368, l.13]]. C µ C − C − µ′ (kz)+ µ (kz)= µ 1(kz) [Guo–Wang [48, p.368, l.11]]. kz − Remark 60. By the formula given in Guo–Wang [48, p.368, l.13], we have d ∂C µ (z) ∂C µ+1(z) [zCµ 1(z) zCµ (z) + Cµ (z)C µ (z)] dz + ∂ µ − ∂ µ 2 C 2 C C C C d ∂ µ (z) d C ∂ µ (z) C d ∂ µ (z) d C ∂ µ (z) = z[ µ (z) dz2 ∂ µ ( dz2 µ (z)) ∂ µ ]+ µ (z) dz ∂ µ ( dz µ (z)) ∂ µ [Watson [123, p.135, l.16–l.19]]. − − Remark 61. Because of the statement given in Watson [123, p.39, l. 6], we require that 2ν be not a negative integer [Watson [123, p.147, l.2]]. Because of the relation− given in Watson [123, p.146, l. 7], we − require that 2(µ + ν) be not a negative integer [Watson [123, p.147, l.2]]. Watson [123, p.146, (3)] has four linearly independent solutions α = µ ν [Watson [123, p.146, l. 6–l. 1]]. By ± ± − − comparing the “lowest” power of z in Jµ (z)Jν (z) with those of the above four solutions, we must identify Jµ (z)Jν (z) with the solution α = µ + ν [Watson [123, p.147, (5)]]. Remark 62. (Contour integrals for special functions) I. When we deal with a contour integral for a special function, all we have to do is to choose a point on the contour and assign a possible value to its argument. The definition of contour integral for special functions is the same as that of line integral in complex analysis [Rudin [99, p.217, (1)]]. To parameterize the integral contour for special functions, we often choose the argument of the integral’s dummy variable as the parameter. Once

123 we choose a point on the contour and assign any possible value to its argument, then the value of the integral is determined by the direction of the contour. However, branch points frequently encountered in special functions may cause a lot of confusions and complications. In order to obtain the desired solution form and facilitate the calculations for the value of the integrand near a branch point, we must choose a proper point and assign a proper value to its argument. II. Simple notations for complicated contour integrals: Watson–Whittaker [122, p.245, l. 7– − l. 5; p.256, l. 7–l. 3]. III.− Confusions− and− complications caused by branch points: Watson [123, p.161, l. 8–l. 7] says, “We take the phases of t 1 and t +1 to vanish at the point A.” Watson–Whittaker− [122,− p.257, l.1] says , “At the starting-point− the arguments of t and 1 t are both 0.” We may wonder if one variable with two conditions will cause a contradiction. − Guo–Wang [48, p.353, l. 7] says, “Assume arg(1 t2)= 0 along the path of integration.” One − − may wonder if this assumption is a prescription about which we should not question. Watson– Whittaker [122, p.257, l.1–l.15] shows that there are so many things concerning branch points to consider when we evaluate a contour integral. These confusions and complications are not what the authors intend to cause. The only purpose of [Watson–Whittaker [122, p.257, l.1–l.15]; Watson [123, p.161, l. 8–l. 6]; Watson [123, p.163, (1); p.164, (2)]] is to tell us that if we want − − to choose a point and its argument properly to facilitate calculations, we must consider branch points first. IV. Convergence: 1 It is necessary to suppose that ℜ(ν + 2 ) > 0 [Watson [123, p.161, l. 11–l. 10]]. This −ν 1/2 − +1 is because we must deal with branch points: the convergence of 1(t + 1) − dt or (t ν 1/2 1 − − 1) − dt requires ℜ(ν + 2 ) > 0. R R V. The advantage of representing special functions by contour integrals: The two linearly independent solutions of the Bessel equation can be represented by the same integrand with different contours [Watson [123, p.163, (1); p.164, (2)]]. Remark 63. (0+) ( u)2ν 1 2me udu = 2πi/Γ(2m 2ν + 1) [Watson [123, p.163, l.15–l.17]]. ∞exp(iα) − − − − − − R (0+) 2ν 1 2m u (0+) 2ν 1 2m u Proof. ∞exp(iα)( u) − − e− du = ∞ ( u) − − e− du [Gonzalez´ [42, pp.680–681, Lemma 9.2]] − − R R = 2πi/Γ(2m 2ν + 1) [Watson–Whittaker [122, p.245, l. 8]]. − − − Remark. 3π < argt < π . − 2 2 5π < arg( t) < π . − 2 − − 2 2π + α < argu < α. − arg( u) α < π. | − − | Remark 64. (Tying up loose ends) Both Watson [123, p.163, l.7–p.164, l.16] and Guo–Wang [48, p.355, l.1–p.356, l. 1] prove Watson [123, p.164, (3)]. However, the ways they present have shortcomings. Let− us tie up loose ends. Note that if we replace z = Reiθ in Gonzalez´ [42, pp.680–681, Lemma 9.2] iθ with z = Re− , then the lemma will become false. If the integrand is defined as in Gonzalez´ [42, pp.680–681, Lemma 9.2], then integral along the path [0,∞exp(iθ1)) equals the integral along the path [0,∞exp(iθ2)) if θ1,θ2 [0,π]. For this range of available half-lines, the positive real axis is ∈ the initial half-line; the positive imaginary axis is the middle half-line; the negative real axis is the final half-line. For the above reason, we may replace the integral path ∞ with ∞exp(iθ) (0 θ 0 0 ≤ ≤ R R 124 i∞ i∞exp( iθ) π π π) or replace 0 with 0 − ( θ 2 ) if argz < 2 [Guo–Wang [48, p.356, l. 7–l. 2]; Watson [123, p.164, l.4– l.14]]. Thus,| |≤ the range| of | depends on which reference half-line− − we R R θ choose. For the notation i∞exp( iω), [0,i∞) represents our reference half-line and ω is used to − π π π satisfy the condition argz ω < . By taking z0 z argz < z argz ω < ( ω < | − | 2 ∈ { || | 2 }∩ { || − | 2 } | | π ), we may extend the domain of z from z argz < π to z argz < π z argz ω < π 2 { || | 2 } { || | 2 } ∪ { || − | 2 } [Watson [123, p.164, l.4–l.14]]. For example, if we let z0 the positive imaginary axis and let π π → π ω 2 , we may extend the range of z from z argz < 2 to z 2 < argz < π ; if we let z0 → {π || | } { |− } π→ the negative imaginary axis and let ω 2 , we may extend the range of z from z argz < 2 π →− i∞exp(iθ) i∞ π { || | } to z π < argz < 2 . In the former case, 0 = 0 ( 2 θ π); in the latter case, i∞{exp|−(iθ) i∞ } 1 − ≤ ≤ 0 = 0 ( π θ 2 π).Thus, we willR have moreR available half-lines along which the integrals equal the− integral≤ ≤ along the original positive imaginary axis. R R 0 3πi(ν 1/2) 2 ν 1/2 Remark 65. How do we obtain the expression ∞i e− − (1 t ) − dt given in Watson [123, p.164, l. 7]? − − R Solution. Let t = is. Then t2 1 < 0. Hence t2 1 = 1 t2. − | − | − argt = 3π/2. t + 1 = t + 1 e 3π/2 η . t 1 = t 1 e 3π/2+η . − | | − − − | − | − t2 1 = t2 1 e 3πi =(1 t2)e 3πi. − | − | − − − 1 3πi(ν 1/2) 2 ν 1/2 Remark 66. How do we obtain the expression − e (1 t ) dt given in Watson [123, p.164, 0 − − − − l. 6]? − R Solution. By continuity, arg(0 )= 3 π. Consequently, − − 2 If t ( 1,0), arg(t + 1)= 3 π π = 2π. ∈ − − 2 − 2 − t + 1 = t + 1 e 2πi. | | − t 1 = t 1 e πi. − | − | − (1) Remark 67. The natural domain of definition for Hν (z) is π < argz < 2π; the natural domain of definition (2) − for Hν (z) is 2π < argz < π [Watson [123, p.167, l.10–l.12]; Guo–Wang [48, p.373, l. 5– l. 4]]. − − − If we use the method of analytic continuation given in Watson [123, §6 12] alone, it would be difficult to see the above results. However, the establishment of the two· relations given in Guo–Wang [48, p.373, (16) & (17)] greatly simplifies the proof of the above results. This is 3 because the natural domain of definition for Wk m(z) is argz < π [Watson–Whittaker [122, , | | 2 §16 4]]. · Remark 68. (Integration on a Riemann surface with branch points) If we reduce a contour integral on a Riemann surface to an integral along a line segment, the value of the latter integral may depend on which sheet the line segment is in, while the former integral is an invariant quantity. When we reduce a contour integral on a Riemann surface to an integral along a line segment, we often have to degeneate a part of the contour to a point. In order to make the argument of points along the contour continuous and simplify the calculation of these arguments, we should restore the degeneated point to its corresponding nondegeneate part. For example, in order to prove Watson [123, p.168, (3)], we must prove that (0+) e u( u)ν 1/2(1+ iu )ν 1/2du =[e iπ(ν 1/2) eiπ(ν 1/2)] ∞expiβ e uuν 1/2(1+ iu )ν 1/2du. ∞expiβ − − − 2z − − − − − 0 − − 2z − R i(β 2π) i(β π) iβ R Proof. Let I = ∞expiβ,A = δe − ,B = δe − ,C = δe ; IA and CI be line segments; AB and BC are counterclockwise half-circles.

125 Note that IA and CI are on different sheets. (0+) ∞expiβ = IAB + BCI. We take the argument of u in the range between β 2π and β. R R R − − = ∞expiβ e u( u)ν 1/2(1 + iu )ν 1/2du BCI 0 − − − 2z − πi ν 1/2 ∞expiβ u ν 1/2 iu ν 1/2 R=(e− R) − 0 e− u − (1 + 2z ) − du. 0 u ν 1/2 iu ν 1/2 IAB = ∞expiβ Re− ( u) − (1 + 2z ) − du πi ν 1/2 0 − u ν 1/2 iu ν 1/2 R=(e )R− ∞expiβ e− u − (1 + 2z ) − du πi ν 1/2 ∞expiβ u ν 1/2 iu ν 1/2 = (e ) − R 0 e− u − (1 + 2z ) − du. The− ending point of the integration path expi 0 comes from the ending point of the integra- R [∞ β, ] tion path IAB, namely, B. So the argument of u at the u = 0 is β π. Then the argument of u − − at the u = 0 is β. Thus, β (β π)= π. − − Similarly, in order to prove Guo–Wang [48, p.371, (11)], we must prove the equality given in Guo–Wang [48, p.371, l.7–l.8].

3πi/2 πi πi/2 Proof. Let I = 1 + i∞,A = 1 + δe− ,B = 1 + δe− ,C = 1 + δe− ; IA and CI be line seg- ments; AB and BC are counterclockwise half-circles. (0+) 1+i∞ = IAB + BCI. Based on the restriction given in Guo–Wang [48, p.371, l.10], at the beginning point of integra- R R R tion path, the argument of t 1 is 3π/2, while the argument of 1 t is π/2, so 1 − − 1 − − = eizt (t2 1)ν 1/2dt =(e πi)ν 1/2 eizt (1 t2)ν 1/2dt. IAB 1+i∞ − − − − 1+i∞ − − = 1+i∞ eizt (t2 1)ν 1/2dt = 1 eizt (t2 1)ν 1/2dt RBCI R1 − − − 1+i∞ R − − = (eπi)ν 1/2 1 eizt (1 t2)ν 1/2dt. R − R − 1+i∞ − − R This is because at the beginning point of the integration path, the argument of t 1 is π/2, while R − the argument of 1 t is π/2. − − (0+) izt 2 ν 1/2 πi ν 1/2 Remark. The above proof shows that 1+i∞ e (t 1) − dt =(S T)U, where S =(e− ) − ,T = πi ν 1/2 1 izt 2 ν 1/2 − − (e ) − ,U = 1+i∞ e (1 t ) − Rdt. If we remove the restriction given in Guo–Wang [48, p.371, l.10], say, at the beginning− point of the integration path in U, we let the argument of 1 t R − be 5π/2. Then U will add a factor of ( 2πi)ν 1/2, S will become eπi(ν 1/2), and T will become − − − − e3πi(ν 1/2. Thus, no matter what value we choose for U, (S T)U is an invariant quantity. − − Remark 69. The convergence at t = ∞ of the two integrals given in Watson [123, p.169, l. 3–l. 2] can − − be proved using integration by parts. Dirichlet’s test for uniform convergence [Bromwich [11, p.114, (3)]] inspires me to have this idea. ν Remark 70. H(1)(z)= 2(z/2) 1 eizt (1 t2)ν 1/2dt [Watson [123, p.170, l.13]]. ν Γ(ν+1/2)Γ(1/2) 1+∞i − − R (1) Γ(1/2 ν)(z/2)ν 1 izt 2 ν 1/2 Proof. H (z)= − e (t 1) dt [Watson [123, p.166, (4)]]]. ν πiΓ(1/2) 1+∞i − − (1+) eizt (t2 1)ν 1/2dt R 1+∞i − − =(1 e2πi(ν 1/2)) 1 eizt (t2 1)ν 1/2dt. R − − 1+∞i − − 1 e2πi(ν 1/2) = eπi(ν 1/2) 2isin[(1/2 ν)π]=( 1)ν 1/2 2i π [Guo–Wang [48, − − R − · − − − · Γ(1/2 ν)Γ(1/2+ν) p.99, l.4]]. −

(1) Remark. Every formula for H (z),Jν (z) and Yν (z) given in Watson [123, p.170, l. 18–l. 1] ν − − should have heen added the factor 2 on the right-hand side of its equality. The step given in

126 Guo–Wang [48, p.371, l.8] gives the result, but fails to provide a method of getting the answer. According to the way that the solution is approached, very likely the reasoning contains guesses, and thus may be incorrect. Remark 71. The formula given in Watson [123, p.175, l. 17] follows from Guo–Wang [48, p.102, (5)]. − Using Rudin [99, p.27, Theorem 1.34] to interchange the order of summation and integration on the right of the equality given in Watson [123, p.175, l. 14], we obtain Watson [123, p.176, (1)]. − It is unnecessary and not worthwhile to prove so rough a statement as the interchange of order of summation and integration by using such a delicate argument as in Watson [123, p.175, l. 10– − p.176, l.1]. This shows exactly the advantage of Rudin [99, p.27, Theorem 1.34]. For the legacy of past mathematicians, we should separate wheat from chaff. For chaff, we do not need detiled explanations [Guo–Wang [48, p.357, l. 10–p.358, l.3]]. − Remark 72. This is permissible if ℜ(ν) > 1 [Watson [123, p.177, l. 17]]. − − δ δ 2 Proof. Take δ > 0 such that ℜ(ν)+ 1 > 2 + 2 . iθ π δ Let CR = t = c + Re + R θ π . { | 2 − ≤ ≤ } et exp(c RsinR δ ) exp(c π R1 δ ). Then | |≤ − − ≤ − 2 − ett ν 1dt 0 as R +∞ because | CR − − | → → exp( π R1 δ ) ( π R1 δ ) (1+δ/2). R − 2 − ≤ 2 − − Remark 73. J2(z)+ J2 (z) 2 [Watson [123, p.200, l.16]] follows from Jackson [60, p.114, (3.91)(i)]. ν ν+1 ∼ πz Remark 74. This expression is of lower order of magnitude (when z is large) than the error due to the stopping at any definite term of the expression (1) [Watson| | [123, p.201, l. 2–p.202, l.1]]. − Proof. 0 < argz < π. π < arg( iz) < π ℜ( iz) > 0. − 2 − 2 ⇒ − π < arg(iz) < 3π ℜ(iz) < 0. 2 2 ⇒ Remark 75. Fix z and let ν ∞. Then → z 1 c1 c2 1 Jν (z) exp[ν + ν log( ) (ν + )logν] [c0 + + 2 + ], where c0 = [Watson [123, ∼ 2 − 2 · ν ν ··· √2π p.225, l. 12– l. 11]]. − − (z/2)ν ∞ ( )m(z/2)2m Proof. By Watson [123, p.40, (5) & (7)], Jν (z)= Γ(ν+1) [1 + ∑m=1 m!(ν+−1)(ν+2) (ν+m) ]. ··· By Guo–Wang [48, p.155, (6)], logΓ(ν + 1)=(ν + 1 )logν ν + 1 log(2π)+ O(ν 1). 2 − 2 − Remark 76. (Differentiation of a rational function whose denominator is a high power of a polynomial) 4z+10z3+z5 Let u3 = 8(1 z2)4 [Watson [123, p.226, l. 8]]. Find u3′ . − − 3 5 Solution. One may use the product rule ( f g)′ = f ′g+ f g′) and let f = 4z+10z +z ,g =(8(1 2 4 1 f g f ′ fg′ − z ) )− ; or use the quotient rule ( g )= g−2 as follows: 1 (1 z2)4(4+30z2+5z4)+8z(1 z2)3(4z+10z3+z5) u3′ = 8 − (1 z2)8 − − 1 (1 z2)(4+30z2+5z4)+8z(4z+10z3+z5) − = 8 (1 z2)5 . − Remark. In the last step, one should not expand the expressions on the numerator of the previous step. Cancel the common factor of the numerator and the denominator first. This may avoid a lot of unnecessary computations. If one were to expand the expressions in the numerator of

127 3 5 7 9 11 13 the first step, the resulting u would be 64z+368z 1032z +273z +733z 381z 25z . The complicated 4 − 128(1 z2)17/2 − − expression would make it more difficult to identify it with− the value given in Watson [123, p.226, l. 8]. − Remark 77. The “=” sign given in Watson [123, p.227, (5)] should have been replaced with “ ” because the proof of the formula requires the use of Stirling’s Theorem. ∼

Remark 78. By Guo–Wang [48, p.384, l. 12–l. 9], the points [u0,v0,ℜ f (w0)] are saddle points, or passes − − on the surface [Watson [123, p.235, l.20]]. By Guo–Wang [48, p.381, l. 4–p.382, l.6], ℜ f (w) must change as rapidly as possible [Watson [123, p.235, l. 13]]. Watson− [123, p.238, Fig. 16] − [Case x/ν < 1] shows that the integral contour given in Watson [123, p.176, (3)] can be replaced with the steepest descent without changing the lower limit and the upper limit of the integral; Watson [123, p.239, Fig. 17] [Case x/ν > 1] shows that each integral contour in Watson [123, p.178, (2) & (3)] can be replaced with the steepest descent without changing the lower limit and the upper limit of the integral; Watson [123, p.240, Fig. 18] [Case x/ν = 1] shows that each integral contour in Watson [123, p.176, (3); p.178, (2) & (3)] can be replaced with the steepest descent without changing the lower limit and the upper limit of the integral. The arrows given in Watson [123, p.238, Fig. 16; p.239, Fig. 17; p.240, Fig. 18] can be explained by Guo–Wang [48, p.383, l.3–l.4; l.8–l.9]. The τ’s given in Watson [123, p.238, l.4; p.239, l.11] come from Guo–Wang [48, p.384, l.6]. Remark 79. θ [exp(t2/4) 1]/(2n 2) [Watson [123, p.273, l. 10]] is false when n = 2. The inequality | |≤ − − − should have been corrected as θ [exp(t2/4) 1]/(n 1). | |≤ − − zn z2 t2 2 Remark 80. εnJn(z)On(t)= n 1 1 − + O(n ) [Watson [123, p.273, l. 4]]. t + { − 4n − } − 2 2 Proof. θ = 1 t + 1 (t )2 + [Watson [123, p.273, (9)]]. n 1 4 2(n 1)(n 2) 4 ··· The desired result− follows− from− Rudin [97, p.64, Definition 3.48].

Remark 81. (Want to prove uniform convergence when convergence is given) z n ∑an( ) converges uniformly in (z,t) ∑anJn(z)On(t) converges uniformly in (z,t) [Watson t ⇔ [123, p.274, l.3–l.5]].

n z n 1 anz 2 Proof. Since t (limsupn ∞ an )− , ∑ tn+1 O(n− ) converges uniformly in (z,t). : The desired| |≤ result follows→ from| Watson| [123, p.273, l. 3] and Bromwich [11, p.113, l. 2]. ⇒ p − − : We must assume that t is bounded. n 2 2 ⇐anz z t | | ∑ tn+1 (1 4−n ) converges uniformly in (z,t). 2 2 − n n z t m+p anz 2 n 2 1 m+p anz | −4 | ∑n=m+1 ntn+1 t [1+(limsupn ∞ an )− ](4n)− ∑n=m+1 tn+1 [Bromwich [11, p.113, l. 2]].| |≤ → | | | | − p Remark 82. The convergence of ∑bn/n [Watson [123, p.274, l. 4]] follows from Bromwich [11, p.113, (2)]. − 1 1 Remark 83. (Series rearrangement) J0(z)[O0(t) tO1(t)]+ J1(z)[2O1(t) 2tO2(t)+ 2 ] − 2 − ∞ tOn+1(t) tOn 1(t) 2nsin (nπ/2) + ∑n=2 Jn(z)[2On(t) n+1 n− 1 + n2 1 ]= 0 [Watson [123, p.275, l.9–l.10]]. − − − − m 2 Proof. ∑n=1[Jn 1(z)+ Jn+1(z)][tOn(t) cos (nπ/2)]/n − − = J0(z)tO1(t)+ J1(z)[tO2(t) 1]/2 − 2 m 1 tOn+1(t) tOn 1(t) 2nsin (nπ/2) + ∑m−=2 Jn(z)[ n+1 + n− 1 n2 1 ] 2 − − − 2 + Jm(z) tOm 1(t) cos [(m 1)π/2] /(m 1)+ Jm+1(z)[tOm(t) cos (mπ)/2]/m. { − − − } − −

128 Remark 84. (Detailed analysis) The statements given in Watson [123, p.275, l. 3–l. 1] are oversimple ex- planations of a complex argument. A detailed analysis should be as− follows:−

∞ 2 Proof. By the proof of Watson [123, p.14, (1)], ∑n=0 εn cos (nπ/2) Jn(z) converges uniformly on z B, where B is an arbitrary positive constant. · | |≤ 2 tOn+1(t) tOn 1(t) 2nsin (nπ/2) Let F(n,t)= 2On(t) n+1 n− 1 + n2 1 . − − − 1 − 1 J0(z)[O0(t) tO1(t)]+ J1(z)[2O1(t) 2tO2(t)+ 2 ] ∞ − − + ∑ Jn(z)F(n,t) converges uniformly in z r < R t [Watson [123, p.272, l. 2–l. 1]]. n=2 | |≤ ≤| | − − Assume n 2 : F(n,t) 0. ∃ ≥ 6≡ Let n0 = min n 2 F(n,t) 0 . ∞ m+n{ ≥ | 6≡ } ∞ n (∑n=0 anz converges uniformly on z 1, where m N) (∑n=0 anz converges uniformly on z 1). | |≤ ∈ ⇒ | |≤ F n t ∞ J z J z F n t Similarly, ( 0, 0)+ ∑n=n0+1[ n( )/ n0 ( )] ( , 0) converges to 0 uniformly on a small neigh- borhood of z = 0, where t0 satisfies F(n0,t0) = 0. 1 z n 6 Since ([ z 0] [ Jn(z) ( | | ) ]) [Watson [123, p.40, (8)]], we reach a contradiction. | | → ⇒ | | → n! 2 Remark 85. The formula for (√1 + z2 + z)m given in Hobson [58, p.277, l.6–l.8] is valid for z = isinφ. By Rudin [99, p.225, Theorem 10.18], the formula for (√1 + z2 + z)m is valid for all the points on the Riemann surface on C 0 . We may use this formula to prove that On(z) defined by Watson \{ } [123, p.280, l.9] is a polynomial in 1/z of degree n + 1 [Watson [123, p.280, l.10–l.11]]. 1 zψ(w) αx dw ψ− (x)+ zψ(w) αx dw Remark 86. ( C c)e − w ψ 1(x) = e − w ψ 1(x) [Watson [123, p.280, l.4–l.5]]. − − − − − R R R iθ Proof. Let θ [0,2π). The ray te t 0 intersects C and c at Aθ and Bθ respectively. Since 1 iθ ∈ { | ≥ } 1 ψ (x) e lines between Aθ and Bθ , the circle w = ψ (x) lies between C and c. | − | | | | − | Remark 87. Why can we use Sonines’s general theorem [Watson [123, p.280, l.18–l.24]] to prove Watson [123, p.271, (1)] [Watson [123, p.281, l.9–l.16]]? By Watson [123, p.281, l.18], both sides of the equality given in Watson [123, p.271, (1)] are analytic functions of t. By Rudin [99, p.225, Theorem 10.18], it suffices to prove Watson [123, p.271, (1)] when t > 0. Thus, the hypothesis ∞ ∞ given in Watson [123, p.281, l.7] is satisfied; ∑n=0 and 0 given in Watson [123, p.281, l.2–l.3] can be interchanged. R u ∞ e− du 2 C C Remark 88. 0 (1 t2)z+2tu is convergent if (1 t )z/t x x 0 [Watson [123, p.282, l.1–l.2]]. − − ∈ \{ ∈ | ≤ } R (1 t2)z Proof. Let −2t = a + bi, where a,b R. u u ∈ If b = 0, then ∞ e− du ∞ e− du. 6 0 | (a+bi)+u | ≤ 0 b ∞ e u | | ∞ e u If b = 0, then aR > 0 and −R du − du. 0 | (a+bi)+u | ≤ 0 a R R u ∞ n n 2 ∞ e− du Remark 89. ∑ ( ) εnt On(z) is an asymptotic expansion of (1+t ) 2 for small positive values n=0 − 0 (1 t )z+2tu of t when argz < π [Watson [123, p.282, l.3–l.4]]. − | | R Proof. With integration by parts, we have u m ∞ e− du n m (2t) n+1 n+1 ∞ 2 (n+2) u 0 (1 t2)z+2tu = ∑m=0( ) m! [(1 t2)z]m+1 +( ) (n+1)!(2t) 0 [(1 t )z+2tu]− e− du. − − −u − n − ∞ e− du ∞ n (2t) RThe asymptotic series of 0 (1 t2)z+2tu is ∑n=0( ) n! [(1 t2)z]n+1 , i.e.,R − − − ∞ n ∞ (n+1) (n+k) 2k+n (n+1) ∑ ( ) n!∑ ··· R t z . n=0 − k=0 k! −

129 u 2 ∞ e− du The asymptotic series of (1 +t ) 0 (1 t2)z+2tu is − ∞ 2m 2m ∞ (2m+k 1)!(2k+2m) 2k+2m (2m+1) ∑m=0( ) 2 ∑k=0 − k! R t z− − m k k m + ∑∞ ( )2m+122m+1 ∑ (2 + )!(2 +2 +1)t2k+2m+1z (2m+2), i.e., m=0 − k=0 k! − s/2 2m s(s/2 m 1)! s (2m+1) 1/z + ∑s even 2 ∑m=0 2 (s/2− m−)! t z− ≥ − (s 1)/2 2m 1 s(s/2+m 1/2)! s 2m 2 − + ( + ) + ∑s odd ∑m=0 2 (s/2 m −1/2)! t z− . − − Remark 90. In order to be consistent with the original coordinate system, the triple integral π π/2 π/2 isinθ(zcosφ+Z sinφ cosψ) 2µ 2ν+2 2ν 0 0 π/2 e cos θ sin θ sin ψdφdθdψ given in Watson [123, p.376, l. 5] should− have been corrected as R R R −π π/2 π isinθ(zcosφ+Z sinφ cosψ) 2µ 2ν+2 2ν 0 0 0 e cos θ sin θ sin ψdφdθdψ. π/2 π/2 2π iϖ sinθ sinφ cosψ 2ν 2µ 2ν+2 Remark 91. R0 R 0 R 0 e cos φ sinφ cos θ sin θdψdφdθ π/2 π π iϖ sinθ cosφ 2µ+1 2ν+1 2µ 2ν =R 0 R 0 R 0 e sin θ cos θ sin φ sin ψdφdψdθ [Watson [123, p.377, l.10– l.12]]. R R R π/2 π/2 2π iϖ sinθ sinφ cosψ 2ν 2µ 2ν+2 Proof. 0 0 0 e cos φ sinφ cos θ sin θdψdφdθ π/2 iϖ sinθl 2ν 2µ 2ν+2 = 0 R n 0Re R n cos θ sin θdωdθ π/2 ≥ iϖ sinθn 2ν 2µ 2ν+2 = R0 RRm 0 e m cos θ sin θdωdθ π/2 π ≥π iϖ sinθ cosφ 2ν 2ν 2µ 2ν+2 = R0 RR0 0 e sin φ sin ψ cos θ sin θ sinφdφdψdθ π eiϖ sinθ cosφ l2ν+1n2µ d sin2ν d = R0 Rn 0R,l 0 ω ψ ψ π ≥ ≥ iϖ sinθ cosφ 2ν+1 2µ 2ν = R0 RRm 0,n 0 e n m dω sin ψdψ. ≥ ≥ Remark 92. (UsingR RR formulas in a table without care may easily result in mistakes) 1 π dθ = 1 , where the value of the square root is taken which makes a+√a2 + b2 > π 0 a ibcosθ √a2+b2 | | b [Watson− [123, p.384, l.6; l.12–l.13]]. | |R Proof. Let z = eiθ . Then ib ib 1 2az ibz2 ib a ibcosθ = a 2 (z + z¯)= a 2 (z + z )= −2z − . π− dθ 1−2π dθ − 1 π 0 a ibcosθ = 2 0 a ibcosθ [Let f (θ)= a ibcosθ . Then f (π θ)= f (π + θ) 0 f (π π − ⇒ − d − f d− ] − θR ) θ = 0 (π +R θ) θ R 1 dz a a2 = b z =1R (z α)(z β) [where α,β = i( b 1 + b2 )] | | − − − ± π a √a2+b2 = R [by the residue theorem if we takeq the value of the square root such that α = − < √a2+b2 | | | b | 1, or equivalently a + √a2 + b2 > b since (a + √a2 + b2)(a √a2 + b2)= b2]. | | | | − − Remark 1. The above argument is based on Conway [19, p.112, Example 2.9]. Remark 2. Using formulas in a table without care may easily result in mistakes. One is under the impression that once the solution form is obtained, the actual solution is determined. This is not so. If the resulting function is multivalued and the formula fails to indicate which value to choose, then the formula would be useless. One should find a delicate method to determine the correct value. If one uses such a unfinished formula in a proof, then the proof would be incorrect. Such a mistake is often difficult to detect. Here are some examples. Gradshteyn– Ryzhik [44, formula 6.611.1] fails to indicate which value of the square root to choose in order 1/2 to get the correct answer. Because (1 + z)− is a multivalued function, without assigning a 2 specific value to (1 + z) 1/2, it would be incorrect to prove 1 F( 1 ,1,1, b )= 1 [Guo– − a 2 − a2 √a2+b2 Wang [48, p.403, (3)]] by using (1 + z)α = F( α,β,β, z) [Guo–Wang [48, p.137, (10)]]. For − − 130 1/2 example, one can make (1 + z)− a single-valued function by defining it as in the binormal theorem. However, one has to pay a price for doing so. For example, there are two methods ∞ at 1 b2 1/2 to calculate 0 e− J0(bt)dt: one is using the binormal theorem to calculate a (1 + a2 )− (one cannot calculate the square root in any other way) [Guo–Wang [48, p.403, l.11, (3)]]; the other is R interpreting the square root in the answer 1 [Watson [123, p.384, l.3–l.6]] in a more effective √a2+b2 way: find two squart roots of a2 + b2 and then choose the one satisfying a + √a2 + b2 > b . | | | | [b/(2a)]ν Γ(µ+ν) b2 1/2 µ ν µ+1 ν µ b2 Remark 93. µ (1 + 2 ) 2F1( − , − + 1;ν + 1; 2 ) a Γ(ν+1) a − 2 2 − a (b/2)ν Γ(µ+ν) ν+µ ν µ+1 b2 = F ( , − ;ν +1; ) [Watson [123, p.385, l.13–l.14]] follows from (a2+b2)(µ+ν)/2Γ(ν+1) 2 1 2 2 a2+b2 Guo–Wang [48, p.143, (10)]. b2 Remark. We must prove that 1 + a2 is not a negative real number [Guo–Wang [48, p.143, (10)]].

2 Proof. If 1+ b were a negative real number, then we would have a2 +b2 =( t)a2, where t > 0. a2 − Namely, (1 +t)a2 = b2. Then − b > a and argb2 = arga2 π. Consequently, | | | | ± argb = arga π/2. Then ± b cosα = ℑb < ℜa = a cosα b < a . We would have a contradiction. | | | | | | ⇒| | | | Remark 94. Watson [123, p.386, l.3–l.4, (5) & (6)] follow from Guo–Wang [48, p.137, (10)]. Remark 95. Watson [123, p.386, l.12–l.13, (7) & (8)] follow from Guo–Wang [48, p.209, the second formula of Exercise 22]. Remark 96. Watson [123, p.386, l. 10, (9)] follows from Watson [123, p.386, l.3, (5)]. − Remark 97. Watson [123, p.387, l.7–l.8, (1)] follows from Watson [123, p.77, l.19–l.20; p.385, (2)] and Guo–Wang [48, p.143, (9); p.285, Exercise 29]. Remark 98. Since, by a change of variable, the integrals are expressible in terms of the ratio of b to a, no generality is lost [Watson [123, p.387, l. 8–l. 7]]. − − Remark. By Watson [123, p.385, (2)], ν ν 2 ∞ at µ 2− (b/a) Γ(µ+1+ν) µ+1+ν µ+ν+2 b 0 e− Jν (bt)t dt = aµ+1Γ(ν+1) 2F1( 2 , 2 ;ν + 1; a2 ). − ν RDuring the proof of Watson [123, p.387, l.7–l.8, (1)], as far as a,b are concerned, the terms b , ν µ 1 µ+1+ν µ+ν+2 b2 a− − − will disappear; only the term 2F1( 2 , 2 ;ν +1; a2 ) remains. Since all we care b2 2 2 − about is the quotient a2 , we may assume that a + b = 1 without loss of generality. In order to take the advantage of the formula cosh2 α + sinh2 α = 1, we let a = coshα,b = isinhα [Watson 2 [123, p.387, l.3]]. Then b = tanh2 α. Since πi is a period of tanh, we may assume that π − a2 − 2 ≤ ℑα π without loss of generality. Thus, ℜ(a ib)= ℜ(coshα sinhα)= ℜ(e (ℜα+iℑα)) > 0. ≤ 2 ± ± ± Consequently, the conditions given in Watson [123, p.385, l.7] are satisfied. ∞ t coshα µ sin µπ ν Remark 99. 0 e− Kν (t sinhα)t dt = sin(µ+ν)π Γ(µ ν +1)Qµ (coshα) [Watson [123, p.387, l.13, (2)]] ∞ t coshα − µ νπi ν should have been corrected as e Kν (t sinhα)t dt = e Γ(µ ν + 1)Q (coshα). R 0 − − − µ ∞ t cosβ µ sin µπ Γ(µ ν+1) ν νπi/2 ν νπi/2 Remark 100. e Yν (t sinβ)t dt = R − [Q (cosβ +0i)e +Q (cosβ 0i)e ] 0 − − sin(µ+ν)π · π µ µ − − ∞ t cosβ µ R[Watson [123, p.387, l. 16–l. 15, (4)]] should have been corrected as 0 e− Yν (t sinβ)t dt = νπi Γ(µ ν+1) ν − − νπi/2 ν νπi/2 e − [Q (cosβ + 0i)e + Q (cosβ 0i)e ]. − − · π µ µ − − R Remark 101. (How we detect errors in a textbook) The formula given in Watson [123, p.388, (6)] and Guo– Wang [48, p.442, l.3] should have corrected as

131 µ 1 ∞ t coshα µ 1 (µ 1/2)πi 2 Qν −1 (coshα) e I t t dt e − 0 − ν ( ) − = − − π √ µ 1/2 ( ). ( sinhα) − ∗ R How do we detect errors in a textbook?q When I find an error, the first response is usually to refuse to accept this fact and try to rationalize the opposite viewpoint. After all, there are many authors who have not found it incorrect after copying it. For example, if we replace the factor (µ 1/2)πi cosνπ 2πiµ e− − in ( ) with sin(µ+ν)π , then we must consider e− = 1 true. Consequently, I try to ∗ 2πi 2πiµ rationalize this consequence: If I properly choose the value of log(e− ), then e− can be 1. Nevertheness, I try to remember this odd experience so that I can easily find a reason when a problem occurs afterwards. However, this “rationalization” actually conceals a mistake because 2πiµ e− = 1 if we let µ = 1/2. Thus, the reason why we fail to detect an error is that we have not gone6 far enough to forsee its consequences. Errors cannot withstand tests. Soon or later they will be detected. Even if an error may not be detected at the first checkpoint in application, it can hardly survive at the second one. When I tried to use Watson [123, p.388, (6)] and Guo– 1/2 µ − Wang [48, p.259, (4)] to prove Watson [123, p.388, (7)], I found that the coefficient of P ν 1/2 1/2 µ − − − supposed to be nonzero becomes 0 and the coefficient of Pν 1/2 supposed to be 0 becomes very − 2πiµ complicated if we consider sin(µ +ν)π = sin(µ ν)π (a consequence of e− = 1) true. Thus, the world would fall into pieces as if Pandora’s− box were opened. I became so frustrated that I 2πiµ had to choose the other option: e− = 1 is not necessarily true. Then I found the counterex- ample: Case µ = 1/2. I could omit the story of proving Watson [123, p.388, (7)] and still make this paragraph logical, but this would destroy the evidence of true experience and eliminate the track of the natural thought for solving a problem.

µπi µ π Γ(µν+1) ν 1/2 Proof of ( ). (A). (Whipple’s formula) e Q (coshα)= P− − (cothα). − ν 2 √sinhα µ 1/2 ∗ − − p ν 1/2 z Proof. I. Let z = coshα; y = P− − (w), where w = . Then µ 1/2 (z2 1)1/2 − − − 2 d2u du µ2 (1 z ) dz2 2z dz +[ν(ν + 1) 1 z2 ]u − − 2 − − 2 2 5/4 2 d y dy 2 1 (ν+1/2) =(z 1)− (1 w ) dw2 2w dw +[(µ 4 ) 1 w2 ]y = 0. − { −µ −µ − − − } II. By I, u(z)= AQν (z)+ BPν (z). 3 By Guo–Wang [48, p.249, (8); p.254, (4)], we have B = 0 if we let ν satisfy Γ(ν + 2 )= ∞. µ Qν (z) µπi π III. µ A = e Γ(ν + µ + 1) 2 as x +∞. Pν (z) → → (B). Let coshα = cothβ. Then sinhpα = cschβ. ∞ t coshβ µ 1 Γ(µ+ν) ν 0 e− Iν (t)t − dt = sinhµ β P−µ (cothβ) [Watson [123, p.387, (1)]; Guo–Wang [48, p.249, (9)]]. − R The result follows from (A).

Remark 102. Watson [123, p.388, l.9, (8)] follows from Watson [123, p.388, (7)] and Guo–Wang [48, p.100, (8); p.285, Exercise 30]. ∞ t coshα π sinhνα Remark 103. 0 e− Kν (t)dt = sinνπ sinhα [Watson [123, p.388, l.12, (9)]].

R ∞ t coshα π πν 1/2 1/2 − Proof. 0 e− Kν (t)dt = 2 sinνπ Pν 1/2(coshα)sinh− α [Watson [123, p.388, (7)]; Guo– Wang [48, p.99, (3)]]. − R 1/2 2 p 1 2 1/2 1 1 3 1 coshα − / Pν 1/2(coshα) = √π (coshα + 1)− sinh α2F1( 2 ν,ν + 2 ; 2 ; − 2 ) [Guo–Wang [48, − −

132 p.255, (1)]]. 1 1 3 1 coshα 1+coshα 1/2 3 1 coshα 2F1( ν,ν + ; ; − )=( ) 2F1(1+ν,1 ν; ; − ) [Guo–Wang [48, p.142, 2 − 2 2 2 2 − 2 2 (8)]]. 3 1 coshα 1+ν 1 ν 3 2 2F1(1 + ν,1 ν; 2 ; − 2 )= 2F1( 2 , −2 ; 2 ; sinh α) [Guo–Wang [48, p.179, (9)]] sinh(να) − − = ν sinhα [Sneddon [104, p.40, l.4]]. Remark. Suppose we want to prove the equality of two functions with parameters. We first find the two differential equations that they satisfy and then manage to transform one of these differential equations to the other. This is the most powerful and effective method in analysis. 1 1 1 2 Suppose we want to prove cos(nz)= 2F1( n, n; ;sin z) [Sneddon [104, p.40, Problem 2(i)]]. 2 − 2 2 When n is a positive even integer, we may express cos(nz) as a polynomial of sin2 z and then use 1 1 1 2 combinatory analysis to prove cos(nz)= 2F1( 2 n, 2 n; 2 ;sin z). The proof is not easy. Even if we succeed in proving these special cases, to extend− the equality from the case of positive even integers to the case of complex numbers is still a big problem to be solved. The method of mathematical induction and all the methods in complex analysis are not competent enough for this task. Whipple’s formula [Guo–Wang [48, p.293, Problem 57]] is another example. Remark 104. The term-by-term integration given in Watson [123, p.393, l. 8] can be justified by Rudin [99, − p.27, Theorem 1.34].

iθ 1 1/2 a t+ ν π Proof. Jν (at) maxθ Jν ( a e t) = O([ ] )e [Jackson [60, p.114, (3.91)(i)]]. | |≤ | | | | 2π a t | | | | 1 1/2 a t | | I ν ( a t)= O([ 2π a t ] e| | ) [Watson [123, p.203, (2)]]. | | | | | | Remark 105. Kummer’s first transformation [Watson [123, p.394, l.5]] refers to Watson [123, p.102, (1)]; Kummer’s second transformation [Watson [123, p.394, l. 11]] refers to Watson [123, p.104, (6)]. − Remark 106. (Applications of analytic continuation to the Weber–Schafheitlin integral: the right timing for a statement’s appearance) Suppose we choose the weakest possible conditions required in an argument to be our the- orem’s hypothesis. If the argument has used the method of analytic continuation [Rudin [99, §16.9–§16.16]] no more than once, then no confusion will occur. However, what should we do if the argument has used the method of analytic continuation more than once? Let us see the following example. m α β+γ+2m 1 ∞ ( ) (a/2) − − Γ(2α+2m)Γ(α β+γ+2m) Example. Let A(z)= ∑m=0 −z2α+2mm!Γ(α β+m+1)Γ(γ+m)Γ(α−β+γ+m) ; − − ∞ Jµ (at)Jν (at) B(z)= 0 γ α β dt, D1 = z ℜ(z) > 2a ; t − − α β+γ+2s 1 { | } 1 ∞i (a/2) − − Γ(2α+2s)Γ(α β+γ+2s) C(z)= R2πi ∞i z2α+2sΓ(α β+s+1)Γ(γ+s)Γ(α− β+γ+s) Γ( s)ds, D2 = z argz < π . − − − − { || | } Watson [123,R p.402, l.13–l.19] shows that (B,D1) is an analytic continuation of A; Watson [123, p.402, l. 10–l. 4] shows that (C,D2) is ananalytic continuation of A. Since D1 D2, we can − − ⊂ say that (B,D2) is an analytic continuation of A. In order to establish the first analytic continua- tion, we must impose the condition z D1. After establishing the second analytic continuation, ∈ we find that the condition z D1 can be weakened to the condition z D2. However, before we ∈ ∈ establish the second analytic continuation, there is no way to know that (B,D2) is an analytic continuation of A. Thus, the paragraph given in Watson [123, p.402, l. 13–l. 11] has the prob- − − lem with timing; we should collect enough evidence before we propose a hypothesus. Therefore, whenever we use the method of analytic contiuation, we should check and record if the change of the condition is needed so that we may easily clarify the relationship between cause and effect

133 in the proof structure. In fact, Watson [123, §13 4; §13 41] are self-contained, but its author has written the facts in · · the form of previews because of the timing problem. Every time he says that a condition ensures convergence, the readers may not be able to prove the fact at that moment, but they should be able to find the proof later in the section if they are patient enough. However, some impatient readers may think that they must find the proof somewhere else. The incorrect claim given in Guo–Wang [48, p.405, l.7–l.9] is sufficient to show that there are many people under the mistaken impres- ∞ Jµ (at)Jν (bt) dt sion. In fact, one cannot see the convergence of 0 tλ [Watson [123, p.399, l.2–l.5]] until one reads up to Watson [123, p.401, l.15]]. Similarly, one cannot see the convergence of R ∞ Jµ (at)Jν (at) dt 0 tλ (µ ν is an odd integer;0 > ℜ(λ) > 1) [Watson [123, p.403, l. 8]] until one reads up to Watson− [123, p.404, (3)]]. − − R m γ+2m 1 ∞ ( ) (z/2) − ∞ ct α+β+2m 1 Remark 1. ∑m=0 − m!Γ γ m 0 e− Jα β (at)t − dt is absolutely convergent when ( + ) | − | z < c [Watson [123, p.399, l. 9–l. 8]]. | | − R − ∞ ct α+β+2m 1 (2α+2m) Proof. By Watson [123, p.385, (2)], 0 e− Jα β (at)t − dt = O(c− ). Then use the ratio test. − R Remark 2. We impose the condition ℜz > 0 [Watson [123, p.399, l.18]] because zγ+2m+1 [Wat- son [123, p.399, l.20]] requires the consideration of the domain of logz. We impose the condition ∞ ct λ ℑz < c to ensure the convergence of e− Jγ 1(zt)t− dt; see Jackson [60, p.114, (3.91)]. | | − √ 2 2 Let D = z ℜz > 0, z < c , D′ = z ℜRz > 0, ℑz < c, z < a + c c and { | | | } { | m | γ+|2m 1 | | α β − } ∞ Jα β (at)Jγ 1(zt) ∞ ( ) (z/2) − (a/2) − Γ(2α+2m) ct − f (z)= 0 e− − γ α β dt = ∑m=0 − 2 2 α+m 2F1(α + m,1/2 β t − − m!Γ(γ+m) (a +c ) Γ(α β+1) 2 · − − − m;α βR + 1; a ). − a2+c2 Watson [123, p.399, l. 14–l. 5] shows that f (z) is analytic on D. Watson [123, p.400, l.1–l.23] − − shows that ( f ,D′) is an analytic continuation of ( f ,D). In order to prove the analyticity of f on D, we impose the condition z < c; after the establishment of the analytic continuation ( f ,D ), | | ′ we find that the condition z < c can be weakened to the condition z < √a2 + c2 c. Thus, using the method of analytic| | continuation is like ascending to a higher| | floor: our views− become broader and farther. Remark 3. By Rudin [97, p.135, Theorem 7.11], the limit of the series when c 0 is the same → as the value of the series when c = 0 [Watson [123, p.401, l.8–l.9]]. “Provided that the integral is convergent” [Watson [123, p.401, l. 6]] means “the condition given in Watson [123, p.399, l.3] − is satisfied”. at Remark 4. By Jackson [60, p.114, (3.91)], Jα β (at),Jγ 1(at)= O(e ). In order to ensure ∞ zt − − the convergence of e− Jα β (at)Jγ 1(at)dt, we impose the condition ℜz > 2a [Watson [123, p.402, l. 20]]. − − − R Remark 5. If ℜz > 0 and z < 2a, then | | m γ α β m 1 m ∞ Jα β (at)Jγ 1(at) 1 ∞ ( ) (a/2) − − − − z Γ(γ α β m)Γ(α+m/2) − 0 − γ α β dt = ∑m=0 − − − − t − − 2 m!Γ(1 β m/2)Γ(γ α m/2)Γ(γ β m/2) m m 1 γ α β+m − − − − − − R 1 ∞ ( ) (a/2)− − z − − Γ(α+β γ m)Γ(α/2 β/2+γ/2 m/2) + 2 ∑m=0 m!Γ(α/2− β/2 γ/2 m/2+1)Γ(β/2+γ/−2 −α/2 m/2)−Γ(α/2 β/−2+γ/2 m/2) [Watson [123, p.403, l.3– l.5]]. − − − − − − −

Proof. We choose Watson–Whittaker [122, p.288, l. 16–p.289, l.5] or Guo–Wang [48, p.154, − Fig. 9] to be our primitive model for development. By Guo–Wang [48, p.100, (8)], Γ(2α + 2s) provides the factor 22s, so does Γ(α β + γ + 2s). The numerator of the integrand given in − Watson [123, p.402, l. 8] provides the factor (a/2)2s. Consequently, instead of considering −

134 ( z)s cscsπ [Guo–Wang [48, p.155, l.10]], we should consider ( 4a )2s 1 | − | | 2z sinsπ | = O(exp[(N + 1/2)cosθ ln 4a 2 (N + 1/2)δ sinθ ]) (ln 2a > 0 because z < 2a) | 2z | − | | | z | | | O(exp[ 2 1/2(N + 1/2)ln 4a 2]) if π < θ 3π/4 or 3π/4 θ < π − 2z . = − 1/2 | | − ≤− ≤ (O(exp[ 2− (N + 1/2)]) if 3π/4 θ π/2 or π/2 θ 3π/4 − − ≤ ≤− ≤ ≤ γ α β (γ α β)lnz Remark 6. z − − = e − − . γ α β ℜ(γ α β)ln z argz ℑ(γ α β) z − − = e − − | |− · − − . | | γ α β ℜ(γ α β)lnc If ℜ(γ α β) > 0 and z = c 0, then z − − = e − − 0 [Watson [123, p.403, (1)]]. − − → | | → Remark 7. “It is supposed that these relations hold down to the end of §13 41.” [Watson [123, p.399, l.12]] should have been corrected as follows: · “In Watson [123, p.399, l.7–p.403, l. 9], (µ,ν,λ) (α,β,γ) is transfo rmed according to − ↔ the relations given in Watson [123, p.399, l.9–l.11]; α =(µ + ν λ + 1)/2. In Watson [123, − p.403, l. 8–p.404, l. 7], (µ,ν,λ) (α, p,λ) is transformed according to the relations given − − ↔ in Watson [123, p.403, l. 6–l. 5]; α =(µ + ν + 1)/2.” − − It is really confusing to use the same notation α in the same section [Watson [123, §13 41]] to · represent two different quantities. The latter α should have been replaced with another notation, for example, η. Remark 8. Without loss of generality we may assume that p = 0,1,2, [Watson [123, p.403, l. 8; l. 6–l. 5]]. ··· − − − Remark 9. Since ℜλ < 0, by Bromwich [11, p.203, l. 7–l. 5; p.204, l. 17–l. 15], both λ λ λ λ − − − − 2F1(α , p ;α p;1) and 2F1(α , p + 1 ;α + p + 1;1) diverge [Watson [123, − 2 − − 2 − − 2 − 2 p.404, l.10–l.11]]. The following supplements may help us understand the proof of the theorem given in Bromwich [11, p.204, l.13–l.22]: (1). In order to obtain an < 1 + 2 [Bromwich [11, p.34, l. 8]], we must impose the condition an+1 n − that σn’s are bounded. (2). ∑an converges lim(nan)= 0 [Bromwich [11, p.35, l.16–l.17]]. ⇔ Proof. I. nan = 1 + 1 [(µ 1) n + n ωn ]. (n+1)an+1 n n+1 n+1 nλ 1 II. : − − ⇒ By Bromwich [11, p.35, l.12], µ > 1. By I and Bromwich [11, Art. 39, Ex. 3], lim(nan)= 0. : ⇐ By Bromwich [11, p.35, l.12], µ 1. nan ≤ Case µ < 1: By I, 1. Hence, nan . (n+1)an+1 ≤n ր Case µ = 1: By induction, ∑m=1 am = O(nan). nk If ∑an diverges, then there exists an subsequence nk such that ∑m=1 am L = 0. 1 → 6 M > 0: L(nkan ) M. This contradicts lim(nan)= 0. ∃ | k − |≤ (3). By Rudin [97, p.62, Theorem 3.43], the hypergeometric series given in Bromwich [11, p.35, l. 16] converges for x = 1, if γ + 1 > α + β. (4).− Without imposing proper− conditions, the three theorems given in Bromwich [11, p.201, l.4; l.5; l. 10] cannot be valid. However, our goal is proving the theorem given in Bromwich [11, p.204,− l.13–l.22]. Consequently, all we have to do is impose some conditions so that the above three theorems are valid for the cases (1), (2), and (3) given in Bromwich [11, p.204, l. 3]. For example, the theorem given in Bromwich [11, p.201, l.5] is valid for case (3) because −

135 an Dn+1 1 an Dn+1 + > ( 1 and 0 as n ∞). The proof of limκn > 0 [Bromwich [11, an+1 Dn 2 an+1 Dn p.201,| | l. 10]] can| be proved| → as follows:→ → − 2 f ′(n) κn f (n+1) Proof. lim[ f (n)(1 + 2 f (n) + f (n) ) f (n) ] > 0 [Bromwich [11, p.201, l.4]]. 2 2 − f (n) + 2 f (n) f ′(n) f (n + 1) 1 − = 2 0 [ f (n + x) f ′(n + x) f (n) f ′(n)]dx − 1 d − = 2R0 dt [ f (n +t) f ′(n +t)]dt − 2 1 f ′ (n+t) = 2R0 f (n +t)( f (n+t) + f ′′(n +t))dt. − 2 For casesR (1), (2), and (3), f (n+t) 1. f (n +t), f ′ (n+t) 0 [Bromwich [11, p.201, l. 5]] as | f (n) |≤ ′′ f (n+t) → − n ∞. → (5). “lim(logn)[n an 2 1 2]] > 2 (convergence); lim(logn)[n an 2 1 2]] < 2 (diver- {| an+1 | − }− {| an+1 | − }− gence)” [Bromwich [11, p.202, l.5, (3)]] should have been corrected as “lim(logn)[n an 2 {| an+1 | − 1 2]] > 0 (convergence); lim(logn)[n an 2 1 2]] < 0 (divergence)”. }− {| an+1 | − }− (6). If α = 0, then am L > 0 as m ∞ [Bromwich [11, p.203, l.3–l.5]]. | | → → Proof. an 2 = 1 + ω . | an+1 | nλ δ λ an 2 δ λ 1 εn − 1 + εn − . Consequently, − ≤| an+1 | ≤ m δ λ m an 2 1 ε ∑ k − ∑ [Bromwich [11, p.95, l. 9]] − k=n ≤ k=n | an+1 | − 1 + ε ∑m kδ λ . ≤ k=n − (7). If µ = 1, then ∑an diverges [Bromwich [11, p.204, l. 7–l. 6]]. − − Proof. Assume that ∑an converges to a number L. n By induction, ∑m=1 am = O(nan). M > 0: L M. ∃ | nan |≤ Since L/n 0, an 0. → 6→ (2 µ)(3 µ) (n µ) (8). By induction, the sum of n terms of this series is − 1 2 3− (n···1)− [Bromwich [11, p.204, l. 4–l. 3]]. · · ··· − − − 1 µ (1 µ)(2 µ) (1 µ)(2 µ)(3 µ) (9). If 0 < α 1, then 1 + − + − − + − − − + diverges [Bromwich [11, ≤ 1 1 2 1 2 3 ··· p.204, l. 2–l. 1]]. · · · − − 1 µ 2 2(1 α) Proof. Case 0 < α < 1: 1 + n−1 1 + n −1 . | − | ≥ − Case α = 1: arg(1 iβ )= arcsin β β . − m − m ≈− m Consequently, arg[Π∞ (1 iβ )] ∑∞ β = β ∞. n+1 − m ≈− m=n+1 m − · Γ(2α+2s) 1 2(α+s)Γ(2α+2s) p Remark 10. Since lims α Γ(α+s p) = 2 lims α (s+α)Γ(α+s p) , the residue at s = α is ( ) /(2a) [Watson [123, p.404, l.→−14]]. − →− − − − − Remark 11. By Watson [123, p.403, (2), (λ = 1)] and Guo–Wang [48, p.94, (1); p.99, (3)], ∞ Jµ (at)Jν (at) 2 sin[(ν µ)π/2] 0 t dt = π ν2− µ2 [Watson [123, p.404, l. 5]]. − − R ∞ µ+1 at2 bµ µ+1 b2u Remark 107. 0 Jµ (bt)t exp[ 2u ]dt = aµ+1 u exp( 2a ) [Watson [123, p.415, l. 10–l. 9]] follows from Watson [123,− p.394, (4)]. − − − R Remark 108. The integral along this is zero [Watson [123, p.412, l.2–l.3]].

136 b2 a2 Proof. Let u = c + Rexp(iθ)= u exp(iθ1) and α = 2−a . | | 1 c exp( αu) = exp( α u cosθ1) < exp( α u sinδ), where δ = tan . | − | − | | − | | − R Remark 109. (Lommel’s theorem vs. Sturm’s theorem) [Watson [123, §15 2]] · Both Lommel’s theorem [Watson [123, p.478, l.7–l.8]] and Sturm’s theorem [Uspensky [118, p.142, l. 4–p.143, l.2]] discuss numbers of zeros; both of their proofs use the same the- − orem given in Uspensky [118, p.104, l. 7–l. 5]. However, in essence, they are solutions to different problrms (the former tries to find− infinitely− many zeros, while the latter asks for the ex- act number of zeros) and the starting points of their approaches are also different (the former first shows that a certain type of intervals contain at least one zero, and then tries to infinitely many such intervals [Watson [123, p.478, l.9–l.14]], while the latter requires a more delicate plan). C (x) xC (x) Remark 110. x C 2 t tdt x ν ν′ 2νβ(α sinνπ+β cosνπ) [Watson [123, p.481, l.9]] follows 0 ν ( ) = 2 dCν (x) d xC ′ (x) + π sinνπ − { ν } dx dx fromR Watson [123, p.64, (1)], the formula given in Watson [123, p.480, l. 12] and Jackson − [60, p.114, (3.89)]. C C Remark 111. At consecutive positive zeros of ν (x), ν′ (x) has opposite signs [Watson [123, p.481, l. 11– l. 10]]. − − Proof. Let a and b be the consecutive positive zeros of Cν (x). By Watson [123, p.480, l.3–l.4], there is exactly one zero c of Cν+1(x) between a and b. C C By the formula given in Guo–Wang [48, p.368, l.13], ν+1(x) and ν′ (x) have opposite signs at the zeros of Cν (x). C C If ν′ (x) were to have same signs at a and b, then c could not be a simple zero of ν+1(x). This would contradict the statement given in Watson [123, p.479, l.21–l.22].

Remark 112. By the statement given in Watson [123, p.481, l.9–l.12], the zeros of Jν (x) and AJν (x)+BxJ (x)(B = ν′ 6 0) are interlacing. By Lommel’s theorem [Watson [123, p.478, l.7–l.8]], AJν (x)+ BxJν′ (x) has infinitely many positive zeros.

x x dY0(β0x) dY0(βx) 4 β Remark 113. tY0(βt)Y0(β0t)dt = 2 [Y0(βx) Y0(β0x) ] 2 log [Watson [123, 0 β 2 β dx dx π2(β 2 β ) β0 − 0 − − − 0 p.483,R l.3–l.4]] follows from Watson [123, p.134, (8)] and Jackson [60, p.114, (3.90)(i)]. 9. Statistical and thermal physics (Pathria–Beale [89]; Reif [93]) (A). Pathria–Beale [89] Remark 1. Definition of microstates: [Pathria–Beale [89, p.2, l. 11–l. 9]]. − − Remark 2. The dependence of Ω on V arises from the fact that it is the physical dimensions of the container that appear in the boundary conditions mposed on the wave functions of the system [Pathria–Beale [89, p.3, l. 3–l. 1]]. See Pathria–Beale [89, p.10, l. 11–l. 10]. ∂S ∂H − − − − Remark 3. CP T( )N P =( )N P [Pathria–Beale [89, p.9, l.7, (18)]]. ≡ ∂T , ∂T , Proof. The first equality follows from Reif [93, p.140, (4 4 8)]. ∂H ∂H · · dH =( )PdT +( )T dP ( ) [by the chain rule] ∂T ∂P ∗ dE =dQ¯ PdV − dH =dQ¯ +VdP ( ). ⇒ ∗∗ At constant pressure, dP = 0. Then dQ¯ CP =( dT )P [by definition] ∂H =( )P [by comparing ( ) with ( )]. ∂T ∗ ∗∗ 137 Remark 4. By Reif [93, p.64, l. 19–p.65, l.5], Ω(N,E,V) ∝ V N [Pathria–Beale [89, p.9, l. 3, (1)]]. h2 −2 2 2 − Remark 5. ε(nx,ny,nz)= 8mL2 (nx + ny + nz ),nx = ny = nz = 1,2,3, [Pathria–Beale [89, p.10, l. 9, l m n··· − (5)]] follows from Reif [93, p.360, l.3]. k = 2π( a , b , c );l,m,n = 0, 1, 2, [Pathria– Beale [89, p.655, l.9, (15)]] follows from Reif [93, p.357, (9 9 13)]. ± ± ··· ∂Σ(N,V,E) 3N · · Remark 6. By Pathria–Beale [89, p.13, (17)], ∂E = 2E Σ(N,V,E) [Pathria–Beale [89, p.14, (17a)]]. Remark 7. (Ensemble average equals time average) Because Reif [93, §15 14] is clearer than Pathria– Beale [89, p.30, l. 7–p.31, l.10; p.31, l. 17—l. 1], we discuss· only the former. The proof − − − given in Reif [93, §15 14] fails to get to the heart of the matter. Furthermore, the apparatus · = used is too complicated to reveal the gap contained in [Reif [93, p.585, l.8]]. h{}i {hi} If we regard the phase space and the characteristc probability of ensemble as a probability space [Borovkov [9, p.17, Definition 6]] and Sn/n as the ensemble average, then the equality between the ensemble average and the time average [Reif [93, p.583, (15 14 1)]=Reif [93, p.583, (15 14 2)]] can be viewed as a version of the strong law of large numbers· · [Chung [15, · · p.133, Theorem 5.4.2(8)]]. The expectation E (X1) [Chung [15, p.133, l.5]; Pathria–Beale [89, p.26, the right side of (3)]] equals the time average because a volume element in phase space with larger probability is the volume element where the random variable stays longer. Remark. In 1713, Jacob Bernoulli proved the strong law of large numbers for the special case of Bernoulli scheme [Borovkov [9, p.91, Theorem 2]]. J. Willard Gibbs recognized its potential, so he introduced the concept of ensemble in 1902 in order to apply the law for the general case to statistical mechanics. The concept of ensemble was designed completely based on the assumption of the strong law of large numbers: independence [Borovkov [9, p.91, l.8]; Tolman [116, p.44, l.14]]; identically distributed random variables [Borovkov [9, p.91, l.9]; Tolman [116, p.43, l.12–l.13, the same structure]]. It was not until the late 1920s that a rigorous proof of the strong law for the general case was accomplished through the efforts of A. Y. Khinchin, A. N. Kolmogorov and others. Thereby, the theoretical foundation of statistical ensemble was well-established. Remark 8. (Liouville’s theorem) The “local” density of the representative points, as viewed by an ob- server moving with a representative point, stays constant in time [Pathria–Beale [89, p.28, l. 8–l. 6]]. − The− above interpretation is confusing because it fails to get to the heart of the matter. There are two ways to interpret Liouville’s theorem: First, by definition, dρ ρ(t,q(t),p(t)) ρ(t0,q(t0),p(t0)) t=t0 = limt t0 − , dt | → t t0 where ρ(t,q, p) can be considered− a function on the curve γ(t)=(t,q(t), p(t)). Second, by the chain rule, dρ ∂ρ 3 ∂ρ ∂qi 3 ∂ρ ∂ pi t=t0 = (t,q,p)=(t ,q(t ),p(t )) + ∑ (t ,q(t ),p(t )) t=t0 + ∑ (t ,q(t ),p(t )) t=t0 . dt | ∂t | 0 0 0 i=1 ∂qi | 0 0 0 ∂t | i=1 ∂ pi | 0 0 0 ∂t | Remark 9. By Pathria–Beale [89, p.34, (14)] and Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.495, l.9], the number of eigenstates within the allowed energy interval is very nearly equal to ∆/h¯ω [Pathria–Beale [89, p.34, l. 4–p.35, l.1]]. − Remark 10. By Guo–Wang [48, p.389, l. 12–l. 11], we see that for large N , the contribution from the − − rest of the circle is negligible [Pathria–Beale [89, p.46, l. 5]]. − Remark 11. By Reif [93, p.612, (A 6 4); p.613, l.5], we see that for ν 1, the major contribution to this · · ≫ integral comes from the region of x that lies around the point x = ν and has a width of order √ν [Pathria–Beale [89, p.658, l.8–l.9]]. Remark 12. By Watson–Whittaker [122, p.253, l.10 & l.12], we have Pathria–Beale [89, p.658, l. 1, − (27)].

138 2 1 (x x0) /4γ Remark 13. δ(x x0)= limγ 0 e− − [Pathria–Beale [89, p.661, l. 1, (43)]]. − → √4πγ − ∞ a(x+b)2 π Proof. I. ∞ e− dx = a [Reif [93, p.609, (A 4 6)(i)]]. − 2 · · ∞ ax2+bx+c b +c ∞ a(x b )2 II. ∞ e−R dx = e 4a p ∞ e− − 2a dx − b2 − π 4a +c = R a e [by I]. R Let a = γ,b = ix,c = 0. p Remark. (Physics proofs vs. mathematics proofs) A physics proof is usually intuitive; it shows how we discover a new formula. In contrast, a mathematics proof is usually abstrast; it shows how we prove it rigorously. ∞ sin(gx) Example. limg ∞ ∞ πx f (x)= f (0). → − A physics proof. TheR noddal separation π/g of sin(gx) becomes smaller and smaller as g becomes larger and larger. A small neighborhood (x π/g,x + π/g) of x = 0 contributes − 6 to the integral a period of sin(gx)[ f (x)/x], which is 0, where f (x)/x can be treated as a ∞ sin(gx) constant. In a neighborhood of x = 0, f (x) can be considered a constant. ∞ πx f (0)= f 0 follows from Rudin [99, p.244, (7)]. − ( ) R ∞ sin(gx) A mathematics proof. The formula ∞ πx f (x)= f (0) can be proved using an argument similar to the one given in Rudin [99,− pp.243–244, Problem 10.44]. R Remark 14. Pathria–Beale [89, p.663, l. 4, (9)] can be proved by induction as in http://math. stackexchange.com/questions/992525/how-to-derive-di− fferential- volume-element-in-terms-of-spherical-coordinates-in-h.

Remark 15. By Watson [123, p.48, (4)], integration over the angular coordinates θ1,θ2, ,θn 2 and φ yields factors ··· − 1/2 n 1 2 (n 2)/2 n 2 1 n 3 1 1 π Γ( −2 )( kr ) − J(n 2)/2(kr) B( −2 , 2 ) B( −2 , 2 ) B(1, 2 ) 2π [Pathria–Beale [89, p.664, l.1–l.2]]. − × · ··· · ν Remark 16. By Watson [123, p.40, (8)], Jν (kr) (kr/2) /Γ(ν +1) as k 0 [Pathria–Beale [89, p.664, l.7]]. → → 1 ∞ αr2 1 (n 2)/2 n 1 1 n/2 k2 Remark 17. n/2 0 e− ( kr ) − J(n 2)/2(kr)r − dr =( 4πα ) exp 4α [Pathria–Beale [89, p.664, (2π) − − l. 8, (14)]]. − R Proof. By Watson [123, p.40, (8)] and Jackson [60, p.114, (3.91)], the integral converges. Its exact value can be evaluated by Guo–Wang [48, p.409, (17)]. (B). Reif [93] Remark 1. If function f is more slowly varying than function g, then the Taylor series of f converges more rapidly than that of g [Reif [93, p.18, l.7–l.8]]. Remark. The example given in Reif [93, p.18, l.9–l.22] is marginal. Here is a typical example. Let f (x)= 1 and g(x)= 1 , where 1 < x < 1. Note that f (x) < g(x) if x > 0 1 x2 1 x − and that the Taylor series− of f requires− fewer terms to reach a prescribed precision.

Remark 2. In this case [the number of instances in the ensemble where u = ui and where simultaneously v = v j] is simply obtained by multiplying [the number of instances where u = ui] by [the number of instances where v = v j] [Reif [93, p.26, l.3–l.6]]. Example. Suppose a card (ui,v j) is drawn from a standard deck of cards, where U = ui = club, diamond, heart, spade and V = v j = A,2,3, ,10,J,Q,K . The ensem- { } { } { } { ··· } ble = U V. Let H = heart and F = J,Q,K . Then the number of elements in H F = (the number× of elements{ in H}) (the number{ of} elements in F). Consequently, × ×

139 1 3 P(H F)= 4 13 = Pu(H) Pv(F). Remark.× Due to its delicate× nature, one should not discuss probability theory in loose lan- guage. For a rigorous treatment of statistical independence, see Lindgren [74, §1.3.3; §2.1.9, §2.2.3]. Remark 3. (Various aspects of generalization) The generalization from a discrete case to a continuous (or measurable) case [Reif [93, §1 9 & §1 10]] From probability to probability density· function:· Reif [93, p.617, l.2] Reif [93, p.617, (A 7 10)] → · · or Reif [93, p.10, (1 2 10); p.37, l. 10] Reif [93, p.37, (1 10 9)] Rudin [99, p.16, Theorem 1.17]. · · − → · ·  → or Reif [93, p.32, Fig. 1 9 1(b)] Reif [93, p.32, Fig. 1 9 1(a)] Mean value: Reif [93, p.15,· (1· 4 6)]→ Reif [93, p.35, (1 9·16)(i)]· Borovkov [9, (mea- · · → · · → surable functions) p.56, Definition 1].  Dispersion: Reif [93, p.16, (1 4 12)] Reif [93, p.35, (1 9 16)(ii)] Borovkov [9, (mea- · · → · · → surable functions) p.69, Definition 5]. Independence: Reif [93, p.7, l. 1–p.8, l.1; p.35, l.11–l.18] Borovkov [9, (measurable − → sets) p.21, Definition 7; (measurable functions) p.42, Definition 8; Theorem 3]. For the discussion in the general case [Reif [93, p.35, l.7–p.36, l. 1]], w(s) is considered only a symbol. When one considers the case of equal step length− a particular case, one should calculate w(s) first [Reif [93, p.37, l.5]]. Only after finding its specific value may we have substance in discussion. Remark 4. The strong law of large numbers and the central limit theorem [3, (A), Remark 50; Reif [93, §1 9–§1 11]] · · Remark 5. Reversible processes [Reif [93, §3.2]] Example. Reif [93, p.114, l. 17–p.116, l.3] − Remark 6. ∂Ω = ∂ (ΩY¯ ) [Reif [93, p.113, (3 8 6)]]. ∂x − ∂E · · Proof.

∂σ ∂(σδE) δE = [consider δE a fixed constant] ∂E ∂E ∂(ΩYdx¯ ) = [Reif [93, p.113, (3 8 3)]] ∂E · · ∂(ΩY¯ ) = dx. ∂E

∂S Remark 7. Cy = T( )y [Reif [93, p.140, (4 4 8)]]. ∂T · · Proof. I. dQ¯ Cy =( )y [Reif [93, p.139, (4 4 1)]] dT · · TdS =( )y [Reif [93, p.123, (3 11 4)]]. dT · ·

TdS ∂S II. ( dT )y = T( ∂T )y.

140 ∂S Proof. dS = ∂T dT. ∂S TdS = T ∂T dT. TdS ∂S dT = T ∂T .

Remark 8. A physical interpretation for Reif [93, p.160, (5 4 2)] Reif [93, p.160, (5 4 2)] is a solution of the· PDE· given in Reif [93, p.160, (5 4 1)]. Reif · · · · [93, p.160, l.17–p.19] provides a physical interpretation for the first term on the right side of the equality given in Reif [93, p.160, (5 4 2)]. One may wonder why we keep the volume · · constant at V0/(ν/ν0). This is because the PDE given in Reif [93, p.160, (5 4 1)] is ∂S ∂S ∂S · · dS =( ∂T )V dT +( ∂V )T dV, where ( ∂T )V keeps V constant. Remark 9. Reif [93, p.161, (5 5 4)] follows from Haloms [50, p.14, Theorem 2; p.23, Theorem 2] and O’Neill [86, p.23, l.· · 3–l. 1]. − − Remark 10. By Green’s theorem, the area enclosed by the quadrilateral figure in Fig. 5 11 6 is equal to b c d a · · w = pdV + pdV + pdV + pdV [Reif [93, p.189, l. 4–l. 1]]. a b c d − − Remark 11. The first equality of Reif [93, p.218, l. 1, (6 6 22)] is a generalization of the first equality R R R R − · · of Reif [93, p.75, l.12, (2 9 3)]. · · Remark 12. Finding extrema with subsidiary conditions I. In calculus: The method of Lagrange multipliers [Reif [93, §A 10]]. II. In calculus of variations (usually consider minima): The analog· of the method of La- grange multipliers [Fomin–Gelfand [35, p.43, Theorem 1]]. III. In statistical mechanics [Reif [93, §6 8; §6 10]] (usually consider sharp maxima [Reif · · [93, p.202, l.8]]): 1. The method of Lagrange multipliers [Reif [93, p.229, l. 16–p.231, l. 13]]. − − The shortcoming: It is difficult to explain the statements given in Reif [93, p.232, l.13–l.14] in detail. Thus, we need the following more delicate methods: 2. Using the statistical trick: a rapidly increasing function multiplied by a rapidly decreasing function will produce a sharp maximum [Reif [93, p.222, l.1–p.223, l.8]]. The sharp maximum is produced by Reif [93, p.110, (3 7 14) or p.242, (7 2 15)]. · · · · 3. Using the δ-function and its Fourier transform [Reif [93, p.223, l.9–p.225, l.15]]. Remark 13. Reif [93, p.252, l.11, (7 6 6)] follows from Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.494, l.3, (B-34)]. · · Remark 14. But in the second case the magnetic moment is smaller by approximately the ratio of the electron to the nucleon mass [Reif [93, p.261, l. 9–l. 7]]. − − Proof. µ = iA [Halliday–Resnick [49, p.543, (30-10)]] q 2 q q = 2πr/v (πr )= 2 vr = 2m (mvr) q = 2m (lh¯) [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.649, (C-12)(ii)]]. Remark 15. By Borovkov [9, p.42, Theorem 3], the individual velocity components behave like statisti- cally independent quantities [Reif [93, p.266, l. 2–l. 1]]. 2 2 − − Remark 16. At this point one has (∂ p/∂v )T = 0 [Reif [93, p.311, l. 9]]. − Proof. Let the v-component of C in Reif [93, p.307, Fig. 8 6 1] be vC. Then v1 < vC < 2 2 · · 2 2 v2 [Reif [93, p.308, Fig. 8 6 2]], where (∂ p/∂v )T v v > 0 and (∂ p/∂v )T v v < 0. · · | = 1 | = 2 However, v1,v2 VC as T T − [Reif [93, p.307, Fig. 8 6 1; p.308, Fig. 8 6 2]]. → → 4 · · · · Remark 17. For given (n ,n , ), there are N! possible ways to put N distinguishable particles into 1 2 n1!n2! ··· ··· individual quantum states such that there are n1 particles in state 1, n2 particles in state 2, etc.

141 Therefore, Reif [93, p.344, (9 4 2)] follows from Reif [93, p.343, (9 4 1)] and the postulate of equal a propri probabilities· [Reif· [93, p.54, l. 13–l. 12]]. · · − − Remark 18. Reif [93, §A 6] provides a quick but not rigorous proof of Stirling formula [Reif [93, p.614, (A 6 14)]]. For· a rigorous proof and a better estimate, see Watson–Whittaker [122, p.253, · · l.10]. Remark 19. The property of (A 7 2) implies all the characteristics of (A 7 1) [Reif [93, p.615, l.5]]. · · · · Proof. Let δγ be defined as in Reif [93, p.615, (A 7 3)] and δ = limγ 0 δγ . · · ∗ → Then both δ and δ satisfy Reif [93, p.615, (A 7 2)]. ∗ · · By Rudin [99, p.31, Theorem 1.39 (b)], δ = δ. ∗ 10. Electrodynamics (Griffiths [46]; Jackson [60]; Matveev [79]; Sadiku [100]; Wangsness [121])

(A). Griffiths [46] Remark 1. For the proof of Griffiths [46, p.37, (1.59)], see Spivak [107, p.102, Theorem 4-13]. Remark 2. Griffiths [46, p.77, (2.20)] follows from Griffiths [46, p.76, (2.19)] and Karamcheti [64, p.72, (4.30)]. Remark 3. E da = E(4πr2) in Griffiths [47, p.25, Problem 2.11] follows from Griffiths [46, p.98, (v)].· H 1 q Remark 4. E = 3 rrˆ (r < R) in Griffiths [47, p.28, Problem 2.21] follows from Corson–Lorrain– 4πε0 R Lorrain [20, p.53, (3-28)]. πy 2 2 1 sin a ∂ F ∂ F Remark 5. If F(x,y)= tan− ( πx ), then 2 + 2 = 0 [Griffiths [46, p.132, l.4]]. sinh a ∂x ∂y πx πy ∂F π sinh a cos a Proof. y = a πx 2 πy 2 . ∂ (sinh a ) +(sin a ) 2 πx πy πx πy 2 πy ∂ F π 2 sinh a sin a π 2 sinh a sin a cos a 2 = ( a ) πx 2 πy 2 2( a ) πx 2 πy 2 2 . ∂y (sinh a ) +(sin a ) ((sinh a ) +(sin a ) ) − πx πy − ∂F π cosh a sin a ∂x = a sinh πx 2 sin πy 2 . − ( a ) +( a ) 2 πx πy πx πy 2 πx ∂ F π 2 sinh a sin a π 2 sinh a sin a cosh a ∂x2 = ( a ) sinh πx 2 sin πy 2 + 2( a ) sinh πx 2 sin πy 2 2 . − ( a ) +( a ) (( a ) +( a ) ) πx πy πx πy 2 πx 2 πy 2F 2F sinh sin sinh sin [cosh cos ] ∂ ∂ π 2 a a π 2 a a a − a ∂x2 + ∂y2 = 2( a ) sinh πx 2 sin πy 2 + 2( a ) sinh πx 2 sin πy 2 2 . − ( a ) +( a ) (( a ) +( a ) ) Remark 6. Differentiation of an integral at a singularity: see Griffiths [46, p.157, Problem 3.42 (b)]; Jackson [60, p.35, l.2–l. 4]. − Remark 7. Evidently in such cases ∇ P is automatically zero [Griffiths [46, p.178, l. 11–l. 10]]. × − − Proof. I. Spherical symmetry: The charge density ρ(r) depends only on the distance be- tween r and the origin. D(r)= D(r)rˆ. II. Cylindrical symmetry: The charge density ρ(r) depends only on the perpendicular dis- tance between r and a cylindrical axis. This requires a charge distribution that is infinitely long. D(r)= D(ρ)ρˆ. III. Plane symmetry: The charge density ρ(r) depends only on the perpendicular distance between r and a plane. This requires a charge distribution that extends infinitely in two directions. D(r)= sgn(z)D( z )zˆ [let xy-plane be the plane of symmetry]. | | Remark 8. ∇ P is infinite at the boundary [Griffiths [46, p.182, l. 11]]. × − Proof. 0 = P dl = S ∇ P ndS [Karamcheti [64, p.75, (4.42)]]. Let h be the6 boundary-crossing· × · side in Griffiths [46, p.182, Figure 4.21]. Then H RR 0 = P dl ∇ P nhdl. Let h 0. 6 · ≈ × · → 142 Remark 9. Magnetostatics Common features Coulomb’s law [Griffiths [46, Bivot–Savart law [Griffiths [46, Experimental laws [Griffiths p.59, (2.1)]] p.215, (5.32)]; Wangsness [121, [46, p.59, l.9]; Wangsness [121, p.225, (14-2)]] p.218, l.5]] used as axioms or starting points to build the theories [Griffiths [46, p.216, l. 18–l. 17]]. − − A uniform infinite line charge: An infinitely long straight cur- Electric field: Griffiths [46, p.63, rent: (2.9)]. Magnetic induction: Griffiths [46, p.217, (5.36)] Magnetic force per unit length: Griffiths [46, p.217, (5.37)]. ∇ E: Griffiths [46, p.70, (2.16)] ∇ B: Griffiths [46, p.223, The divergence and curl deter- · · ∇ E: Griffiths [46, p.77, (5.48)] mine a vector field. This feature × (2.20)]. ∇ B: Griffiths [46, p.224, l.6; characterizes Maxwell’s equa- × p.225, (5.54)]. tions. Gauss’ law: Griffiths [46, p.68, Ampere’s` law: Griffiths [46, These laws faciliate calculating (2.13)] p.225, (5.55)] fields for sources with appro- priate symmetry [Griffiths [46, p.71, l. 11–l. 4; p.229, l.4– l.9]]. − − Infinite plane with uniform sur- Infinite uniform plane current face charge: Griffiths [46, p.74, sheet: Griffiths [46, p.228, (2.17)] (5.57)]; Wangsness [121, p.244, (15-21)] Infinite parallel-plate capacitor: Infinite long solenoid: Griffiths A simple device for producing Griffiths [46, p.74, l. 2–l. 1] [46, p.228, (5.57)] strong uniform fields [Griffiths − − [46, p.228, l. 3–l. 2]]. − − Toroids: Griffiths [46, p.230, (5.58)]. Remark 10. The vector potential is “circumferential” [Griffiths [46, p.238, l.15]]. Remark. The explanation says that it miminics the magnetic field of the wire [Griffiths [46, p.238, l.15]]. In my opinion, this choice of the direction of vector potential is purely a guess. This is because I have difficulty finding the similarity between a circle [Griffiths [46, p.238, Example 5.12]] and a straight line [Griffiths [46, p.226, Example 5.7]]. The statement should have been proved as follows: Proof. A = A(ρ) [by symmetry] = Aρ (ρ)ρˆ + Aϕ (ρ)ϕˆ + Az(ρ)zˆ. I. Aρ (ρ)= 0. Proof 1. See Griffiths [46, p.227, l. 3–p.228, l.1]. − Proof 2. Since ∇ [Aρ (ρ)ρˆ]= 0, we may discard the component Aρ (ρ)ρˆ by letting Aρ (ρ)= × 0. After all, the only thing that matters is to find a simple A such that ∇ A = B. We should × not waste our time finding a tricky proof like Proof 1.

143 II. Az(ρ)= 0. The only nonzero component of ∇ A = B is thez ˆ component [Griffiths [46, p.238, l.13]], A × so its ϕˆ component ∂ ρ ∂Az = 0 [Wangsness [121, p.31, (1-88)]]. Thus, ∂Az = 0. Namely, ∂z − ∂ρ ∂ρ Az(ρ)= const. Since ∇ const. = 0, we may let Az(ρ)= 0. × Remark. The discussion given in Griffiths [46, p.227, l. 3–p.228, l.1] is meaningless be- − cause ∇ A = B can be replaced with ∇ A1 = B, where A1 = A + ∇ f . × × (B). Jackson [60] Remark 1. (a ∇)n f (r)= f (r) [a n(a n)]+n(a n) ∂ f [Jackson [60, front flyleaves: Vector Formulas]]. · r − · · ∂r Proof. The formula follows from Wangsness [121, p.34, (1-121)]. Remark 2. ∇(x a)= a + x(∇ a)+ i(L a), where L = 1 (x ∇) [Jackson [60, front flyleaves: Vector · · × i × Formulas]]. Proof. ∇ a = ∂ax + ∂ay + ∂az . · ∂x ∂y ∂z xˆ yˆ zˆ ∇ a = ∂ ∂ ∂ . × ∂x ∂y ∂z a a a x y z

xˆ yˆ zˆ x y z x (∇ a)= . × × ∂az ∂ay ∂ax ∂az ∂ay ∂ax ∂y ∂z ∂z ∂x ∂x ∂y − − − xˆ yˆ zˆ

x ∇ = x y z . × ∂ ∂ ∂ ∂x ∂y ∂z

xˆ yˆ zˆ (x ∇) a = y ∂ z ∂ z ∂ x ∂ x ∂ y ∂ . × × ∂z − ∂y ∂x − ∂z ∂y − ∂x a a a x y z The desired result follows from Wangsness [121, p.34, (1-112); (1-121)].

Remark 3. Jackson [60, p.34, (1.27)] Proof. By Jackson [60, p.33, l. 1], if the observation point x is infinitesimal close to the − inner side , disk dΩ = 2π; if the observation point x is infinitesimal close to the outer side , d 2 . The potential of the remainder containing a hole alone is continuous disk Ω =R π when the obervation− point passes through the hole because the integrand on the right side of R Jackson [60, p.33, (1.24)] has no singularities. 2 1 Remark 4. ∇ ( )= 4πδ(x x′) [Jackson [60, p.35, l. 4, (1.31)]]. x x′ − − − Remark.| − | The concept of δ generalized function originates from physics and is considered a function for many years by physicists so that some physicists think that we may treat the generalized function as a function when introducing the concept of δ generalized function, may treat it so until reaching the critical juncture, and then jump to the right track by treating it as a generalized function. Perhaps this way will allow us to avoid a contradiction. In fact, treating the generalized function as a function in the beginning has already planted the seeds of contradiction. A contradiction does not occur simply because we fail to forsee it at that time. Actually, a contradiction must occur. The concept of generalized function is a more delicate mathematical concept than that of function. The traditional mathematical language for functions is too crude to clearly explain the concept of generalized function. The first proof. See Pathria [88, p.501, l.3–l.11].

144 Remark. The above argument has problems. See Redziˇ c´ [92, p.2, Remark †]. However, if we handle the singular point carefully, we can make the above argument work as in the second proof. The second proof. See Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1477, l. 5–p.1478, l. 6]. − −

Remark. In order to satisfy the conditions given in Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1477, l. 1–p.1478, l.1], we may let gε (r)= 1/ε( r ε). The second proof is not − | |≤ rigorous because we treat δ-function as a function rather than a generalized function [Rudin [98, p.141, l. 7–l. 3]]. We discover that behind this significant but seemingly contradictory argument, there− is− actually a rich, deep and refined theory. The new theory requires more delicate analysis, language and formulation so that its meaning would not be ambiguous. A generalized function may be a function or a function whose domain contains a point which does not have a well-defined function value. However, for any testing function, the generalized function must have a well-defined value. Thus, distribution theory is a new theory that we create to avoid the contradiction that the domain of a function contains a point sin(x/ε) whose function value cannot be defined. The convergence limε 0 δ(x) [Cohen- → πx → Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1470, (10)]] should not be interpreted as the one in pointwise sense. Otherwise, we will have a contradition [Pathria [88, p.498, l. 8–l. 7]]. If − − we interpreted the convergence as the one in distribution sense [Rudin [98, p.146, l.9]], the previous contradiction will not occur. If we use the concept of generalized functions, the second proof actually shows that gε 4πδ in the distribution sense [Rudin [98, p.146, →− l.9]]. Thus, the second proof can be made rigorous using disbution theory, so can the third proof. The third proof. See Jackson [60, p.35. l.1–l. 4]. − Remark 1. Jackson’s proof is correct, but is disorganized because it uses the methods of distribution theory, but fails to use the theory’s terminology. A theory has its structures. Only through the use of the theory’s terminology may we clarify the structure of the proof and preserve its logical rigor. Remark 2. The above proof can be translated into the language of distribution theory as 2 2 follows: Let ra = √r + a . 2 1 3 2 1 lima 0 < ∇ ( ) ρ >= lima 0 r R d x∇ ( )ρ(x) → ra → ra | | (|≤3) = 4πε0Φa(x)= 4πρ(0)=< RRR4πδ0 ρ >. 2 −1 − | Thus, lima 0 ∇ ( )= 4πδ0 [Rudin [98, p.146, l.4–l.7]]. → ra − The fourth proof. See Redziˇ c´ [92, p.5, l.6–p.6, l.9]. Remark 5. By Jackson [60, p.35, (1.31)] and Wangsness [121, p.36, (1-135)], 2 1 ∇′ ( )= 4πδ(x x′) [Jackson [60, p.36, l. 3]]. x x′ Remark.| − | (How− we understand− the physical meaning− of a mathematical theorem) In view of Jackson [60, p.36, l. 14–p.37, l.16; p.37, l. 15–l. 14; p.38, l.11–l.17; p.39, (1.42)– (1.46)], the concept of− dipole layer is the key− to understanding− the physical meaning of Green’s theorem or those of boundary conditions. This is the reason why Jackson discusses dipole layers [Jackson [60, §1.6]] before boundary conditions [Jackson [60, §1.8–§1.10]]. However, it is difficult to understand the former topic without knowing dipoles or point dipoles in advance. Therefore, it would be better prepared for understanding if one read Jackson [60, §1.6] again after being familiar with dipoles and point dipoles.

145 ∂x Remark 6. ∂u = Uuˆ [Jackson [60, p.51, l.11]; Hawkins [53, p.89, (5-9)–(5-12); p.90, l.4–l.5]].u ˆ,vˆ & wˆ are pairwise orthogonal. By Rudin [99, p.186, (1)], δ(x x )= δ(u u )δ(v v )δ(w − ′ − ′ − ′ − w′) UVW [Jackson [60, p.51, l.13]]. Remark· 1. The Jacobian cannot vanish identically [Hawkins [53, p.88, l.1]]. Otherwise, one column can be expressed as a linear combination of the other two columns. Remark 2. By O’Neill [86, p.148, Lemma 3.8], ∇z1 represents a vector at P normal to the surface z1 = c1 [Hawkins [53, p.89, l. 4–l. 3]]. Remark 3. These two sets of unit vectors− are− identical if the curvilinear coordinate system is orthogonal [Hawkins [53, p.90, l.4–l.5]].

Proof. Let Si be the surface of zi = zi(y1,y2,y3) and Ti be the tangent plane to Si at P = (y1,y2,y3). Because the normal vector of the tangent plane Ti at P is along the line of intersection of the tangent planes Tj and Tk ( j = k, j = i = k) at P, the tangent vector ei of the coordinate curve 6 6 6 Zi at P is the normal vector Ei of the coordinate surface Si at P. Remark 7. The last paragraph of Jackson [60, §2.2] says only that

2 1 a q a − y2 2 ( ) 2 if q is outside the sphere ∂Φ 4πa y (1+ a 2 a cosγ)3/2 − y2 − y σ = ε0 =  2 ∂x  a 1 −  q a y2 −  2 ( ) 2 if q is inside the sphere. 4πa y (1+ a 2 a cosγ)3/2 − y2 − y  Remark. The normal derivative out of the conductor given in Jackson [60, p.60, l. 10] ∂Φ − refers to ∂x when q is outside the sphere. Remark 8. l is 0 or a positive integer; m is an integer such that l m l [Jackson [60, p.107, l. 2– − ≤ ≤ − l. 1]]. − Proof. If r is viewed as a constant, then the left side of Jackson [60, p.95, (3.1)] can be treated as the operator L2 given in Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.662, (D-6- a)]. By Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.648, (C-10)], l 0 (can be an integer or a ≥ half-integer). Why do we choose l 0 as our convention? The roots of the algebraic equation j( j + 1)= ≥ l(l + 1) [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.648, (C-11)]] give the microscopic d2U l(l+1) reason, while the roots of the differential equation dr2 r2 U = 0 [Jackson [60, p.96, (3.7)]] give the macroscopic reason. − By Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.649, l.11], l m l. − ≤ ≤ By Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.664, l.4–l.5], l is an integer.

Remark 9. (Proofs for Bessel functions) Jackson [60, p.112, (3.79) & (3.80); p.113, (3.82) & (3.83)]: see Watson [123, p.39, l.1– p.40, l.18]. The series given in Jackson [60, p.113, (3.82) & (3.83)] converge for all finite values of x: see Guo–Wang [48, p.348, l. 4–p.349, l.3]. − If ν is not an integer, these two solutions J ν (x) form a pair of linearly independent solu- tions to the second-order Bessel equation [Jackson± [60, p.113, l.9–l.11]]: see Guo–Wang [48, p.348, l.8–l.10].

146 Nν (x) is linearly independent of Jν (x) [Jackson [60, p.113, l. 14]]: see Watson [123, p.76, l.6, (1)]. − Jackson [60, p.113, (3.84)]: see Watson–Whittaker [122, p.356, l.3]. Nn(x) [Jackson [60, p.113, (3.85)]] is a solution of the Bessel equation for functions of order n: see Watson [123, p.59, l.10; p.64, (1) & (2)]. The series expansion of Nν (x) [Jackson [60, p.113, l. 13–l. 12]]: see Watson–Whittaker − − [122, p.372, l.7–l.8] or Guo–Wang [48, p.167, (5) & (6)]. Jackson [60, p.113, (3.87) & (3.88)]: see Watson [123, p.45, (1) & (2); p.66, (1) & (2); p.74, (3) & (4)] or Guo–Wang [48, p.349, (13) & (14); p.369, l.8–l.9]. Jackson [60, p.114, (3.89)]: see Watson [123, p.40, l.18] or Guo–Wang [48, p.348, (7)]. Jackson [60, p.114, (3.90)]: see Guo–Wang [48, p.367, (6)]. Jackson [60, p.114, (3.91)]: see Guo–Wang [48, p.379, (5) & (6)]. Each Bessel function has an infinite number of roots [Jackson [60, p.114, l. 19–l. 18]]: see Guo–Wang [48, p. 420, l. 8–l. 7]. − − − − The asymptotic formula given in Jackson [60, p.114, l. 10]: see Guo–Wang [48, p.423, (3)]. − Jackson [60, p.115, (3.95)]: see Guo–Wang [48, p.424, (3)] and Jackson [60, p.113, (3.87) & (3.88)]. The statements given in Jackson [60, p.116, l. 20–l. 1]: see Guo–Wang [48, p.374, l.12, − − (3), l.15–l.16, (7); p.375, l.1–l.2, (10) & (11); p.378, (3); p.379, (5)]. Jackson [60, p.118, (3.108)]: see Watson–Whittaker [122, p.362, l.9]; Eq. (5.24) in https:// people.math.osu.edu/gerlach.1/math/BVtypset/node122.html#eq:tilted_ planewave; Eq. (5.44) in https://people.math.osu.edu/gerlach.1/math/ BVtypset/node127.html. 1 d2 l(l+1) 4π Remark 10. 2 (rgl(r,r )) 2 gl(r,r )= 2 δ(r r ) [Jackson [60, p.121, l.2, (3.120)]] r dr ′ − r ′ − r − ′ ∞ l Proof. Let G(x,x′)= ∑l=0 ∑m= l Alm(r r′,θ ′,φ ′)Ylm∗ (θ ′,φ ′)Ylm(θ,φ) [Jackson [60, p.120, (3.118)] follows from a theorem− similar to| Coddington–Levinson [16, p.197, Theorem 4.1]] ∞ l = ∑l=0 ∑m= l gl(r,r′)Ylm∗ (θ ′,φ ′)Ylm(θ,φ) [G is symmetric in (θ,φ) and (θ ′,φ ′)]. 2 − 2 ∇2G = 1 ∂ (rG)+ 1 ∂ (sinθ ∂G )+ 1 ∂ G [Wangsness [121, p.33, l. 1, (1-105)]] x r ∂r2 r2 sinθ ∂θ ∂θ r2 sin2 θ ∂φ 2 − ∞ l 1 ∂ 2 l(l+1) = ∑ ∑ [ 2 (rgl(r,r )) gl(r,r )]Y (θ ,φ )Ylm(θ,φ) [Guo–Wang [48, p.242, l=0 m= l r ∂r ′ − r ′ lm∗ ′ ′ l. 8–l. 7; p.244,− (2)]]. − − 4πδ(x x′) − 4π − = r2 δ(r r′)δ(φ φ ′)δ(cosθ cosθ ′) [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1477, (60)]]− − − − 4π ∞ l = r2 δ(r r′)∑l=0 ∑m= l Ylm∗ (θ ′,φ ′)Ylm(θ,φ) [Jackson [60, p.108, (3.56)]]. − 2− − Because ∇xG = 4πδ(x x′) [Jackson [60, p.120, (3.116)]] and Ylm(θ,φ) lm form a basis [Jackson [60, p.108,− l. 14–l.− 10; p.109, (3.58)]], { } 1 d2 l(l+1−) − 4π [ 2 (rgl(r,r )) 2 gl(r,r )]Y (θ ,φ )= 2 δ(r r )Y (θ ,φ ). r dr ′ − r ′ lm∗ ′ ′ − r − ′ lm∗ ′ ′ Remark 11. ρ(x )= Q δ(r a)δ(cosθ ) [Jackson [60, p.123, l. 17, (3.129)]] ′ 2πa2 ′ − ′ − Proof. If a = 1, the charge density on x = a is Q δ(r a)δ(cosθ ). The area of x = | ′| 2π ′ − ′ | ′| a is proportional to the square of radius a, so the charge density on x′ = a is inversely proportional to the square of radius a. | | Remark 12. There is a sign difference between the first equality of Jackson [60, p.59, (2.5)] and the first equality of Jackson [60, p.124, (3.137)]. The difference can be explained using Griffiths

147 [46, p.90, (2.35) & (2.36)], wheren ˆ refers to the outward normal of a sphere. For the case ∂Vbelow of Jackson [60, p.59, Figure 2.3], ∂n = 0. Notice that the field lines are directed from the positive charge to the negative induced charges. In contrast, for the case of Jackson [60, ∂Vabove p.124, (3.137)], ∂n = 0. 1 ∞ im(φ φ ′) Remark 13. (An easy way to make the discussion of δ(x) rigorous) δ(φ φ ′)= 2π ∑m= ∞ e − on − − [ π,π) [Jackson [60, p.125, l.13, (3.139)(ii)]] − sin(n+1/2)x 1st proof. Dn(x)= sin(x/2) [Rudin [97, p.174, (77)]] sin(2n+1)x Dn(2x)= sinx ⇒ 1 π limn ∞ Dn(2x)= δ(x) [Born–Wolf [8, p.897, (20) & (21)]] ⇒ 1 ∞ → in(2x) π ∑n= ∞ e = δ(x) [Rudin [97, p.174, (76)(i)]] ⇒ 1 ∞ − inx π ∑n= ∞ e = δ(x/2)= 2δ(x) [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1471, (20)]]. ⇒ −

2nd proof. Let f be a continuous function of bounded variation with a period 2π. 1 π sn( f ;x)= 2π π f (x t)Dn(t)dt [Rudin [97, p.175, (82)]] π − 1 ∞− imt π f (x t)( 2π ∑m= ∞ e )dt = limn ∞ sn( f ;x)= f [Royden [96, p.232, Proposition ⇒ −R − → 18];− Zygmund [127, vol. 1, p.57, Theorem (8.1)(ii)]] Rπ ∞ = π f (x t)∑q= ∞ δ(t 2πq)dt [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1473, (31)]] −∞ − − − = ∞ f (x t)δ(t)dt R−π − = π f (x t)δ(t)dt [δ(t)= 0 if t [ π,π)]. R− −1 ∞ imt 6∈ − δ(t)= 2π ∑m= ∞ e a.e. on [ π,π) [Rudin [99, p.31, Theorem 1.39(b)]] ⇒R − − Remark 1. Strictly speaking, the first proof is not good because it fails to consider the requirement given in Reif [93, p.614, l. 7]. Both proofs are not rigorous because a general- − ized function should not be treated as a function. In order to correct the problem, we should use the language of functional analysis. Actually, the required supplement is not much. For the discussion of the Dirac delta function, it requires only Rudin [98, p.142, l. 6, (1); p.155, (2) & (5)] to bridge the gap between a function and a generalized function. For− the discus- sion of derivatives of δ [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1476, b]], it requires only Rudin [98, p.144, (1), (2) & (3)] to bridge the gap between a function and a generalized function. Remark 2. The discussion given in Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 2, Appendix II] is not good because a generalized function should not be treated as a function. It requires a rig- orous theory to correct and support the discussion. The theory contained in Rudin [98, chap. 6] is rigorous, but it fails to directly apply to the Dirac delta generalized function. Many physicists fail to understand the theory. This is the reason why theory and applications are easily disconnected. Therefore, it is important to identify their connections. 2 1 d dgm 2 m 4π Remark 14. (ρ ) (k + )gm = δ(ρ ρ ) [Jackson [60, p.125, (3.141)]]. ρ dρ dρ − ρ − ρ − ′ 1 ∞ ∞ im(φ φ ′) Proof. Let G(x,x′)= 2π2 ∑m= ∞ 0 dke − cos[k(z z′)]gm(k,ρ,ρ′) [Jackson [60, p.125, − − (3.140)] follows from a theorem similar to Coddington–Levinson [16, p.197, Theorem 4.1]]. R 2 1 ∂ ∂G 1 ∂ 2G ∂ 2G ∇xG = ρ ∂ρ (ρ ∂ρ + ρ2 ∂φ 2 + ∂z2 [Wangsness [121, p.31, (1-89)]] − 2 1 ∞ ∞ im(φ φ ′) 1 d dgm 2 m = 2π2 ∑m= ∞ 0 dke − cos[k(z z′)][ ρ dρ (ρ dρ ) (k + ρ2 )gm]. − − − 4πδ(x x ) − − ′ R = 4π δ(ρ ρ )δ(φ φ )δ(z z ) [Jackson [60, p.120, l. 2]] − ρ − ′ − ′ − ′ − 4π 1 ∞ ∞ im(φ φ ′) = ρ δ(ρ ρ′)[ 2π2 ∑m= ∞ 0 dke − cos[k(z z′)]] [Jackson [60, p.125, (3.139)]]. − − − − R 148 2 imφ imφ Because ∇xG = 4πδ(x x′) [Jackson [60, p.120, (3.116)]] and e coskz,e sinkz mk form a basis, − − { } 2 1 d dgm 2 m 4π (ρ ) (k + )gm = δ(ρ ρ ). ρ dρ dρ − ρ − ρ − ′ Remark 15. (Why we should emphasize integration studies) It can be said that physics integrates various topics in differential equations. For example, Jackson [60, §3.11 Expansion of Green Functions in Cylinderical Coordinates] integrates the general form of 3-dim Green’s function [Jackson [60, p.38, (1.31); p.125, (3.142)]], the 1-dim Green’s function [Jackson [60, p.125, (3.143)]], the Sturm–Liouville system [Jackson [60, p.126, (3.145)]] and the Wronkian normalization [Jackson [60, p.126, (3.146)]]. In differential equations, we usually treat them as independent and disconnected topics. However, when we put them into one physical system simultaneously to serve a spe- cial purpose (for the present case, the computation of the potential of a unit point charge), we should consider and ensure the compatibility among these topics. They are interrelated. The assignment of the value of a parameter of one subsystem may affect another subsystem. By (pW) = ψ1(pψ ) ψ2(pψ ) , the Wronskian of two independent solutions of Jackson ′ 2′ ′ − 1′ ′ [60, p.126, (3.145)] is c/p(x), where c is a constant. If we assign c to be 4π as in Jackson − [60, p.126, (3.146)], we will obtain Jackson [60, p.125, (3.143)], which can be used to com- pute the potential of a unit point charge. If we use Birkhoff–Rota [5, p.286, l.20] instead, our calculation will not obtain the correct . Except for leading us to the consideration of compatibility, integration studies may also help us (1). Trace back to natural origins. From the viewpoint of one dimension alone, the formula given in Birkhoff–Rota [5, p.286, l.20] looks artificial. However, once it combines with inte- gration studies, it will become natural: the integration through the normalization of Jackson [60, p.126, (3.146)] reveals that the radial consequence Jackson [60, p.125, (3.143)] origi- nates from the natural symmetric Green’s function in three dimensions [Jackson [60, p.125, (3.142)]]. (2). Observe that a side problem for one subject may be the main problem of another subject. Coddington–Levinson [16, p.192, Theorem 2.2(iii)] says that as a function of t, G satisfies Lx = lx for t = τ. How about if t = τ? Even if the answer may help better understand 6 Coddington–Levinson [16, p.192, Theorem 2.2] , we often ignore this side problem. This is because the δ-function is an indefinable object in the classical theory of ordinary differential (n 1) equations. At best, we can only say that G − (τ,τ,l) does not exist [Coddington–Levinson [16, p.192, Theorem 2.2(ii)]]. However, in functional analysis, the δ-function can be rig- orously defined [Rudin [98, p.141, l. 7–l. 3]]. Then the above side problem becomes − − interesting and can be completely solved [Rudin [98, p.206, Exercise 10; p.378, l. 6]]. − Remark 16. A = 4π [Jackson [60, p.126, l. 8]] − Proof. By substituting ψ1(x)= AIm(x) and ψ2(x)= Km(x) [Jackson [60, p.126, l. 21]] into − Jackson [60, p.126, (3.146)], we have 4π AW[Im(x),Km(x)] = ( ). − x ∗ Substitute the limiting forms [Jackson [60, p.116, (3.102) & (3.103)]] into ( ) and then compare the leading coefficients of the two sides of ( ). ∗ ∗ Remark 17. If f : [0,∞) C, then Titchmarsh [113, p.2, (1.1.4)] is valid for x 0. Jackson [60, p.126, (3.151)] follows→ from this formula. ≥ Remark 18. Jackson [60, p.128, (3.163)] follows from Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.102, (A-47)].

149 16π ∞ lπx lπx mπy lπy′ sinh(Klmz<)sinh[c Klmz>] Remark 19. G(x,x )= ∑ sin( )sin( ′ )sin( )sin( ) − , where Klm = ′ ab l,m=1 a a b b Klm sinh(Klmc) π(l2/a2 + m2/b2)1/2 [Jackson [60, p.129, (3.168)]].

16π ∞ lπx lπx′ mπy lπy′ Proof. I. Let G(x,x′)= ab ∑l,m=1 sin( a )sin( a )sin( b )sin( b )g(l,m,z,z′) (by sym- metry and a theorem similar to Coddington–Levinson [16, p.197, Theorem 4.1]). 2 ∂ 2 ∂ 2 ∂ 2 ∇xG = ∂x2 + ∂y2 + ∂z2 2 2 2 2 2 = 16π ∑∞ sin( lπx )sin( lπx′ )sin( mπy )sin( lπy′ )[ ∂ g ( l π + m π )g]. ab l,m=1 a a b b ∂z2 − a2 b2 4πδ(x x ) − − ′ = 4πδ(x x′)δ(y y′)δ(z z′) [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1477, (59)]] − − ∞ − 4 −lπx lπx′ mπy lπy′ = 4πδ(z z′)∑l,m=1 ab sin( a )sin( a )sin( b )sin( b ) [Cohen-Tannoudji–Diu–Laloe¨ [17,− vol. 1,− p.100, (A-32)]]. 2 lπx mπy Because ∇xG = 4πδ(x x′) [Jackson [60, p.120, (3.116)]] and sin( a )sin( b ) lm are linearly independent,− − { } ∂ 2g 2 ∂z2 Klmg = δ(z z′). II. The− desired− result− follows from Birkhoff–Rota [5, p.286, (67)]. Remark. (The general method of finding a Green function’s eigenfunction expansion: using symmetry) In order to reduce the problem of finding a 3-dim Green function to the problem of finding 1-dim Green function, we should summarize the proof of Jackson [60, p.121, l.2, (3.120)], the proof of Jackson [60, p.125, (3.141)], and Part I of the above proof as follows: Put the unit charge into the volume of interest. Let x′ be its position. Let x,y,z be the Green function’s three variables. Now use z′ to divide the volume into two regions: I. x z < z′ ; 2 { | } II. x z > z′ . In these two regions, the Poisson equation ∇xG = 4πδ(x x′) is reduced to { | } 2 − − the Laplace equation ∇ G = 0. Let φlm(x,y) lm be the basis of the solution space. By sym- x { } metry and a theorem similar to Coddington–Levinson [16, p.197, Theorem 4.1], we have 2 G(x,x′)= ∑lm glm(z,z′)φlm(x,y)φlm(x′,y′). By substituting this expression for G into ∇xG = 4πδ(x x′) and using Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.100, (A-32)], we will obtain− the− equation for 1-dim Green function. In the last paragraph, we have used the fact that G is symmetric in (x,y) and (x′,y′). In order to find the solutions of the equation for the 1-dim Green function, we should use the fact that G is symmetric in z and z′. This usage of symmetry is more subtle, more refine, and more interesting than the previous one. In view of the example given in Coddington– Levinson [16, p.222, l.9–l.14], the algebraic methods of sloving the boundary value prob- lems such as Jackson [60, p.121, l.2, (3.120)] (gl(0) is finite; gl(∞)= 0), Jackson [60, p.125, (3.141)] (gm(0) is finite; gm(∞)= 0), and Birkhoff–Rota [5, p.286, Theorem 12] are essen- tially the same. No wonder Jackson [60, p.120, l.1–l.9] and Jackson [60, p.125, l. 11–l. 3] − − have similar geometric interpretations for for region I: x z < z and region II: x z > z . { | ′} { | ′} Jackson should have quoted Birkhoff–Rota [5, p.286, Theorem 12] whenever necessary in- stead of repeatng its proof many times. Remark 20. Jackson [60, p.131, (3.175)] follows from Guo–Wang [48, p.403, (3)]. Pl ( cosθ ) Remark 21. Bl = |rl+1 | [Jackson [60, p.132, l.2, (3.176)]]. Proof. Fix ρ. By Jackson [60, p.131, (3.174)], Bl(ρ,z) is even in z. Consequently, it suffices l l Pl (z/r) ( 1) ∂ 1 ∂ Pl (z/r) to consider the case z 0. Assume = − ( ). It suffices to prove ( )= ≥ rl+1 l! ∂zl r ∂z rl+1 (l + 1) Pl+1(z/r) . − rl+2

150 Proof. I.

∂ ∂(z/r) P (z/r)= P (z/r) ∂z l l′ ∂z 1 2 = P′(z/r)((z/r) 1) − r l − 1 = [l(z/r)Pl(z/r) lPl 1(z/r)] [Watson–Whittaker [122, p. 309, (V)]]. − r − − II.

∂ Pl(z/r) ∂ (l+1) (l+2) ( )=[ Pl(z/r)]r− + Pl(z/r)[ (l + 1)]r− (z/r) ∂z rl+1 ∂z − 1 (l+2) = [l(z/r)Pl(z/r) lPl 1(z/r)]+ Pl(z/r)[ (l + 1)]r− (z/r) [by I] −rl+2 − − − 1 = [(l + 1)Pl 1(z/r)] [Watson–Whittaker [122, p. 308, (II)]]. −rl+2 +

Remark 22. Jackson [60, p.132, l. 5, (3.180)] follows from Guo–Wang [48, p.377, (3); p.406, (8)]. − Remark 23. Jackson [60, p.133, l. 7, (3.185)] follows from Gradshteyn–Ryzhik [44, formula 6.752.1 & formula 6.752.2]. − Remark 24. Jackson [60, p.134, l.10, (3.187)] follows from Guo–Wang [48, p.377, (3); p.406, (8); p.137, (10)]. 2 2 k2α ∂ B Remark 25. ∇ B k3 2 = 0 [Jackson [60, p.779, (A.9)]]. − k1 ∂t Proof. ∇ (∇ B)= ∇(∇ B) ∇2B [Wangsness [121, p.34, (1-120)]] × × · − = ∇2B [Jackson [60, p.778, (A.8)(iv)]]. − ∇ (∇ B)= 4πk α(∇ J)+ k2α ∂ (∇ E) [Jackson [60, p.778, (A.8)(ii)]] 2 k1 ∂t × × 2 × × k2α ∂ B = k3 2 [Jackson [60, p.778, (A.8)(iii)]]. − k1 ∂t

2 1 Remark 26. In Gaussian units, inductance has the dimensions t l− [Jackson [60, p.783, l. 7]]. 1/2 3/2 1 − Proof. By Jackson [60, p.777, (A.2)], q has the dimensions m l t− . 2 2 4 Hence I has the dimensions ml t− . 1 2 2 1 By U = 2 LI , L has the dimensions t l− . (C). Matveev [79] Remark 1. By Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.798, Fig. 4], there are four quantum- mechanical states on the level n = 2 [Matveev [79, p.21, l.6–l.7]]. (D). Sadiku [100] Remark 1. The Lorentz condition can be obtained from the continuity equation [Sadiku [100, p.388, l. 4]] by using Wangsness [121, p.518, l.9–p.519, l. 3]. Born–Wolf [8, p.79, l. 19–l. 18] − − − − provides a simpler proof. Remark 2. Sadiku [100, p.443, l. 4–p.444, l.8]. − jβ1z jβ1z Proof. E1s =[Ei0e− + ΓEi0e ]ax. jωt Let E1 = E1se .

151 2 jωt jωt E1 =(E1se )(E1∗se− ) | | 2 2 jβ1z 2 jβ1z = Ei0 (1 + Γe )(1 + Γ∗e− ) | |2 2 iθ = Ei0 [1 + Γ + 2 Γ cos(2β1z + θ)], where Γ = Γ e . | | | | | | | | I. Case θ = 0: nπ nλ1 cos(2β1z)= 1 (zmax = = , where n = 0,1,2,3, ). ⇒ − β1 − 2 ··· (2n+1)π (2n+1)λ1 cos(2β1z)= 1 (zmin = = , where n = 0,1,2,3, ). − ⇒ − 2β1 − 4 ··· II. Case θ = π: (2n+1)π (2n+1)λ1 cos(2β1z + π)= 1 (zmax = = , where n = 0,1,2,3, ). ⇒ − 2β1 − 4 ··· nπ nλ1 cos(2β1z + π)= 1 (zmin = = , where n = 0,1,2,3, ). − ⇒ − β1 − 2 ···

jωt jωt Remark. When discussing E1 , we define E1 to be E1se instead of ℜ(E1se ) [See Sadiku [100, p.391, (9.67)]].| | Remark 3. LC = µε [Sadiku [100, p.475, (11.1)(i)]] follows from p.534, (11.1.6)(ii) in http://www. ece.rutgers.edu/˜orfanidi/ewa/ch11.pdf; G = σ [Sadiku [100, p.475, (11.1)(ii)]] follows from p.540, l. 6 in http://www.ece. C ε − rutgers.edu/˜orfanidi/ewa/ch11.pdf. Remark 4. I. Parameters for coaxial line [Sadiku [100, p.475, Table 11.1]] R = 1 [ 1 + 1 ] [Sadiku [100, p.428, (10.57)(i)]]. 2πδσc a b µ b L = 2π ln a [Sadiku [100, p.344, l.7]]. 1 G = 2πσ[ln(b/a)]− [Sadiku [100, p.233, l. 2]]. 1 − C = 2πε[ln(b/a)]− [Sadiku [100, p.227, (6.28)]]. II. Parameters for two-wire line [Sadiku [100, p.475, Table 11.1]] R = 1 [Sadiku [100, p.428, (10.57)(i)]]. πaδσc µ 1 d L = π cosh− 2a [Wangsness [121, p.465, (27-68)(i)]; Sadiku [100, p.475, (11.1)(i)]]. πσ G = 1 d [Wangsness [121, p.465, (27-68)(i)]; Sadiku [100, p.475, (11.1)(ii)]]. cosh− 2a πε C = 1 d [Wangsness [121, p.465, (27-68)(i)]]. cosh− 2a Remark 5. Sadiku [100, p.477, l.3–l.10] clearly explains why Wangsness [121, chapter 27] considers (V,I) instead of (E,H). In fact, Sadiku [100, p.477, (11.2)] explains 1. The similarity between Z0 and η [Sadiku [100, p.480, l.1]], 2. Why Wangsness [121, p.455, l.1–l. 3] and Wangsness [121, p.462, l. 3–p.463, l.11] run − − parallel to Wangsness [121, p.389, l. 7–p.391, l.5], and 3. Why Wangsness [121, p.463, l.12–l.− 11] runs parallel to Wangsness [121, p.382, l.8– − p.383, l.4].

Remark 6. Sadiku [100, p.478, (11.8) & (11.9)] are derived more naturally than Wangsness [121, p.462, (27-44) & (27-45)]. Remark 7. G = 0 [Sadiku [100, p.480, l. 6]] follows from p.540, l. 6 in http://www.ece.rutgers. − − edu/˜orfanidi/ewa/ch11.pdf. Remark 8. Imax = Vmax/Z0 and Imin = Vmin/Z0 [Sadiku [100, p.487, l.13]] follow from [p.555, (11.6.7); p.567, (11.12.4)] in http://www.ece.rutgers.edu/˜orfanidi/ewa/ch11.pdf. Note. p.557, (11.7.10) in http://www.ece.rutgers.edu/˜orfanidi/ewa/ch11. pdf. Sadiku [100, p.487, (11.39a) & (11.39b)] follow from [p.570, (11.13.6) (11.13.7); p.567, (11.12.6)] in http://www.ece.rutgers.edu/˜orfanidi/ewa/ch11.pdf.

152 2d2 Remark 9. For the reason why we define r = λ [Sadiku [100, p.592, (13.8)]] as the boundary between the near and the far zones, see p.730, (15.7.3) in http://www.ece.rutgers.edu/ ˜orfanidi/ewa/ch15.pdf. Remark 10. (How we tailor calculations to our needs) We often do a lot of unnecessary calculations for the radiation zone: Sadiku [100, p.591, l. 11–l. 1; p.595, l.1–l.4; p.599, l. 1–p.600, l. 8] and Wangsness [121, p.477, − − − − l. 5–p.478, l.3; p.482, l. 2–p.483, l.4]. However, most of them will never be used. p.734, − − l. 11–l. 2 in http://www.ece.rutgers.edu/˜orfanidi/ewa/ch15.pdf − − shows how we should tailor calculations to our needs by avoiding unnecessary ones. The unnecessary calculations not only waste time and space but may also easily leave a gap in the theory due to the failure to provide the calculations that we should. For example, Sadiku [100, p.599, (13.33)] and p.792, (17.8.1) in http://www.ece.rutgers.edu/ ˜orfanidi/ewa/ch17.pdf are given without proofs. We may use p.3, l. 4–p.8, l.4 in http://www.ece.mcmaster.ca/faculty/nikolova/antenna_dload/current_− lectures/L12_Loop.pdf to prove Sadiku [100, p.600, (13.35a) & (13.35b)] and p.792, (17.8.1) in http://www.ece.rutgers.edu/˜orfanidi/ewa/ch17.pdf. p.6, (12.23) in http://www.ece.mcmaster.ca/faculty/nikolova/antenna_dload/ current_lectures/L12_Loop.pdf can be proved by using p.792, (17.8.1) in http:// www.ece.rutgers.edu/˜orfanidi/ewa/ch17.pdf and Maxwell equations: H = ∂D ∇ A and ∇ H = ∂t = jωεE [Wangsness [121, p.375, (24-4)]]. Remark× 1. Since× dl [p.4, (12.14) in http://www.ece.mcmaster.ca/faculty/ nikolova/antenna_dload/current_lectures/L12_Loop.pdf] is on xy-plane, A cannot have rˆ or θˆ components. Consequently, A = Aϕ ϕˆ [p.5, l.5 in http://www. ece.mcmaster.ca/faculty/nikolova/antenna_dload/current_lectures/ L12_Loop.pdf]. π jzcosϕ n Remark 2. 0 cos(nϕ)e dϕ = π j Jn(z) [p.5, (12.19) in http://www.ece.mcmaster. ca/faculty/nikolova/antenna_dload/current_lectures/L12_Loop.pdf]. R π jzsinθ jnθ Proof. π e e− dθ = 2πJn(z) [p.14, l. 3 in http://www.math.psu.edu/papikian/ Kreh.pdf− ]. − R π Let ϕ = 2 θ. Then π jzcosϕ− n π e [cos(nϕ)+ j sin(nϕ)]dϕ = 2π j Jn(z). − π jzcosϕ However, π e sin(nϕ)dϕ = 0 (since the integrand is an odd function of ϕ). R − Remark 11. By replacingR l in Sadiku [100, p.486, (11.34)] with l′ = 0 [Sadiku [100, p.486, l. 9]], we − have Zin = ZL [Sadiku [100, p.603, l. 4]]. − Remark 12. D = 1.64 [Sadiku [100, p.607, (13.46)]]. Proof. It suffices to prove cos2( π cosθ) sin2 θ. 2 ≤ By drawing the graph of y = 1 cosθ, we have − 1 cosθ 2 θ [0 θ π ]. Then − ≤ π ≤ ≤ 2 sin2[ π (1 cosθ)] sin2 θ. 2 − ≤ (E). Wangsness [121] Remark 1. The definition of (A ∇)B given in Wangsness [121, p.34, (1-121)] is awkward. In contrast, the following definition· is so simple and heuristic that we may immediately recognize its exact meaning from the notation itself: (A ∇)B =(A ∇Bx,A ∇By,A ∇Bz). · · · · ρ(r′)Rˆ dτ′ Remark 2. The argument given in Wangsness [121, p.65, l.21–l.23] is incorrect. ∇ 2 = · V ′ R R 153 Rˆ ρ(r′)∇ ( 2 )dτ′ [Wangsness [121, p.65, (4-23)]] follows from Rudin [99, p.246, Exercise V ′ · R 16]. R Remark 3. It is incorrect to prove E = σ/ε0 [Wangsness [121, p.94, l.9]] by using Wangsness [121, p.85, (6-4)]. The correct proof can be found in Griffiths [46, p.74, Example 2.5]. Remark 4. By the last two figures in p.5, www.its.caltech.edu/˜teinav/Lectures/Ph %201b/Lecture%2014%20-%2003-02-2017.pdf, we see that this is impossible because E is a conservative vector field [Wangsness [121, p.94, 6]]. − Remark 5. Wangsness [121, p.112, l.14, (8-9)] is incorrect, so is the argument given in Guo–Wang [48, p.216, l.12–l.14]. ( x < 1,y < 1) given in Wangsness [121, p.112, l.14, (8-9)] should have | | been corrected as ( x < 1, y < 1, 2x + y 2 < 1) [Watson–Whittaker [122, p.302, l.6]]. | | | | | | | | Otherwise, let y 1 and x 1. Then t = 2x+y2 3 does not satisfy Wangsness [121, p.111, (8-6)]. ր ց− − → Remark 6. Both Wangsness [121, p.120, l. 13–p.121, l. 5] and Choudhury [14, p.55, l.2–l. 7] de- scribe the electric field lines of− a point dipole.− The latter writing style makes the− reading easier. Although Sadiku [100, p.144, (4.83)] correctly explains why the lines of E are per- pendicular to the equipotential surfaces [Wangsness [121, p.121, l. 6–l. 5; p.123, l.17– − − l.18]], O’Neill [86, p.148, Lemma 3.8] gives a better answer. This is because a geometric problem should be solved without using resources other than geometry. Remark 7. The proof of (3) in https://en.wikipedia.org/wiki/Polarization_density provides the alternative proof of ρb = ∇ P said in Wangsness [121, p.144, l. 8]. This − · − proof also allows us to prove Wangsness [121, p.143, (10-7); (10-8)] more rigorously. It is difficult to prove Wangsness [121, p.143, (10-7); (10-8)] directly from Wangsness [121, p.143, (10-6)] using Wangsness [121, p.69, (5-7); (5-8)] alone. However, if we use the above alternative proof to obtain Wangsness [121, p.143, (10-7)], then Wangsness [121, p.143, (10-8)] may follow from Rudin [99, p.31, Theorem 1.39(b)]. Remark 8. For dielectrics, the surface charge density is compatible with the volume charge density [Wangsness [121, p.145, l.4–l.5]]. Remark. The statement “we can take b to be approximately constant throughout this small volume” [Wangsness [121, p.134, l.7]] should be crossed out because it contradicts the state- ment “b increases in such a way in this process that limh 0(hb) remains finite” [Wangsness [121, p.134, l.12–l.13]]. → Remark 9. In order to prove that D will not be changed [Wangsness [121, p.157, l. 6; p.167, l.10]], − we should consider a rectangular box which surrounds the positive plate and whose bottom is parallel to the plate. Then we use the Gauss law: ∇ D = ρ f [Wangsness [121, p.152, · (10.41)]]. The formula given in Griffiths [46, p.182, l. 6] is derived by another method. − Remark 10. For the proof of Wangsness [121, p.164, (10-92)], see Jackson [60, p.166, l. 2–p.167, l.17]. − Remark 11. The discussion given in Wangsness [121, p.184, l.10–l.26] is used as the prerequisite for studying the example given in Wangsness [121, p.184, l. 12–p.185, l.13]. It is difficult to − compare two types of sources (parallel charged lines vs. parallel cylindrical conductors) and then prove their equivalence using surface charge density. Actually, the prerequisite should have put emphasis on the solutions of Laplace equations rather than sources. Let φ be a real harmonic function (i.e., ∇2φ = 0) in a domain D. Then 1 π it φ(z)= 2π π φ(z + re )dt (z D)[Rudin [99, p.259, l. 7]]. − ∈ − Corollary 1. maxz D¯ φ(z)= maxz ∂D φ(z) and minz D¯ φ(z)= minz ∂D φ(z). R ∈ ∈ ∈ ∈ Corollary 2. If φ1 φ2 on ∂D, then φ1 φ2 on D. ≡ ≡ 154 Let U1,U2 be two closed discs given in Wangsness [121, p.184, Figure 11-10], D = C (U1 \ ∪ U2), φ1 represent the potential of parallel charged lines, and φ2 represent the potential of parallel cylindrical conductors. Thus, we may use the sources of parallel charged lines given in Wangsness [121, p.78, Figure 5-8] to find the potential of parallel cylindrical conductors in D. Remark. Wangsness [121, p.184, l.10–l.26] attempts to explain Corollary 2, but fails to hit the heart of the matter. Griffiths [46, p.127, Problem 3.11] and Jackson [60, p.88, Problem 2.8(b)] fail to ask the important question they should. Note that Jackson [60, p.39, (1.42); p.40, l.17–l. 13] are not helpful in proving Corollary 2. − Remark 12. Wangsness [121, pp.200–201, Exercise 11-24]. Proof. By substituting φ = φ(ρ,ϕ)= U(ρ)V(ϕ) into 2 1 ∂ ∂φ 1 ∂ 2φ 0 = ∇ φ = ρ ∂ρ (ρ ∂ρ )+ ρ2 ∂ϕ2 [Wangsness [121, p.31, (1-89)]], we have 2 2 ρ [ ∂U(ρ) + ρ ∂ U(ρ) ]= 1 ∂ V(ϕ) = m2. U(ρ) ∂ρ ∂ρ2 −V(ϕ) ∂ϕ2 ∂ 2V(ϕ) 2 ∂ϕ2 + m V(ϕ)= 0 V(ϕ)= cosmϕ or sinmϕ. 2 ⇒ 2 ρ [ ∂U(ρ) + ρ ∂ U(ρ) ]= m2 ρ2 ∂ U(ρ) + ρ ∂U(ρ) m2U(ρ)= 0 U(ρ) ∂ρ ∂ρ2 ⇒ ∂ρ2 ∂ρ − lnρ if m = 0 U . (ρ)= m m ⇒ (ρ or ρ− if m = 0 [by the Frobenius method] 6 2 µ0I′a ndz0 Remark 13. dB = 2 2 1/2 [Wangsness [121, p.231, (14-21)]]. 2[a +(zP z0) ] − 2 Proof. B(z)= µ0I′a N zˆ [Wangsness [121, p.230, (14-18)]]. 2[a2+z2]3/2 2 dB µ0I′a = , where dN = ndz0 [Wangsness [121, p.231, l.8]] and z = zP z0 can be dN 2[a2+z2]3/2 − vewed as constant [Wangsness [121, p.231, l.8–l.9]].

Remark 14. We can take r′ = const. [Wangsness [121, p.233, l.6–l.7]].

Explanation. Because in the original model r′ refers to the fixed position of the current element I′ds′ [Wangsness [121, p.218, Figure 13-1]], in this context r′ refers to the instan- taneous position of the charge q′ even though q′ moves with velocity v. Remark 15. (How to rigorously prove an intuitive statement) The illustration given in Wangsness [121, p.237, l. 10–l. 2] at best provides the idea of proof instead of a detailed proof. In order to highlight− the− key idea and provide a rigorous proof, we should simplify our model and make it typical so that we may easily generalize this special case to the general case. In other words, the following factors must be simplified: the shape of C′, the positions of P and P + ds, and the solid angles. Considering symmetry and the simplification of solid angles, we let C be r = a [where r =(x,y,0)], P =(x,y,z) be on the positive z-axis, and ′ | ′| ′ P′ = P+(0,0,dz), where dz > 0. Let T0 be the right circular cone with vertex P and base C′, T1 be the right circular cone with vertex P′ and base C′, and T2 be the right circular cone with vertex P and base C +(0,0, dz). Then T1 = T2. Therefore, the solid angle Ω subtended at ′ − ∼ ′ the vertex of T1 equals to the solid angle subtended at the vertex of T2. Remark 16. Bϕ will depend only on ρ [Wangsness [121, p.247, l.1]]. Proof. By Wangsness [121, p.245, (15-26); Figure 15-11], Bi is independent of d. Consequently, Bϕ is independent of z. Remark 17. (Somewhat indirect calculations vs. direct calculations) [Wangsness [121, p.65, l.9–l.12;

155 p.248, l.2–l.4]] I. (a). Somewhat indirect calculation of ∇ E: · The geometric operaton that transforms S E da to a solid-angle integral: Wangsness [121, p.58, (4-2) (4-1)] by using· Wangsness [121, p.58, (4-3) & (4.4)]. → H Calculus operations: (i). Express Qin as a volume integral [Wangsness [121, p.60, l.15]]. (ii). Use the divergence theorem to transform E da to a volume integral [Wangsness [121, S · p.60, the left-hand side of (4-9)]]. H (b). Direct calculation of ∇ E: · Express ∇ E in a form good for calculations: Wangsness· [121, p.65, (4-22) (4-23) (4-24) p.66, (4-24)]. → → → The geometric operation that transforms the integral given in the right-hand side of Wangsness [121, p.66, (4-25)] to a solid-angle integral: Wangsness [121, p.66, (4-25) (4-26)]. → (c). Thus, from the viewpoint of argument structures, method I.(a) and method I.(b) are almost the same except the order of operations. However, since method I.(a) essentially considers the volume integral of ∇ E, while method I.(b) directly considers ∇ E, the latter method is more direct. · · II. (a). Somewhat indirect calculation of ∇ B: × The geometric operation that transforms C B da to a solid-angle integral: Wangsness [121, p.237, (15-2) p.241, (15-10)]· by using Wangsness [121, p.238, (15-3) → H & (15.4)]. Calculus operations: (i). Express Iin as a surface integral [Wangsness [121, p.204, (12-6)]]. (ii). Use Stokes’ theorem to transform C B ds to a surface integral [Wangsness [121, p.60, the left-hand side of (4-9)]]. · H (b). Direct calculation of ∇ B: × Express ∇ B in a form good for calculations: × Wangsness [121, p.248, (15-30) (15-32)] [by Wangsness [121, p.34, (1-119); p.36, (1- 132)]]. → Wangsness [121, p.248, (15-32) (15-34)] [by Wangsness [121, p.35, (1-129); p.206, (12- 15); p.248, l. 12]]. → − The geometric operation that transforms the integral given in the right-hand side of Wangsness [121, p.248, (15-34)] to a solid-angle integral: Wangsness [121, p.248, (15-34) p.248, l. 2]. → − (c). Thus, from the viewpoint of argument structures, method II.(a) and method II.(b) are almost the same except the order of operations. However, since method II.(a) essentially considers the surface integral of ∇ B, while method II.(b) directly considers ∇ B, the latter method is more direct. × × Remark 18. “∇ B = 0 ( A : B = ∇ A)” is a general mathematical theorem rather than a conjecture · ⇒ ∃ × [Wangsness [121, p.251, l. 14–l. 12]]. For its proof, see http://www.math.unl. − − edu/˜mbrittenham2/classwk/208s04/inclass/divergence-frees_are_ curls.pdf. The proof given in Wangsness [121, p.251, l. 8–p.252, l.3] is more suitable for physical applications. − Remark 19. The direction of arrows in Wangsness [121, p.255, Figure 16-2(a)] is compatible with Wangsness [121, p.253, (16-23)]; the compability between the direction of arrows in Wangsness [121, p.255, Figure 16-2(c)] and Wangsness [121, p.253, (16-23)] is clearer. z = Ay(x,y,0)= Bx is a function of (x,y). We can represent this function as a surface in R3. The intersec-

156 tion of z = Ay(x,y,0)= Bx with x = x0 is a straight line with height Bx0 from xy plane. If we project this straight line onto the xy plane, its height cannot be shown in the figure of projection. Therefore, we use the length of the directional segment to represent the height z0 = Bx0. Remark 20. E = Eϕ (ρ)ϕˆ [Wangsness [121, p.269, l.6]].

Proof. Because the ideal solenoid is infinite, there is no difference between z1 and z2 = z1. Therefore, E should not depend on z. 6 E should not depend on ϕ because z-axis is the axis of symmetry for the solenoid. If we put the infinite ideal solenoid upside down, the picture should remain the same. If E(ρ) were originally to have a positive z-component, then this component would become a negative z-component after we do so. We would reach a contradiction. Therefore, Ez(ρ)= 0. The direction of induced emf for a given closed path is either clockwise or counterclock- wise by Faraday’s law [Wangsness [121, p.264, (17-3)]]. Consequently, for the given circle C with radius ρ0 [Wangsness [121, p.269, l.6–l.7]], the induced emf (or current) cannot have the component for ρˆ; so the induced electric field E(ρ)= Eρ (ρ)ρˆ + Eϕ (ρ)ϕˆ + Ez(ρ)zˆ [Wangsness [121, p.29, (1-80)]] has only the component Eϕ (ρ)ϕˆ. Remark 21. The rotation given in Wangsness [121, p.277, l.8] is driven by the Lorentz force. Remark 22. Remark 1

Remark 23. L j j > 0 [Wangsness [121, p.280, l. 8–l. 7]]. − − Proof. The proof is based on Wangsness [121, p.281, Figure 17-16]. Ij goes along Cj counterclockwise, so Ij > 0 by convention. Let S j be the area enclosed by Cj. Then S j points out of the page by convention. By Wangsness [121, p.227, Figure 14-2], B also points out of the page. By Wangsness [121, p.251, (16-6)], Φ j j > 0. By Wangsness [121, p.280, (17-55)], L j j > 0. Remark 24. In the presence of matter:

157 Polarization Magnetization Definition of polarization: Wangsness [121, Definition of magnetization: Wangsness [121, p.141, (10-1)]. p.313, (20-1)]. ρb: Wangsness [121, p.141, (10-1)]. Jm: Wangsness [121, p.315, (20-10)(i)]. σb: Wangsness [121, p.141, (10-1)]. Km: Wangsness [121, p.315, (20-10)(ii)]. The fromal derivations [Wangsness [121, The formal derivations [Wangsness [121, p.142, l.13–p.143, l.9]] agree with the phys- p.313, l. 2–p.314, l. 4]] agree with the ical derivations [Griffiths [46, p.170, l. 8– physical derivations− [Furtok–Klein− [38, p.63, − p.171, l.14]]. l. 3–p.64, l.10]; Griffiths [46, p.266, l. 6– p.268,− l.8]]. − The : Wangsness [121, p.143, The vector potential: Wangsness [121, p.314, (10-9)]. (20-9)]. Scalar potential inside a dielectric: Vector potential inside a matter: Although Wangsness [121, p.143, (10-9)] is Although Wangsness [121, p.314, (20-9)] is valid at a field point in a vacuum region valid at a field point in a vacuum region out- outside a dielectric, it can be extended to side a matter, it can be extended to the region the region inside the dielectric by using the inside the matter by using the concept of spa- concept of spatial average [Choudhury [14, tial average [Choudhury [14, p.297, (7.10)]] p.297, (7.10)]] and microscopic Maxwell’s and microscopic Maxwell’s equations [be- equations [because a dielectric can be viewed cause a marcoscopic amount of matter at as a collection of bound charges after all]. rest can be viewed as a collection of current Proof. “The average of microscopic charge whirls]. densities = The macroscopic charge density” Proof. “The average of microscopic current [Wangsness [121, p.146, (10-17)]] follows densities = The macroscopic current density” from Choudhury [14, p.303, (7.27)]. follows from Choudhury [14, pp.307–308, (7.40)]. Cavity definitions: Cavity definitions: E: Wangsness [121, p.148, l.7–l.23], B: Wangsness [121, p.317, l. 3–p.318, l.9], − D: Wangsness [121, p.152 l. 13–l. 1]. H: Wangsness [121, p.322, l. 7–p.323, l.7]. − − − Uniformly polarized sphere Uniformly magnetized sphere σ(θ ′) = Pcosθ ′ [Wangsness [121, p.148, Km = M sinθ ′ϕˆ ′ [Wangsness [121, p.319, (10-27)]]. (20-17)]]. φ,E on the positive z-axis [Wangsness [121, B on the positive z-axis [Wangsness [121, p.148, l. 6–l. 5]] based on the potential of p.319, l.14–l.15]] based on the potential of − − electric dipole moment [Wangsness [121, (8- magnetic dipole moment [Wangsness [121, 21), (10-9)]] and Coulomb’s law [Wangsness (20-3), (20-15)]] and the Biot–Savart law [121, (3-2), (5-1), (5-7), (5-8)]]: [Wangsness [121, (14-6), (14-7), (14-11)]]: 3 3 Pa 2µ0Ma φo(z) = 2 [Wangsness [121, p.150, (10- B [Wangsness [121, p.320, (20- 3ε0z zo = 3z3 31)]]. 22)]]. Pz φi(z) = [Wangsness [121, p.150, (10- 2 3ε0 Bzi = 3 µ0M [[Wangsness [121, p.320, (20- 31)]]. 25)]]. 2p Ezo(z)= 3 [Wangsness [121, p.150, (10- 4πε0z 35)]]. P Ezi(z)= [Wangsness [121, p.150, (10- − 3ε0 3)]].

158 Remark 25. Linear isotropic homogeneous dielectrics Linear isotropic homogeneous magnetic ma- terials Parallel plate capacitor with Q f fixed Infinite long ideal solenoid with K f un- Capacitance: Wangsness [121, p.158, (10- changed [Wangsness [121, p.323, l. 7]] − 64)]. Self-inductance: Wangsness [121, p.330, (20- Fields: Wangsness [121, p.157, (10-62); 65)]. p.158, (10-67); p.159, (10-70)]. Fields: Wangsness [121, p.328, (20-52); p.329, (20-64); p.322, (20-28)]. Capacitance in general Self-inductance in general Given a free charge distribution ρ f in vacuum. Given a free current distribution J f in vac- ∇ D0 = ρ f [Wangsness [121, p.60, (4-10)]]. uum. · ∇ D0 = 0 [Wangsness [121, p.68, (5-4)]]. ∇ H0 = J f [Wangsness [121, p.322, (20- × × By the existence of Helmboltz’s theorem 29)]]. [Choudhury [14, p.584, l.2]], D0 exists. ∇ H0 = 0 [Wangsness [121, p.329, (20-57)]]. · Keep ρ f unchanged and fill the space with a By the existence of Helmboltz’s theorem l.i.h. dielectric of relative permittivity κe. [Choudhury [14, p.584, l.2]], H0 exists. (The dielectric is l.i.h.) Keep J f unchanged and fill the space with a ∇ D = ∇ (ε0E + P) [Wangsness [121, l.i.h. magnetic material of relative permeabil- ⇒ × × p.151, (10-40)]] ity κm. = ∇ (ε0E + χeε0E) [Wangsness [121, (The magnetic material is l.i.h.) × p.155, (10-50)]] ∇ H = ∇ M [Wangsness [121, p.323, ⇒ · − · = εeκe∇ E = 0. (20-34)]] × ∇ D = ρ f [Wangsness [121, p.152, (10-41)]]. = χm∇ H [Wangsness [121, p.328, (20-52)]] · · By the uniqueness of Helmboltz’s theorem ∇ H = 0. ⇒ · [Choudhury [14, p.584, l.1]], D = D0. ∇ H = J f [Wangsness [121, p.322, (20- D E ∆φ0 × ∆φ = − E ds = − ds = − ds = . 29)]]. + · + ε · + κe · κe Q f κeQ f By the uniqueness of Helmboltz’s theorem C = = = κeC . ∆φR ∆φ0 R 0 R [Choudhury [14, p.584, l.1]], H = H0. B = µH = κmµ0H0 = κmB0. Φ = B da = κ B0 da = κmΦ0. L ·I I · L . m =RΦ/ = κmΦR0/ = κm 0 Remark 1. If Wangsness [121, p.158, (10-64); p.330, (20-65)] are directly verified by experi- ments, then (∇ E = ρb/ε0 [Choudhury [14, p.299, (7.17)(i); p.303, (7.27)]; Wangsness [121, · p.146, (10-18)]]; ∇ E = 0 [Choudhury [14, p.299, (7.17)(iv)]]) and (∇ H = µ0J f [Choud- × × hury [14, p.299, (7.17)(iii; p.306, (7.35)]]; ∇ B = 0 [Choudhury [14, p.299, (7.17)(ii)]]) (which are derived from microscopic Maxwell· equations) are indirectly verified by experi- ments. See Wangsness [121, p.147, l.4–p.148, l.6]. Remark 2. There is no free surface current density at ρ = a [Wangsness [121, p.331, l.4]]. Proof. Consider the volume element dτ with the base dA and the length ds [Wangsness [121, p.205, Figure 12-6 (a)]]. The total current in dτ is Ids = Jdτ. Consider the side surface dS of dτ. The volume of dS is 0, so the current shared by dS is 0J. 0J The free surface current density at ρ = a is dS = 0. Remark 3. In order to understand the example given in Wangsness [121, p.332, l. 16– − 159 p.333, l.23], only the calculations in proving Wangsness [121, pp.330–333, (20-67), (20-68), (20-69), (20-70), (20-71), (20-72)] are necessary, the rest of calculations are unnecessary because the results derived by the latter calculations may follow more easily and directly from the theory of magnetostatics.

Remark 26. Jd is essential for the existence of electromagnetic waves [Wangsness [121, p.349, l.15– l.16]]. See Wangsness [121, p.375, (24-4) & (24-5)]. Remark 27. There is no longer any distinction between a partial and total time derivative in the second term [Wangsness [121, p.361, l.7–l.8]].

Proof. ε0(E B) is a function of x,y,z,t. × ε0(E B)dτ is a function of t only. all space × Remark 28.R By Wangsness [121, p.382, (24.41)], α and β have the same sign. Therefore, we may assume that α and β are real and positive [Wangsness [121, p.382, l. 5]]. ω − ω Remark 29. We should say that “kzˆ B = v2 E is equivalent to (24-33)” rather than that “kzˆ B = v2 E is consistent with (24.33)”× [Wangsness− [121, p.382, l.7]]. × − Remark 30. If we replace “P increasing” by “P decreasing” in these figures, we will also have to reverse the sense of rotation about the ellipse [Wangsness [121, p.398, l.5–l.7]]. Proof. In Wangsness [121, p.396, Figure 24-12], when P is increasing (i.e. moves along the positive axis), the front curve ∆ > 0 lags the curve ∆ = 0. When P is decreasing (i.e. moves along the negative axis), the front curve ∆ < 0 lags the curve ∆ = 0. Namely, Ey will reach the maximum after Ex. Thus, we should reverse the sense of rotaton in the first quadrant of Wangsness [121, p.397, Figure 24-14]. Remark. In Wangsness [121, p.396, Figure 24-12], the two points ∆,∆ on the P-axis − should have been replaced with ∆ , ∆ respectively. −| | | | Remark 31. By Marion–Thornton [77, p.118, (3.52); p.119, (3.60), (3.61)], the displacement and ve- locity are no longer generally in phase with the applied oscillating force [Wangsness [121, p.399, l. 11–l. 10]]. − − Remark 32. ωi = ωr = ωt [Wangsness [121, p.406, l.18]]. ωit ωrt ωtt Proof. a1e− + a2e− = a3e− [by Wangsness [121, p.406, (25-3)]]. By Pontryagin [91, p.44, Theorem 4], ωi = ωr = ωt. Remark 33. This is actually not such a surprise [Wangsness [121, p.437, l. 2]]. − Proof. I. Htrans =(kg/ωµ)zˆ Etrans. × Proof. (kg/ωµ)zˆ Etrans × =(kg/ωµ)zˆ (Ex,Ey,0) × =(kg/ωµ)( Ey,Ex,0) − =(Hx,Hy,0) [Wangsness [121, p.437, (26-46), (26-48 & (26-49)]].

II. Ex = 0 on y = 0 [Wangsness [121, p.437, (26-45)]]. Etrans = Eyyˆ is normal to y = 0. By I, Htrans = Hxxˆ is tangential to y = 0. Since Htrans = Hxxˆ + Hyyˆ, Hy = 0 on y = 0. Similarly, Hy = 0 on y = b. III. Ey = 0 on x = 0 [Wangsness [121, p.437, (26-46)]]. Etrans = Exxˆ is normal to x = 0. By I, Htrans = Hyyˆ is tangential to x = 0.

160 Since Htrans = Hxxˆ + Hyyˆ, Hx = 0 on x = 0. Similarly, Hx = 0 on x = a. Remark. Wangsness [121, p.437, l. 2–p.438, l.3] provides the key idea of this proof without − details. However, how to clarify the confusion is most important in this case. Remark 34. Wangsness [121, p.440, Figure 26-4]. Explanation. Now that the figure is given, we should point out its main features. First of all, the tangent vector at a point on the H-field line is the field vector H at the point. π π Hz = 0 divides the figure into three parts: part 1 is the left part 2 < kgz ωt < 2 , part 2 π 3π 3π − − 5π is the middle part < kgz ωt < , part 3 is the right part < kgz ωt < . Here we 2 − 2 2 − 2 consider only part 1. (a). When kgz ωt = 0, Hx = 0 [Wangsness [121, p.439, (26-57)]]. − a Consequently, the arrows on H-field lines point rightwards for 0 < x < 2 and point leftwards a for 2 < x < a [Wangsness [121, p.439, (26-58)]]. a π (b). Hz 0 near x = 2 . Therefore, vectors H Hxxˆ point up when 0 < kgz ωt < 2 and ≃ π ≃ − point down when 2 < kgz ωt < 0 [Wangsness [121, p.439, (26-57)]]. − − a (c). Let kgz ωt = 0. Then Hx = 0. Vectors Hz increase as x increases from 2 to a and − |a | vectors Hz decrease as x increases from 0 to 2 [Wangsness [121, p.439, (26-58)]]. Thus, | | a the H-field lines near x = 0 or a are denser than those near x = 2 . a (d). Let x = 2 . Then Hz = 0 [Wangsness [121, p.439, (26-58)]]. Vectors H = Hx increase π | | | | as kgz ωt increases from 0 to 2 and vectors H = Hx increase as kgz ωt decreases from −π | | | | − π 0 to [Wangsness [121, p.439, (26-57)]]. Thus, the H-field lines near kgz ωt = are − 2 − ± 2 denser than those near kgz ωt = 0. − π (e). According to (c) and (d), in the rectangle formed by x = 0, x = a and kgz ωt = 2 , the H-field lines near the edges are denser than those near the center. The figure− is obviously± incorrect for this point.

Remark 35. C1′ = C3′ = C5′ = 0 and k1C2′ + k2C4′ + k3C6′ = 0 [Wangsness [121, p.446, l.10]]. Proof. Let S = (k1C +k2C +k3C )sink2ysink3z+k2C cosk2ysink3z+k3C sink2ycosk3z, − 2′ 4′ 6′ 3′ 5′ T = k1C1′ sink2ysink3z. Then Wangsness [121, p.446, (26-89)] can be written as Ssink1x + T cosk1x = 0. Let x vary. Then S = T = 0 [because sink1x,cosk1x are two linearly independent solutions d2y 2 of dx2 + k1y = 0]. T = 0 C = 0. ⇒ 1′ Similarly, C3′ = C5′ = 0. Consequently, k1C2′ + k2C4′ + k3C6′ = 0. Remark 36. By Marion–Thornton [77, p.119, l.10–l.16], the nonperiodic part of the solution that is even- tually damped out is called a transient, while the periodic part that persists is called the steady state [Wangsness [121, p.453, l. 6–l. 5]]. − − Remark 37. Aµ is a 4-vector [Wangsness [121, p.519, l. 9]]. − Proof. For tensors, all that coordinate changes involve is nothing but index manipulation [Hawkins [53, p.93, (6-1); p.94, (6-2)]]. 2  Aµ = µ0Jµ in xµ coordinates [Wangsness [121, p.519, (29-130)]] ( ). − ∗ In x′µ coordinates, we have 2  A = µ0J ′ ′µ − µ′ 4 ∂x′µ = µ0 ∑ Jν (since Jµ is a 4-vector; see Hawkins [53, p.93, (6-1)]] − ν=1 ∂xν 161 4 ∂x′µ = ∑ ( µ0J ) ν=1 ∂xν ν ∂x − = ∑4 ′µ 2A (by ( )) ν=1 ∂xν µ ∂x ∗ = 2 ∑4 ′µ A . ν=1 ∂xν µ 2 is an invariant [Wangsness [121, p.513, (29-94)]], which means that 2 is not affected by coordinate changes. Thus, it does not participate in any index manipulation [Hawkins [53, p.95, (6-3))]], so we may eliminate this operator 2. Then ∂x A = ∑4 ′µ A . By definition [Hawkins [53, p.93, (6-1)]], ′µ ν=1 ∂xν ν Aµ is a 4-vector.

∂Bz Remark 38. “Bz is slow varying” [Wangsness [121, p.537, l.7]] should be corrected as “ ∂z is slow varying in ρ at ρ = 0”. f (ρ) is slow varying in ρ at ρ = 0 f (aρ) limρ 0 = 1, where a > 0 [by definition] ⇔ → f (ρ) As ρ 0+, f (aρ) f (ρ). ⇔ → ≈ Remark 39. Bρ (0)= 0 [Wangsness [121, p.537, l.9]] means Bρ (z,0)= 0. Namely, B(z,0) has zˆ-component only.

Remark 40. vρ << vφ [Wangsness [121, p.537, l.13]]. m0v Proof. rC = qB⊥ [Wangsness [121, p.532, (A-12)]]. After one turn, v remains the same and B(z + v T) B⊥(z) as z +∞ [because B is slow varying in z at z =+∞]. | k| ≈ → rc(z+ v T) rC(z) Consequently, vρ = | k| − 0 as z +∞. T ≈ → Remark 41. Microscopic derivations for electromagnetc parameters [Wangsness [121, Appendix B]]: I. The response of a medium to static fields:

162 The microscopic derivation of χe The microscopic derivation of χm A. If p0 = 0 [Wangsness [121, p.548, l.1]], A. The magnetizing induction Bm acting on then the molecule in question: B = µ (1 + 2 m 0 < p >= e2( Zi + 1 )E = αE 1 )χ H [Wangsness [121, p.555, (B-38)]]. ∑i Ki ∑ j Kj p p 3 m ∆ω~ = e B [Wangsness [121, p.557, (B- [Wangsness [121, p.548, (B-9)]]. 2me m For a single atom, the polarizability is given 44)]], 3 where e is called the Larmor frequency. by α = 4πε0a [Wangsness [121, p.548, 2me (B-10)]]. For an electron in a circular orbit of radius The polarizing field E acting on the molecule a: m = 1 ea2ω~ [Wangsness [121, p.557, (B- p − 2 P 45)]]. in question: Ep = E + 3ε [Wangsness [121, 0 e2 2 p.550, (B-18)]] ∆m = a Bm [Wangsness [121, p.557, − 4me Wangsness [121, p.550, (B-19)] (B-46)]] → Nα/ε0 χe = [Wangsness [121, p.550, Wangsness [121, p.558, (B-47)] (if m0 = → 1 Nα/3ε0 → (B-20)]]. − 0) The Clausius–Mossotti relation: Wangsness [121, p.558, (B-48)] 2 3ε0(κe 1) → µ0NZe 2 α = − [Wangsness [121, p.550, χm = < r > (diamagnetic) N(κe+2) → − 6me (B-21)]]. [Wangsness [121, p.558, (B-49)]]. B. If p0 = 0 [Wangsness [121, p.551, l.11– B. If χm > 0, then m0 = 0 [Wangsness [121, l.12]], 6 p.558, l. 21–l. 20]]. 6 p0Ep − − < p0 cosθ >= p0L( kT ) [Wangsness [121, Wangsness [121, p.558, (B-50)] e p.552, (B-280]], where morb = 2m L [Wangsness [121, p.558, y → − e L(y)= cothy 1 (y << 1). (B-51)]] − y ≃ 3 p0Ep Wangsness [121, p.558, (B-52)] If kT << 1 [Wangsness [121, p.552, l. 2]], → then − Wangsness [121, p.559, (B-53)] 2 → e Np0 m = g J [Wangsness [121, p.559, (B- P = N < p0 cosθ >= Ep [Wangsness 2me 3kT 54)]].→ − [121, p.553, (B-31)]]. µ m H p2 < m cosθ >= m L( 0 0 m ) [Wangsness The corresponding polarizability is α = 0 0 0 kT C 3kT [121, p.559, (B-55)]]. [Wangsness [121, p.553, (B-32)]]. µ0m0Hm 2 If << 1 [Wangsness [121, p.559, 3ε0(κe 1) p kT The Debye equation: − = α + 0 N(κe+2) 3kT l. 21]], then 2 [Wangsness [121, p.553, (B-34)]]. − Nµ0m0 M = N < m0 cosθ >= 3kT Hm [Wangsness [121, p.559, (B-56)]]. Curie’s law for paramagnetic material: χm = 2 Nµ0m0 3kT [Wangsness [121, p.559, (B-57)]]. C. Weisis’ theory of ferroelectric material: C. Weisis’ theory of ferromagnetic material: see Wangsness [121, p.562, l.17–l. 13]. M = Nm0L(y) [Wangsness [121, p.560, (B- − 59)]]. By Wangsness [121, p.559, (B-58); p.560, (B- 60)], we have M =( kT )y H [Wangsness [121, p.560, λ µ0m0 − λ (B-61)]]. The solution of simultaneous equations given in Wangsness [121, p.560, (B-59); (B-61)] is shown in Wangsness [121, p.560, Figure B- 7]. H = 0,Ms = 0 as shown in Wangsness [121, p.561, Figure6 B-8]. 163 The Curie temperature Tc is defined by Tc = 2 λ µ0Nm0 3k [Wangsness [121, p.560, (B-62)]]. Ms = 0 for T Tc [Wangsness [121, p.560, (B-63)]]. ≥ Remark 1. For the Maxwell velocity distribution function given in Wangsness [121, p.552, l.7] , see Reif [93, p.265, (7 10 3)]. · · Remark 2. For the derivation of g in Wangsness [121, p.559, (B-54)], see the derivation of Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1058, (43)]. II. The response of a medium to time-varying fields [a plane wave]: ne2 A. Wangsness [121, (12-36) (12-37) (12-38)] σ0 = [Wangsness [121, p.213, (12- → → → ξ 39)]]. σ0 Wangsness [121, (24-125) (24-127) (24-128) (24-129)] σ = 2 [Wangsness → → → → 1 i(σ0mω/ne ) [121, p.399, (24-130)]]. − 2 2 2 σ ω κm iσ Wangsness [121, (24-7) (24-36) (24-37)] k = ω µ(ε +i )= 2 (κe + ) [Wangsness ω c ωε0 [121, p.563, (B-72)]]. → → → c By =( ω )k = n + iη [Wangsness [121, p.563, (B-73)]], N iσ 1/2 n + iη =[κm(κe + )] [Wangsness [121, p.563, (B-74)]] ωε0 n + iη = √κe (if κm = 1) [Wangsness [121, p.563, (B-75)]]. → 2 nk(e /me) B. Wangsness [121, (B-76) (B-77) (B-78) (B-79) (B-80) (B-81)] < p >= Ep ∑k 2 2 = ω ω iγkω → → → → → → k − − αEp [Wangsness [121, p.565, (B-82)]]. By Wangsness [121, (B-21), (B-75)], (n+iη)2 1 Nα 2− = [Wangsness [121, p.565, (B-83)]] (n+iη) +2 3ε0 n2 1 1 α(ω) the Lorenz–Lorentz law: 2− = (if η = 0) [Wangsness [121, p.565, (B-84)]]. ⇒ n +2 N 3ε0 C. Suppose n 1 and η 0. ≃ ≃ Wangsness [121, (B-83)(i) (B-85) ((B-86),(B-87)) ((B-88),(B-80), Figure B-10)]. D. Suppose that the medium→ contains→ free electrons. → a. By Wangsness [121, (B-74), (B-75), (B-83)], (n + iη)2 = 1+(2Nα/3epsilon0) + iσ [Wangsness [121, p.567, (B-90)]]. 1 (Nα/3epsilon0) ωε0 Wangsness− [121, p.567, (B-93)] [by Wangsness [121, p.567, (B-91)]] ⇒ Wangsness [121, p.567, (B-94); (B-95)]. ⇒ b. Suppose σ0/ωε0 >> 1. Note that free electrons affect only the terms containing σ0 in Wangsness [121, p.567, (B- 94); (B-95)]. 2 2 Then n η = 1 + δ (δ small) and 2nη σ0/ωε0 >> 1. − 1/2 ≃ Thus, n η (σ0/2ωε0) . ≃ ≃ 1 1/2 β α =(ω/c)n ( µ0σ0ω) [by Wangsness [121, p.563, (B-73)]]. ≃ ≃ 2 11. Optics (Born–Wolf [8]; Kenyon [65]; Matveev [80])

(A). Born–Wolf [8] Remark 1. Prove the compatibility between the surface charge density given in Born–Wolf [8, p.5, (17a)] and that given in Wangsness [121, p.137, (9-24)]. Proof. I. ρ(x)= qδ(x)δ(y)δ(z) [Jackson [60, p.27, (1.6)]] means ρdxdydz = q (inte- grate along an infinitesimal path on each coordinate axis in the neighborhood of the origin). RRR II. If F(x,y,z)= 0 reduces to z = 0, then Born–Wolf [8, p.5, (17a)] reduces to ρ = ρδˆ (z). This means ρˆ = ρdz = the surface charge density on the xy-plane (use the same idea as in I; integrate an infinitesimal path on the z-axis in the neighborhood of the origin). Conse- R quently, ρˆ = limh 0 hρ = σ. →

164 Remark 2. Born–Wolf [8, p.10, l.6–l.8] should have been corrected as follows: Since dW = dA (the amount of energy that electromagnetic energy decreases provides the − amount of work done by the fields) as we expect in Born–Wolf [8, p.8, l. 14–l. 13], this result justifies the definition of electromagnetic energy by means of (32). − − ∂g(r) Remark 3. δt = δs grad( )ω¯ [Born–Wolf [8, p.23, l.17]]. | ∂ω | δt ∂g(r) Proof. δs = grad( ∂ω )ω¯ ) q [definition of directional derivative] ∂g(r) · ∂g(r) grad( ∂ω )ω¯ = grad( ∂ω )ω¯ ∂g(r) [O’Neill [86, p.148, Lemma 3.8]] grad( )ω¯ · | ∂ω | ∂g(r) = grad( )ω¯ . | ∂ω | Remark 4. The ellipse is inscribed in a rectangle whose sides are parallel to the coordinate axes and whose lengths are 2a1 and 2a2 [Born–Wolf [8, p.26, l.10–l.11]].

2 2 Ey 1 Ey 2 Ey 2 Proof. By Born–Wolf [8, p.26, (15)], Ex = cosδ 2 cos δ ( 2 sin δ). a1a2 a1 a a ± 2 − 2 − r2 Ey Since the quantity inside the square root is nonnegative, a2 1. 2 ≤ Remark 5. s1 = s0 cos2χ cos2ψ, s2 = s0 cos2χ sin2ψ [Born–Wolf [8, p.32, (45a); (45b)]]. Proof. I. sin2ψ = s2 [Born–Wolf [8, p.28, l.2]]. cos2ψ s1 Therefore, it suffices to prove Born–Wolf [8, p.32, (45a)]. 2 2 2 2 2 2 II. s1 + s1 tan 2ψ + s0 sin 2χ = s0 [Born–Wolf [8, p.32, (44), (45c), & (46)]]. s2(1 + tan2 2ψ)= s2(1 sin2 2χ). Hence 1 0 − s1 =( cos2ψ)s0 cos2ψ. ± Since s0,cos2χ 0, ≥ s1 and cos2ψ have the same sign. ± Let s1 0. If we want≥ to inscribe an ellipse inside the rectangle given in Born–Wolf [8, p.27, Fig. 1.6], then 1 a2 π 1 a2 0 ψ tan− or π tan− ψ π. Consequently, ≤ ≤ a1 ≤ 4 − a1 ≤ ≤ cos2ψ 0. Therefore, we take the plus sign for the factor cos2ψ. ≥ ± Remark 6. p and q are a pair of conjugate semidiameters of the ellipse [Born–Wolf [8, p.36, l.12]]. Proof. Born–Wolf [8, p.35, (60)] can be written as cosε 0 sinε 0 px a 0 cosε 0 sinε py 0    =  . Then sinε 0 cosε 0 qx 0 −  0 sinε 0 cosεqy b  −     px cosε 0 sinε 0 a acosε py 0 cosε 0 sinε 0 bsinε   =    =  . Thus, qx sinε 0 cosε 0 0 asinε − − qy  0 sinε 0 cosεb  bcosε  2 2  −     px py      a2 + b2 = 1, 2 2 qx qy a2 + b2 = 1, and pxqx pyqy a2 + b2 = 0. See Bell [2, p.115, (A′)&(B′)]. Remark 7. The polarization is right-handed if and only if [a,b,∇ε] > 0 [Born–Wolf [8, p.36, l.13–l.14]].

165 Proof. I. The definition of right-handed polarization given in Born–Wolf [8, p.29, l.8–l.10] is based on the view from the observer: the end point of the electric vector appears to describe the ellipse in the clockwise sense. II. After standardization, a =(a,0,0),b =(0,b,0) and ε =(0,0,k) (x,y,z)= kz. Thus, · [a,b,∇ε]=[(a,0,0) (0,b,0)] (0,0,k)= abk, where a,b > 0. × · III. From the view toward the observer, if k > 0, the wave will propagate in the positive z-axis. IV. Both views describe the same thing, but they choose different coordinate systems. For the positive z-axis, their directions are opposite. Remark 8. The explanation “from (62) 2p q sin2ε = · , √(p2 q2)2+4(p q)2 − · p2 q2 cos2ε = − ” given in Born–Wolf [8, p.36, l. 15–l. 14] is incorrect. How- √(p2 q2)2+4(p q)2 − − ever, from (62),− we have· the following two cases: 2p q Case I. sin2ε = · , √(p2 q2)2+4(p q)2 − · p2 q2 cos2ε = − . √(p2 q2)2+4(p q)2 − · This case leads to a2 b2, which satisfies the condition a b given in Born–Wolf [8, p.35, l. 5]. ≥ | |≥| | − 2p q Case II. sin2ε = − · , √(p2 q2)2+4(p q)2 − · q2 p2 cos2ε = − . √(p2 q2)2+4(p q)2 − · When (p2 q2)2 + 4(p q)2 > 0, b2 > a2, which does not satisfy the condition a b . Therefore,− we discard this· case. | |≥| | When (p2 q2)2 + 4(p q)2 = 0, this case can be incorporated into Case I. − · Remark 9. For the proof of Born–Wolf [8, p.38, l. 8, (1)], see Hecht [54, p.98, Figure 4.16; p.100, (r) − (t) Figure 4.19 ]. r s is the time required for the reflected wavefront to pass through AD; r s ·v1 ·v2 is the time required for the refracted wavefront to pass through AD. Remark 10. Both s(t) and s(r) lie in the plane of incidence [Born–Wolf [8, p.39, l.5]]. Proof. Let n =(0,0,1). Then (s(i) s(r)) n = 0, so s(i), s(r), and n lie in the same plane. × · (s(i) s(t)) n = 0, so s(i), s(t), and n lie in the same plane. × · Remark 11. A linear, homogeneous, and isotropic medium with zero conductivity is perfectly transparent [Born–Wolf [8, p.40, l.13–l.15]]. 1 Proof. If σ = 0, then δ = β =+∞ [Wangsness [121, p.384, (24-54)]]. Thus, light can pass through the medium. Therefore, the medium is transparent. Remark. For other related material, see Wangsness [121, p.383, (24-43); p.384, l. 23– l. 19; p.388, l.8–l. 12]. − − − Remark 12. Their magnetic permeabilities will then in fact differ from unity by negligible amounts [Born–Wolf [8, p.40, l.15–l.16]]. Proof. In Gaussian units, µ = 1 + 4πχm [§Dielectric and magnetc materials in https:// en.wikipedia.org/wiki/Gaussian_units], where χm 1 [diamagnetic: Wangsness | | ≪ [121, p.558, (B-49); l.16]; paramagnetic: Wangsness [121, p.559, (B-57); l. 13]]. − 166 Remark 13. The azimuth of vibration α is defined by the angle between the plane of vibration and the plane of incidence [Born–Wolf [8, p.47, l. 9–l. 8]]. For incident waves, αi is the angle − − between E(i) and E(i) [Born–Wolf [8, p.40, l.11; p.44, l.1]]. However, Born–Wolf [8] fails k to identify the two α’s. The plane of vibration is generated by E(i) and S(i), while the plane of incidence is the xz-plane [Born–Wolf [8, p.47, Fig. 1-13]].y ˆ is the direction of normal of the plane of incidence, while H(i) is parallel to the normal of the plane of vibration. Thus, α (i) (i) is the angle between H andy ˆ. H andy ˆ are on the plane of vibration. α = αi because the two angles are on the plane of vibration and the latter is just the former after being rotated by π/2. Remark 14. “In (48) the equality sign holds only for normal or tangential incidence (θi = θt = 0 or θi = π/2)” [Born–Wolf [8, p.48, l.4–l.5]] should have been corrected as “In (48) the equality sign holds only for the following cases: θi = θt = 0, θi = π/2 or θt = π/2”. Remark 15. Wangsness [121, p.391, (24-100)] also takes care of the statement within the parenthesis of Born–Wolf [8, p.51, l.13–l.14]. Even so, < S > can still calculated directly by using complex form without the necessity of first finding the real parts [Wangsness [121, p.392, (24-104)]].

Remark 16. If we replace U(z),V(z),W(z) in Born–Wolf [8, p.57, (17)–(23)] with U1(z),V1(z),W1(z) and call the resulting formulas (17)′–(23)′, then we can see more easily how we obtain (17)′–(23)′ from Born–Wolf [8, pp.56–57, (10)–(16)]. Proof. There are two methods of changing from Born–Wolf [8, p.1, (1)] to Born–Wolf [8, p.1, (2)], and vice versa. Method 1: Replace E with H and ε with µ. − − Method 2: Replace E with H and ε with µ. − If we use Method 1 and let U1(z)= U(z),V1(z)= V(z),W1(z)= W(z), we will obtain − (17)′–(23)′ from Born–Wolf [8, pp.56–57, (10)–(16)]. If we use Method 2 and let U1(z)= U(z),V1(z)= V(z),W1(z)= W(z), we will obtain − − (17)′–(23)′ from Born–Wolf [8, pp.56–57, (10)–(16)]. Remark. We still use U(z),V(z),W(z) instead of U1(z),V1(z),W1(z) in Born–Wolf [8, p.57, (17)–(23)] because U1(z) (resp. V1(z)) still satisfies Born–Wolf [8, p.57, (22)(resp. (23))] and the actual values of U1(z),V1(z) are determined by initial conditions only. Remark 17. φ(z)= k0nzcosθ,α = nsinθ [Born–Wolf [8, p.58, (24)]]. Proof. For a homogeneous plane wave [Born–Wolf [8, p.639, l.9]], its cophasal surface is constant = g(r) [Born–Wolf [8, p.18, (24)]] r s = ω v· [Born–Wolf [8, p.18, (23)]] = k(y,z) (sinθ,cosθ) (because θ is the angle between s and zˆ [Born–Wolf [8, p.57, l. 4– l. 3]]) · − − = k0n(ysinθ + zcosθ) [Born–Wolf [8, p.17, (21)]]. The result follows from Born–Wolf [8, p.57, l. 7]. − U ik DU µ dz Remark 18. 2 = 0 1 U2 [Born–Wolf [8, p.59, l. 2]]. 1 − R U U Proof. I. W = 1 1′ [Birkhoff–Rota [5, p.35, (9)]] U U 2 2′ U V = ik µ 1 1 [Born–Wolf [8, p.59, (25)]] 0 U V 2 2

167 = ik0µD [Born–Wolf [8, p.59, (26)]]. W(z) II. U2 = U1 2 dz [Birkhoff–Rota [5, p.36, (13)]] U1 µ(z) = ik0DU1 R 2 dz [by I]. U1 Remark. DR = constant also follows from Birkhoff–Rota [5, p.35, (11)] [Born–Wolf [8, p.59, l. 4]]. − Remark 19. For a nonabsorbing medium, ε and µ are real [Born–Wolf [8, p.63, l.9]]. Proof. I. B = µH µ = µ eiΩ [Wangsness [121, p.385, (24-61)]]. ⇒ | | II. σ = 0 Ω = 0 [Wangsness [121, p.382, (24-37); p.383, (24-47)]] ⇒ µ is real [by I]. ⇒ III. σ = 0 N is real [Wangsness [121, p.563, (B-73); (B-74)]] ⇒ √εµ = N is real [Born–Wolf [8, p.14, (14)]]. ⇒ d2R Remark 20. When H = mλ0/4cosθ2(m = 1,2, ), ( dH2 ) ≶ 0 according as m 2 2 2 2 ··· ( 1) r12r23(1 + r12r23 r12 r23) ≶ 0 [Born–Wolf [8, p.67, (70)]]. − dR 16π −2 −2 2 2 2 2 Proof. = (1 + r r23 r r )r r cosθ2FR, where dH λ0 12 12 23 12 23 −sin2β − − FR = 2 2 2 . (1+r12r23+2r12r23 cos2β) dFR d(2β) 2 2 2 dH = dH (1 + r12r23 + 2r12r23 cos2β)− DR, where 2 2 DR = 4r12r23 +(cos2β)(1 + r12r23 + 2r12r23 cos2β). (1 + r r )2 if m is odd D 12 23 . R = 2 ( (1 r12r23) if m is even − − Remark 21. By Karamcheti [64, p.94, (5.55) & (5.56)], the φ in Born–Wolf [8, p.76, (7)] exists; by Karamcheti [64, p.94, (5.57) & (5.58)], the A in Born–Wolf [8, p.76, (5)] exists; this is because it only requires solving partial differential equations. For any point r′(x′,y′,z′) outside the ball, it is permissible to differentiate φ2 under the integral sign [Born–Wolf [8, p.78, l. 11–l. 10]]. This is because any point outside the ball satisfies the hypothesis of − − Rudin [99, p.27, Theorem 1.34]. By Born–Wolf [8, p.79, l.11–l.13], φ1 can hardly make any contribution to the first term in the bracket on Griffiths [46, p.424, l. 4]. Therefore, this term − can be viewed as being contributed by φ2 alone. By the definition of δ-function, the second term in the bracket on Griffiths [46, p.424, l. 4] can only be contributed by φ1. Therefore, − according to Born’s analysis, the equality given in Griffiths [46, p.424, l. 4] kills three birds − with one stone: it simultaneously proves that φ,φ1,φ2 all satisfy the inhomogeneous wave equations. Remark 22. The Lorentz condition can be obtained from the continuity equation [Sadiku [100, p.388, l. 4]; Born–Wolf [8, p.77, l.13–l.15]] by using Wangsness [121, p.518, l.9–p.519, l. 3]. Born–Wolf− [8, p.79, l. 19–l. 18] provides a simpler proof. − −1 − Remark 23. curl′[M]= curl′M]+ R [M˙ ] [Born–Wolf [8, p.81, (17b)]]. cR × xˆ yˆ zˆ xˆ yˆ zˆ ∂ ∂ ∂ ∂ ∂ ∂ Proof. ∂x ∂y ∂z = ∂x ∂y ∂z t =t R/c ′ ′ ′ ′ ′ ′ | ′ − M (r ,t R/c) M (r ,t R/c) M (r ,t R/c) M (r ,t ) M (r ,t ) M (r ,t ) x ′ y ′ z ′ x ′ ′ y ′ ′ z ′ ′ − − −

168 xˆ yˆ zˆ + ∂ ∂ ∂ . ∂x′ ∂y′ ∂z′ Mx(r ,t R/c) My(r ,t R/c) Mz(r ,t R/c) ′ − ′ − ′ − In the first determinant, the partial derivatives apply to both r and t R/c; in the second ′ − determinant, the partial derivatives apply to r′ only (t ′ is fixed); in the third determinant, the partial derivatives apply to t R/c only (r is fixed). − ′ Remark 24. EI = 0 [Wangsness [121, p.550, l. 10]], so the molecules inside the sphere do not produce − any resulting field at the central molecule [Born–Wolf [8, p.90, l. 17–l. 15]]. 2 − − Remark 25. At the centre of the sphere we have ∂ φ0 = 0 [Born–Wolf [8, p.91, l. 14–l. 13]]. ∂x∂y − − ∂φ0 φ0(x,y,0) φ0(x,0,0) Proof. (x,0,0)= limy 0 − ∂y → y φ0( x,y,0) φ0( x,0,0) ∂φ0 = limy 0 − −y − = ∂y ( x,0,0). 2 → − ∂ φ0 (∂φ0/∂y)(x,0,0) (∂φ0/∂y)(0,0,0) ∂x∂y (0,0,0)= limx 0 −x → 2 (∂φ0/∂y)( x,0,0) (∂φ0/∂y)(0,0,0) ∂ φ0 = limx 0 − − = (0,0,0). − → x − ∂x∂y Remark 26. Born–Wolf [8, p.92, (13)]− follows from Wangsness [121, p.548, (B-9)]. Remark 27. ρ¯1 = ρ1 [Born–Wolf [8, p.102, l.7]]. Proof. n2 1 ρ¯1 [Born–Wolf [8, p.101, (48)]] = ν¯ 2 ν2 − 1 − ρ¯1 = 2 2 [Born–Wolf [8, p.102, (50)(i)]]. ν¯1 ν ρ1/3 2 − − 12πNα n 1 = 3 4πNα [Born–Wolf [8, p.100, (45)]] − ρ1 − = 2 2 [Born–Wolf [8, p.100, (46)]]. ν¯ ν ρ1/3 1 − − 2 Remark 28. ∇ Q + nk0Q = 0 [Born–Wolf [8, p.106, (10)]]. 2 Proof. ∇2P 1 P¨ = 0 ∇2P + ω P = 0. − v2 ⇒ v2 The result follows from Born–Wolf [8, p.17, (21)]. Remark 29. By Wangsness [121, p.143, (10-7)], div Q = 0 [Born–Wolf [8, p.106, (11)]]. Remark 30. By Pontryagin [91, p.44, Theorem 4], this is only possible if each group vanishes separately [Born–Wolf [8, p.108, l.13–l.14]]. 1 sin(θi+θt )cos(θi θt ) Remark 31. A = 2 sinθ sinθ − T [Born–Wolf [8, p.114, (55b)]]. k i t k (i) (i) Proof. yˆ A0 = A s [Wangsness [121, p.415, Figure 25-12]]. (i) ×(i) k (i) (i) T0 s (s T0)= s (T0 s ) [Wangsness [121, p.11, (1-30)]]. − (i) · (i) ×(i) × (i) yˆ [s (T0 s )] = s (yˆ (T0 s )) [Wangsness [121, p.11, (1-30)]]. × ×(i) × (i) · × yˆ (T0 s )= s (yˆ T0) · × · × = s(i) (T s(t)) [Wangsness [121, p.415, Figure 25-12]]. · k Remark 32. A scientific textbook should be written with the audience in mind: it should contains not only results, but also the method to obtain them. If the proof is long, we should divide it into several steps so that readers may check the work step by step. 1 1 K(e,S,n)+ L(e,S,n, µ)+ 2 M(e,ε, µ)= 0 [Born–Wolf [8, p.119, (16)]]. ik0 (ik0) 2 ik S 2 2 2 2 Proof. ∇ E0 = e 0 ∇ e + 2ik0[∇S ∇]e + ik0e ∇ S +(ik0) (∇S) e . εµ ¨ 2 2 ik0S { · · } c2 E0 = k0n e e. − ik S ∇(ln µ) (∇ E0)= ∇(ln µ) (∇ e + ik0(∇S) e)e 0 [Born–Wolf [8, p.118, (9)]] × × × × × ik S = ∇(ln µ) (∇ e)+ ik0[(∇S)(∇(ln µ) e) e(∇(ln µ) (∇S))] e 0 [Wangsness [121, p.11,{ (1-30)]].× × · − · } ik S ik S ∇[E0 ∇(lnε)] = ∇[e ∇(lnε)] e 0 +(e ∇(lnε))(ik0e 0 ∇S). · { · } · 169 Remark 33. Because ∇(ln( µ)) = ∇(ln µ + iπ)= ∇(ln µ), Born–Wolf [8, p.120, (17)] can be obtained from Born–Wolf− [8, p.119, (16)] simply by using the fact that Maxwell’s equations remain unchanged when E and H and simultaneously ε and µ are interchanged [Born–Wolf [8, p.120, l.6–l.7]]. −

Remark 34. Uω (x,y,z) will satisfy the time-independent wave equation (2) [Born–Wolf [8, p.420, 12]]. − 2 1 ∂ 2V Proof. I. ∇ V = c2 ∂t2 [Born–Wolf [8, p.420, (9)]] 1 ∞ 2 iωt = 2−√ ∞ ω Uω (x,y,z)e− dω [Born–Wolf [8, p.420, (10)]]. c 2π − 2 1 ∞ 2 iω0t II. ∇ UωR0 (x,y,z)= √ ∞ ∇ V(x,y,z)e dt [Born–Wolf [8, p.420, (11)]] 2π − 1 ∞ ∞ 2 i(ω ω0)t = 2−πc2 ∞ ∞ ω Uω (x,Ry,z)e− − dtdω [by I] 2 − − ω0 = 2 RUω R[Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1473, (34); p.1469, (5)]]. − c 0 Remark 35. Born–Wolf [8, p.884, l. 12–p.885, l.7], Guo–Wang [48, p.34, l. 10–l. 1], and Watson − − − [123, p.236, l.11–l.19] all fail to state Watson’s lemma accurately and completely. The following gives the accurate and complete statement and proof of Watson’s lemma. (Watson’s lemma) Let h(µ) be a single-valued analytic function in arg µ < θ; when real dµ2 ∞ βs α | | µ +∞, h(µ)= O(e ), where d is real. ∑ csµ = µ h(µ)(ℜα < 1,β > 0) in a → s=0 neighborhood of µ = 0. Then ∞ kµ2 1 (α 1)/2 ∞ βs α+1 βs/2 π h(µ)e dµ k ∑ csΓ( − )k as k ∞ in argk δ(δ > 0). 0 − ∼ 2 − s=0 2 − → | |≤ 2 − Proof.R The proof is based on the proof of Guo–Wang [48, p.34, Watson’s lemma]. N 1 βs α βN ℜα dµ2 Fix N. M > 0: h(µ) ∑s=−0 csµ − < Mµ − e . ∞ kµ∃2 | N−1 ∞ kµ2 β| α 0 e− h(µ)dµ = ∑s=−0 cs 0 e− µ − dµ + RN N 1 c 1 k (βs α+1)/2Γ βs α 1 2 R . =R ∑s=−0 s 2 − − ((R + )/ )+ N ∞ (ℜk d)µ2 βN ℜα− RN M 0 e− − µ − dµ | |≤Γ((βs ℜα+1)/2) (βs ℜα+1)/2 = M R− (βs ℜα+1)/2 = O( k − − ) 2(ℜk d) − | | in k > d−cscδ. | | µg(z) Remark 36. limz z0 = g(z)/ 2 f ′′(z0) [Born–Wolf [8, p.887, l.1–l.4]]. → f ′(z) − 2 µ dµ 1 Proof. f (z)= f (z0) pµ =( 2 )− . − ⇒ f ′(z) − dz 1/2 dµ 1/2 µ =( f (z0) f (z)) 2 = ( f (z0) f (z)) f (z). − ⇒− dz − − − ′ f ′′(z0) 2 f (z0) f (z)= (z z0) . − − 2 − −··· dµ 1 √[ f ′′(z0)]/2(z z0) √[ f ′′(z0)]/2 limz z0 ( 2 )− = limz z0 − − − = − . → − dz → f ′(z) f ′(z0) f ′′(z0) Remark 37. Born–Wolf [8, p.889, l.12–l.15] follows− from Watson− [123, p.230, l. 12–l. 9]. 2 − − fxx fyy fxy Remark 38. K = 2 − 2 2 [Born–Wolf [8, p.891, l. 4]] follows from O’Neill [86, p.212, Corollary (1+ fx + fy ) − 4.1]. Remark 39. In my opinion, it is improper to regard δ(x) as a quantity with certain symbolic meaning [Born–Wolf [8, p.892, l.10–l.11]]. We should treat δ(x) as a distribution [continous linear functional, Rudin [98, p.141, Definition 6.7]]. If one sees the word “distribution”, one may search for its meaning in Wikipedia. If one sees the words “symbolic meaning”, one will have no way to figure out what it means. Furthermore, Born–Wolf [8, p.894, (9)–(12)] are all identities for distributions instead of identities for functions. If one fails to understand this, one cannot prove them.

Remark 40. The xi given in Born–Wolf [8, p.894, l. 1] has to be a simple zero of f (x). See Merzbacher − [81, p.633, l. 9]. − 170 ∂ Σ Σ ∂F 1 Σ Σ Remark 41. ∂x σ FdV ′ = σ ∂x dV ′ + limδx 0 δx ( σ FdV ′ σ FdV ′) [Born–Wolf [8, p.898, (3)]]. → ′ − Proof.R Let χ(ΣR,σ)(r′) be the characteristicR functionR of the volume within Σ and outside σ. χ (r )F(T,r ) χ (r )F(P,r )= χ (r )[F(T,r ) F(P,r )]+χ (r )F(P,r ) (Σ,σ ′) ′ ′ − (Σ,σ) ′ ′ (Σ,σ ′) ′ ′ − ′ (Σ,σ ′) ′ ′ − χ(Σ,σ)(r′)F(P,r′). χ (r′)[F(T,r′) F(P,r′)] (Σ,σ′) − χ (r ) ∂F (P,r ) as δx 0. δx → (Σ,σ) ′ ∂x ′ → Remark 42. (Only through a language tool that is accurate enough may a delicate statement be described) The proof of 1 Σ Σ limδx 0 δx ( σ FdV ′ σ FdV ′)= σ FρxdS′ [Born–Wolf [8, p.899, (4)]] → ′ − − is confusing in both notation and language. The language tool that the authors use is neither R R R clear nor accurate enough to describe such a delicate result. The following proof attempts to clarify the details: Proof. I. Let B(P,a)= (x,y,z) (x,y,z) P < a . Then ∂B(P,a)= σ. { || − | } Let B(T,a)= (x,y,z) (x,y,z) T < a . Then ∂B(T,a)= σ . { || − | } ′ Let δS′ be the surface element on ∂[(B(T,a) B(P,a)) (B(P,a) B(T,a))] σ pointing \ ∪ \ ∩ away from the volume (B(T,a) B(P,a)) (B(P,a) B(T,a)). \ ∪ \ Let ρx be the x-component of the unit radial vector ~ρ pointing from P to δS′. Let A,B =(B¯(T,a) B¯(P,a)) (B¯(P,a) B¯(T,a)) (x,y,z) y = 0 [Assume P is the ori- gin],{ where} A is the upper\ point∩ and B is the\ lower point∩ { [see Born–Wolf| } [8, p.899, Fig. 9]]. Then ∠APT = ∠TPB = 90◦ (since δx is small). Thus, δS ρx is the signed projection area of δS′ onto the yz-plane. ′ × II. For the right shaded region B¯(T,a) B¯(P,a), ~ρ and δS′ [by convention, δS′ is the outward \ normal pointing away from the volume B¯(T,a) B¯(P,a)] are antiparallel, so ρx δS > 0. \ − × ′ Hence, dV = ρx δx δS (because dV > 0). ′ − × × ′ ′ III. For the left shaded region B¯(P,a) B¯(T,a), ~ρ and δS′ are parallel, so ρx δS > 0. \ × ′ Hence, dV ′ = ρx δx δS′, i.e., dV ′ = ρx δx δS′. Σ Σ× × − − × × IV. FdV ′ FdV ′ = FdV ′ FdV ′ [Born–Wolf [8, p.899, Fig. σ ′ − σ B(T,a) B(P,a) − B(P,a) B(T,a) 9]] \ \ R R R R δx σ FρxdS′ [by II and III]. →− 1 ∂F Remark 43. grad F R ∆H ∆D = 0 [Born–Wolf [8, p.902, (10)]]. × − c ∂t Proof. δ is a linear functional. That the authors fail to point out this important concept hidden behind this proof makes one doubt if they master distribution theory. ∇ H 1 D˙ = 0 [Born–Wolf [8, p.1, (1)]]. × − c ∇ H = U( F)∇ H(1) +U(F)∇ H(2) +δ(F)(∇F) (∆H) [Born–Wolf [8, p.902, (6)]]. × − (1×) (2) × × ∂D = U( F) ∂ε1E +U(F) ∂ε2E + δ(F) ∂F ∆D. Hence ∂t − ∂t ∂t ∂t δ(F)(∇F) (∆H)= δ(F) ∂F ∆D. Therefore, × ∂t (∇F) (∆H)= ∂F ∆D (since δ is a linear functional [Rudin [98, p.141, l. 7]]). × ∂t − (B). Kenyon [65] Remark 1. The figure in the upper left corner of http://www.amnh.org/education/resources/ rfl/web/essaybooks/cosmic/p_roemer.html is better than the one given in Kenyon [65, p.8, Fig. 1.2]. (C). Matveev [80]

171 Remark 1. Matveev [80, p.24, l. 6–p.25, l. 7] shows that − − (Φ is a solution of Matveev [80, p.24, (2.13)]) (Φ is of the form given in Matveev [80, ⇔ p.25, (2.20)]). In contrast, Wangsness [121, p.377, l.11–l. 1] shows only that − (ψ is of the form given in Wangsness [121, p.377, (24-11)]) (ψ is a solution of Wangsness [121, p.376, (24-9)]). ⇒ Remark 2. A wave whose constant-phase surface are planes are called a plane wave [Matveev [80, p.28, l. 14–l. 13]]. This definition is based on Born–Wolf [8, p.18, (23), (24), and (28)] and is consistent− − with the general definition given in Matveev [80, p.25, (2.21)]. Remark 3. The harmonic functions given in Matveev [80, p.27, l. 15] are solutions of the equation of motion for the simple harmonic oscillator [Marion–Thornton− [77, p.100, (3.5)]]. They should not be mistaken for the harmonic functions defined by Rudin [99, §11.3, (2)]. Remark 4. Matveev [80, p.30, (2.53) & (2.54)] follow from Wangsness [121, p.34, (1-118)]. Matveev [80, p.30, (2.55) & (2.56)] follow from Wangsness [121, p.34, (1-115)]. By Kreyszig [66, p.103, l. 13–p.104, l. 16], the set of quantities in (2.60b) constitutes a four-dimensional vector [Matveev− [80, p.32,− l. 7–l. 5]]. − − Remark 5. Matveev [80, p.33, l.7–l.9, (2.61)] follows from Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.11, (A-1)] and Landau–Lifshitz [68, p.28, (9.15)]. Remark 6. Landau–Lifshitz [68, p.76, (31.6)] provides the reason why we call S [Matveev [80, p.35, l.12]] the flux density of energy and P [Matveev [80, p.36, l.6]] the rate of flow of energy. Remark 7. For Matveev [80, p.37, l. 18, (3.11)], see Wangsness [121, p.361, (24-32)]. For Matveev [80, p.38, l.4, (3.12), light− pressure for total absorption], see Wangsness [121, p.427, l.17]. For Matveev [80, p.38, l.5–l.7, light pressure for total reflection], see Wangsness [121, p.427, (25-99)]. Remark 8. Matveev [80, p.40, l. 7, (3.13)] follows from Wangsness [121, p.381, (21-75)], Matveev − [80, p.31, (2.57)], and (nx,ny,0) (Ex,Ey,0)= 0. · 2 Remark 9. E0 = E (1+βnx )/ 1 β [Matveev [80, p.40, l. 2, (3.14a)]] follows from Matveev [80, 0′ ′ − − p.40, l. 1, (3.14b); p.41, l.1, (3.14c)]. Matveev [80, p.40, l. 1, (3.14b); p.41, l.1, (3.14c)] p follow− from Wangsness [121, p.522, (29-138)]. − Remark 10. Matveev [80, p.41, l. 19, (3.17)] follows from Matveev [80, p.35, (3.4)] and Landau– − Lifshitz [68, p.76, (31.6)]. Matveev [80, p.41, l. 3, (3.20)] follows from Wangsness [121, p.507, (29-56)]. For the definition of G given− in Matveev [80, p.42, l.17, (3.26)], see Wangsness [121, p.361, (21-74)].

Remark 11. The amplitude of the resultant wave varies from E01 +E02 to E01 E02 [Matveev [80, p.45, l.21–l.22]]. | − |

i(ω1t k1z) iδ1 Proof. E1x = E01e − = E01e . i(ω2t k2z) iδ2 E2x = E02e − = E02e . δ iδ iδ E03e 3 = E01e 1 + E02e 2 . 2 iδ1 iδ2 iδ1 iδ2 E03 =(E01e + E02e )(E01e− + E02e− ) 2 2 = E + E + 2E01E02 cos(δ1 δ2), where 01 02 − δ1 δ2 depends on ω1,ω2 and t. − Remark 12. The energy of a standing wave between successive nodes and antinodes remains constant with the passage of time [Matveev [80, p.48, l.4–l.6]].

172 Proof. S = E B µ×0 1 δ δ δ δ = [2E0 cos(kz + )cos(ωt + )xˆ] [(2E0/c)sin(kz + )sin(ωt + )yˆ] µ0 2 2 × 2 2 E2 = 0 sin2(kz + δ )sin2(ωt + δ )zˆ. µ0c 2 2 E2 S = 0 sin2(kz + δ ) sin2(ωt + δ ) zˆ = 0. h i µ0c 2 h 2 i The result follows from Wangsness [121, p.393, (24-111)(i)]. Remark 13. Wangsness [121, p.408, (25-11)(i)] and Jackson [60, p.305, Figure 7.6] explain the following facts: (i) Either the vector E of the incident wave or the vector B must reverse direction as a result of reflection [Matveev [80, p.48, (3.l.19–l.20)]]. (ii) The photographic action is due to the electric field E and not to the magnectic induction B [Matveev [80, p.48, l. 4–l. 2]]. − − Remark 14. The proof of ∆υ∆t 1 [Matveev [80, p.77, l.6, (8.27)]] is incorrect. It should be corrected as follows: ≃

Proof. According to Matveev [80, p.76, (8.24)], the first zero of A(ω) is ω0 = 2π/τ. Since the spectral width is the interval ∆υ of frequencies over which the spectral amplitude differs significantly from zero [Matveev [80, p.76, l.16–l.17]], we choose ω1 = ω0/2 = π/τ as the right frequency endpoint that A(ω1) differs significantly from zero. Considering the left frequency endpoint, we let ∆ω = 2ω1 = 2π/τ. Then ∆υ = 1/τ. Remark 15. For the proof Plancherel’s theorem [Matveev [80, p.79, (8.32)]], see Rudin [99, p.200, The- orem 9.13 (b)]. dW Remark 16. Before the Fourier transform, the radiating power < P > is defined as dt [Matveev [80, p.79, (9.5)]]. After the Fourier transform, the emission intensity w [Matveev− [80, pp.81–82, (9.21), (9.25); p.83, l.1]] and the absorption intensity Φ(ω) [Matveev [80, p.84, l. 6 & l. dW − − 1]] are defined as dω . By Plancherel’s theorem [Matveev [80, p.79, (8.32)]], it is legitimate to do so. Remark 17. By Lindgren [74, p.181, l.15 & l. 1], the probability that the time between two successive − collisions lies in the interval between ξ and ξ +dξ is given by p(ξ)dξ =(1/τ0)exp( ξ/τ0)dξ, − where τ0 is the mean time between collisions [Matveev [80, p.87, l. 17–l. 13]]. − − Remark 18. γ = ∆ [Matveev [80, p.90, l.11]]. 2 2 (ω ω0) /(2σ ) Proof. e− − = 1/2. 2 2 δω 2 (2ln2)σ =(ω ω0) =( ) . − 2 δω = 2√2ln2σ = ∆ [Matveev [80, p.90, (10.18)]]. The result follows from Matveev [80, p.84, (9.31)(ii)].

Remark. Gaussian distribution FG [Matveev [80, p.90, l.3]] is the normal distribution [Borovkov [9, p.33, (3)]], while the Lorentz distribution [Matveev [80, p.83, (9.29)]] is the Cauchy dis- tribution [Borovkov [9, p.34, l.17]].

Remark 19. γ = γ1 + γ2 [Matveev [80, p.90, l. 3]]. − Proof. Let ω0 = 0. By Borovkov [9, p.31, Theorem 1], the Lorentz distribution can be expressed as fL(ω1), where L is a random variable. Namely, f ω dω 1 dω [Borovkov [9, p.34, l. 16]] L( 1) 1 = π 1 ω2 1 ( + 1 ) − 1 = π(1+(ω/σ)2 d(ω/σ) [let ω = σω1]

173 = fσL(ω)dω. fσL = FL [Matveev [80, p.83, (9.29)]], where σ = γ/2. ϕσL(t)= ϕL(σt) [Borovkov [9, p.126, l.1]] σ t = e− | | [Borovkov [9, p.128, l.19]]. Let L1 = σ1L,L2 = σ2L. ϕL1+L2(t)= ϕL1(t)ϕL2(t) [Borovkov [9, p.126, l.6]] (σ1+σ2) t = e− | | = ϕ(σ1+σ2)L(t). By Borovkov [9, p.131, Corollary 1], L1 + L2 =(σ1 + σ2)L. Since σ = γ/2, the width of fL1+L2 is γ1 + γ2. Remark 20. ∆ = ∆2 + ∆2 [Matveev [80, p.90, l. 1]]. 1 2 − Proof.q Let ω0 = 0. By Borovkov [9, p.31, Theorem 1], Gaussian distribution can be expressed as fG(ω1), where G is a random variable. Namely, 1 2 f (ω )dω = e ω1 /2dω [Borovkov [9, p.33, l. 6]] G 1 1 √2π − 1 2 2 − = 1 e ω /(2σ )d(ω/σ) (Let ω = σω ) √2π − 1 = fσG(ω)dω, where fσG = FG [Matveev [80, p.90, (10.17)]]. ϕσG = ϕG(σt) [Borovkov [9, p.126, l.1]] σ 2t2/2 = e− [Borovkov [9, p.128, l.2]]. Let G1 = σ1G,G2 = σ2G. ϕG1+G2(t)= ϕG1(t)ϕG2(t) [Borovkov [9, p.126, l.6]] 2 2 2 (σ1 +σ2 )t /2 = e− = ϕ 2 2 . √σ1 +σ2 G 2 2 By Borovkov [9, p.131, Corollary 1], G1 + G2 = σ1 + σ2 G. q 2 2 By Matveev [80, p.90, (10.18)], the width of fG1+G2 is ∆1 + ∆2. Remark 21. If we let F be a δ-function and substitute it into Matveevq [80, p.98, (12.14)], by Cohen- Tannoudji–Diu–Laloe¨ [17, vol. 2, p.1469, (5)], we will obtain a plane wave. This is all that Matveev [80, p.86, l.1–l.15; p.97, l. 20–p.98, l.19] say. − Remark 22. p = p(a)=[1/(πn)]exp( a2/n) [Matveev [80, p.101, (13.9)]]. ∂n 2 − Proof. ∂t = D∇ n [p.65, (5) in http://galileo.phys.virginia.edu/classes/ 304/brownian.pdf], where n(x,t) is the number density of diffusing particles and D is the diffusion constant. Let p(x,t) be the probability for the particle at (x,t). By p.66, (8) in http://galileo.phys.virginia.edu/classes/304/brownian. pdf, we will obtain ∂ p 2 ∂t = D∇ p [p.66, (7) in http://galileo.phys.virginia.edu/classes/304/ brownian.pdf]. 2 Since p(x,t)=(√4πDt) 1 exp( x ) is a solution of ∂ p = D∇2 p, if we let the step number − − 4Dt ∂t n = 2Dt, we will have Matveev [80, p.101, (13.9)] in the complex space after normalization.

Remark. Matveev [80, p.102, (13.11)] follows from Matveev [80, p.101, (13.9)]. In- tuition also leads to the same conclusion [Matveev [80, p.102, (13.11)], see http:// mathworld.wolfram.com/RandomWalk2-Dimensional.html]. This is the rea- son why we define n = 2Dt.

174 dΓ11(τ) Remark 23. Γ11(0)= max ∞ τ ∞ Γ11(τ) > 0 [Matveev [80, p.106, l. 7]]; dτ 0 as τ 0+ [Matveev [80, p.106, l. −11–l.≤ ≤ 10]]. − ≤ → − − Proof. I. ( f (t + τ) f (t))2 = f (t + τ)2 + f (t)2 2 f (t + τ) f (t). 1 −T/2 1 T/−2 2 1 T/2 2 For large T > 0, T T/2 f (t) f (t + τ)dt = T T/2 f (t) dt 2T T/2[ f (t + τ) f (t)] dt. − − − − − 1 T/2 2 II. Let A(τ)= limTR ∞ 2T T/2[ f (t + τ) f (Rt)] dt. R → − − dA(τ) 1 T/2 = limT ∞ 2R[ f (t + τ) f (t)] f ′(t + τ)dt dτ → 2T T/2 − 0 as τ 0+ because− [ f (t + τ) f (t)] f (t + τ) 0. ≥ → R − ′ ≥ Remark 24. ωint = ωre f lt = ωre f rt; kin r = kre f l r = kre f r r [Matveev [80, p.123, (16.5); (16.6)]]. · · · Proof. It suffices to prove Matveev [80, p.123, (16.5)]. Fix ωin,ωre f l,ωre f r,kin,kre f l,kre f r, and r. Only let t vary. Let a = b. 6 eat,ebt are two linearly independent solutions of the ODE (D a)(D b) f (t)= 0. − − Remark 25. Matveev [80, p.127, (16.24)] should have been corrected according to Wangsness [121, p.412, (25-27)]. Remark 26. For Matveev [80, p.134, l. 23–l. 3], see p.14, Fig.; p.15, Fig. in https://www.brown. edu/research/labs/mittleman/sites/brown.edu.researc− − h.labs.mittleman/ files/uploads/lecture13_0.pdf. For the phase change of the magnetic field strength upon reflection [Matveev [80, p.134, l. 10–l. 9; l. 4–l. 3]], the result follows − − − − from p.1, Fig. in https://www.usna.edu/Users/physics/mungan/_files/ documents/Scholarship/PhaseChange.pdf and the right-hand rule.

Remark 27. For the proof of krefl n = kin n [Matveev [80, p.135, l. 4]], see Wangsness [121, p.408, l.5]. − − k2 k2 sin2 θ = k2[1 ( n1 )2 sin2 θ ] [Matveev [80, p.136, l.7]] follows from Wangsness 2 1 in 2 n2 in [121,− p.406, (25-4)]. − 2 2 1/2 Erefl Z2 cosθin Z1[1 (n1/n2) sin θin] Remark 28. ( ) = − − 2 2 1/2 Ein Z2 cosθin+Z1[1 (n1/n2) sin θin] ⊥ − Z2 cosθin iZ1(s/k2) = − [Matveev [80, p.138, l.5–l.6, (17.16)]]. Z2 cosθin+iZ1(s/k2) Erefl Z2 cosθin Z1 cosrefr Proof. ( ) = − [Matveev [80, p.128, (16.30a)]] Ein Z2 cosθin+Z1 cosrefr ⊥ 2 2 1/2 Z2 cosθin Z1[1 (n1/n2) sin θin] = − − 2 2 1/2 [Matveev [80, p.136, (17.8)]]. Z2 cosθin+Z1[1 (n1/n2) sin θin] − k2 cosθrefr = is [Wangsness [121, p.418, l. 7]]. Therefore, Z2 cosθin Z1 cosrefr Z2 cosθin iZ1(s/k2) − − = − . Z2 cosθin+Z1 cosrefr Z2 cosθin+iZ1(s/k2) Remark. Matveev [80, p.138, (17.17)] should have been corrected as 2 2 1/2 Erefl Z1 cosθin Z2[1 (n1/n2) sin θin] Z1 cosθin iZ2(s/k2) ( ) = − − 2 2 1/2 = − . Ein Z1 cosθin+Z2[1 (n1/n2) sin θin] Z1 cosθin+iZ2(s/k2) k − Remark 29. 2sl = 1 [Matveev [80, p.147, l.7]] because the intensity of the wave is proportional to E E. See Matveev [80, p.143, (19.12)]. ·

12. Fluid Mechanics (Landau–Lifshitz [69]) Remark 1. A complete system of equations of fluid dynamics should be five in number [Landau–Lifshitz [69, p.5, l.7–l.8]] because five variables require five equations to get solutions.

Remark 2. ρdx = ρ0da [Landau–Lifshitz [69, p.5, l.15]] means that the mass of the fluid element at t equals the mass of the fluid element at t = t0.

Remark 3. (∂s/∂t)a = 0 [Landau–Lifshitz [69, p.5, l.22]] because dx = 0 (hence v = 0) if a is fixed.

175 ∂v 1 ∂ p Remark 4. ( )a = ( )t [Landau–Lifshitz [69, p.5, l.20]] is Newton’s second law at t = t0. ∂t − ρ0 ∂a Remark 5. dΦ = dp/ρ ∇Φ = ∇p/ρ [Landau–Lifshitz [69, p.6, l.9–l.10]] because ∂Φ/∂t = 0 [in thermal ⇒ equilibrium] and ∂ p/∂t = 0 [in mechanical equilibrium]. Remark 6. Landau–Lifshitz [69, p.7, (3.7)] follows from Wangsness [121, p.33, (1-101) & (1-103)]. Remark 7. “Consider a fluid element” [Landau–Lifshitz [69, p.7, l. 17]] means “consider a fluid element − with fixed mass”.

Remark 8. V(p′,s′) V(p′,s) > 0 [Landau–Lifshitz [69, p.7, l. 8]] because the density is inversely pro- prortional− to the volume. − ∂V T ∂V Remark 9. ( )p = ( )p [Landau–Lifshitz [69, p.7, l. 4]] follows from Reif [93, p.167, (5 7 2)]. This ∂s cp ∂T − · · ∂V ∂s ∂V is because ∂ in ∂s ∂T = ∂T can be replaced with d if we treat p as a constant. Remark 10. Because Landau–Lifshitz [69, p.4, (2.9)] considers dv/dt, the acceleration g must be added to the right-hand side of equation (5.1) if the flow takes place in a gravitational field [Landau–Lifshitz [69, p.9, l.7–l.8]. Remark 11. v (v ∇)v = 1 v ∇v2 [Landau–Lifshitz [69, p.9, l. 5]]. · · 2 · − Proof. (v ∇)v ∂ · ∂ ∂ =(v1 ∂x + v2 ∂y + v3 ∂z )v ∂v ∂v ∂v =(v1 ∂x + v2 ∂y + v3 ∂z ) ∂v1 ∂v1 ∂v1 ∂v2 ∂v2 ∂v2 ∂v3 ∂v3 ∂v3 =(v1 ∂x + v2 ∂y + v3 ∂z ,v1 ∂x + v2 ∂y + v3 ∂z ,v1 ∂x + v2 ∂y + v3 ∂z ).

∂ 1 2 1 2 Remark 12. ∂t ( 2 ρv +ρε)= ∇ [ρv( 2 v +w)] [Landau–Lifshitz [69, p.10, l.11]] follows from Wangsness [121, p.34, (1-114)].− · Remark 13. The second term is the work done by pressure forces on the fluid within the surface [Landau– Lifshitz [69, p.10, l. 2–l. 1]]. − − Proof. By Reif [93, p.110, (4 4 6)], dE¯ =dQ¯ pdV¯ , wherepdV ¯ is the work done by the system. · · − By Landau–Lifshitz [69, p.10, (6.2)], ∂ ( 1 ρv2 + ρε)dV = ρv( 1 v2 + ε) f+ pv df, where the second term on the right side of − ∂t 2 2 · · the equality is the work done on the system (in this case, the fluid element) by the pressure R H H forces.

Remark 14. dv = ∇w [Landau–Lifshitz [69, p.13, l.7]] follows from Landau–Lifshitz [69, p.3, (2.2); p.4, dt − (2.9)]. Remark 15. Landau–Lifshitz [69, p.14, Remark ‡; p.16, Remark †] are intended to help one’s understanding, but their wording may easily leave one under the impression that Stokes’ theorem is valid only in simply-connected regions. In fact, Stokes’ theorem is also valid in multiply-connected regions except that we must consider the entire boundary rather than a partial boundary. For example, let it it R = z 1 < z < 2 ,C1 = 2e t [0,2π] ,C2 = e t [0,2π] . If a potential flow is valid in { | | | } { | ∈ } { − | ∈ } R, then, by Stokes’ theorem, we have v dl = 0 rather than v dl = 0. C1+C2 · C1 · Remark 16. Landau–Lifshitz [69, p.19, l.5, (10.9)]R follows from Courant–JohnR [22, vol. 2, p.104, the funda- mental theorem]. Remark 17. u = Ω r [Landau–Lifshitz [69, p.22, l. 14]] follows from Karamcheti [64, p.12, (1.5)]. × −

176 Remark 18. x ∂φ + y ∂φ + z ∂φ = xyΩ( 1 1 ) [Landau–Lifshitz [69, p.22, l. 10]] follows from O’neill a2 ∂x b2 ∂y c2 ∂z b2 − a2 − [86, p.148, Lemma 3.8]. ΩρV (a2 b2)2 Remark 19. M = 5 a2−+b2 [Landau–Lifshitz [69, p.23, l.2]]. Proof. Deform the ellipsoid to the unit ball: (x,y,z)=(au,bv,cw). Let u = ρ sinθ cosφ v = ρ sinθ sinφ w = ρ cosθ. Then express the integral in terms of ρ,θ, and φ. Remark 20. ∂v + v ∂v = 1 ∂ p [Landau–Lifshitz [69, p.24, l.4]]. ∂t ∂r − ρ ∂r Proof. By Landau–Lifshitz [69, p.3, (1.2)], ∂v 1 ∂t +(v ∇)v = ρ ∇p, where v = vrˆ. · − ∂v (v ∇)v =(vrˆ) (rˆ ∂r ) [Wangsness [121, p.33, (1-102)]] · ∂(vrˆ) · = v ∂r .

3 Remark 21. V = dR = 2p0 ( a 1) [Landau–Lifshitz [69, p.25, l.3]]. dt − 3ρ R3 − q 2 Proof. 3 dR dV [Landau–Lifshitz [69, p.24, l. 1, (4)]]. R = 2 2p0 − V + 3ρ −

Remark 22. a dR = a√π Γ(5/6) [Landau–Lifshitz [69, p.25, l.7]]. 0 √(a/R)3 1 Γ(1/3) − R 3/2 Proof. a dR = a 1 r dr (let r = R/a) 0 √(a/R)3 1 0 √1 r3 − − = a 1 uR 1/6(1 u) 1/2duR (let u = r3). 3 0 − − − Remark 23. TheR compression of the jet is a1/a = π/(2 + π) [Landau–Lifshitz [69, p.26, l.-10]].

a1 0 iθ Proof. a/2 a/2 = π π e tanθdθ [Landau–Lifshitz [69, p.26, l.-12]] a1 − −0 0 − = π ( cosθ π + i π secR θdθ) a1 − |− θ − π 0 = π [ 2 + ilntan( 2 R+ 4 ) π ]. − |− 13. Elasticity (Landau–Lifshitz [70]) Remark 1. Landau–Lifshitz [70, p.3, (1.7) & (1.8)] Proof. Following the same argument leading to Landau–Lifshitz [70, p.3, (1.5)], we may obtain the strain tensor for curvilinear coordinates (q1,q2,q3): 1 ∂u ∂u uik = (qˆi + qˆk ) (Assumeq ˆi qˆk = δik) [Riley–Hobson–Bence [94, p.373, (10.60)]]. 2 · hk∂qk · hi∂qi · For spherical coordinates, let h1∂q1 = ∂r,h2∂q2 = r∂θ,h3∂q3 = r sinθ∂φ. For cylindrical coordinates, let h1∂q1 = ∂r,h2∂q2 = r∂φ,h3∂q3 = ∂z. Then ∂u ∂u 2uθφ = θˆ + φˆ · r sinθ∂φ · r∂θ ∂u ˆ ˆ ˆ 1 ∂ur ∂uθ ˆ φ ˆ ∂rˆ ∂θ ∂φ = θ r sinθ ( ∂φ rˆ + ∂φ θ + ∂φ φ + ur ∂φ + uθ ∂φ + uφ ∂φ ) · ˆ 1 ∂ur ∂uθ ∂uφ ∂rˆ ∂θˆ ∂φ + φˆ ( rˆ + θˆ + φˆ + ur + uθ + uφ ) · r ∂θ ∂θ ∂θ ∂θ ∂θ ∂θ 1 ∂uφ 1 ∂uθ = ( uφ cotθ)+ [Symon [110, p.97, (3.99)]]. r ∂θ − r sinθ ∂φ

177 2 A physical theory’s creation, development and validity

I. Choosing postulates 1. To develop a rigorous theory, we must introduce some basic postulates that are eminently reasonable and certainly not contradictable by any of the laws of mechanics. The actual validity of a postulate is determined by experiments: we make theoretical predictions based on this postulate; if these predictions can be repeat- edly verified by experiments, this can make us accept the validity of this postulate with increasing confidence [Reif [93, p.48, l.3–l.16]]. 2. Classical thermodynamics uses thermodynamic laws [Reif [93, p.122, l.17–p.123, l.10]] as basic pos- tulates, while staistical mechanics uses the postulate of equal a priori probabilities [Reif [93, p.54, l. 13– − l. 12]] as the fundamental postulate and connects thermodynamic laws and microstates [Pathria–Beale [89, − p.2, l. 12]] of a system with the relation S = klnΩ [Reif [93, p.99, (3 3 12)]]. We may derive the equation − · · of statepV ¯ = NkT [Reif [93, p.125, (3 12 8)]] from either the postulate of statistical mechanics or the postu- lates of classical thermodynamics. However,· · only through statistical mechanics may we obtain a satisfactory explanation for Boltzmann’s constant k [Reif [93, p.137, l.5]]. Remark. Physical theories and experiments grows with each other. We first obtain the the equation of state pV¯ = NkT from experiments. The attempt to figure out the theoretic meaning of the constant k leads to the development of statistical mechanics. 3. To where we apply the fundamental postulate of equal a priori probabilities in statistical mechanics [Reif [93, p.60, (2 4 1); p.112, l.23–p.113, l.11]] II. Theory’s· validity· 1. How to prepare an ideal gas [Reif [93, p.135, l.1–l.14]] 2. The measurement of macroscopic parameters (a). Work and internal energy [Reif [93, §4 1; p.172, l. 4–p.175, l.4]] (b). Heat [Reif [93, §4 2]] (c). Tem- perature [Reif [93, §4 3]] (d). Specific heats· [Reif [93, §4− 4; §5 7]] (e). Entropy [Reif [93, §4 5;· §5 4; p.171, · · · · · l.2–p.172, l. 5]] 3. Theory’s predictions and their verification [the methd: Reif [93, p.128, l. 22–l. 13]]: the experimental− values agree with the theoretical values [examples: Reif [93, p.158, l. 10–l. −8]]. − − − III. (Theory’s development) Theory of spin is created by analyzing the Stern–Gerlach experiment [Cohen- Tannoudji–Diu–Laloe¨ [17, vol. 1, chap. V, §A, 1.a]]. All the formulas given in Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, chap. V, §A, 1.b] are copied from the corresponding formulas in the theory of angular momen- tum.

3 The uncertainty principle

The principle is so important that almost every topic in quantum mechanics will discuss it. In order to fully understand the uncertainty principle, one must master some important concepts and follow a series of necessary steps. If one misses any part of it, then one’s understanding about the uncertainty principle is simply incorrect. One can have a hundred points or eighty points in composition, but there are only two scores in physics: a hundred points and zero.

178 3.1 Eigenvalues and simultaneous measurements

In quantum mechanics, physical quantities are expressed as operators. The measurement of a physical quanty is an eigenvalue of the operator. For a finite-dimensional vector space, a normal operator is diagonal- izable [Halmos [50, p.156, Theorem 1; p.159, l. 21–l. 20]; Jacobson [61, vol. 2, p.134, Theorem 7]]. In order to simultaneously measure a set of observables,− one− must find a common eigenstate for these observ- ables. The way to accomplish this goal is to construct a C.S.C.O. [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, chap. II, §D, 3.b]] using Jacobson [61, vol. 2, p.134, Theorem 7]. Thus, simultaneous measurements of a set of observables requires that these observables commute in pairs. Consequently, from the viewpoint of quantum mechanics, Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.150, (E-30)] shows that it is impossible to measure Ri and Pi simultaneously in the non-classical regime.

3.2 Proofs of the uncerntainty relation

The following two proofs are essentially the same [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.286, l. 7]; Rudin [99, p.80, (2)]; p.14, l. 1, http://math.uchicago.edu/˜may/REU2013/REUPapers/ Hill.pdf− ] even if the former proof− starts with observables, while the latter proof starts with wave func- tions:

3.2.1 Using the conjugacy relation between two observables

[Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, Complement CIII]]

3.2.2 Using the Fourier transform between two wave functions

[Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.23, l.18–p.26, l. 16]] − §5 of http://math.uchicago.edu/˜may/REU2013/REUPapers/Hill.pdf] provides a rigor- ous proof of Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.26, (C-18)]. Hill’s paper is creative, well-organized, self-contained, and made in one breath. dα Remark 1. x0 = [ ]k k [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.25, (C-15)]]. − dk = 0

+∞ i(α(k)+kx) Proof. ψ(x,0)= ∞ g(k) e dk [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.23, (C-8); p.25, (C- 12)]]. − | | R dα α(k)+ kx =(α(k0)+(k k0)[ dk ]k=k0 )k + kx − dα = α(k0)+ k0x +(k k0)(x +[ dk ]k=k0 ). − dα i(α(k0)+k0x) +∞ i(k k0)(x+[ dk ]k=k ) Thus, ψ(x,0)= e ∞ g(k) e − 0 dk. +∞ | | +∞ In general, x 0 g− k dk. In order to let x 0 achieve the maximum g k dk, we must ψ( , ) ∞ R ( ) ψ( , ) ∞ ( ) | |≤ − | | dα | | − | | dα i(k k0)(x+[ ]k=k ) 0 choose x0 = [ ]k=k Rso that e − dk 0 = e = 1. R − dk 0

Remark 2. In http://math.uchicago.edu/˜may/REU2013/REUPapers/Hill.pdf, note that 1 1 1. − (g)= gˇ = g˜ˆ, so g − (g) is continuous [Theorem 4.10.(2) & Theorem 4.11]. F 2πix ξ → F 2πix (ξ+ξ ) 2πix t 2.g ˆ = e 0 fˆ(ξ +ξ0) [p.4, l.5] should have been corrected asg ˆ = e 0 0 fˆ(ξ +ξ0). tgˆ 2 =( te 0 fˆ(t + k k R | R 179 2 1/2 2πix0(t+ξ0) ˆ 2 1/2 ξ0) dt) [p.14, l.10] should have been corrected as tgˆ 2 =( R te f (t + ξ0) dt) . 3.| If we want to apply Theorem 5.1 to quantum mechanics,k k f should| be interpreted| as a wave function R [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.19, l.7]] or probability amplitude [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.19, l.9]] instead of probability distribution [Hill, p.14, l.17; Borovkov [9, p.17, l.15–l.16]]. Remark 3. The above rigorous proofs require the use of observables and wave functions, so it is difficult to see the key idea of the uncertainty principle. Matveev [80, p.77, l.6, (8.27)] shows that the idea is quite simple. For details, see Remark 14.

3.3 Applications

[Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, Complement MIII, §2; p.494, (B-34)]] Remark 1. < T >> 0 [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.356, l. 2]]. −

Proof. Assume that < T >= 0. < T >= 0 d ϕ(x) = 0 ϕ is a constant. Then ⇒| dx | ⇒ ϕ is not square-integrable on ( ∞,+∞). We obtain a contraction. −

Remark 2. Note that if ψ > represents the normalized ground state, then | < H >ψ [i.e. E given in Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.357, (17)]] =< ψ H ψ > [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.227, (C-4)]] | | = E1 [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.355, Figure 3]].

4 The quantum regime vs. the classical regime

Before quantum mechanics was well established, it was better to collect more examples to give more evidence to the theory, but listing examples without a summary can easily look disorganized. Modern textbooks usually follow this approach. However, after quantum mechanics is well established, the repeti- tion of the same method with different outlooks only obscures the essence of the theory and easily confuses readers. Therefore, we should brief the essence first, and then provide examples. Strictly speaking, quantum mechanics covers all the cases and contains classical mechanics. Classical mechanics is valid in the classical regime, but how do we distinguish the classical regime from non-classical regime? The common problems that we encounter are as follows: I. The uncertainty principle: If the length scale that we consider is much greater than the de Broglie wave- length [Reif [93, p.246, (7 4 3)]], then a simutaneous specfication of coordinates and momenta is allowed · · (the classical regime). Examples: gases vs. electrons in a metal [Reif [93, p.247, l. 16–p.248, l.13]]. II. Bound state energies are quantized: By separation of variation of variables [Cohen-Tannoudji–Diu–Lalo− e¨ [17, vol. 1, p.32, (D-2)]], Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.32, (D-1)] may become Cohen- Tannoudji–Diu–Laloe¨ [17, vol. 1, p.32, (D-4) or p.33, (D-8); p.352, (1)]. The requirement of the square- integrability of ϕ implies that the bound state energies are quantized [Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.352, l. 14–p.354, l. 9]]. See Cohen-Tannoudji–Diu–Laloe¨ [17, vol. 1, p.355, Figure 3] or Reif [93, p.250, Fig.− 7 5 1]. − · · High temperatures belong to the classical regime [Reif [93, p.253, (7 6 13)]], while low temperatures belong to the non-classical regime [Reif [93, p.253, (7 6 15)]]. In the case· of· Reif [93, p.253, (7 6 13)], the · · · ·

180 difference h¯ω between consecutive energy levels is negligibly small, so energy levels in the classical regime look continuous. Examples: the thermal energy kT [Reif [93, p.252, (7 6 5)]] vs. the zero-point energy 1 h¯ω · · 2 of the ground state [Reif [93, p.253, l.3–l.18]]; the specific heat of diamond [hard; low molecular weight] vs. those of other solids [Reif [93, p.255, l.10–p.356, l. 6]]. −

References

[1] Arnold, V. I.: Ordinary Differential Equations, 3rd ed., translated from the Russian by Roger Cooke, New York: Springer-Verlag, 1992.

[2] Bell, R. J. T.: An Elementary Treatise on Coordinate Geometry of Three Dimentions, 3rd ed., London: Macmillan, 1963.

[3] Bendersky, M.: The Calculus of Variations, http://math.hunter.cuny.edu/mbenders/ cofv.pdf, 2008.

[4] Billingsley, P.: Convergence of Probability Measures, 2nd edition, New York: Wiley, 1999.

[5] Birkhoff, G. & Rota, G. C.: Ordinary Differential Equations, 3rd ed., New York: JohnWiley & Sons, 1978.

[6] Blaga, P. A.: Lectures on the Differential Geometry of Curves and Surfaces, http://www.cs. ubbcluj.ro/˜pablaga/geometrie%20III/Blaga%20P.-Lectures%20on%20the %20differential%20geometry%20of%20curves%20and%20surfaces%20(2005). pdf, 2005.

[7] Border, K.C.: http://www.hss.caltech.edu/˜kcb/Notes/EulerHomogeneity.pdf, 2012.

[8] Born, M. & Wolf E.: Principles of Optics, 7th (expanded) ed., New York: Pergamon, 2005.

[9] Borovkov, A. A.: Probability Theory (translated from the Russian by O. Borovkova), Amsterdam: Gordon and Breach Science Publishers, 1998.

[10] Bourbaki, N.: General Topology, 2 parts, Reading, MA: Addison-Wesley, 1966.

[11] Bromwich, T. J. I’A: An Introduction to the Theory of Infinite Series, London: Macmillan, 1908.

[12] Carmo, M. do: Differential Geometry of Curves and Surfaces, Englewood Cliffs, NJ: Prentice-Hall, 1976.

[13] Cherkaev, A: http://www.math.utah.edu/˜cherk/teach/12calcvar/dual.pdf [14] Choudhury, M. H.: Electromagnetism, New York: Ellis Horwood Limited, 1989.

[15] Chung, K. L.: A Course in Probability Theory, 3rd ed., San Diego: Academic Press, 2001.

[16] Coddington E. A. & Levinson N.: The Theory of Ordinary Differential Equations, New York: McGraw- Hill, 1955.

181 [17] Cohen-Tannoudji, C., Diu, B. & Laloe,¨ F.: Quantum Mechanics, 2 vols, New York: John Wiley & Sons, 1977.

[18] Collatz, L.: Differential Equations (translated from German by E. R. Dawson), New York: John Wiley & Sons, 1986.

[19] Conway, J. B.: Functions of One Complex Variable, New York: Springer-Verlag, 1973.

[20] Corson, D. R., Lorrain, P. &; Larrain, F.: Electromagnetic Fields and Waves, 3rd ed., New York: W. H. Freeman, 1988.

[21] Courant R. & Hilbert D.: Methods of Mathematical Physics, 1st English ed., 2 vols., New York: Inter- science Publishers, vol.1 (1953), vol.2 (1962).

[22] Courant R. & John, F: Introduction to Calculus and Analysis, 2 vols., New York: John Wiley & sons, vol.1 (1965), vol.2 (1974).

[23] Dahl, G.: http://heim.ifi.uio.no/˜geird/conv.pdf [24] Das, J.: Analytic Geometry, Kolkata, India: Academic Publishers, 2011.

[25] Deo, S.: Algebraic Topology: A Primer, 2nd ed., Singapore, Springer, 2018.

[26] Dugundji, J.: Topology, Boston: Allyn and Bacon, 1966.

[27] Eisenhart, L. P.: A Treatise on the Differential Geometry of Curves and Surfaces, Boston: Ginn and company, 1909.

[28] Ellison, W. & F.: Prime Numbers, New York: John Wiley & Sons, 1985.

[29] Van Esch, P: http://patrick.vanesch.pagesperso-orange.fr/nrqmJJ/ hamiltonjacobiv1.pdf

[30] Evans, L. C: Partial Differential Equations, American Mathematical Society, 1998.

[31] Feller, W.: An Introduction to Probability Theory and Its Applications, 2 vols., New York, Wiely, (vol. 1, 3rd ed., 1968; vol. 2, 2nd ed., 1971).

[32] Feshbach, H. & Morse P. M.: Methods of Theoretical Physics, 2 parts, New York: McGraw-Hill, 1953.

[33] Fine, H. B. & Thompson H. D.: Coordinate Geometry, New York: The Macmillan Company, 1909.

[34] Flygare, M.: http://www.ingvet.kau.se/juerfuch/kurs/amek/prst/11_nhco. pdf

[35] Fomin, S. V. & Gelfand, I. M.: Calculus of Variations, translated and edited by R. A. Silverman, Englewood Cliffs, NJ: Prentice-Hall, 1963.

[36] Fowler, M.: http://galileoandeinstein.physics.virginia.edu/lectures/ michelson.html, 2008.

[37] Fritzsche, K. & Grauert H.: Several Complex Variables, New York: Springer-Verlag, 1976.

182 [38] Furtok, T. E. & Klein, M. V.: Optics, 2nd ed., New York: John Wiley & Sons, 1986.

[39] Geyer, C. J.: Weak Convergence in Metric Spaces, http://www.stat.umn.edu/geyer/8112/ notes/metric.pdf.

[40] Gibbon, J. D.: http://www2.imperial.ac.uk/˜jdg/AE2MAPDE.PDF [41] Goldstein, H., Poole, C., & Safko, J: Classical Mechanics, 3rd, ed.,New York: Addison-Wesley, 2001.

[42] Gonzalez,´ M. O.: Classical Complex Analysis, New York: Marcel Dekker, 1992.

[43] Gonzalez,´ M. O.: Complex Analysis (Selected Topics), New York: Marcel Dekker, 1992.

[44] Gradshteyn, I. S. & Ryzhik, I. M.: Tables of Integrals, Series, and Products, 4th edition, prepared by Yu. V. Geronimus and M. Yu. Tseytlin, corr. enlarged edition by A. Jeffrey, New York: Academic Press, 1980.

[45] Gray, R. M.: Probability, Random Processes, and Ergodic Properties, 2nd ed., New York: Springer, 2009.

[46] Griffiths, D. J.: Introduction to Electrodynamics, 3rd ed., Upper Saddle River, NJ: Prentice-Hall, 1999.

[47] Griffiths, D. J.: Instructor’s Solutions Manual (Introduction to Electrodynamics, 3rd ed.), Upper Saddle River, NJ: Prentice-Hall, 2004.

[48] Guo, D. R. & Wang, Z. X.: Special Functions, translated from the Chinese by D. R. Guo & X. J. Xio, Singapore: World Scientific, 1989.

[49] Halliday, D. & Resnick, R: Fundamentals Of Physics, 2nd ed., New York: John Wiley & Sons, Inc., 1981.

[50] Halmos, P. R.: Finite Dimensional Vector Spaces, 2nd ed., Princeton: D. Van Nostrand, 1958.

[51] Hartman, P.: Ordinary Differential Equations, 2nd ed., Boston: Birkhauser,¨ 1982.

[52] Hatcher, A.: Algebraic Topology, Cambridge: Cambridge University Press, 2002.

[53] Hawkins, G. A.: Multilinear Analysis for Students in Engineering and Science, New York: Wiely, 1963.

[54] Hecht, E.: Optics, 4th ed., New York: Addison Wesley, 2002.

[55] Hicks, N. J.: Notes on Differential Geometry, New York: Van Nostrand Reinhold Company, 1965.

[56] Hilbert, D. & Cohn-Vossen S.: Geometry and the Imagination, translated by P. Nemenyi, 2nd, ed., New York: Chelsea, 1990.

[57] Ho, C. W.: A Note on Proper Maps, Proceedings of American mathematical society, vol. 51, no. 1, August 1975.

[58] Hobson, E. W.: A Treatise on Plane Trigonometry, 4th ed., Cambridge: Cambridge University Press, 1918.

183 [59] Ince, E. L.: Ordinary Differential Equations, New York: Dover, 1956.

[60] Jackson, J. D.: Classical Electrodynamics, 3rd ed., New York: John Wiley, 1999.

[61] Jacobson, N.: Lectures in Abstract Algebra, 3 vols., Princeton: Van Nostrand, vol. 1 (1951), vol. 2 (1953), vol.3 (1964).

[62] John F.: Partial Differential Equation, 4th ed., New York: Springer-Verlag, 1982.

[63] Kaplan, W.: Advanced Calculus, 5th ed., New York: Addison-Wesley, 2002.

[64] Karamcheti, K.: Vector Analysis and Cartesian Tensors with Selected Applications, San Francisco: Holden-Day, 1967.

[65] Kenyon, I. R.: The Light Fantastic: A Modern Introduction to Classical and Quantum Optics, New York: Oxford University Press, 2008.

[66] Kreyszig, E.: Introduction to Differential Geometry and Riemannian Geometry, Toronto: University of Toronto Press, 1968.

[67] Landau, L. D. & Lifshitz, E. M.: Mechanics, 3rd ed., translated from the Russian by J. B. Sykes & J. S. Bell, New York: Pergamon, 1988.

[68] Landau, L. D. & Lifshitz, E. M.: The Classical Theory of Fields, 4th revised English ed., translated from the Russian by M. Hamermesh, Oxford: Pergamon, 1987.

[69] Landau, L. D. & Lifshitz, E. M.: Fluid Mechanics, 2nd ed., translated from the Russian by J. B. Sykes & W. H. Reid, New York: Pergamon, 1987.

[70] Landau, L. D. & Lifshitz, E. M.: Theory of Elasticity, 3rd ed., translated from the Russian by J. B. Sykes & W. H. Reid, New York: Pergamon, 1986.

[71] Lang, S.: Complex Analysis, Reading, MA: Addison-Wesley, 1977.

[72] Lebedev, N.N.: Special Functions and Their Applications, translated from the Russian by R. A. Silver- man, Englewood Cliffs, N.J.: Prectice-Hall, 1965.

[73] Lee, J. M.: Introduction to smooth manifolds, 2nd ed., New York: Springer , 2013.

[74] Lindgren, B. W.: Statistical Theory, 3rd ed., New York: Macmillan Publishing Co., 1976.

[75] Loeve,´ M: Probability Theory, vol. I, 4th ed., New York: Springer-Verlag , 1977.

[76] Mann, W. R. & Taylor, A. E.: Advanced Calculus, 3rd ed., New York: John Wiley & Sons, 1983.

[77] Marion, J. B. & Thornton, S. T.: Classical Dynamics of Particles and Systems, 5th ed., Belmont, C.A.: Brooks/Cole, 2004.

[78] Mathpages: Phase, Group, and Signal Velocity, http://www.mathpages.com/home/ kmath210/kmath210.htm

[79] Matveev, A. N.: and Magnetism, Translated from the Russian by Ram Wadhwa and Natalia Deineko, Moscow, Mir Publishers, 1988.

184 [80] Matveev, A. N.: Optics, Translated from the Russian by Ram Wadhwa, Moscow, Mir Publishers, 1988.

[81] Merzbacher, E.: Quantum Mechanics, 3rd ed., New York: John Wiley & Sons, 1998.

[82] Morin, D.: http://www.people.fas.harvard.edu/˜djmorin/chap6.pdf [83] Munkres J. R.: Topology, 2nd ed., Englewood Cliffs, NJ: Prentice-Hall, 2000.

[84] Munkres J. R.: Elements of Algebraic Topology, Reading, MA: Addison-Wesley, 1984.

[85] Munkres J. R.: Elementary Differential Topology, 2nd ed., Princeton: Princeton University Press, 1968.

[86] O’Neill, B.: Elementary Differential Geometry, New York: Academic Press, 1966.

[87] Parzen, E.: Modern Probability Theory and Its Applications, New York: Wiley, (1960, 1992).

[88] Pathria, R. K.: Statistical Mechanics, 1st ed., New York: Pergamon, 1980.

[89] Pathria, R. K. & Beale, P. D.: Statistical Mechanics, 3rd ed., New York: Elsevier, 2011.

[90] Pervin, W.J.: Foundations of General Topology, New York, Academic Press, 1964.

[91] Pontryagin, L. S.: Ordinary Differential Equations, translated from the Russian by Leonas Kacinskas and Walter B. Counts, Reading: Addison-Wesley Publishing Company, 1962.

[92] Redziˇ c,´ D.V.: On the Laplacian of 1/r, http://arxiv.org/pdf/1303.2567.pdf.

[93] Reif, F.: Statistical Thermal Physics, New York: McGraw-Hill, 1965.

[94] Riley, K. F., Hobson, M. P., Bence S. J.: Mathematical Methods for Physics and Engineering (A com- prehensive guide), 2nd ed., Cambridge: Cambridge University Press, 2002.

[95] Roch, S.: Uniform integrability, http://www.math.wisc.edu/˜roch/275b.1.12w/ lect10-web.pdf

[96] Royden, H. L.: Real Analysis, 2nd ed., New York: The Macmillan Company, 1968.

[97] Rudin, W.: Principles of Mathematical Analysis, 2nd ed., New York: McGraw- Hill, 1964.

[98] Rudin, W.: Functional Analysis, New York: McGraw-Hill, 1973.

[99] Rudin, W.: Real and Complex Analysis, 2nd ed., New York: McGraw-Hill, 1974. Hill, 1964.

[100] Sadiku, M. N. O.: Elements of Electromagnetics, 3rd ed., Oxfrod: Oxford University Press, 2001.

[101] Saks, S. & Zygmund, A.: Analytic Functions, translated by E. J. Scott, Monografje Matematyczne, vol. 28, 3rd ed., Warsaw, 1971.

[102] Shapiro, J. A.: http://www.physics.rutgers.edu/˜shapiro/507/book3.pdf [103] Smythe, W. R.: Static and Dynamic Electricity, 3rd ed., New York: Taylor & Francis, 1989.

[104] Sneddon, I. N.: Special Functions of Mathematical Physics and Chemistry, New York: Interscience Publishers, 1956.

185 [105] Sneddon, I. N.: Elements of Partial Differential Equations, New York: McGraw-Hall, 1957.

[106] Sneddon, I. N.: Mixed Boundary Value Problems in Potential Theory, New York: Wiley-Interscience, 1966.

[107] Spivak, M: Calculus on Manifolds, New York: Addison-Wesley, 1965.

[108] Spivak, M.: A Comprehensive Introduction to Differential Geometry, Vol. 1, Berkeley: Publish or Perish, 1979.

[109] Struik, D. J.: Lectures on Classicial Differential Geometry, 2nd ed., New York: Dover Publications, 1961.

[110] Symon, K. R.: Mechanics, 3rd ed., Reading, MA: Addison-Wesley, 1971.

[111] Taylor, A. E. & Mann, W. R.: Advanced Calculus, 3rd ed., New York: Wiley, 1983.

[112] Thornton, S. T.: Instructor’s Manual for Classical Dynamics of Particles and Systems, 5th ed.

[113] Titchmarsh, E. C.: Introduction to the Theory of Fourier Integrals, Oxford: The Clarendon Press, 1937.

[114] Titchmarsh, E. C.: The Theory of Functions, 2nd ed., Oxford: Oxford University Press, 1939.

[115] Tkachev, V.: http://www.mai.liu.se/˜vlatk48/teaching/teaching_vt2009/ lectures_uu/PDE09-03.pdf

[116] Tolman, R. C.: The Principles of Statistical Mechanics, Oxford: The Clarendon Press, 1938.

[117] Tu, L. W.: Introduction to Manifolds, 2nd ed., New York: Springer, 2011.

[118] Uspensky, J.V.: The Theory of Equations, New York: McGraw–Hill, 1948.

[119] Vvedensky, D. D.: http://www.cmth.ph.ic.ac.uk/people/d.vvedensky/groups/ Chapter7.pdf, 2001.

[120] van der Waerden, B. L.: Modern Algebra, 2 vols, Translated from the German by F. Blum, New York: Ungar, 1949.

[121] Wangsness, R. K.: Electromagnetic Fields, New York: John Wiley & Sons, 1986.

[122] Watson, G. N. & Whittaker E. T.: A Course of Modern Analysis, 4th ed., Cambridge: Cambridge University Press, 1963.

[123] Watson, G. N.: Theory of Bessel Functions, 2nd ed., Cambridge: Cambridge University Press, 1966.

[124] Weatherburn, C. E.: Differential Geometry of Three Dimensions, 2 vols, Cambridge: The University Press, vol. 1 (1927); vol. 2 (1930).

[125] Wellner, J. A.: Convergence in Distribution, https://www.stat.washington.edu/jaw/ COURSES/520s/521/HO.522.13/ch11.pdf

[126] Willmore, T. J.: An Introduction to Differential Geometry, Oxford, Oxford University Press, 1959.

186 [127] Zygmund, A.: Trigonometric Series, 2 vols., 2nd ed., Cambridge: Cambridge University Press, 1959.

Mr. Li-Chung Wang is the author of the following website about the philosophy of mechanics: http://www.lcwangpress.com/physics/main.html. Address: 7th Floor, #21 Lane 267, Xi-zhou Street, Chungli, Taiwan, ROC. E-mail:[email protected]

187