<<

Chapter 1

Describing the Physical World: Vectors &

Lecture 1 It is now well established that all matter consists of elementary particles1 that interact through mutual attraction or repulsion. In our everyday life, however, we do not see the elemental nature of matter, we see continuous regions of material such as the air in a room, the water in a glass or the wood of a desk. Even in a gas, the number of individual molecules in macroscopic volumes is enormous 10 19 cm−3. Hence, when describing the behaviour of matter on scales of microns or ≈ above it is simply not practical to solve equations for each individual molecule. Instead, theoretical models are derived using average quantities that represent the state of the material. In general, these average quantities will vary with position in space, but an important concept to bear in mind is that the fundamental behaviour of matter should not depend on the particular coor- dinate system chosen to represent the position. The consequences of this almost trivial observation are far-reaching and we shallfind that it dictates the form of many of our governing equations. We shall always consider our space to be three-dimensional and Euclidean2 and we describe position in space by a position vector,r, which runs from a specific point, the origin, to our chosen location. The exact chosen will depend on the problem under consideration; ideally it should make the problem as “easy” as possible.

1.1 Vectors

A vector is a geometric quantity that has a “magnitude and direction”. A more mathematically precise, but less intuitive, definition is that a vector is an element of a . Many physical quantities are naturally described in terms of vectors, e.g. position, velocity and acceleration, force. The invariance of material behaviour under changes in coordinates means that if a vector represents a physical quantity then it must not vary if we change our coordinate system. Imagine drawing a line that connects two points in two-dimensional (Euclidean) plane, that line remains unchanged whether we describe it as “x units across andy units up from the origin” or “r units from the origin in theθ direction”. Thus, a vector is an object that exists independent of any coordinate system, but if we wish to describe it we must choose a specific coordinate system and its representation in that coordinate system (its components) will depend on the specific coordinates chosen.

1The exact nature of the most basic unit is, of course, still debated, but the fundamental discrete nature of matter is not. 2We won’t worry about relativistic effects at all.

7 1.1.1 Cartesian components and summation convention The fact that we have a Euclidean space means that we can always choose a Cartesian coordinate system withfixed orthonormal base vectors,e 1 =i,e 2 =j ande 3 =k. For a compact notation, it is much more convenient to use the numbered subscripts rather than different symbols to distinguish the base vectors. Any vector quantitya can be written as a sum of its components in the direction of the base vectors 1 2 3 a=a e1 +a e2 +a e3; (1.1) 1 2 3 and the vectora can be represented via its components (a , a , a ); and so,e 1 = (1,0, 0),e 2 = (0,1, 0) ande 3 = (0,0, 1). We will often represent the components of vectors using an index, i.e. (a1, a2, a3) is equivalent toa I , whereI 1,2,3 . In addition, we use the Einstein summation convention in which any index that appears∈{ twice} represents a sum over all values of that index

X3 J J a=a eJ = a eJ . (1.2) J=1

Note that we can change the (dummy) summation index without affecting the result

X3 X3 J J K K a eJ = a eJ = a eK =a eK . J=1 K=1 The summation is ambiguous if an index appears more than twice and such terms are not allowed. For clarity later, an upper case index is used for objects in a Cartesian (or, in fact, any orthonormal) coordinate system and, in general, we will insist that summation can only occur over a raised index and a lowered index for reasons that will hopefully become clear shortly. It is important to recognise that the components of a vectora I do not actually make sense unless we know the base vectors as well. In isolation the components give you distances but not direction, which is only half the story.

1.1.2 Curvilinear coordinate systems For a complete theoretical development, we shall consider general coordinate systems3. Unfortu- nately the use of general coordinate systems introduces considerable complexity because the lines on which individual coordinates are constant are not necessarily straight lines nor are they necessarily orthogonal to one another. A consequence is that the base vectors in general coordinate systems are not orthonormal and vary as functions of position.

1.1.3 Tangent (covariant base) vectors

K K 1 2 The position vectorr=x eK , is a function of the Cartesian coordinates,x , wherex =x,x =y andx 3 =z. Note that the Cartesian base vectors can be recovered by differentiating the position vector with respect to the appropriate coordinate ∂r =e . (1.3) ∂xK K 3For our purposes a coordinate system to be a set of independent scalar variables that can be used to describe any position in the entire Euclidean space. In other words the derivative of position with respect to a coordinate returns a vector tangent to the coordinate direction; a statement that is true for any coordinate system. For a general coordinate system,ξ i, we can write the position vector as

1 2 3 K i r(ξ ,ξ ,ξ ) =x (ξ )eK , (1.4) because the Cartesian base vectors arefixed. Here the notationx K (ξi) means that the Cartesian coordinates can be written as functions of the general coordinates, e.g. in plane polarsx(r,θ) = r cosθ,y(r,θ)=r sinθ, see Example 1.1. Note that equation (1.4) is thefirst time in which we use upper and lower case indices to distinguish between the Cartesian and general coordinate systems. 1 A tangent vector in theξ direction,t 1, is the difference between two position vectors associated with a small (infinitesimal) change in theξ 1 coordinate

t (ξi) =r(ξ i + dξ1) r(ξ i), (1.5) 1 − where dξ1 represents the small change in theξ 1 coordinate direction, see Figure 1.1.

i 2 t1(ξ ) ξ constant

r(ξi + dξ1) r(ξi)

O

1 Figure 1.1: Sketch illustrating the tangent vectort 1(ξ) corresponding to a small change dξ in the coordinateξ 1. The tangent lies along a line of constantξ 2 in two , or a plane of constant ξ2 andξ 3 in three dimensions.

Assuming thatr is differentiable and Taylor expanding thefirst term in (1.5) demonstrates that ∂r t =r(ξ i) + dξ1 r(ξ i) + ((dξ i)2), 1 ∂ξ1 − O which yields ∂r t = dξ1, 1 ∂ξ1 if we neglect the (small) quadratic and higher-order terms. Note that exactly the same argument can be applied to increments in theξ 2 andξ 3 directions and because dξi are scalar lengths, it follows that ∂r g = i ∂ξi is also a tangent vector in theξ i direction, as claimed. Hence, using equation (1.4), we can compute tangent vectors in the general coordinate directions via

∂r ∂xK g = = e . (1.6) i ∂ξi ∂ξi K We can interpret equation (1.6) as defining a local linear transformation between the Cartesian base vectors and our new tangent vectorsg i. The transformation is linear becauseg i is a linear combination of the vectorse K , which should not be a surprise because we explicitly neglected the quadratic and higher terms in the Taylor expansion. The transformation is local because, in general, the coefficients will change with position. The coefficients of the transformation can be written as entries in a ,M, in which case equation (1.6) becomes M z }| {       ∂x1 ∂x2 ∂x3 g1 ∂ξ1 ∂ξ1 ∂ξ1 e1 =  1 2 3  (1.7)  g   ∂x ∂x ∂x   e  2 ∂ξ2 ∂ξ2 ∂ξ2 2 . ∂x1 ∂x2 ∂x3 g3 e3 ∂ξ3 ∂ξ3 ∂ξ3 Provided that the transformation is non-singular (the determinant of the matrixM is non-zero) the tangent vectors will also be a of the space and they are called covariant base vectors because the transformation preserves the tangency of the vectors to the coordinates. In general, the covariant base vectors are neither orthogonal nor of unit length. It is also important to note that the covariant base vectors will usually be functions of position. Example 1.1. Finding the covariant base vectors for plane polar coordinates A plane polar coordinate system is defined by the two coordinatesξ 1 =r,ξ 2 =θ such that x=x 1 =r cosθ andy=x 2 =r sinθ. Find the covariant base vectors. Solution 1.1. The position vector is given by

1 2 1 2 1 2 r=x e1 +x e2 =r cosθe 1 +r sinθe 2 =ξ cosξ e1 +ξ sinξ e2, and using the definition (1.6) gives ∂r ∂r g = = cosξ 2e + sinξ 2e , andg = = ξ 1 sinξ 2e +ξ 1 cosξ 2e . 1 ∂ξ1 1 2 2 ∂ξ2 − 1 2

Note thatg 1 is a unit vector, q g = √g ·g = cos2 ξ2 + sin2 ξ2 = 1, | 1| 1 1 butg is not, g =ξ 1. The vectors are orthogonalg ·g = 0 and can be are related to the standard 2 | 2| 1 2 orthonormal polar base vectors viag 1 =e r andg 2 =re θ.

1.1.4 Contravariant base vectors Lecture 2 The fact that the covariant basis is not necessarily orthonormal makes life somewhat awkward. For K orthonormal systems we are used to the fact that whena=a eK , then unique components can be obtained via a dot product4. K I a·eI =a eK ·eI =a , (1.8)

4The dot or scalar product is an operation on two vectors that returns a unique scalar: the product of the lengths of the two vectors and the cosine of the angle between them. In the present context we only need to know that for orthogonal vectors the dot product is zero and the dot product of a unit vector with itself is one, see 1.3.1 for further discussion. § where the last equality is a consequence of the orthonormality. In our covariant basis, we would k writea=a gk, so that · k · a gi =a gk gi, (1.9) but no further simplification can be made. Equations (1.9) are a linear system of simultaneous equations that must be solved in order tofind the values ofa k, which is considerably more effort than using an explicit formula such as (1.8). The explicit formula (1.8) arises because the matrix eK ·eI is diagonal, which means that the equations decouple and are no longer simultaneous. We can, however, recover most of the nice properties of an orthonormal coordinate system if we define another set of base vectors that are each orthogonal to two of the covariant base vectors and have unit length when projected in the direction of the remaining covariant base vector. In other words, the new vectorsg i are defined such that ( 1, ifi= j, gi·g =δ i (1.10) j j ≡ 0, otherwise,

i where the objectδ j is known as the . In orthonormal coordinate systems the two I sets of base vectors coincide; for example, in our global Cartesian coordinatese e I . i i i K≡ We can decomposeg into its components in the Cartesian basisg =g K e , where we have used the raised index on the base vectors for consistency with our summation convention. Note i thatg K is thus defined to be theK-th Cartesian component of thei-th contravariant base vector. From the definition (1.10) and (1.6)   �  ∂xL ∂xL ∂xL ∂xL gi eK · e =g i eK ·e =g i δK =g i =δ i . (1.11) K ∂ξj L K ∂ξj L K ∂ξj L L ∂ξj j

i K i Note that we have used the “index-switching” property of the Kronecker delta to writeg K δL =g L, which can be verified by writing out all terms explicitly. ∂ξj Multiplying both side of equation (1.11) by ∂xK yields ∂xL ∂ξj ∂ξj ∂ξi gi =δ i = ; L ∂ξj ∂xK j ∂xK ∂xK and from the ∂xL ∂ξj ∂xL = =δ L , ∂ξj ∂xK ∂xK K because the Cartesian coordinates are independent. Hence, ∂ξi gi δL =g i = , L K K ∂xK and so the new set of base vectors are ∂ξi gi = eK . (1.12) ∂xK The equation (1.12) defines a local linear transformation between the Cartesian base vectors and the vectorsg i. In a matrix representation, equation (1.12) is M−T z }| {   1 1 1   1 ∂ξ ∂ξ ∂ξ 1 g 1 2 3 e =  ∂x2 ∂x2 ∂x2  (1.13)  g2   ∂ξ ∂ξ ∂ξ   e2  , ∂x1 ∂x2 ∂x3 g3 ∂ξ3 ∂ξ3 ∂ξ3 e3 ∂x1 ∂x2 ∂x3 and we see that the new transformation is the inverse transpose5 of the linear transformation that defines the covariant base vectors (1.6). For this reason, the vectorsg i are called contravariant base vectors. Example 1.2. Finding the contravariant base vectors for plane polar coordinates For the plane polar coordinate system defined in Example 1.1,find the contravariant base vectors. Solution 1.2. The contravariant base vectors are defined by equation (1.12) and in order to use that equation directly, we must express our polar coordinates as functions of the Cartesian coordinates

x2 r=ξ 1 = √x1x1 +x 2x2, and tanθ = tanξ 2 = , x1 and then we can compute

∂ξ1 ∂ξ1 ∂ξ2 sinξ 2 ∂ξ2 cosξ 2 = cosξ 2, = sinξ 2, = and = ∂x1 ∂x2 ∂x1 − ξ1 ∂x1 ξ1 Thus, using the transformation (1.13), we have

∂ξ1 ∂ξ1 g1 = e1 + e2 = cosξ 2e + sinξ 2e =g , ∂x1 ∂x2 1 2 1

I where we have used the fact thate I =e , and also ∂ξ2 ∂ξ2 sinξ 2 cosξ 2 1 g2 = e1 + e2 = e + e = g . ∂x1 ∂x2 − ξ1 1 ξ1 2 (ξ1)2 2

i· i We can now easily verify thatg gj =δ j. An alternative (and often easier) approach is tofind the contravariant base vectors byfinding the inverse of the matrixM that defines the covariant base vectors and using equation (1.13).

1.1.5 Components of vectors in covariant and contravariant bases We canfind the components of a vectora in the covariant basis by taking the dot product with the appropriate contravariant base vectors �  k i · i k · i k i i a=a gk, wherea =a g =a gk g =a δk =a . (1.14) Similarly components of the vectora in the contravariant basis are given by taking the dot product with the appropriate covariant base vectors �  k · k· k a=a kg , wherea i =a g i =a kg gi =a kδi =a i. (1.15)

5That the inverse matrix is given by   ∂ξ1 ∂ξ2 ∂ξ3 1 1 1  ∂x1 ∂x2 ∂x3  M−1  ∂ξ ∂ξ ∂ξ  = ∂x2 ∂x2 ∂x2 , ∂ξ1 ∂ξ2 ∂ξ3 ∂x3 ∂x3 ∂x3 can be confirmed by checking thatMM −1 =M −1M=I, the identity matrix. Alternatively, the relationship follows directly from equation (1.11) written in matrix form. In fact, we can obtain the components of a general vector in either the covariant or contravariant K K basis directly from the Cartesian coordinates. Ifa=a eK =a K e , then the components in the covariant basis associated with the curvilinear coordinatesξ i are ∂ξi ∂ξi ∂ξi ai =a·g i =a K e ·gi =a K eJ ·e =a K δJ =a K , K ∂xJ K ∂xJ K ∂xK a contravariant transform. Similarly, the components of the vector in the contravariant basis may be obtained by covariant transform from the Cartesian components and so ∂xK a = a (1.16a) i ∂ξi K and ∂ξi ai = aK . (1.16b) ∂xK 1.1.6 Invariance of vectors: (significance of index position) Having established the need for two different types of transformations in curvilinear coordinate systems, we are now in a position to consider the significance of the raised and lowered indices in our summation convention. We shall insist that for an index to be lowered the object must transform covariantly under a change in coordinates and for an index to be raised the object must transform contravariantly under a change in coordinates6. An important exception to this rule are the coordinates themselves:ξ i represents the three scalar coordinates, e.g. in spherical polar coordinatesξ 1 =r,ξ 2 =θ andξ 3 =φ;ξ i are not the components of a vector and do not obey contravariant transformation rules. Equation (1.16a) demonstrates that the components of a vector in the contravariant basis are indeed covariant, justifying the lowered index, and equation (1.16b) provides similar justification for the contravariance of components in the covariant basis. We shall now demonstrate that these transformation properties also follow directly from the requirement that a physical vector should be independent of the coordinate system. Consider a vectora, which can be written in the covariant or contravariant basis

i i a=a gi =a ig . (1.17) We now consider a change in coordinates fromξ i to another general coordinate systemχ i. It will be of vital importance later on to know which index corresponds to which coordinate system so we have chosen to add an overbar to the index to distinguish components associated with the two coordinate systems,ξ i andχ i. The covariant base vectors associated withχ i are then ∂r ∂xJ ∂ξk ∂xJ ∂ξk g = eJ = eJ = g ; (1.18) i ≡ ∂χi ∂χi ∂χi ∂ξk ∂χi k and the transformation betweeng i andg k is of the same (covariant) type as that betweeng i ande K in equation (1.6). The transformation is covariant because the “new” coordinate is the independent

6The logic for the choice of index location is the position of the generalised coordinate in the partial derivative defining the transformation: ∂xK ∂ξi g = e (lowered index),g i = e (raised index). i ∂ξi K ∂xK K variable in the partial derivative (it appears in the denominator). In our new basis, the vector i a=a gi and becausea must remain invariant

i i a=a gi =a gi.

Using the transformation of the base vectors (1.18) to replaceg i gives ∂ξk ∂ξk ai g =a ig =a kg a k = ai. (1.19) ∂χi k i k ⇒ ∂χi Hence, the components of the vector must transform contravariantly because multiplying both sides of equation (1.19) by the inverse transpose transformation∂χ j/∂ξk gives

∂χi ai = ak. (1.20) ∂ξk This transformation is contravariant because the “new” coordinate is the dependent variable in the partial derivative (it appears in the numerator). A similar approach can be used to show that the components in the contravariant basis must transform covariantly in order to ensure that the vector remains invariant. Thus, the use of our summation convention ensures that the summed quantities remain invariant under coordinate trans- formations, which will be essential when deriving coordinate-independent physical laws.

Interpretation The fact that base vectors and vector components must transform differently for the vector to remain invariant is actually quite obvious. Consider a one-dimensional Euclidean space in which 1 7 a=a g1. If the base vector is rescaled by a factorλ so thatg 1 =λg 1 then to compensate the 1 1 1 component must be rescaled by the factor 1/λ:a = λ a . Note that for a 1 1 transformation matrix with entryλ, the inverse transpose is 1/λ. ×

1.1.7 Orthonormal coordinates If the coordinates are orthonormal then, by construction, there is no distinction between the co- i variant and contravariant basis,g i =g . Using equations (1.6) and (1.12), we see that ∂xK ∂ξi g = e =g i = e , i ∂ξi K ∂xK K and so ∂xK ∂ξi = . (1.21) ∂ξi ∂xK Hence, the covariant and contravariant transformations are identical in orthonormal coordinate systems, which means that there is no need to distinguish between raised and lowered indices. This simplification is adopted in many textbooks and the convention is to use only lowered indices. When working with orthonormal coordinates we will also adopt this convention for simplicity, but we must always make sure that we know when the coordinate system is orthonormal. It is for this reason that we have adopted the convention that upper case indices are used for orthonormal coordinates.

7In one all we can do is rescale the length, although the scaling can vary with position. If the coordinate system is not known to be orthonormal, we will use lower case indices and must distinguish between the covariant and contravariant transformations. Condition (1.21) implies that

∂xK ∂xK ∂xK ∂ξi = =δ i . (1.22) ∂ξj ∂ξi ∂ξj ∂xK j

In the matrix representation, equation (1.22) is

MMT =I M T M=I, ⇒ whereI is the identity matrix. In other words the components of the transformation form an orthogonal matrix. It follows that (all) orthonormal coordinates can only be generated by an orthogonal transformation from the reference Cartesians. This should not be a big surprise: any other transform will change the angles between the base vectors or their relative lengths which destroys orthonormality. The argument is entirely reversible: if either the covariant or contravariant transform is orthogonal then the two transforms are identical and the new coordinate system is orthonormal.

An aside Further intuition for the reason why the covariant and contravariant transformations are identical when the coordinate transform is orthogonal can be obtained as follows. Imagine that we have a general linear transformation represented as a matrixM the acts on vectors such that components K in thefixed Cartesian coordinate systemp=p eK transform as follows

K K J ˜p =M J p .

Note that the indexK does not have an overbar because ˜pK is a component in thefixed Cartesian coordinate system,e K . The transformation can, of course, also be applied to the base vectors of thefixed Cartesian coordinate systeme I ,

K K J [e˜I ] =M J [eI ] ,

K J J where [ ] indicates theK-th component of the base vector. Now, [e I ] =δ I and it follows that

K K [e˜I ] =M I , which allows us to define the operation of the matrix components on the base vectors directly because K K e˜I = [e˜I ] eK =M I eK . (1.23a) Thus the operation of the transformation on the components is the transpose of its operation on 8 e K e the base vectors . We could write the new base vectors as I˜ =M I˜ K to be consistent with our previous notation, but this will probably lead to more confusion in the current exposition. Now consider a vectora that must remain invariant under our transformation. Let the vector a˜′ be the vector with the same numerical values of its components asa but with transformed base ′ K ′ vectors, i.e. a˜ =a e˜K . Thus, the vector a˜ will be a transformed version ofa. In order to ensure that the vector remains unchanged under transformation we must apply the appropriate inverse

8This statement also applies to general bases. ′ transformation to a˜ relative to the new base vectors, e˜I . In other words, the transformation of the coordinates must be K −1 K ′J −1 K J ˜a = [M ]J ˜a = [M ]J a , (1.23b) where we have used the fact that ˜a′J =a J by definition. Using the two transformation equations (1.23a,b) we see that

K −1 K J L −1 K L J L J J ˜a e˜K = [M ]J a MK eL = [M ]J MK a eL =δ J a eL =a eJ , as required. Thus, we have the two results: (i) a general property of linear transformations is that the matrix representation of the transformation of vector components is the transpose of the matrix represen- tation of the transformation of base vectors; (ii) in order to remain invariant the transform of the components of the vector must actually undergo the inverse of the coordinate transformation. Thus, the transformations of the base vectors and the coordinates coincide when the inverse transform is equal to its transpose, i.e. when the transform is orthogonal. If that all seems a bit abstract, then hopefully the following specific example will help make the ideas a little more concrete.

Example 1.3. Equivalence of covariant and contravariant transformations under affine transformations Consider a two-dimensional Cartesian coordinate system with base vectorse I . A new coordinate system with base vectorse I is obtained by rotation through an angleθ in the anticlockwise direction about the origin. Derive the transformations for the base vectors and components of a general vector and show that they are the same.

Solution 1.3. The original and rotated bases are shown in Figure 1.2(a) from which we determine that the new base vectors are given by

e = cosθe + sinθe ande = sinθe + cosθe . 1 1 2 2 − 1 2

e2 e2 e2 e2 p′

e p e θ 1 θ 1

θ e e (a) 1 (b) 1

Figure 1.2: (a) The base vectorse I are the Cartesian base vectorse I rotated through an angleθ about the origin. (b) If the coordinates of the position vectorp are unchanged it is also rotated by θ to p′. I Consider a position vectorp=p eI in the original basis. If we leave the coordinates unchanged ′ I then the new vector p =p eI is the original vector rotated byθ, see Figure 1.2(b). We must therefore rotate the position vector p′ through an angle θ relative to thefixed basis − eI , but this is actually equivalent to a positive rotation of the base vectors. Hence the transforms for the components of vector and the base vectors are the same.

p1 = cosθp 1 + sinθp 2 andp 2 = sinθp 1 + cosθp 2. − 1.2 Tensors Lecture 3 Tensors are geometric objects that have magnitude and zero, one or many associated directions, but are linear in character. A more mathematically precise definition is to say that a is multilinear map or alternatively an element of a of vector spaces, which is somewhat tautological and really not helpful at this point. The order (or degree or rank) of a tensor is the number of associated directions and so a scalar is a tensor of order zero and a vector is a tensor of order one. Many quantities in such as strain, stress, diffusivity and conductivity are naturally expressed as tensors of order two. We have already seen an example of a tensor in our discussion of vectors: linear transformations from one set of vectors to another, e.g. the transformation from Cartesian to covariant base vectors, are second-order tensors. If the vectors represent physical objects, then they must not depend on the coordinate representation chosen. Hence, the linear transformation must also be independent of coordinates because the same vectors must always transform in the same way. We can write our linear transformation in a coordinate-independent manner as a= (b), (1.24) M and the transformation is a tensor of order two. In order to describe precisely we must pick a specific coordinateM system for each vector in equation (1.24). In theM global Cartesian basis, equation (1.24) becomes a e = (b e ) =b (e ), (1.25) I I M J J J M J because it is a linear transformation. We now take the dot product withe K to obtain

a e ·e =b e · (e ) a =b e · (e ), I K I J K M J ⇒ K J K M J where the dot product is written on the left to indicate that we are taking the dot product after the linear transformation has operated on the base vectore J . Hence, we can write the operation of the transformation on the components in the form

aI =M IJ bJ , (1.26) whereM IJ =e I · (eJ ). Equation (1.26) can be written in a matrix form to aid calculation M       a1 M11 M12 M13 b1       a2 = M21 M22 M23 b2 . a3 M31 M32 M33 b3

The quantityM IJ represents the component of the transformed vectors in theI-th Cartesian di- rection if the original vector is of unit length in theJ-th direction. Hence, the quantityM IJ is meaningless without knowing the coordinate system associated with bothI andJ. In fact, there is no need to choose the same coordinate system forI andJ. If we write the vectora in the covariant basis, equation (1.25) becomes

aig =b (e ). i J M J Taking the dot product with the appropriate contravariant base vector gives

∂ξk ak =b gk· (e ) =b e · (e ), J M J J ∂xK K M J which means that ∂ξk ∂ξk ak =M kb = M b M k = M . J J ∂xK KJ J ⇒ J ∂xK KJ In other words the components of each (column) vector corresponding to afixed second index in a coordinate representation of must obey a contravariant transformation if the associated basis undergoes a covariant transform,M i.e. the behaviour is exactly the same as for the components of a vector. If we now also represent the vectorb in the covariant basis, equation (1.25) becomes

aig =b j (g ). i M j Taking the dot product with the appropriate contravariant base vector gives   ∂ξk ∂xJ ∂ξk ∂xJ ak =b jgk· (g ) =b j e · e =b j e · (e ), M j ∂xK K M ∂ξj J ∂xK ∂ξj K M J on using the linearity of the transformation. Hence,

∂ξk ∂xJ ∂ξk ∂xJ ak =M kbj = M bj M k = M , j ∂xK KJ ∂ξj ⇒ j ∂xK KJ ∂ξj and the components of each (row) vector associated with afixedfirst index in a coordinate repre- sentation of undergo a covariant transformation when the associated basis undergoes a covariant transform, i.e.M the “opposite behaviour” to the components of a vector. The difference in behaviour between the two indices of the components of the linear transformation arises because one index corresponds to the basis of the “input” vector, whereas the other corresponds to the basis of the “output” vector. There is a sum over the second (input) index and the components of the vectorb and in order for this sum to remain invariant the transform associated with the second index must be the opposite to the components of the vectorb, in other words the same as the transformation of the base vectors of that vector. The obvious relationships between components can easily be deduced when we represent our vectors in the contravariant basis,

i ij j j a =M bj, ai =M i bj, ai =M ijb . (1.27)

ij Many books termM a contravariant second-order tensor;M ik a covariant second-order tensor j andM i a mixed second-order tensor, but they are simply representations of the same coordinate- independent object in different bases. Another more modern notation is to say thatM ij is a type i (2, 0) tensor,M iJ is type (0, 2) andM j is a type (1, 1) tensor, which allows the distinction between mixed tensors or orders greater than two. 1.2.1 Invariance of second-order tensors Let us now consider a general change of coordinates fromξ i toχ i. Given that i i j a =M j b , (1.28a) i 9 we wish tofind an expression forM j such that i i j a =M j b . (1.28b) Using the transformation rules for the components of vectors (1.19) it follows that (1.28a) becomes ∂ξi ∂ξj an =M i bn. ∂χn j ∂χn We now multiply both sides by∂χ m/∂ξi to obtain ∂χm ∂ξi ∂χm ∂ξj an =δ man =a m = M i bn. ∂ξi ∂χn n ∂ξi j ∂χn Comparing this expression to equation (1.28b) it follows that ∂χi ∂ξm M i = M n , j ∂ξn m ∂χj and thus we see that covariant components must transform covariantly and contravariant compo- nents must transform contravariantly in order for the invariance properties to hold. Similarly, it can be shown that i j n m i j ∂χ ∂χ nm ∂ξ ∂ξ M = M , andM = Mnm. (1.29) ∂ξn ∂ξm i j ∂χi ∂χj An alternative definition of tensors is to require that they are sets of index quantities (multi- dimensional arrays) that obey these transformation laws under a change of coordinates. The transformations can be expressed in matrix form, but we must distinguish between the covariant and contravariant cases. We shall writeM ♭ to indicate a matrix where all components transform covariantly andM ♯ for the contravariant case. We define the transformation matrixF to have the components   ∂χ1 ∂χ1 ∂χ1 ∂ξ1 ∂ξ2 ∂ξ3   i  2 2 2  ∂χ ∂χ ∂χ i ∂χ F=  1 2 3  , orF = ;  ∂ξ ∂ξ ∂ξ  j ∂ξj ∂χ3 ∂χ3 ∂χ3 ∂ξ1 ∂ξ2 ∂ξ3 and then from the chain rule and independence of coordinates   ∂ξ1 ∂ξ1 ∂ξ1  ∂χ1 ∂χ2 ∂χ3  2 2 2 i −1  ∂ξ ∂ξ ∂ξ  −1 i ∂ξ F =  1 2 3  , or [F ] = .  ∂χ ∂χ ∂χ  j ∂χj ∂ξ3 ∂ξ3 ∂ξ3 ∂χ1 ∂χ2 ∂χ3 ♯ ♭ If M and M represent the matrices of transformed components then the transformation laws (1.29) become ♯ ♭ M =FM ♯FT , and M =F −T M♭F−1. (1.30) 9This is a place where the use of overbars makes the notation look cluttered, but clarifies precisely which coordinate system is associated with each index. This notation also allows the representation of components in the two different e. g. i coordinate systems, so-called two-point tensors, M j , which will be useful. 1.2.2 Cartesian tensors If we restrict attention to orthonormal coordinate systems, then the transformation between co- ordinate systems must be orthogonal10 and we do not need to distinguish between covariant and contravariant behaviour. Consider the transformation from our Cartesian basise I to another or- thonormal basise I . The transformation rules for components of a tensor of order two become ∂xN ∂xM M = MNM . I J ∂xI ∂xJ The transformation between components of two vectors in the different bases are given by

∂xI ∂xK a = aK = aK , I ∂xK ∂xI which can be written the form ∂xI ∂xK a =Q aK , whereQ = = , I IK IK ∂xK ∂xI and the componentsQ IK form an orthogonal matrix. Hence the transformation property of a (Cartesian) tensor of order two can be written as

MI J =Q IN MNM QJM or in matrix form M= QMQ T . (1.31) In many textbooks, equation (1.31) is defined to be the transformation rule satisfied by a (Cartesian) tensor of order two.

1.2.3 Tensors vs matrices There is a natural relationship between tensors and matrices because, as we have seen, we can write the components of a second-order tensor in a particular coordinate system as a matrix. It is often helpful to think of a tensor as a matrix when working with it, but the two concepts are distinct. A summary of all the above is that a tensor is an geometric object that does not depend on any particular coordinate system and expresses a linear relationship between other geometric objects.

1.3 Products of vectors: scalar, vector and tensor

Required 1.3.1 Scalar product Reading (not cov- We have already used the scalar or dot product of two vectors, and the discussion here is included ered in lectures) only for completeness. The scalar product is the product of two vectors that returns a unique scalar: the product of the lengths of the vectors and the cosine of the angle between them. Thus far, we have only used the dot product to define orthonormal sets of vectors. If we represent two vectorsa andb in the co- and contravariant bases, then

· i · j a b=(a ig ) (b gj),

10Although the required orthogonal transformation may vary with position, as is the case in plane polar coordinates. and so · j i· j i i a b=a ib g gj =a ib δj =a ib . An alternative decomposition demonstrates that

i a·b=a bi, and we note that the scalar product is invariant under coordinate transformation, as expected. In orthonormal coordinate systems, there is no distinction between co and contravariant bases and so a·b=a K bK .

1.3.2 Vector product The vector or cross product is a product of two vectors that returns a unique vector that is orthogonal to both vectors. In orthonormal coordinate systems, the vector product is defined by       a1 b1 a2b3 b 2a3      −  a b= a2 b2 = a3b1 a 1b3 . × a × b a b −a b 3 3 1 2 − 2 1 In order to represent the vector product with it is convenient to define a quantity known as the alternating, or Levi-Civita, symbole IJK . In orthonormal coordinate systems the components ofe IJK are defined by  0 when any two indices are equal; eIJK =e = +1 when I,J,K is an even permutation of 1, 2, 3; (1.32) IJK  1 when I,J,K is an odd permutation of 1,2,3. − e.g. e112 =e 122 = 0, e123 =e 312 =e 231 = 1, e213 =e 132 =e 321 = 1. − Strictly speakinge IJK thus defined is not a tensor because if the handedness of the coordinate system changes then the sign of the entries ine IJK should change in order for it to respect the appropriate invariance properties; such objects are sometimes called pseudo-tensors. We could ensure thate IJK is a tensor by restricting our definition to right-handed (or left-handed) orthonormal systems, which will be the approach taken in later chapters. The vector product of two vectorsa andb in orthonormal coordinates is

[a b] =e a b , (1.33) × I IJK J K which can be confirmed by writing out all the components. In addition, the relationship between the Cartesian base vectorse I can be expressed as a vector product using the alternating tensor

e e =e e . (1.34) I × J IJK K Let us now consider the case of general coordinates: the cross product between covariant base vectors is given by

∂xI ∂xJ ∂xI ∂xJ ∂xI ∂xJ g g = e e = e e = e e . i × j ∂ξi I × ∂ξj J ∂ξi ∂ξj I × J ∂ξi ∂ξj IJK K The expression on the right-hand side corresponds to thefirst two indices of the alternating tensor undergoing a covariant transformation so that

g g =e e . i × j ijK K If we now transform the third index covariantly we must transform the base vector contravariantly so that ∂ξk g g =ǫ e =ǫ gk; (1.35) i × j ijk ∂xK K ijk I J K whereǫ ∂x ∂x ∂x e . A similar argument shows that ijk ≡ ∂ξi ∂ξj ∂ξk IJK gi g j =ǫ ijkg , × k ijk ∂ξi ∂ξj ∂ξk IJK whereǫ = ∂xI ∂xJ ∂xK e . If we decompose≡ the vectorsa andb into the contravariant basis we have

a b=(a gi) (b gj) =a b gi g j =a b ǫijkg . × i × j i j × i j k Thus, if we decompose the vector product into the covariant basis we have the following expression for the components [a b] k =ǫ ijka b , or [a b] i =ǫ ijka b . × i j × j k 1.3.3 Tensor product The tensor product is a product of two vectors that returns a second-order tensor. It can be motivated by the following discussion. Recall that equation (1.24) can be written in the form

a =M bj, whereM =g · (g ). i ij ij i M j

The componentsM ij correspond to the representation of tensor with respect to a basis, but which basis? We shall define the basis to be that formed from the tensor product of pairs of base vectors: gi g j, where the symbol is used to denote the tensor product. Hence, we can represent a tensor in⊗ the different forms ⊗

=M gi g j =M e e =M i g g j , M ij ⊗ IJ I ⊗ j j i ⊗ ··· which is analogous to representing vectors in the different forms

i i a=a I eI =a i g =a gi.

Returning to equation (1.24), we have

a= (b) a=(M gi g j)(b), M ⇒ ij ⊗ i j 11 and becauseM ij are just coefficients it follows thatg g are themselves tensors of second order . Decomposinga andb into the contravariant and covariant⊗ bases respectively gives

a gi = (M gi g j)(bng ) =M bn(gi g j)(g ), (1.36) i ij ⊗ n ij ⊗ n 11You should think carefully to convince yourself that this is true. but from equation (1.27) a =M bj a gi =M bjgi =M bn δj gi. (1.37) i ij ⇒ i ij ij n Equating (1.36) and (1.37), it follows that the tensor product acting on the base vectors is (gi g j)(g ) =δ j gi = (gj·g )g i. (1.38) ⊗ n n n Equation (1.38) motivates an alternative definition of the tensor product, which in the case of the product of two vectors is also called the dyadic or outer product. The dyadic product of two vectors a andb is a second-order tensor that is defined through its action on an arbitrary vectorv (a b)(v)=(b·v)a. (1.39) ⊗ Consider the tensor =a b; the components of that tensor are given by T ⊗ T =g · (g ) =g ·(b·g )a=(g ·a)(b·g ) =a b , ij i T j i j i j i j which gives a third (equivalent) definition of the tensor product. Alternatively, we can write

i i ij i j j j Tj =a bj orT =a b orT i =a ib . (1.40) Note that we can write =a b=a gi b gj =a b gi g j =T gi g j, T ⊗ i ⊗ j i j ⊗ ij ⊗ i j again demonstrating thatT ij are the components of the tensor in the basisg g . The tensor product can easily be extended to tensors of arbitraryT order via its⊗ component-wise definition. e.g. i jk i jk Tmn =A mnB

1.4 The Lecture 4 We can reinterpret the Kronecker delta as a tensor with components in the global Cartesian coor- dinate system, so that (

IJ I 1, ifI= J, δIJ =δ =δ J ≡ 0, otherwise. If we transform the Kronecker delta into the general coordinate systemξ i, then we obtain the so-called metric tensors ∂xI ∂xJ ∂xI ∂xI g δ = =g ·g , (1.41a) ij ≡ ∂ξi ∂ξj IJ ∂ξi ∂ξj i j ∂ξi ∂ξj ∂ξi ∂ξj gij δIJ = =g i·gj, (1.41b) ≡ ∂xI ∂xJ ∂xI ∂xI ∂ξi ∂xJ ∂ξi ∂xI gi δI = =δ i . (1.41c) j ≡ ∂xI ∂ξj J ∂xI ∂ξj j

i We note that all the tensors are symmetricg ij =g ji and that the mixed metric tensorg j is invariant i and equal toδ j in all coordinate systems. From equations (1.41a,b) we deduce that ∂ξi ∂ξj ∂xJ ∂xJ ∂ξi ∂xJ ∂ξi ∂xI gijg = = δJ = =δ i , (1.42) jk ∂xI ∂xI ∂ξj ∂ξk ∂xI I ∂ξk ∂xI ∂ξk k ij in other words if we represent the componentsg ij as a matrix, we canfind the componentsg by taking the inverse of that matrix, i.e. Dij gij = , (1.43) g ij whereD is the matrix of cofactors of the matrix of coefficientsg ij andg is the determinant of the matrixg ij I 2 i 2 ∂x ij ∂ξ 1 g g ij = and g = = , (1.44) ≡| | ∂ξj | | ∂xJ g which demonstrates that the determinants are positive.

1.4.1 Properties of the metric tensor Line elements We have already seen that an infinitesimal line element corresponding to a change dξi in theξ i coordinate is given by ∂r t =ds = dξi =g dξi (not summed). i i ∂ξi i Hence, a general infinitesimal line element corresponding to changes in all coordinates is given by

i dr=g idξ , (summed overi), and its length squared is 2 · i· j i j ds =dr dr=g idξ gjdξ =g ijdξ dξ . Thus,p the infinitesimal length change due to increments dξi in general coordinates is given by i j gijdξ dξ , which explains the use of the term metric to describe the tensorg ij.

Surface elements

1 Any surfaceξ is constant is spanned by the covariant base vectorsg 2 andg 3 and so an infinitesimal area within that surface is given by

dS = ds ds = g g dξ2dξ3, (1) | 2 × 3| | 2 × 3| and I J K p 1 1 ∂x ∂x ∂x 1 1 g g = ǫ 231g = ǫ 231 g = eIJK g ·g . (1.45) | 2 × 3| | | | || | ∂ξ2 ∂ξ3 ∂ξ1 The term between the modulus signs is the determinant12 of the matrix with components∂x I /∂ξj, so from the definition of the determinant of the metric tensor (1.44), p 11 2 3 dS(1) = gg dξ dξ .

Hence, an element of area in the surfaceξ i is constant is p dS = ggii dξjdξk,(i=j=k andi not summed ). (i) 6 6 12It is the scalar triple product of the three rows of the matrix. Volume elements

An infinitesimal parallelepiped spanned by the vectors dsi has volume given by

dV= ds ·(ds ds ) = g ·(g g ) dξ1dξ2dξ3 = g ·ǫ g1 dξ1dξ2dξ3. | 1 2 × 3 | | 1 2 × 3 | | 1 231 | Now g ·ǫ g1 = ǫ = √g, | 1 231 | | 231| using the same argument as in equation (1.45). Thus, the volume element is given by

dV= √gdξ1dξ2dξ3, (1.46) which is, of course, equivalent to the standard expression for change of coordinates in multiple integrals 1 2 3 ∂(x , x , x ) dV= dξ1dξ2dξ3. ∂(ξ 1,ξ 2,ξ 3)

Index raising and lowering We conclude this brief section by noting one of the most useful properties of the metric tensor. For j a vectora=a jg , we canfind the components in the covariant basis by taking the dot product

i i j i ji a =a·g =a jg ·g =a jg .

Thus the contravariant metric tensor can be used to raise indices and similarly the covariant metric tensor can be used to lower indices j ai =a gji.

1.5 Vector and

1.5.1 Covariant differentiation We have already established that the derivative of the position vectorr with respect to the coor- i dinateξ gives a tangent vectorg i. If we wish to express the rate of change of a vectorfield in the direction of a coordinate then we also need to be able to calculate ∂a a , ∂ξi ≡ ,i where we shall use a comma to indicate differentiation with respect to thei-th general coordinate. Under a change of coordinates fromξ i toχ i, the derivative becomes

∂a ∂ξj ∂a = , ∂χi ∂χi ∂ξj so it transforms covariantly. If we decompose the vector into the covariant basis we have that �  a ig i g ig ,j = a i ,j =a ,j i +a i,j, (1.47) by the product rule. Vector equations are often written by just writing out the components (in an assumed basis)

v=u v e =u e v =u , ⇒ I I I I ⇒ I I where formally the last equation is obtained via dot product with a suitable base vector. In Carte- sian components we can simply take derivatives of the component equations with respect to the coordinates because the base vectors do not depend on the coordinates

uI =v I leads tou I,J =v I,J because (uI eI ),J =u I,J eI .

In other words, the components of the differentiated vector equation are simply the derivatives of the components of the original equation. In a general coordinate system, the base vectors are not independent of the coordinates and so the second term in equation (1.47) cannot be neglected. The vector equation

a=b a ig =b ig a i =b i, ⇒ i i ⇒ where the last equation is now obtained by taking the dot product with the contravariant base vectors. However, in general coordinates,

i i i i a =b does not (directly) lead toa ,j =b ,j.

Although thefinal equation is (obviously) true from direct differentiation of each component the statement is not coordinate-independent because it does not obey the correct transformation prop- erties for a mixed second-order tensor. In fact, the derivatives of the base vectors are given by

∂2r g = =r =r , i,j ∂ξj∂ξi ,ij ,ji assuming symmetry of partial differentiation. If we decompose the position vector into thefixed Cartesian coordinates, we have ∂2xK g = e , i,j ∂ξj∂ξi K which can be written in the contravariant basis as

k gi,j =Γ ijkg , where ∂2r ∂2xK ∂xK Γ =g ·g = ·g = . (1.48) ijk i,j k ∂ξj∂ξi k ∂ξi∂ξj ∂ξk

The symbolΓ ijk is called the Christoffel symbol of thefirst kind. Thus, we can write �  i i k i i kl l l i a,j =a ,jgi +a Γijkg =a ,jgi +a Γijkg gl = a,j +Γ ija gl, where l kl kl · l· Γij =g Γijk =g gk gi,j =g gi,j, (1.49) l andΓ ij is called the Christoffel symbol of the second kind. Finally, we obtain a =a k g , wherea k =a k +Γ k ai, ,j |j k |j ,j ij k k and the expressiona j is the of the componenta . In many books the covariant | k derivative is represented by a subscript semicolona ;j and in some it is denoted by a comma, just to confuse things. The fact thata ,j is a covariant vector and thatg k is a covariant vector means k 13 that the covariant derivativea j is a (mixed) second-order tensor . Thus when differentiating equations expressed in components| in a general coordinate system we should write ai =b i a i =b i . ⇒ |j |j If we had decomposed the vectora into the contravariant basis then

i i a,j =a i,jg +a ig,j. and from equation (1.12), ∂ξk ∂xI ∂ξk ∂xI gk = eK g k = eK =δ I eK =e I . (1.50) ∂xK ⇒ ∂ξk ∂xK ∂ξk K We differentiate equation (1.50) with respect toξ j, remembering thate I is a constant base vector, to obtain ∂xI ∂2xI gk +g k = 0, ,j ∂ξk ∂ξj∂ξk ∂xI ∂ξi ∂ξi ∂2xI g k = g k g i = g ·gigk = Γ i gk. ⇒ ,j ∂ξk ∂xI − ∂xI ∂ξj∂ξk ⇒ ,j − k,j − kj Thus the derivative of the vectora when decomposed into the contravariant basis is �  a =a gi Γ i a gk = a Γ i a gk; ,j i,j − jk i k,j − jk i andfinally, we obtain a =a gi, wherea =a Γ k a , ,j i|j i|j i,j − ji k which gives the covariant derivative of the covariant components of a vector. In Cartesian coordinate systems, the base vectors are independent of the coordinates, soΓ IJK ≡ 0 for all I,J,K and the covariant derivative reduces to the partial derivative: a =a I =a . I |K |K I,K Therefore, in most cases, when generalising equations derived in Cartesian coordinates to other coordinate systems we simply need to replace the partial derivatives with covariant derivatives. The partial derivative of a scalar already exhibits covariant transformation properties because ∂φ ∂ξj ∂φ = ; ∂χi ∂χi ∂ξj and the partial derivative of a scalar coincides with the covariant derivative φ =φ . ,i | i The covariant derivative of higher-order tensors can be constructed by considering the covariant derivative of invariants and insisting that the covariant derivative obeys the product rule. 13Thus, the covariant derivative exhibits tensor transformation properties, which the partial derivative does not because ! ∂ai ∂ξk ∂ ∂χi ∂ξk ∂χi ∂al ∂ξk ∂2χi = al = + al, ∂χj ∂χj ∂ξk ∂ξl ∂χj ∂ξl ∂ξk ∂χj ∂ξk∂ξl and the presence of the second term violates the tensor transformation rule. 1.5.2 Vector differential operators The gradient of a scalarfieldf(x) is denoted by∇f, or gradf, and is defined to be a vectorfield such that f(x+dx) =f(x) +∇f·dx+o( dx ); (1.51) | | i.e. the gradient is the vector that expresses the change in the scalarfield with position. Lettingdx=th, dividing byt and taking the limit ast 0 gives the alternative definition → f(x+th) f(x) d ∇f·h = lim − = f(x+th) t=0. (1.52) t→0 t dt | Here, the derivative is the in the directionh. If we decompose the vectors into components in the global Cartesian basis equation (1.52) becomes ∇ ∂f d ∂f [ f] K hK = (xK + thK ) t=0 = hK , ∂xK dt | ∂xK and so becauseh is arbitrary, we can define

∇ ∇ ∂f f=[ f] K eK = eK , (1.53) ∂xK which should be a familiar expression for the gradient. Relative to the general coordinate system ξi equation (1.53) becomes ∂f ∂ξi ∂f ∇ e gi gi f= i K = i =f ,i , (1.54) ∂ξ ∂xK ∂ξ ∇ i ∂ so we can write the vector differential operator =g ∂ξi . Note that because the derivative transforms covariantly and the base vector transform contravariantly the gradient is invariant under coordinate transform.

Example 1.4. Gradient in standard coordinate systems Find the gradient in a plane polar coordinate system.

Solution 1.4. The gradient of a scalarfieldf is simply ∂f ∇f=g i . ∂ξi In plane polar coordinates,

sinξ 2 cosξ 2 g1 = cosξ 2 e + sinξ 2 e ,g 2 = e + e , 1 2 − ξ1 1 ξ1 2 so   �  ∂f sinξ 2 cosξ 2 ∂f ∇f= cosξ 2 e + sinξ 2 e + e + e , 1 2 ∂ξ1 − ξ1 1 ξ1 2 ∂ξ2 which can be written in the (hopefully) familiar form

∂f 1 ∂f ∇f= e + e . ∂r r r ∂θ θ Gradient of a Vector Field The gradient of a vectorfieldF(x) is a second-order tensor∇F(x) also often written as∇ F that arises when the vector differential operator is applied to a vector ⊗

∇ F=g i F =F k gi g =F gi g k. ⊗ ⊗ ,i |i ⊗ k k|i ⊗ Note that we have used the covariant derivative because we are taking the derivative of a vector decomposed into the covariant or contravariant basis.

Divergence The divergence of a vectorfield is the trace or contraction of the gradient of the vectorfield. It is formed by taking the dot product of the vector differential operator∇ with the vectorfield: ∂ �  divF=∇·F=g i·F =g i· F jg =g i·F j g =δ i F j =F i , ,i ∂ξi j |i j j |i |i so that divF=F i =F i +Γ i F j =F i +Γ i F j. (1.55) |i ,i ji ,i ij Curl The curl of a vectorfield is the cross product of our vector differential operator∇ with the vector field ∂F ∂(F gj) curlF=∇ F=g i =g i j =g i F gj =ǫ ijkF g . × × ∂ξi × ∂ξi × j|i j|i k

Laplacian The Laplacian of a scalarfieldf is defined by

2f=Δf=∇·∇f. ∇ Thus in general coordinates   ∂ ∂f 2f=g j· gi =g j·f gi =g ijf =g ijf , ∇ ∂ξj ∂ξi ,i|j ,i|j | ij

ij because the partial and covariant derivatives are the same for scalar quantities. Nowg j is a tensor (covariantfirst order), which means that we know how it transforms under change in coordinates,| but in Cartesian coordinatesg ij becomesδ IJ = 0, sog ij = 0 in all coordinate systems14. Thus, |j ,J |j   2 ij 1 ∂ ij ∂f f=(g f,i) j = √g g . ∇ | √g ∂ξj ∂ξi

The last equality is not obvious and is proved in the next section, equation (1.58)

14This is a special case of Ricci’s Lemma. 1.6 Divergence theorem

The divergence theorem is a generalisation of the fundamental theorem of calculus into higher dimensions. Consider the volume given bya ξ 1 b,c ξ 2 d ande ξ 3 f. If we want to calculate the outwardflux of a vectorfieldF(x≤) from≤ this≤ volume≤ then we≤ must calculate≤ ZZ F ndS, · wheren is the outer unit normal to the surface and dS is an infinitesimal element of the surface area. From section 1.4.1, we know that on the facesξ 1 = a, b, p dS= gg11dξ2dξ3, andp a normalp to the surface is given by the contravariant base vectorg 1, with length g 1 = g1·g1 = g11. On the faceξ 1 =a the vectorg 1 is directed into the volume (it is oriented| along| the line of increasing coordinate) and onξ 1 =b,g 1 is directed out of the volume, so the net outward flux through these faces is Z Z Z Z d f p p d f p p F(a,ξ 2,ξ 3)·( g1/ g11) gg11dξ2dξ3 + F(b,ξ 2,ξ 3)·(+g1/ g11) gg11dξ2dξ3 c e − c e Z Z d f h p p i = F 1(b,ξ 2,ξ 3) g(b,ξ 2,ξ 3) F 1(a,ξ 2,ξ 3) g(a,ξ 2,ξ 3) dξ2dξ3 c e − From the fundamental theorem of calculus, the integral becomes Z Z Z d f b 1 1 2 3 (F √g),1 dξ dξ dξ , c e a and using similar arguments, the total outwardflux from the volume is given by ZZ Z Z Z b d f i 1 2 3 F·ndS= (F √g),idξ dξ dξ . a c e

From section 1.4.1 we also know that an infinitesimal volume element is given by dV= √gdξ 1dξ2dξ3, so ZZ ZZZ ZZZ 1 i 1 2 3 1 i F·ndS= (F √g),i√gdξ dξ dξ = (F √g),i dV. √g √g The volume integrand is 1 i i 1 i (F √g),i =F + F (√g),i, √g ,i √g but 1 1 ∂g 1 ∂g ∂gjk (√g),i = i = i , √g 2g ∂ξ 2g ∂gjk ∂ξ by the chain rule. From equations (1.42) and (1.44)

kl kl D l kl l ∂g l ∂g kl kl lk gjkg =g jk =δ j g jkD =gδ j δj = =D =gg =gg , (1.56) g ⇒ ⇒ ∂gjk ∂glk which means that �  �  �  1 1 jk ∂gjk 1 jk 1 j k 1 j k j (√g),i = gg = g g ·g +g ·g = g ·g +g ·g = Γ +Γ =Γ , √g 2g ∂ξi 2 j,i k j k,i 2 j,i k,i 2 ji ki ji (1.57) from the definition of the Christoffel symbol (1.49). Thus the volume integrand becomes �  1 i i i j i F √g =F +F Γ =F i, (1.58) √g ,i ,i ji | and so, ZZ ZZZ ZZZ ZZZ F·ndS= F i +F iΓj dV= F i dV= divFdV, ,i ji |i using the definition of the divergence (1.55). In fact, one should argue that the definition of the divergence is chosen so that it is the quantity that appears within the volume integral in the divergence theorem. The form of the theorem that we shall use most often is ZZ ZZ ZZZ ZZZ i i i 1 i F ni dS= Fin dS= F i dV= (F √g),i dV. (1.59) | √g

You might argue that the proof above is only valid for simple parallelepipeds in our general coordi- nate system, but we can extend the result to more general volumes by breaking the volume up into regions, each of which can be described by a parallelepiped in a general coordinate systems.

1.7 The Stokes Theorem

For completeness, we also state, but do not prove the Stokes theorem, which states that for any vectorv, the integral of its tangential component around a closed path is equal to theflux of the curl of the vector through any surface bounded by the path: I ZZ v·dr= (curlv)·dS. (1.60) C A In general tensor form, equation (1.60) becomes I I ZZ ZZ · i i ijk · ijk v g i dξ = vi dξ = ǫ vj igk dS= ǫ vj ink dS, (1.61) C A | A | where the vector area isdS=ndS, andn is a unit vector normal to the surface. Again we can make the argument that the definition of the curl is chosen so that this relationship is satisfied. You may be interested to know that an entire theory (differential forms) can be constructed in which the Stokes theorem is a fundamental result Z Z ω= dω, ∂C C whereω is ann-form and dω is its differential. The theory encompasses all known integral theorems because the appropriaten-dimensional differential forms coincide with the divergence and curl and the theory provides a rational extension to higher dimensions.