<<

Physics 411 Lecture 7

Tensors

Lecture 7

Physics 411 Classical Mechanics II

September 12th 2007

In Electrodynamics, the implicit law governing the motion of particles is F α = m x¨α. This is also true, of course, for most of classical physics and the details of the physical principle one is discussing are hidden in F α, and potentially, its potential. That is what defines the interaction. In , the motion of particles will be described by

µ µ α β x¨ + Γ αβ x˙ x˙ = 0 (7.1) and this will occur in a four-dimensional space-time – but that doesn’t con- cern us for now. The point of the above is that it lacks a potential, and can be connected in a natural way to the metric.

7.1 Introduction in Two

We will start with some basic examples in two-dimensions for concreteness. Here we will always begin in a Cartesian-parametrized plane, with vectors xˆ and yˆ.

7.1.1 Rotation

To begin, consider a simple rotation of the usual coordinate axes through an angle θ (counterclockwise) as shown in Figure 7.1. From the figure, we can easily relate the coordinates w.r.t. the new axes (¯x andy ¯) to coordinates w.r.t. the “usual” axes (x and y) – define ` =

1 of 12 7.1. INTRODUCTION IN TWO DIMENSIONS Lecture 7

y¯ y

! x¯ y¯ ψ θ xˆ x

Figure 7.1: Two sets of axes, rotated through an angle θ with respect to each other. px2 + y2, then the invariance of length allows us to write

x¯ = ` cos(ψ − θ) = ` cos ψ cos θ + ` sin ψ sin θ = x cos θ + y sin θ (7.2) y¯ = ` sin(ψ − θ) = ` sin ψ cos θ − ` cos ψ sin θ = y cos θ − x sin θ, which can be written in form as:  x¯   cos θ sin θ   x  = . (7.3) y¯ − sin θ cos θ y

| {z α } ≡R ˙=R β

If we think of infinitesimal displacements centered at the origin, then we α α β would write dx¯ = R β dx . Notice that in this case, as a matrix, the inverse of R is  cos θ − sin θ  −1 ˙= (7.4) R sin θ cos θ and this is also equal to the of R, a defining property of “ro- −1 T tation” or “orthogonal” matrices: R = R , in this case coming from the observation that one ’s clockwise rotation is another’s counter-clockwise, and symmetry properties of the trigonometric functions.

2 of 12 7.1. INTRODUCTION IN TWO DIMENSIONS Lecture 7

Scalar

This one is easy – a scalar “does not” transform under coordinate transfor- mations – if we have a function φ(x, y) in the first coordinate system, then we have: φ¯(¯x, y¯) = φ(x(¯x, y¯), y(¯x, y¯)) (7.5) so that operationally, all we do is replace the x and y appearing in the definition of φ with the relation x =x ¯ cos θ−y¯ sin θ and y =x ¯ sin θ+¯y cos θ.

Vector (Contravariant)

With the fabulous success and ease of scalar transformations, we are led to a natural definition for an object that transforms as a vector. If we write a generic vector in the original coordinate system:

f = f x(x, y) xˆ + f y(x, y) yˆ, (7.6) then we’d like the transformation to be defined as:

¯f = f¯x(¯x, y¯) x¯ + f¯y(¯x, y¯) y¯ = f x(x, y) xˆ + f y(x, y) yˆ, (7.7) i.e. identical to the scalar case, but with the non-trivial involvement of the basis vectors. From Figure 7.1, we can easily see how to write the basis vector x¯ and y¯ in terms of xˆ and yˆ: x¯ = cos θ xˆ + sin θ yˆ (7.8) y¯ = − sin θ xˆ + cos θ yˆ, and using this, we can write ¯f in terms of the original basis vectors, the elements in front of these will then define f x and f y according to our target.

¯f =f¯x cos θ − f¯y sin θ xˆ +f¯x sin θ + f¯y cos θ yˆ, (7.9) so that we learn (slash demand) that f x = f¯x cos θ − f¯y sin θ (7.10) f y = f¯x sin θ + f¯y cos θ, or, inverting, we have, now in matrix form:  f¯x   cos θ sin θ   f x  = . (7.11) f¯y − sin θ cos θ f y

3 of 12 7.1. INTRODUCTION IN TWO DIMENSIONS Lecture 7

¯α α β Notice that the “components” can be written as f = R β f from the above, and this leads to the usual statement that “a (contravariant) vector transforms ‘like’ the coordinates themselves.” (more appropriately, like the α α β coordinate differentials, which are actual vectors – recall dx¯ = R β dx from above). The rotation matrix and transformation is just one example, and the form is telling – notice that (somewhat sloppily)

dx¯α dx¯α = Rα dxβ −→ = Rα (7.12) β dxβ β and so it is tempting to generalize the contravariant vector transformation law to include an arbitrary relation between “new” and “old” coordinates:

dx¯α f¯α = f β, (7.13) dxβ precisely what we defined for a contravariant vector originally.

7.1.2 Non-Orthogonal Axes

We will stick with linear transformations for a moment, but now allow the axes to be skewed – this will allow us to distinguish between the above con- travariant transformation law and the covariant (one-form) transformation law. Consider the two coordinate systems shown in Figure 7.2 – from the figure, we can write the relation for dx¯α in terms of dxα:  dx   1 cos θ   dx¯  = (7.14) dy 0 sin θ dy¯ or, we can write in the more standard form (inverting the above):

 dx¯   1 − cot θ   dx  = 1 , (7.15) dy¯ 0 sin θ dy | {z α } ≡T ˙=T β

α α β so that dx¯ = T β dx and the matrix T is the inverse of the matrix in (7.14).

4 of 12 7.1. INTRODUCTION IN TWO DIMENSIONS Lecture 7

yˆ y¯

dy dy¯ θ xˆ, x¯ dx¯

dx

Figure 7.2: Original, orthogonal Cartesian axes, and a skewed set with coordinate differential shown (parallel projection is used since differentials are contravariant).

Let us first check the contravariant form dx = dx xˆ + dy yˆ transforms ap- propriately. We need the basis vectors in the skewed coordinate system – from the figure, these are x¯ = xˆ and y¯ = cos θ xˆ + sin θ yˆ, then

dx¯ = dx¯ x¯ + dy¯y¯ dy =(dx − cot θ dy) xˆ + (cos θ xˆ + sin θ yˆ) (7.16) sin θ = dx xˆ + dy yˆ as expected. Recall that we developed the metric for this type of situation previously – if we take the scalar length dx2 + dy2, and write it in terms of the dx¯ and dy¯ infinitesimals, then

 1 cos θ   dx¯  dx2 + dy2 = dx¯2 + dy¯2 2 dx¯ dy¯ cos θ ˙= dx¯ dy¯  , cos θ 1 dy¯ | {z } ≡g¯µν (7.17) α β α β defining the metric. We have the relation dx gαβ dx = dx¯ g¯αβ dx¯ , which tells us how the metric itself transforms. From the definition of the trans- α α β dx¯α α formation, dx¯ = T β dx , we have dxβ = T β as usual, but we can also write the inverse transformation, making use of the inverse of the matrix T

5 of 12 7.1. INTRODUCTION IN TWO DIMENSIONS Lecture 7

˜α −1 in (7.14). Letting T β≡˙ T , dxα dxα = T˜α dx¯β −→ = T˜α . (7.18) β dx¯β β Now going back to the invariant,

α β α γ β ρ dx¯ g¯αβ dx¯ = (T γ dx )g ¯αβ (T ρ dx ). (7.19)

α β For this to expression to equal dx gαβ dx , we must have ˜σ ˜δ g¯αβ = T α gσδ T β, (7.20) which we can put in as a check:

α β α γ β ρ dx¯ g¯αβ dx¯ = (T γ dx )g ¯αβ (T ρ dx ) α γ  σ δ  β ρ = (T dx ) T˜ gσδ T˜ (T dx ) γ α β ρ (7.21) σ δ γ ρ = δ γ δ ρ gσδ dx dx σ δ = dx gσδ dx .

The metric transformation law (7.20) is different from that of a vector – first of all, there are two indices, but more important, rather than transforming α with T β, the covariant indices transform with the inverse of the transfor- mation. This type of transformation is not obvious under rotations since T rotations leave the metric itself invariant (R g R = g), and we don’t notice the transformation of the metric.

Covariant “vector” Transformation

If we generalize the above covariant transformation to non-linear coordinate relations, we define covariant vector transformation as:

dxβ f¯ = f . (7.22) α dx¯α β

For a second rank covariant , like the metric gµν, we just introduce a copy of the transformation for each index (the same is true for the con- travariant transformation law):

dxα dxβ g¯ = g (7.23) µν dx¯µ dx¯ν αβ

6 of 12 7.1. INTRODUCTION IN TWO DIMENSIONS Lecture 7

Now the question becomes: For contravariant vectors, the model was dxα, what is the model covariant vector? The answer is, “the gradient”. Consider a scalar φ(x, y) – we can write its partial derivaties w.r.t. the coordinates as: ∂φ(x, y) ≡ φ , (7.24) ∂xα ,α and ask how this transforms under any coordinate transformation – the an- swer is provided by the . We have the usual scalar transformation φ¯(¯x, y¯) = φ(x(¯x, y¯), y(¯x, y¯)), so ∂φ¯ ∂φ ∂xβ ∂xβ φ¯ = = = φ , (7.25) ,α ∂x¯α ∂xβ ∂x¯α ∂x¯α ,β precisely the rule for covariant vector transformation. ∂φ ∂φ How does this compare with our usual ∇φ = ∂x xˆ + ∂y yˆ? Almost automat- ically, we realize that this is a contravariant vector – it is written in terms of the basis vector components just as f α was, so we expect to find that ∇φ α is shorthand for φ, rather than φ,α. There are deep distinctions between covariant and contravariant , and we have not addressed them in full abstraction here – for our purposes, there is a one-to-one map between the two provided by the metric, and that makes them effectively equivalent. It is important, in some settings, to keep track of the fundamental nature of an object, covariant or contravariant (as is the case with, for example, momen- tum), but for now, it suffices to define the relation between contravariant and covariant objects via the metric: For a contravariant vector f α,

β fα ≡ gαβ f . (7.26) We further define gαβ to be the contravariant form of the metric, its numer- αβ α ical matrix inverse: g gβγ = δγ . Now we can see that: ¯ α αβ ¯ φ, =g ¯ φ,β (7.27) and for our skew-axis transformation, we have:

 1 cos θ   1 − cos θ  g¯ ˙= g¯αβ ˙= sin2 θ sin2 θ (7.28) αβ cos θ 1 − cos θ 1 sin2 θ sin2 θ so that 1  1 ∂φ¯ cos θ ∂φ¯ 1  cos θ ∂φ¯ 1 ∂φ¯ g¯αβ φ¯ ˙= − x¯ + − + y¯, ,β sin θ sin θ ∂x¯ sin θ ∂y¯ sin θ sin θ ∂x¯ sin θ ∂y¯ (7.29)

7 of 12 7.2. TRANSFORMATION AND BASIS Lecture 7 and ∂φ¯ ∂φ ∂x ∂φ ∂y ∂φ = + = ∂x¯ ∂x ∂x¯ ∂y ∂x¯ ∂x (7.30) ∂φ¯ ∂φ ∂x ∂φ ∂y ∂φ ∂φ = + = cos θ + sin θ, ∂y¯ ∂x ∂y¯ ∂y ∂y¯ ∂x ∂y which we can input into (7.29): 1  1 ∂φ ∂φ cos2 θ ∂φ  1  cos θ ∂φ cos θ ∂φ ∂φ g¯αβφ¯ ˙= − − cos θ x¯ + − + + y¯ ,β sin θ sin θ ∂x ∂x sin θ ∂y sin θ sin θ ∂x sin θ ∂x ∂y ∂φ ∂φ 1 ∂φ = − cot θ x¯ + y¯, ∂x ∂y sin θ ∂y (7.31) or, finally, replacing the unit vectors with the original set ∂φ ∂φ 1 ∂φ g¯αβ φ¯ ˙= − cot θ x¯ + y¯ ,β ∂x ∂y sin θ ∂y (7.32) ∂φ ∂φ = xˆ + yˆ ∂x ∂y and we see that this is the correct expression for a contravariant vector: ∇¯ φ¯ = ∇φ.

7.2 Transformation and Basis

We have discussed the transformation laws for our various tensors, but let’s look in more detail: consider a vector written in the usual Cartesian coor- dinates xα ≡˙ x ˙= x xˆ + y yˆ + z z.ˆ (7.33) Our aim is to transform this to a new set of coordinates (r, θ, φ) related to these in some manner. The transformation laws for tensors involve objects ∂x0α α 0 like ∂xβ , so we must be able to form x (x ) and vice-versa (a statement about the non-singular nature of the transformation) x1(r, θ, φ) = r sin θ cos φ x2 = r sin θ sin φ (7.34) x3 = r cos θ

Just by the chain rule, then, we have ∂xα dxα = dx0β. (7.35) ∂x0β

8 of 12 7.2. TRANSFORMATION AND BASIS Lecture 7

In matrix form ∂xα dxα = dx0β ∂x0β  sin θ cos φ r cos θ cos φ −r sin θ sin φ   dr  (7.36) ˙=  sin θ sin φ r cos θ sin φ r sin θ cos φ   dθ  cos θ −r sin θ 0 dφ

= dr er + dθ eθ + dφ eφ.

This last line is important – keep in mind that matrix-vector multiplication can be viewed as Aij xj = [Akxk] where the right-hand side means “multiply th the k column of A by xk. So the above tells us what we mean by the vectors er, eθ, eφ. These are “basis vectors” as usual. They look a little funny sitting here, we are used to a slightly different type of basis vector. If you consult Griffiths, you will find that his spherical basis vectors are written, in terms of ours, as 1 1 ˆe = e ˆe = e ˆe = e (7.37) r r θ r θ φ r sin θ φ i.e. they are normalized. So far, so good – if we want to represent dx = dx1 xˆ + dx2 yˆ+ dx3 zˆ in terms of (r, θ, φ) and (ˆer, ˆeθ, ˆeφ), we know how to do it. Now the question is, if we had a vector f α on the Cartesian side, so that f = f 1 xˆ + f 2 yˆ + f 3 zˆ, what is this vector written in terms of the spherical coordinate system and basis? Well, we know how to write the Cartesian basis in terms of the spherical one, and we get:

0 1 2  3  f = f cos φ + f sin φ sin θ + f cos θ ˆer 1 2 3  + f cos φ cos θ + f cos θ sin φ − f sin θ ˆeθ (7.38) 2 1  + f cos φ − f sin φ ˆeφ and we can further write this in terms of the more natural basis (er, eθ, eφ):

0 3 1 2   f = f cos θ + f cos φ + f sin φ sin θ er 1 + f 1 cos φ cos θ + f 2 cos θ sin φ − f 3 sin θ e r θ (7.39) 1 + f 2 cos φ − f 1 sin φ e , r sin θ φ

9 of 12 7.3. DERIVATIVES Lecture 7 which leads us to the point of all this – we can call the elements of the above:  f 3 cos θ +f 1 cos φ + f 2 sin φ sin θ  0α 1 1 2 3  f ˙=  r f cos φ cos θ + f cos θ sin φ − f sin θ  1 f 2 cos φ − f 1 sin φ r sin θ (7.40)  sin θ cos φ sin θ sin φ cos θ   f 1  1 1 1 2 ˙=  r cos θ cos φ r cos θ sin φ − r sin θ   f  . 1 1 3 − r sin θ sin φ r sin θ cos φ 0 f Comparing this to (7.36) (invert the matrix you find there to verify that ∂xα ∂x0β α ∂x0β ∂xγ = δγ ) we see that ∂x0α f 0α = f β. (7.41) ∂xβ

7.3 Derivatives

β So, we see that we might write: f = f eβ as a representation of the vector in a basis (or “frame”). The “gradient” of f is ∂f ∂f β ∂e = e + f β β (7.42) ∂xα ∂xα β ∂xα which is the usual sort of statement: we must take the derivatives of the function and the basis if it depends on coordinates. Let us be clear, and I’ll do this on the tensor side entirely, since I don’t like mixing it up. Think of a contravariant vector transformation: ∂x0α f 0α = f β, (7.43) ∂xβ ∂ and take ∂x0γ of this, then by the chain rule: ∂f 0α ∂x0α  ∂xσ ∂f β  ∂2x0α ∂xρ = + f β. (7.44) ∂x0γ ∂xβ ∂x0γ ∂xσ ∂xβ∂xρ ∂x0γ

Now consider the transformation rule for a second rank, :

∂x0α ∂xσ f 0α = f γ (7.45) β ∂xγ ∂x0β σ so comparing to (7.44), we see that the second term is the problem one!

α That is to say, the object f ,γ is not a tensor. This must be fixed immediately.

10 of 12 7.4. WHAT IS A TENSORIAL DERIVATIVE? Lecture 7

7.4 What is a Tensorial Derivative?

The usual definition of a derivative is in terms of a limit like f(x + δx) − f(x) f 0(x) = lim . (7.46) δx→0 δx Suppose we expand a tensor f α(x + δx) about the point P (with coordinate x), this is just Taylor’s theorem

α α γ α 2 f (x + δx) = f (x) + δx f ,γ + O(δx ) (7.47) and we would say that the (non-tensorial) change of the vector is given by α α γ α ∆f ≡ f (x + δx) − f(x) = δx f ,γ. Instead of using this ∆f α, which, since it involves the “usual” derivative cannot be a tensor, we introduce a fudge factor. So we will compare f α(x + δx) with f α(x) + δf α(x) – that is, we will define the “covariant” derivative in terms of α α α α f (x + δx) −(f (x) + δf (x)) f ;γ(x) ≡ lim . (7.48) δx→0 δxγ

What form should this “extra bit” δf α(x) take? Well, I suggest that in the limiting case that f α(x) = 0, it should be zero, and it should also tend to zero with δx → 0, i.e. if we don’t move from the point P , we should recover the usual derivative. This suggests an object of the form ∼ δxρ f τ , and we still need a contravariant index to form δf α – so we propose:

α α ρ τ δf ≡ −C ρτ δx f (7.49) α (the negative sign is convention) where C ρτ is just some triply-indexed object that we don’t know anything about yet. Our new derivative, the “” will be written α α α α f (x + δx) −(f (x) + δf (x)) f ;γ(x) ≡ lim δx→0 δxγ α α f (x + δx) − f (x) α τ (7.50) = lim + Cγτ f δx→0 δxγ α α τ = f ,γ(x) + C γτ (x) f (x)

But I haven’t really done anything, just suggested that we move a vector α from point to point in some C βγ-dependent way rather than the usual trans- port operator (the derivative). The whole motivation for doing this was to

11 of 12 7.4. WHAT IS A TENSORIAL DERIVATIVE? Lecture 7 ensure that our notion of derivative was tensorial. This imposes a constraint α on C βγ (which is arbitrary right now) – let’s work it out. Using (7.44)

∂x0α ∂xσ ∂f β ∂2x0α ∂xρ ∂x0β f 0α = + f β + C0α f σ ;γ ∂xβ ∂x0γ ∂xσ ∂xβ ∂xρ ∂x0γ βγ ∂xσ ∂x0α ∂xσ   ∂2x0α ∂xρ ∂x0β = f β − f ρ Cβ + f β + C0α f σ, ∂xβ ∂x0γ ;σ σρ ∂xβ ∂xρ ∂x0γ βγ ∂xσ (7.51) α α α β the term in parenthesis comes from noting that f ,γ = f ;γ −C βγ f . Collect- ing everything that is “wrong” in the above, we can define the transformation α rule for C βγ

∂x0α ∂xσ  ∂x0α ∂xτ ∂2x0α ∂xτ ∂x0β  f 0α = f β + − Cβ + + C0α f ρ. ;γ ∂xβ ∂x0γ ;σ ∂xβ ∂x0γ τρ ∂xρ ∂xτ ∂x0γ βγ ∂xρ (7.52) In order to kill the offending term in parentheses, we define

∂xλ ∂x0α ∂xσ ∂2x0α ∂xρ ∂xλ C0α = Cρ − (7.53) τγ ∂x0τ ∂xρ ∂x0γ σλ ∂xλ ∂xρ ∂x0γ ∂x0τ

α and from this we conclude that C βγ is itself not a tensor. Anything that transforms according to (7.53) is called a “connection”. What we’ve done in (7.50) is add two things, neither of which is a tensor, in a sum that produces a tensor. Effectively, the non-tensor parts of the two terms kill each other (by construction, of course, that’s what gave us (7.53)), we just needed a little extra freedom in our definition of derivative. For better or α worse, we have it! This f ;β is called the covariant derivative, and plays the α role, in tensor equations, of the usual f ,β.

12 of 12