The rigid rotor in classical and quantum mechanics paul e.s. wormer Institute of Theoretical Chemistry, University of Nijmegen, Toernooiveld, 6525 ED Nijmegen, The Netherlands
Contents
1 Introduction 1
2 The mathematics of rotations in R3 2
3 The algebra of real antisymmetric matrices 8
4 The kinematics of a rigid body 11
5 Kinetic energy of a rigid rotor 14
6 The Euler equations 20
7 Quantization 21
8 Rigid rotor functions 23
9 The quantized energy levels of rigid rotors 29
10 Angular momenta and Lie derivatives 34 10.1 Infinitesimal rotations of functions f(r) ...... 35 10.2 Infinitesimal rotations of functions of Euler angles ...... 37
1 Introduction
The following text contains notes on the classical and quantum mechanical rigid rotor. The classical part is based on the books of H. Goldstein Classical Mechanics [Addison-Wesley, Reading, MA, 1980, 2nd Ed.] and V. I. Arnold, Mathematical Methods of Classical Mechanics, [Springer-Verlag, New York, 1989, 2nd Ed.]. In the following pages Goldstein’s exposition is condensed, whereas Arnold’s terse mathematical treatment is expanded. The quantum mechanical part is in the spirit of L. C. Biedenharn and J. D. Louck Angular Momentum in Quantum Physics: Theory and Application. [Addison-Wesley, Reading, MA 1981].
1 2 The mathematics of rotations in R3
Consider a real 3 3 matrix R with columns r , r , r , i.e., R =(r , r , r ). × 1 2 3 1 2 3 The matrix R is orthogonal if
r r = δ , i, j =1, 2, 3. i · j ij
The matrix R is a proper rotation matrix, if it is orthogonal and if r1, r2 and r3 form a right-handed set, i.e.,
3 r r = ε r . (1) i × j ijk k Xk=1
Here εijk is the antisymmetric (Levi-Civita) tensor,
ε123 = ε312 = ε231 =1 (2) ε = ε = ε = 1 213 321 132 − and εijk = 0 if two or more indices are equal. The matrix R is an improper rotation matrix if its column vectors form a left-handed set, i.e., r r = ε r . (3) i × j − ijk k Xk Equations (1) and (3) can be condensed into one equation
3 r r = det(R) ε r (4) i × j ijk k Xk=1 by virtue of the following lemma.
Lemma 1. The determinant of a proper rotation matrix is 1 and of an improper rotation 1. − Proof The determinant of a 3 3 matrix (a, b, c) can be written as a (b c). × · × Now, for a proper rotation, we find by Eq. (1) and remembering that the rk are orthonormal,
r (r r )= ε r r = ε =1, 1 · 2 × 3 23k 1 · k 231 Xk and likewise we find 1 for an improper rotation by Eq. (3). − 2 The Levi-Civita tensor allows the following compact notation for the vec- tor product (a b) = ε a b . × i ijk j k Xj,k For instance, (a b) = ε a b + ε a b = a b + a b . × 2 213 1 3 231 3 1 − 1 3 3 1 Theorem 1
A proper rotation matrix R =(r1, r2, r3) can be factorized thus
R = Rz(ω3) Ry(ω2) Rx(ω1) (the“xyz-parametrization”) or also
R = Rz(α) Ry(β) Rz(γ) (the “Euler parametrization”) where cos ϕ sin ϕ 0 − Rz(ϕ) := sin ϕ cos ϕ 0 0 0 1 cos ϕ 0 sin ϕ R (ϕ) := 0 1 0 (5) y sin ϕ 0 cos ϕ − 10 0 R (ϕ) := 0 cos ϕ sin ϕ . x 0 sin ϕ −cos ϕ Proof We first prove the xyz-parametrization by describing an algorithm for the factorization of R. Consider to that end cos ω cos ω sin ω cos ω sin ω 3 2 − 3 3 2 R (ω ) R (ω )= sin ω cos ω cos ω sin ω sin ω =: (a , a , a ). z 3 y 2 3 2 3 3 2 1 2 3 sin ω 0 cos ω − 2 2 (6) Note that the multiplication by Rx(ω1) on the right does not affect the first column, so that r1 = a1. Solve ω2 and ω3 from the first column of R,
cos ω3 cos ω2 r1 = sin ω3 cos ω2 . sin ω − 2 This is possible. First solve ω for π/2 ω π/2 from 2 − ≤ 2 ≤ sin ω = R (r ) . 2 − 31 ≡ − 1 3 3 Then solve ω for 0 ω 2π from 3 ≤ 3 ≤ R11 cos ω3 = cos ω2 R21 sin ω3 = . cos ω2
This determines the vectors a2 and a3.
Since a1, a2 and a3 are the columns of a proper rotation matrix [Eq. (6)] they form an orthonormal right-handed system. The plane spanned by a2 and a is orthogonal to a r and hence contains r and r . Thus, 3 1 ≡ 1 2 3 cos ω sin ω (r , r )=(a , a ) 1 1 . (7) 2 3 2 3 sin ω −cos ω 1 1
Since r2, a2 and a3 are known unit vectors we can compute
a r = cos ω (8) 2 · 2 1 a r = sin ω . 3 · 2 1 These equations give ω with 0 ω 2π. Augment the matrix in Eq. (7) 1 ≤ 1 ≤ to Rx(ω1), then
R (r , r , r )=(r , a , a )R (ω ) ≡ 1 2 3 1 2 3 x 1 = (a1, a2, a3)Rx(ω1)= Rz(ω3) Ry(ω2) Rx(ω1).
This concludes the proof of the xyz parametrization.
The Euler parametrization is obtained by solving ω2 and ω3 from r3 = a3 and then considering
cos ω sin ω (r , r )=(a , a ) 1 1 (9) 1 2 1 2 sin ω −cos ω 1 1 or, a r = cos ω , a r = sin ω . 1 · 1 1 2 · 1 1 Equation (9) can be written as
(r1, r2, r3)=(a1, a2, r3) Rz(ω1)= Rz(ω3) Ry(ω2) Rz(ω1) , which proves the Euler parametrization. Note. Some confusion exists about the Euler angles of an improper orthogonal ma- trix S. One can write S = S′R, where R is proper and has a unique Euler parametrization and S′ is another improper rotation matrix. Different
4 choices of S′ are possible. Some workers choose S′ = 1 (space inversion), − while others choose a reflection, for instance in the xy plane: 10 0 S′ = 01 0 . 0 0 1 − Since the choice of S′ is usually implicit and clouded by physical arguments, it is not always clear that the choice is only a matter of convention. In any case, one needs an extra convention, added to the Euler convention, to uniquely parametrize an improper rotation matrix. Yet another parameterization of proper rotation matrices, the (n,ϕ) pa- rameterization, is useful. In order to introduce it, we first prove the existence of a rotation (invariant) axis. Theorem 2 (Euler’s theorem) A rotation matrix R has at least one invariant vector n, i.e., R n = n. If R has more than one invariant vector, R = 1 (the unit matrix) and any vector is an invariant vector. Proof We show that the matrix R has an eigenvalue λ = 1. Since det(R)−1 = det(R−1) = 1, we find, using the rules det(AT ) = det(A) and det(A B) = det(A) det(B),
det(R 1) = det (R 1)T = det(R T 1) = det(R−1 1) − − − − = det R−1(R 1) = det( 1) det(R−1) det(R 1) − − − − = det(R 1). − − Hence, det (R 1) = det (R 1) so that det (R 1) = 0, and we con- − − − − clude that the secular equation det (R λ1) = 0 has the root λ = 1. The − corresponding eigenvector is n. From linear algebra we know the general result that an m m matrix A × has m orthogonal eigenvectors if and only if it is normal, that is, if AA† = A†A. That is, a normal matrix is unitarily equivalent to a diagonal matrix. Its eigenvectors and eigenvalues may be complex. In the case at hand R, which obviously is normal, is equivalent to a matrix of the form eiφ 0 0 0 e−iφ 0 , 0 0 1 because, as we just saw, it has at least one eigenvalue 1. Further the diagonal matrix is unitary since R is orthogonal. The diagonal elements of a diagonal
5 unitary matrix lie on the unit circle in the complex plane. Finally det(R)=1 (is the product of the diagonal elements), so that the two complex eigenvalues are each others complex conjugate. The two corresponding eigenvectors are in general complex. The matrix R has only two real eigenvectors, other than n, if φ = π or φ = 0. For φ = π the eigenvectors change sign and are not invariant. For φ = 0 we have the unit matrix. This proves Theorem 2. Often one writes a proper rotation matrix as R(n,ϕ), where the invariant vector n is the rotation axis and ϕ is the angle of rotation around n. It is not difficult to give an explicit expression for R(n,ϕ). Consider to that end an arbitrary vector r = an in R3, (a R), and decompose it into a component 6 ∈ parallel to the invariant unit vector n and a component x⊥ perpendicular to it: r =(r n) n + x with x = r (r n) n. (10) · ⊥ ⊥ − · The vectors r, x and n are in one plane, while y n r is perpendicular ⊥ ⊥ ≡ × to this plane. The vectors n, x⊥ and y⊥ form a right-handed frame. The vector n has unit length by definition and the vectors x⊥ and y⊥ both have the length r 2 (n r)2 (which is not necessarily unity). When we | | − · rotate r around n its component along n is unaffected and its perpendicular component transforms as
x cos ϕx + sin ϕy . (11) ⊥ → ⊥ ⊥ Hence,
R(n,ϕ)r = cos ϕ r (r n) n + sin ϕ n r +(r n) n. (12) − · × · It is easily verified that forn (n , n , n ) ≡ 1 2 3 0 n n − 3 2 n r = n 0 n r =: Nr. (13) 3 1 × n n −0 − 2 1 The dyadic product n n is a matrix with i, j element equal to n n . Evi- ⊗ i j dently, (r n) n = n nr. (14) · ⊗ By direct calculation one shows that
N 2 = n n 1. (15) ⊗ − By substituting (13), (14) and (15) into (12) we obtain finally
R(n,ϕ)r = 1 + sin ϕN + (1 cos ϕ)N 2 r. (16) − 6 Since r is arbitrary, we have
R(n,ϕ)= 1 + sin ϕN + (1 cos ϕ)N 2. (17) −
Since every rotation matrix = 1 has one, and only one, invariant vector, 6 it follows that ARA T (with orthogonal A and R) has the invariant vector A n. Indeed, A n = A R n = ARA T (A n). Since furthermore Tr(R) = Tr(ARA T ) = 2 cos ϕ +1, the rotation angles of R and ARA T are equal and we find the useful expression,
AR n,ϕ A T = R An,ϕ . (18)
This expression enables us to cast the Euler parametrizatio n into a different form. Switching back and forth between the notation R (ϕ) R (e ,ϕ) and z ≡ z similarly R (ψ) R (e , ψ), we find y ≡ y T Rz(α) Ry(β) Rz(γ) = Rz(α) Ry(β) R (ez,γ) Rz(α) Ry(β) T Rz(α) R (e y, β) Rz(α) (19) × R (e , α) × z ′′ ′ = R (ez ,γ) R (ey, β) R (ez, α).
The right-hand side gives the usual definition of the Euler angles. Consider a body with an orthonormal frame (ex, ey, ez) attached to it and perform the the three consecutive rotations:
1. Rotate the body around its z-axis over an angle α, this sends ey to e′ R (α)e . y ≡ z y ′ 2. Rotate the body around the new y-axis, ey, over an angle β. This sends e to e′′ R (β) e , where R (β) R (e′ , β). z z ≡ y′ z y′ ≡ y ′′ 3. Rotate the body finally around the new z-axis, ez , over an angle γ. Thus
Rz(α) Ry(β) Rz(γ)= Rz′′(γ) Ry′(β) Rz(α). (20)
The left-hand side is useful if one wants to compute the rotation matrix in the Euler parametrization, while the right-hand side corresponds to the geometric definition of the Euler angles.
7 3 The algebra of real antisymmetric matrices
The condition on the real 3 3 matrix A × AT = A − has the consequence that A is of the form
0 a a − 3 2 a 0 a . 3 − 1 a a 0 − 2 1 (The numbering and signs appearing in this matrix will become clear later.)
At this point we only note that the three real numbers a1, a2 and a3 specify the real antisymmetric matrix A uniquely. In other words, there is a one-to- one correspondence between vectors and antisymmetric 3 3 matrices, ×
a1 a = a A. (21) 2 ←→ a3 The set of antisymmetric matrices is a vector space (linear combinations are also antisymmetric and note that the zero matrix is both symmetric and antisymmetric). Its dimension is 3 and a basis is
00 0 0 01 0 1 0 − L = 0 0 1 , L = 0 00 , L = 1 0 0 (22) 1 2 3 01− 0 1 0 0 0 0 0 − (L ) = ε , (L ) = ε , (L ) = ε . (23) 1 ij − 1ij 2 ij − 2ij 3 ij − 3ij The linear independence follows easily,
a1 L1 + a2 L2 + a3 L3 = 0 a a 0 0 0 − 3 2 a 0 a = 0 0 0 a = a = a =0. 3 − 1 1 2 3 a a 0 0 0 0 ⇐⇒ − 2 1 The fact [Eq. (23)] that matrix elements of Lk can be written by means of the Levi-Civita tensor follows by inspection. The space is also an algebra with the commutator as the product. Let AT = A, BT = B then it follows that A, B T = A, B , and we see − − − that the space is closed under this product. This algebra is a Lie algebra, commonly denoted by so(3). The reason for the designation so(3) is the following: The set of all proper rotation matrices forms a group: SO(3) (the
8 special orthogonal group in 3 dimensions). Differentiate Rz(ϕ), Ry(ϕ) and Rz(ϕ) [see Eq. (5)],
00 0 d R (ϕ) = 0 sin ϕ cos ϕ dϕ x 0− cos ϕ − sin ϕ − 00 0 = 0 0 1 Rx(ϕ) 01− 0 00 0 = R (ϕ) 0 0 1 (24) x − 01 0 = L1 Rx(ϕ)= Rx(ϕ) L1 .
Likewise, d R (ϕ) = L R (ϕ)= R (ϕ) L (25) dϕ y 2 y y 2 d R (ϕ) = L R (ϕ)= R (ϕ) L . (26) dϕ z 3 z z 3 Differentiation at ϕ = 0 brings us from SO(3) to so(3), a linear space tangent to SO(3) and spanned by the Li.
The matrices Li generate “infinitesimal rotations”. Consider, for in- stance, for infinitesimal ∆ϕ, [∆ϕ >> (∆ϕ)2],
dR R (ϕ + ∆ϕ) = R (ϕ ) + ∆ϕ z (two term Taylor) (27) z 0 z 0 dϕ ϕ0 = Rz(ϕ0) 1 + ∆ϕ Lz and it follows that (1+∆ϕ Lz) represents an infinitesimal rotation of a vector in R3 around the z-axis. We have seen that there is a 1 1 correspondence between R3 and so(3). − To a certain extent this correspondence holds also for orthogonal transfor- mations on both spaces, as is shown in the following theorem.
Theorem 3 Consider a proper or improper rotation matrix B and let
3
A = ai Li, ai real. (28) i=1 X 9 Then 3 T B A B = B (Ba)i Li , B := det (B) . (29) i=1 X In other words, if A a then B A BT B Ba ; except for the presence ↔ ↔ of B , a rotates as a vector.
Proof
Write B =(b1, b2, b3) and recall from Eq. (4) that
b b = B ε b , a × b abc c c X where B = 1 gives the handedness of the set (b , b , b ). Consider first ± 1 2 3 BT L B T L i ab = Bak ( i)kl Blb k,l X = B ε B [by Eq. (23)] − ka ikl lb Xk,l = ε (b ) (b ) − ikl a k b l Xk,l = (b b ) = B ε (b ) − a × b i − abc c i c X = B (Lc)ab Bic , c X so that T B Li B = B Bic Lc . c X This is true for any orthogonal B. Substitute B BT = B−1 then, → 3 T BLiB = B LcBci. (30) c=1 X
Since this equation holds for the basis of so(3) it holds for any element of this space.
Computations are often facilitated by the results in the following lemma.
Lemma 2. If a 0 a a 1 − 3 2 a = a A = a 0 a 2 3 − 1 a ↔ a a 0 3 − 2 1 10 and b 0 b b 1 − 3 2 b = b B = b 0 b 2 3 − 1 b ↔ b b 0 3 − 2 1 then a b = Ab = B a . (31) × − Proof By Eq. (23),
(a b) = ε a b = a (L ) b = (A) b × i ikl k l k k il l il l Xk,l Xk,l Xl = (Ab)i = b (L ) a = B a = (B a) . − l l ik k − ik k − i Xk,l Xk The proof of the well-known fact that a b transforms as a pseudovector × now follows as an easy corollary of lemma 2. Let C be orthogonal with determinant C . Then
T C ( a b) = CAb = C A C Cb (32) × = A′ b′ = C a′ b′ = C (C a) (Cb), × × since by Theorem 3 A′ C A C T corresponds with C C a C a′, i.e., ≡ ≡ a′ = C a, and similarly b′ = Cb. If C = 1 then simultaneous rotation of a and b gives rotation of a b by C. If C is improper a b is rotated but × × not inverted.
Note This is an example of a general result: an antisymmetric tensor of rank n 1 transforms contragrediently (times determinant) to a vector under the − unitary group U(n).
4 The kinematics of a rigid body
Consider a system of n point masses mk, k = 1,...,n, moving in three- dimensional Euclidean point space (affine space with inner product in coordi- 3 nate space R ). At time t the masses are at the points P1(t),P2(t),...,Pn(t). Choose a fixed orthonormal right-handed (laboratory) frame with origin at the point O, −→eO := (~e1,~e2,~e3).
11 Thus, the vector pointing from O to the point mass Pk is represented by pk(t): −−→OPk(t)= −→eO pk(t) , k =1,...,n. (33)
Clearly, the geometric quantity −−→OPk(t) is independent of the orientation of the frame at O. The center of mass C(t) of the system is given by
1 n n −→OC(t) := m −−→OP (t) with M := m . (34) M k k k Xk=1 Xk=1 The vector −→OC(t) is represented by c(t), −→OC(t)= −→eO c(t). The vector −−→OPk(t) can be decomposed as,
−−→OPk(t)= −→OC(t)+ C−−−−−−→(t)Pk(t) . (35) If the inner products
−−→CP (t) −−→CP (t) ρ , k,l, =1,...,n (36) k · l ≡ kl are time independent: dρkl/dt = 0 for all k and l, the system is a rigid body. In the case of 3-dimensional rigid bodies we can attach to the system a frame −→fC with origin in the center of mass. This frame moves with the rigid body, or in other words, all particles are represented by time-independent coordinate vectors with respect to this body-fixed frame. The position of the center of mass in Euclidean point space is fixed by three real parameters and the orientation of the frame requires another three parameters. (These could for instance be the Euler angles describing the rotation of the frame −→eC —which is parallel to the space-fixed frame—to the body-fixed frame −→fC ). In the case of planar (two-dimensional) rigid bodies we can only attach two orthogonal unit vectors to the body, say −→fx and −→fy, a third unit vector can be constructed by taking the vector product between these vectors. In the case of linear (one-dimensional) systems only one vector can be attached to the system. The two polar angles of the vector with respect to the space-fixed frame gives the orientation of the body in space. Let us consider a 3-dimensional system. It is not difficult to define an orthonormal frame −→fC(t) [for instance by diagonalization of the inertia tensor, see Eq. (58) below] and we obtain
−→fC (t)= −→eC F (t). (37)
Here −→eC is the frame obtained by parallel translation of the lab frame −→eO along −→OC. Since both −→eC and −→fC(t) are orthogonal we have F (t) F (t)T = 1 . (38)
12 Assuming that −→fC(t) is right-handed, just as −→eC , it follows that F (t) is a proper rotation matrix. As we have seen in Theorem 1, F (t) is uniquely determined by three angles, for instance the Euler angles α(t), β(t) and γ(t).
The coordinates of any point Pk(t) of the rigid body with respect to −→fC (t) are time-independent. We assume that at t = 0 the space-fixed and body-
fixed frame coincide, i.e., F (0) = 1. Introducing the coordinates rk(0) of the point masses at t = 0, we write
−−→CPk(t)= −→fC (t) rk(0) = −→eC F (t) rk(0) = −→eC rk(t), (39) so that the relation between space-fixed [rk(t)] and body-fixed [rk(0)] coor- dinates is,
rk(t)= F (t) rk(0) , k =1,...,n. (40)
′ ′ Returning to the space-fixed frame at O and defining the point Pk so that OP−−→k is parallel to −−→CPk and using that in Euclidean space these parallel vectors are represented with respect to parallel frames by the same column vector, i.e., ′ OP−−→k := −→eO rk(t) (41) we find,
′ −−→OPk = −→OC + −−→CPk = −→OC + OP−−→k = −→eO c (t)+ rk (t) (42)