EXACT LOW-ORDER POLYNOMIAL EXPRESSIONS TO COMPUTE THE KOLMOGOROFF-NAGUMO MEAN IN THE AFFINE SYMPLECTIC GROUP OF OPTICAL TRANSFERENCE MATRICES
SIMONE FIORI∗‡ AND STILIAN PRIFTI†
Abstract. The current contribution presents exact third-order polynomial expressions of matrix functions that arise in the computation of the Kolmogoroff-Nagumo mean of a set of optical transference matrices, that belong to the affine symplectic group ASp(4).
Key words. Affine symplectic group of matrices; Exact low-order polynomial representation; Kolmogoroff- Nagumo mean; Linear optical transference.
1. Introduction. The group of real symplectic matrices Sp(2n), with n ∈ N and n ≥ 1, is an instance of quadratic matrix group [14]. Symplectic matrices find wide applications in sciences and engineering. Noteworthy examples of applications are found in vibration analysis [3], in optimal control [2, 15], in electromagnetism [16] and in computational optics [7]. The present contribution is motivated by an application in linear optics: When a ray of light passes through a lens (such as the lens in a telescope) the ray is refracted and the change in direction may be described by a symplectic matrix. The recent contribution [8] deals with the computation of the Kolmogoroff-Nagumo mean of a set of centered linear optical systems. Each of such systems is described by a 4 × 4 real symplectic matrix, namely, by an element of the Lie group Sp(4), termed astigmatic matrix. Given as set of N astigmatic matrices Sn ∈ Sp(4), its Kolmogoroff-Nagumo mean is denoted by:
N ! 1 X S¯ϕ := ϕ ϕ−1(S ) , (1.1) KN N n n=1
where ϕ−1 : Sp(4) → sp(4), with sp(4) denoting the Lie algebra associated with the Lie group Sp(4) formed by all the 4 × 4 Hamiltonian matrices, and ϕ : sp(4) → Sp(4) denotes its inverse. (In −1 general, the function ϕ is defined only locally in an open set Uϕ ⊂ Sp(4).) The rationale of the Kolmogoroff-Nagumo mean is that while the symplectic group Sp(4) is a curved manifold, hence it is impossible to take an arithmetic mean over such a space directly, its Lie algebra is a linear space, where an arithmetic mean is well-defined. Therefore, the symplectic matrices Sn are first mapped to the Lie algebra through a suitable map ϕ−1 and the result of arithmetic averaging is brought back to the Lie group through its inverse ϕ. In fact, the actual calculations are taken in the Lie algebra. This is the approach usually adopted, for instance, in the numerical integration of ODEs on Lie groups (see, for instance, [4, 5]). In the contribution [8], the following pairs of maps were considered: • The map ϕ−1 = log, namely, the matrix logarithm. Its inverse is ϕ = exp, namely, the
*Dipartimento di Ingegneria dell’Informazione, Universit`aPolitecnica delle Marche, Via Brecce Bianche, I-60131 Ancona (Italy) School of Mechanical Engineering, Universit`aPolitecnica delle Marche, Via Brecce Bianche, I-60131 Ancona (Italy) Corresponding author. eMail: s.fi[email protected]. Cite this paper as: S. Fiori and S. Prifti, “Exact low- order polynomial expressions to compute the Kolmogoroff-Nagumo mean in the affine symplectic group of optical transference matrices”, Linear and Multilinear Algebra, Vol. 65, No. 4, pp. 840 — 856, 2017 1 matrix exponential. Such maps are defined as matrix-to-matrix series, namely:
∞ X Xk exp(X) := I + , (1.2) k! k=1 ∞ X (I − X)k log(X) := − . (1.3) k k=1 The series for the logarithmic map converges as long as kX − Ik < 1. Given a symplectic matrix S ∈ Sp(4), its logarithm log(S) is defined by the matrix-power series (1.3) and is hence cumbersome to evaluate; likewise, given an Hamiltonian matrix H ∈ sp(4), its exponential exp(H) is cumbersome to compute by the series (1.2). • The map ϕ−1 = cay, namely, the matrix Cayley transform, defined as
cay(X) := (I + X)(I − X)−1, (1.4)
well-defined if X − I ∈ GL(n) := {Y ∈ Rn×n | det(Y ) 6= 0}. Since the Cayley transform is self-inverse, the inverse map is ϕ = cay. In particular, the contribution [8] recalled how an analytic function ϕ of a 4 × 4 real-valued matrix may be calculated exactly as a third-order polynomial, namely:
3 2 ϕ(H) = ϕ3H + ϕ2H + ϕ1H + ϕ0I, (1.5) −1 (−1) 3 (−1) 2 (−1) (−1) ϕ (S) = ϕ3 S + ϕ2 S + ϕ1 S + ϕ0 I, (1.6)
(−1) (−1) (−1) (−1) and gave explicit formulas for the coefficients ϕ0, ϕ1, ϕ2, ϕ3 and ϕ0 , ϕ1 , ϕ2 , ϕ3 as func- tions of the eigenvalues of the matrices H and S, respectively, both for the exp/log maps-pair and for the cay/cay maps-pair. Since the eigenvalue structure of an Hamiltonian matrix and of a symplectic matrix presents several possible cases, several cases of interest were classified and solved for, separately. The Kolmogoroff-Nagumo mean may be calculated for every matrix Lie group and it coincides with the output of the first step, with a specific starting point (namely, the group identity), of a general averaging algorithm developed in [10]. In order to compare the Kolmogoroff-Nagumo mean with the general algorithm presented in [10], it is also necessary to identify the pair (ϕ, ϕ−1) with a pseudo-retraction/pseudo-lifting maps pair. A further generalization on this line is that the pair of conjugate functions (ϕ, ϕ−1) may be replaced with a pair (ϕ, ψ), such that the composition ϕ ◦ ψ behaves as an approximate identity [9], with the aim of lightening the computational burden associated with the Kolmogoroff-Nagumo mean. It is worth underlying that not every Kolmogoroff- Nagumo-like averaging rule is an instance of the the Lie-group averaging scheme (1.1). This is the res PN −1 −1 case, for example, of the resolvent average S := ϕ n=1 ϕ (Sn)/N with ϕ(H) := H − I introduced in [1]. Since ϕ does not map an sp(4)-matrix into an Sp(4)-matrix, such an averaging rule does not result to be an instance of the Lie-group-type Kolmogoroff-Nagumo averaging rule (1.1). The Kolmogoroff-Nagumo mean as well as taking averages over the real symplectic group are special cases of a more general problem, namely, averaging over curved manifolds (see, for instance, the study [6]). The above setting is of limited scope in the sense that it was developed for centered optical systems only. In the presence of decentered optical systems, the above setting is no longer suitable. In fact, a general optical system cannot be represented only in terms of an astigmatic matrix, but it is necessary to call for an augmented optical transference matrix of the form: S δ T = , (1.7) 0 1 2 where S ∈ Sp(4) is the astigmatic submatrix and δ ∈ R4 represents a shift of the optical system with respect to its center. The resulting 5×5 matrix belongs to the affine symplectic group ASp(4), that is a Lie group under standard matrix multiplication/inverse, whose Lie algebra is denoted as asp(4). For a review of the (affine) symplectic group, see, e.g., [11]. A generic element L of the algebra asp(4) presents the following structure:
H v L = , (1.8) 0 0
with H ∈ sp(4) and v ∈ R4. The exponential-logarithm-based Kolmogoroff-Nagumo mean of a set of N optical system transference matrices Tn reads:
N ! 1 X T¯exp := exp log(T ) , (1.9) KN N n n=1 while the Cayley-Cayley-based Kolmogoroff-Nagumo mean of a set of N optical system transference matrices Tn reads
N ! 1 X T¯cay := −cay cay(−T ) . (1.10) KN N n n=1 The reason for the change of sign in the definition (1.10) is that, from the general structure of an optical transference matrix (1.7), it follows immediately that the matrix I − T is not invertible, while the matrix I + T is surely invertible. The maps ϕ and ϕ−1 considered in the previous definition may be evaluated on the basis of exponential/logarithmic/Cayley maps applied to symplectic and Hamiltonian matrices. In fact, it holds that: exp(H) M (H)v log(S) M (S)δ exp(L) = 1 , log(T ) = 2 , (1.11) 0 1 0 0 cay(H) M (H)v cay(−S) (cay(−S) + I)δ/2 cay(L) = 3 , cay(−T ) = , (1.12) 0 1 0 0
where the matrix functions M1, M2 and M3 are defined as:
−1 M1 : sp(4) → Sp(4),M1(H) := (exp(H) − I)H , (1.13) −1 M2 : Sp(4) → sp(4),M2(S) := log(S)(S − I) , (1.14) −1 −1 −1 M3 : sp(4) → Sp(4),M3(H) := (cay(H) − I)H = 2H(I − H) H . (1.15)
The function M1 is well-defined only if H ∈ GL(4), the matrix function M3 is well-defined only if H ∈ GL(4) and H − I ∈ GL(4), while the function M2 is well-defined only if S − I ∈ GL(4). Since −1 −1 the matrix-functions H and (I −H) are analytic within the existence field of the function M3, and since analytic matrix-functions of the same matrix-variable commute, it holds that 2H(I − H)−1H−1 = 2HH−1(I − H)−1 = 2(I − H)−1, hence:
−1 M3(H) = 2(I − H) . (1.16)
The functions M1, M2 and M3 differ from the functions exp, log and cay, hence, in the present work, we propose to compute them explicitly as third-order polynomials. For the evaluation of a term like cay(−T ) there is no special function needed, because the evaluation of the quantity cay(−S) was already covered in the previous contribution [8]. 3 2. Third-order polynomials to compute 4×4-matrix functions. The method presented in the current paper to compute a non-linear matrix function by means of a low-order polynomial is a special case of a general method based on the Lagrange generalized polynomials recalled, for example, in the Appendix A of the book [12]. An analytic function of a 4 × 4 matrix may be evaluated through an exact third-order polyno- mial formula whose coefficients depend on the eigenvalues of the matrix. Let X denote a 4×4 real- valued matrix. The characteristic polynomial associated with the matrix X is pX (z) := det(zI−X), where z ∈ C denotes a scalar complex-valued variable, which is a monic polynomial of degree 4: 4 3 2 pX (z) = z + p3z + p2z z + p1z + p0, (2.1) where p3, p2, p1, p0 ∈ R are the scalar coefficients of the characteristic polynomial (that coincide, up to a change of sign, with the principal invariants associated to the matrix X and may be computed through Newton’s identities [17]) and the roots of the polynomial (2.1) are the eigenvalues of the matrix X. Namely, denoting with λX an eigenvalue of the matrix X, it holds that
pX (λX ) = 0. (2.2) The Cayley-Hamilton theorem states that each matrix satisfies its own characteristic equation. In the present context, the Cayley-Hamilton theorem casts as:
4 3 2 PX (X) := X + p3X + p2X + p1X + p0I = 0. (2.3) By definition, any analytic function may be expanded as a polynomial. Moreover, to any matrix-to- matrix function F may be associated a scalar-to-scalar function by replacing the matrix argument with a scalar argument. Hence, the matrix-to-matrix analytic function F (X) may be thought of as a polynomial f(z) in the variable z ∈ C. The polynomial f(z) may be written as:
f(z) = q(z)pX (z) + r(z), (2.4) where q(z) is the quotient polynomial and r(z) is the remainder polynomial of degree (at most) 3, namely:
3 2 r(z) = f3z + f2z + f1z + f0, (2.5) where f3, f2, f1, f0 ∈ R are the scalar coefficients of the polynomial r(z). By setting z = λX in the equation (2.4), thanks to the identity (2.2), it is readily seen that:
f(λX ) = r(λX ). (2.6) The corresponding relationship in the matrix variable X reads:
F (X) = Q(X)PX (X) + R(X). (2.7)
Since, by the Cayley-Hamilton theorem (2.3), it holds that PX (X) = 0, from the equation (2.7) it follows that F (X) = R(X). Therefore, the matrix function F (X) may be evaluated, exactly, as a third-order matrix polynomial, namely, as:
3 2 F (X) = f3X + f2X + f1X + f0I, (2.8) where the scalar coefficients of the polynomial expression of the function F (X) are to be sought. The key point in the computation of the coefficients f3, f2, f1, f0 in the polynomial expression (2.8) is that they are also the coefficients of the polynomial expression of the associated scalar-to- scalar function f(z). Therefore, one may exploit the identity (2.6) to calculate such coefficients. 4 For symmetry reasons, it is known that the eigenvalues of an Hamiltonian and of a symplectic 4 × 4 matrix may appear with multiplicity 1, 2 or 4. Given a matrix-to-matrix analytic function F (and its associated scalar counterpart f), given a b c d the four eigenvalues of the matrix X, namely λX , λX , λX , λX and supposing that they are all distinct, the coefficients f3, f2, f1, f0 are found by solving the following linear system:
a 3 a 2 a a (λX ) f3 + (λX ) f2 + λX f1 + f0 = f(λX ), b 3 b 2 b b (λX ) f3 + (λX ) f2 + λX f1 + f0 = f(λX ), c 3 c 2 c c (2.9) (λX ) f3 + (λX ) f2 + λX f1 + f0 = f(λX ), d 3 d 2 d d (λX ) f3 + (λX ) f2 + λX f1 + f0 = f(λX ). In matrix format, the above linear system casts as
a a 2 a 3 a 1 λX (λX ) (λX ) f0 f(λX ) b b 2 b 3 b 1 λX (λX ) (λX ) f1 f(λX ) c c 2 c 3 = c . (2.10) 1 λX (λX ) (λX ) f2 f(λX ) d d 2 d 3 d 1 λX (λX ) (λX ) f3 f(λX ) | {z } | {z } V g The matrix V of the coefficients is square Vandermonde. The coefficients of the polynomial (2.8) are found by inverting the matrix V and then by computing V −1g. The inversion of a square Vandermonde matrix is facilitated by its LU decomposition, that is given explicitly for the 4 × 4 case, e.g., in the contribution [13]. The explicit solution of the above Vandermonde-type system is: f(λb ) λa λc λd f(λa ) λb λc λd f = X X X X − X X X X 0 λa −λb λb −λc λb −λd λa −λb λa −λc λa −λd ( X X )( X X )( X X ) ( X X )( X X )( X X ) f(λc ) λa λb λd f(λd ) λa λb λc − X X X X + X X X X λa −λc λb −λc λc −λd λa −λd λb −λd λc −λd ( X X )( X X )( X X ) ( X X )( X X )( X X ) f(λa ) (λb λc +λb λd +λc λd ) f(λb ) (λa λc +λa λd +λc λd ) f = X X X X X X X − X X X X X X X 1 a b a c a d a b b c b d (λX −λX )(λX −λX )(λX −λX ) (λX −λX )(λX −λX )(λX −λX ) f(λc ) (λa λb +λa λd +λb λd ) f(λd ) (λa λb +λa λc +λb λc ) + X X X X X X X − X X X X X X X , (λa −λc )(λb −λc )(λc −λd ) (λa −λd )(λb −λd )(λc −λd ) X X X X X X X X X X X X (2.11) f(λb ) (λa +λc +λd ) f(λa ) (λb +λc +λd ) f = X X X X − X X X X 2 λa −λb λb −λc λb −λd λa −λb λa −λc λa −λd ( X X )( X X )( X X ) ( X X )( X X )( X X ) f(λc ) (λa +λb +λd ) f(λd ) (λa +λb +λc ) − X X X X + X X X X , λa −λc λb −λc λc −λd λa −λd λb −λd λc −λd ( X X )( X X )( X X ) ( X X )( X X )( X X ) a b f(λX ) f(λX ) f3 = a b a c a d − a b b c b d (λX −λX )(λX −λX )(λX −λX ) (λX −λX )(λX −λX )(λX −λX ) c d f(λX ) f(λX ) + a c b c c d − a d b d c d . (λX −λX )(λX −λX )(λX −λX ) (λX −λX )(λX −λX )(λX −λX ) Whenever the eigenvalues of the matrix X are not distinct, repeating the equations (2.5) and (2.6) leads to an unsolvable (under-determined) linear system, of little use. In this case, it is necessary to evaluate the polynomial expression of the derivatives of the function f. For example, starting over from equation (2.4) and computing the analytic derivative with respect to the variable z of both sides, gives:
0 0 0 0 f (z) = q (z)pX (z) + q(z)pX (z) + r (z). (2.12)
Upon evaluating such expression in an eigenvalue λX , supposed of algebraic multiplicity equal to 2, the first term on the right-hand side vanishes because pX (λX ) = 0 and the second term on the 5 0 right-hand side vanishes because pX (λX ) = 0 as well. Therefore, for an eigenvalue of multiplicity 2, it holds that
0 0 2 f (λX ) = r (λX ) = 3(λX ) f3 + 2λX f2 + f1. (2.13)
a b Hence, for a matrix X with two distinct eigenvalues λX , λX , both of multiplicity 2, the resolving linear system reads:
a 3 a 2 a a (λX ) f3 + (λX ) f2 + λX f1 + f0 = f(λX ), a 2 a 0 a 3(λX ) f3 + 2λX f2 + f1 = f (λX ), b 3 b 2 b b (2.14) (λX ) f3 + (λX ) f2 + λX f1 + f0 = f(λX ), b 2 b 0 b 3(λX ) f3 + 2λX f2 + f1 = f (λX ).
In matrix format, the above linear system casts as
a a 2 a 3 a 1 λX (λX ) (λX ) f0 f(λX ) a a 2 0 a 0 1 2λX 3(λX ) f1 f (λX ) b b 2 b 3 = b (2.15) 1 λX (λX ) (λX ) f2 f(λX ) b b 2 0 b 0 1 2λX 3(λX ) f3 f (λX )
a b In the mixed case that there are two distinct eigenvalues λX and λX with multiplicity 1 and an c eigenvalue λX with multiplicity 2, the resolving linear system reads:
a 3 a 2 a a (λX ) f3 + (λX ) f2 + λX f1 + f0 = f(λX ), b 3 b 2 b b (λX ) f3 + (λX ) f2 + λX f1 + f0 = f(λX ), c 3 c 2 c c (2.16) (λX ) f3 + (λX ) f2 + λX f1 + f0 = f(λX ), c 2 c 0 c 3(λX ) f3 + 2λX f2 + f1 = f (λX ).
In matrix format, the above linear system casts as
a a 2 a 3 a 1 λX (λX ) (λX ) f0 f(λX ) b b 2 b 3 b 1 λX (λX ) (λX ) f1 f(λX ) c c 2 c 3 = c (2.17) 1 λX (λX ) (λX ) f2 f(λX ) c c 2 0 c 0 1 2λX 3(λX ) f3 f (λX )
00 00 Likewise, for an eigenvalue λX of algebraic multiplicity 4, it holds that f (λX ) = r (λX ), namely:
00 f (λX ) = 6λX f3 + 2f2, (2.18)
000 000 and that f (λX ) = r (λX ), namely:
000 f (λX ) = 6f3. (2.19)
Hence, for a matrix X with an eigenvalues λX of multiplicity 4, the resolving linear system reads:
3 2 (λX ) f3 + (λX ) f2 + λX f1 + f0 = f(λX ), 2 0 3(λX ) f3 + 2λX f2 + f1 = f (λX ), 00 (2.20) 6λX f3 + 2f2 = f (λX ), 000 6f3 = f (λX ), 6 whose explicit solution is: f = f(λ ) − λ f 0(λ ) + 1 λ2 f 00(λ ) − 1 λ3 f 000(λ), 0 X X X 2 X X 6 X 0 00 1 2 000 f1 = f (λX ) − λX f (λX ) + 2 λX f (λX ), (2.21) f = 1 f 00(λ ) − 1 λ f 000(λ ), 2 2 X 2 X X 1 000 f3 = 6 f (λX ).
The polynomial expressions to compute the special functions M1(H), M2(S) and M3(H), along with the associated scalar generating functions to compute the coefficients of the polynomials, are:
exp(z) − 1 M (H) = a H3 + a H2 + a H + a I, m (z) := , (2.22) 1 3 2 1 0 1 z log(z) M (S) = b S3 + b S2 + b S + b I, m (z) := , (2.23) 2 3 2 1 0 2 z − 1 2 M (H) = c H3 + c H2 + c H + c I, m (z) := . (2.24) 3 3 2 1 0 3 1 − z
While the functions m1 and m2 may be prolonged by continuity in their singularity, in fact:
exp(z) − 1 log(z) lim = 1, lim = 1, z→0 z z→1 z − 1
the function m3 has an essential singularity in z = 1. The computation of the three sets of four coefficients is carried out and discussed in the following Sections for some cases of interest. The Appendix A provides some details about the MATLAB© implementation of the evaluation of the polynomial expressions for the three matrix functions M1(H), M2(S) and M3(H) as well as for the functions log(T ), exp(L) and cay(L). 3. Computation of the coefficients of the polynomial expression of the functions M1(H) and M3(H). The matrix functions M1 and M3, as defined in (1.13) and (1.14), apply to a 4 × 4 Hamiltonian matrix. In the Section 2, it was pointed out that the coefficients of the polynomial expression of the quantities M1(H) and of M3(H) depend on the eigenvalues of the matrix H in argument. The four eigenvalues of a 4×4 Hamiltonian matrix come in complex-valued antipodal pairs, namely λa, −λa, λb, −λb ∈ C and are computed by the expression [8]: r 1 q ± tr(H2) ± tr2(H2) − 16 det(H). (3.1) 2 In order to determine the coefficients of the polynomial expressions, it is necessary to distinguish between the cases that the complex-valued numbers λa, λb are distinct or coincident. In the following sections, we cover three cases, namely: • Four distinct eigenvalues of algebraic multiplicity 1; • Two distinct eigenvalues, case (λH , −λH , λH , −λH ), with λH ∈ C − {0}; • Three distinct eigenvalues, case (λH , −λH , 0, 0), with λH ∈ C − {0}.
3.1. Coefficients of the polynomial expression of the matrix function M1(H). The presents subsection covers the computation of the coefficients a3, a2, a1 and a0 pertaining to the 7 polynomial expression of the function M1(H) (2.22). It pays to recall the analytic derivatives of the associated scalar function m1(z), namely: ez ez − 1 m0 (z) = − , (3.2) 1 z z2 2 ez + z2 ez − 2 z ez − 2 m00(z) = , (3.3) 1 z3 z3 ez − 3 z2 ez − 6 ez + 6 z ez + 6 m000(z) = . (3.4) 1 z4 3.1.1. Case I: Distinct eigenvalues of algebraic multiplicity 1. The four distinct eigen- a a b b values are denote by λH , −λH , λH , −λH ∈ C. In order for such eigenvalues to be distinct from a b a 2 b 2 each other, it must hold that λH 6= 0, λH 6= 0 and (λH ) 6= (λH ) . In the current case, the resolving system is square Vandermonde and the explicit solutions (2.11) may be used to compute the coefficients a3, a2, a1 and a0. The explicit solution for the coefficients reads: (λa )3 sinh(λb )−(λb )3 sinh(λa ) a = H H H H , 0 λa λb (λa )2−(λb )2 H H ( H H ) 2 b 2 λa cosh(λ )−1 λb cosh(λa )−1 H H H H a1 = λb (λa )2−(λb )2 − λa (λa )2−(λb )2 , H H H H H H (3.5) λa sinh(λb )−λb sinh(λa ) a = − H H H H , 2 λa λb ((λa )2−(λb )2) H H H H b cosh(λa )−1 cosh(λ )−1 H H a3 = a 2 a 2 b 2 − b 2 a 2 b 2 . (λH ) ((λH ) −(λH ) ) (λH ) ((λH ) −(λH ) )
3.1.2. Case II: Two distinct eigenvalues, case (λH , −λH , λH , −λH ), with λH ∈ C−{0}. The explicit solution for the coefficients of the third-order polynomial reads: 3 sinh(λH ) 1 a0 = − cosh(λH ), 2 λH 2 2(cosh(λH )−1) sinh(λH ) a1 = λ2 − 2λ , H H (3.6) cosh(λH ) sinh(λH ) a2 = 2λ2 − 2λ3 , H H 1−cosh(λH ) sinh(λH ) a3 = 4 + 3 . λH 2λH
3.1.3. Case III: Three distinct eigenvalues, case (λH , −λH , 0, 0), with λH ∈ C − {0}. Whenever the matrix H possesses the eigenvalue structure (λH , −λH , 0, 0), with λH ∈ C − {0}, 0 the matrix function M1(H) is not well-defined (in fact, the function m1(z) cannot be evaluated in z = 0).
3.2. Coefficients of the polynomial expression of the matrix function M3(H). The presents subsection covers the computation of the coefficients c3, c2, c1 and c0 pertaining to the polynomial expression of the function M3(H) (2.24). In order to complete the necessary calcula- tions, it pays to recall the expressions of the derivatives of the auxiliary function m3(z), namely:
0 2 m3(z) = , (3.7) (z − 1)2
00 4 m3 (z) = − , (3.8) (z − 1)3
000 12 m3 (z) = . (3.9) (z − 1)4 8 0 00 000 In the present case, the functions m3, m3, m3 , m3 are rational, therefore, according to the expressions (2.11), the coefficients c3, c2, c1 and c0 are rational functions of the eigenvalues. 3.2.1. Case I: Distinct eigenvalues of algebraic multiplicity 1. The four distinct eigen- a a b b values are denoted again by λH , −λH , λH , −λH ∈ C. In order for the eigenvalues to be distinct a b a 2 b 2 from each other, it must hold that λH 6= 1, λH 6= 1 and (λH ) 6= (λH ) . In this case, the explicit solutions (2.11) may be used to compute the coefficients c3, c2, c1 and c0. The coefficients of the third-order polynomial read: 2 (λb )2 2 (λa )2 c = c = H − H , 0 1 ((λa )2−1)((λa )2−(λb )2) ((λb )2−1)((λa )2−(λb )2) H H H H H H (3.10) 2 2 c2 = c3 = b 2 a 2 b 2 − a 2 a 2 b 2 . ((λH ) −1)((λH ) −(λH ) ) ((λH ) −1)((λH ) −(λH ) )
In the present case, the polynomial expression of the matrix function M3(H) simplifies in:
2 M3(H) = (c2H + c0I)(H + I), because the coefficients c3, c2, c1, c0 result to be pairwise identical.
3.2.2. Case II: Two distinct eigenvalues, case (λH , −λH , λH , −λH ), with λH ∈ C−{0}. In order for the eigenvalues to result of multiplicity 2, it must hold that λH 6= ±1. The explicit solution for the coefficients of the third-order polynomial was computed as: λH +2 3 λH −2 c0 = − 2 − 2 , 2 (λH +1) 2 (λH −1) 2 λH +3 4 λH −3 c1 = 2 − 2 , 2 λH (λH +1) 2 λH (λH −1) (3.11) 1 1 c2 = 2 − 2 , 2 λH (λH −1) 2 λH (λH +1) 2 λH −1 1 c3 = 3 2 − 3 2 . 2 λH (λH −1) 2 λH (λH +1)
3.2.3. Case III: Three distinct eigenvalues, case (λH , −λH , 0, 0), with λH ∈ C − {0}. It must hold that λH 6= ±1. The explicit solution for the coefficients of the third-order polynomial was found to be: c0 = c1 = 2, 2 2 (λ +λH −1) c = − H , (3.12) 2 λ2 (λ2 −1) H H 3 2 (λ −λH +1) H c3 = − 3 2 . λH (λH −1)
It is worth underlying that the corresponding case, whenever using the function M1, is un- defined. Therefore, the Cayley transform based averaging can be applied to a wider variety of cases. 4. Computation of the coefficients of the polynomial expression of the function M2(S). Given a 4 × 4 real-valued symplectic matrix S, the coefficients of its characteristic poly- nomial pS(z) (2.1) may be computed through the Newton’s identities: p3 = p1 = −tr(S), 1 2 2 p2 = 2 (tr (S) − tr(S )), (4.1) p0 = 1, 9 where the noticeable property det(S) = 1 was used in the expression of p0. Hence, the eigenvalues of a 4 × 4 symplectic matrix are given by the formula:
ζ ± pζ2 − 4 −p ± pp2 − 4p + 8 , where ζ = 1 1 2 . (4.2) 2 2 A consequence of the above formulas is that if λ denotes an eigenvalue of a symplectic matrix, then λ¯ and λ−1 must be eigenvalues as well (the over-bar denotes complex conjugation). When all eigenvalues are of multiplicity 1, the following cases may occur: 1. Case of complex-valued quartet: There are four distinct complex-valued roots λ ∈ C, λ¯, λ−1, λ¯−1. 2. Case of a real-valued duet and a unimodular duet: There are two distinct real-valued roots −1 ¯ λ1 ∈ R, λ1 , and two distinct complex-valued, unimodular roots λ2 ∈ C, λ2, with |λ2| = 1 (the symbol | · | denotes the modulus of a complex-valued number). −1 3. Case of two real-valued duets: There are four distinct real-valued roots λ1 ∈ R, λ1 , −1 λ2 ∈ R, λ2 . ¯ 4. Case of two unimodular duets: There are four distinct complex-valued roots λ1 ∈ C, λ1, ¯ with |λ1| = 1, and λ2 ∈ C, λ2, with |λ2| = 1. The eigenvalues may come with multiplicity 2 and with multiplicity 4 as well. There are six possible cases that cover the occurrence of eigenvalues with multiplicity 2: 1. Case of eigenvalues (x, x−1, x, x−1) with x ∈ R − {0, ±1}. 2. Case of eigenvalues (x, x−1, 1, 1) with x ∈ R − {0, ±1}. 3. Case of eigenvalues (x, x−1, −1, −1) with x ∈ R − {0, ±1}. 4. Case of eigenvalues (eιθ, e−ιθ, 1, 1) with θ ∈ R − πZ (where the symbol ι denotes the imaginary unit, namely, ι2 = −1). 5. Case of eigenvalues (eιθ, e−ιθ, −1, −1) with θ ∈ R − πZ. 6. Case of eigenvalues (1, 1, −1, −1). There are two possible cases that cover the occurrence of eigenvalues with multiplicity 4: 1. Case of eigenvalues (1, 1, 1, 1). 2. Case of eigenvalues (−1, −1, −1, −1). The presents Section covers the computation of the coefficients b3, b2, b1 and b0 pertaining to the polynomial expression of the function M2(S) (2.23). For the convenience of the reader, the expressions of the derivatives of the function m2(z), are given as follows:
0 z (log (z) − 1) + 1 m2(z) = − , (4.3) z (z − 1)2 2 00 4 z + z (2 log (z) − 3) − 1 m2 (z) = , (4.4) z2 (z − 1)3 2 3 000 18 z − 9 z + z (6 log (z) − 11) + 2 m2 (z) = − . (4.5) z3 (z − 1)4
In particular, the following cases are given full consideration: • Case of complex-valued quartet λ, λ¯, λ−1, λ¯−1; • Case of real-valued quartet (x, x−1, y, y−1) with x, y ∈ R − {0, ±1}; • Case of purely imaginary quartet (eιθ, e−ιθ, eιψ, e−ιψ, ) with θ, ψ ∈ (0, 2π), θ 6= ψ, θ + ψ 6= 2π. 4.1. Case I: Distinct eigenvalues of algebraic multiplicity 1, complex-valued quar- ¯ −1 ¯−1 tet. The four distinct eigenvalues of a symplectic matrix S are denoted as λS, λS, λS , λS . The 10 explicit solutions (2.11) may be used to compute the coefficients b3, b2, b1 and b0. The explicit solution for the coefficients of the third-order polynomial was computed as: n log(λ )λ¯ (λ5 −1) o S S S b0 = = (λ −1)2(λ +1)={λ¯ (λ2 +1)} , S S S S n log(λ )((1−λ5 )(λ¯2 +1)−|λ |2(λ3 −1)) o S S S S S b1 = = (λ −1)2(λ +1)={λ¯ (λ2 +1)} , S S S S (4.6) n ¯ ¯5 ¯ ¯3 2 o log(λS )(λS (1−λS )+λS (1−λS )(λS +1)) b2 = = ¯ 2 ¯ ¯ 2 , (λS −1) (λS +1)={λS (λ +1)} S n 2 3 o |λS | log(λS )(1−λS ) b3 = = 2 ¯ 2 , (λS −1) (λS +1)={λS (λS +1)}
where the symbol ={·} denotes the imaginary part of a complex number. 4.2. Case II: Distinct eigenvalues of algebraic multiplicity 1, purely imaginary duets (eιθ, e−ιθ, eιψ, e−ιψ) with θ, ψ ∈ (0, 2π), θ 6= ψ, θ + ψ 6= 2π. With the considered configu- ration of the four eigenvalues, the explicit solution for the coefficients of the third-order polynomial becomes: ψ sin( 5ψ ) θ sin( 5θ ) b = 2 + 2 , 0 ψ θ 2 sin( 2 )(sin(2ψ)−2 cos(θ) sin(ψ)) 2 sin( 2 )(sin(2θ)−2 cos(ψ) sin(θ)) θ(sin( 3θ )+2 sin( 5θ ) cos(ψ)) ψ(sin( 3ψ )+2 sin( 5ψ ) cos(θ)) b = − 2 2 − 2 2 , 1 2 sin( θ )(sin(2θ)−2 cos(ψ) sin(θ)) 2 sin( ψ )(sin(2ψ)−2 cos(θ) sin(ψ)) 2 2 (4.7) θ(sin( 5θ )+2 sin( 3θ ) cos(ψ)) ψ(sin( 5ψ )+2 sin( 3ψ ) cos(θ)) b = − 2 2 − 2 2 , 2 2 sin θ (sin(2θ)−2 cos(ψ) sin(θ)) 2 sin ψ (sin(2ψ)−2 cos(θ) sin(ψ)) ( 2 ) ( 2 ) ψ sin 3ψ θ sin 3θ ( 2 ) ( 2 ) b3 = − ψ − θ . 2 sin( 2 )(sin(2ψ)−2 cos(θ) sin(ψ)) 2 sin( 2 )(sin(2θ)−2 cos(ψ) sin(θ))