Future Generation Computer Systems 19 (2003) 1197–1205

On enumeration problems in Lie–Butcher theory H. Munthe-Kaas∗, S. Krogstad Department of Computer Science, University of Bergen, N-5020 Bergen, Norway

Abstract The algebraic structure underlying non-commutative Lie–Butcher series is the free over ordered trees. In this paper we present a characterization of this algebra in terms of balanced Lyndon words over a binary alphabet. This yields a systematic manner of enumerating terms in non-commutative Lie–Butcher series. © 2003 Published by Elsevier Science B.V.

Keywords: Butcher theory; Trees; Enumeration; integrators; Runge–Kutta methods

1. Summary 2. Introduction to Lie–Butcher theory

Let g be the free Lie algebra over ordered trees, Lie series, dating back to Sophus Lie (1842–1899), with a grading induced by the number of nodes in the is a version of Taylor series adapted to general man- trees. In the first part of this paper we prove that gn, the ifolds. Butcher series, invented by Butcher [4,5] homogeneous component of degree n, has dimension: and developed further in [8] (see also [9])isan adaption of Taylor series to the study of order con- n d (g ) = 1 µ 2 . ditions of Runge–Kutta methods. In recent papers dim n n d d 2 d|n [6,10,12,14,15,17,19], efforts have been made to ex- tend the theory of Runge–Kutta methods from Rn For n = 1, 2, 3,... these numbers are 1, 1, 3, 8, 25, to Lie groups and homogeneous spaces. Some very 75, 245, 800, 2700, 9225, 32 065,... This sequence interesting recent developments establish the con- counts the number of order conditions in the nection between Butcher theory and Crouch–Grossman–Owren–Marthinsen methods for theory in physics [3]. In the light of this development integrating differential equations on manifolds. It does it seems important to further investigate the algebraic also count the number of balanced Lyndon words structure of non-commutative Lie–Butcher series on over a binary alphabet. manifolds. In the last part of the paper we establish explicitly Without going into details we will in this section the 1–1 correspondence between balanced binary Lyn- briefly review some basic results, yielding the con- don words and Hall basis elements for the free Lie nection between Lie–Butcher theory and the free Lie algebra of ordered trees. algebra over ordered trees.

Definition 1. The set of ordered trees (OT) is defined ∗ Corresponding author. by the following rules: E-mail addresses: [email protected] (H. Munthe-Kaas), [email protected] (S. Krogstad). (1) The one node tree is an ordered tree, •∈OT.

0167-739X/$ – see front matter © 2003 Published by Elsevier Science B.V. doi:10.1016/S0167-739X(03)00045-1 1198 H. Munthe-Kaas, S. Krogstad / Future Generation Computer Systems 19 (2003) 1197–1205

(2) If τ1,τ2,... ,τk ∈ OT, then the ordered k-tuple The resulting linear space of all 0th and higher order τ = (τ1,τ2,... ,τk) ∈ OT. We call τi the right invariant derivations is denoted G, the universal branches of τ. enveloping algebra of g.Weletexp:g → G denote the formal exponential series: We let |τ| denote the degree of the tree (number of ∞ j nodes): V exp(V) = . j! j=0 |•|=1, |(τ1,... ,τk)|=|τ1|+···+|τk|+1. (1) Given a point p ∈ M, the Lie series development of a function φ ∈ FV(M) is given as The number of ordered trees of degree n + 1isgiven by the Catalan numbers: φ(exp(tV) · p) = exp(tV)[φ](p). ( n) n C(n) = 2 ! . In the commutative case where G = M = R , acting n!(n + 1)! upon itself by translations, this is just Taylor series. The number of ordered trees with 1, 2, 3,... nodes is In the study of numerical methods for integrating 1, 1, 2, 5, 14, 42, 132,... (2), it is very useful to characterize a curve z(t) ∈ M, We will briefly introduce Lie–Butcher series, related z(0) = p, in terms of elementary differentials of f . to Runge–Kutta methods on manifolds. Let M be a homogeneous space acted upon transitively from left Definition 2. Given a point p ∈ M and a smooth by a Lie group G, p → g · p, where g ∈ G, p ∈ f : M → g.Toeveryτ ∈ OT there corresponds an M. Let g be the Lie algebra of G and exp : g → G elementary differential Fp(τ) ∈ g, defined as follows: the exponential map. For V ∈ g and p ∈ M,welet F (•) = f(p), V · p ∈ TpM denote the tangent: p F ((τ ,τ ,... ,τ )) ∂ p 1 2 k V · p = exp(sV) · p. ∂s = (Fp(τ1) · Fp(τ2) ···Fp(τk))[f ](p). s=0 A differential equation on M can always be written There are various ways of characterizing a curve in the form: z(t) ∈ M, z(0) = p. y (t) = f(y(t)) · y(t) (2) • The approach in [16] is based on a time dependent a → R for some function f : M → g. infinitesimal generator. Let a function :OT ν(t,a,p)∈ g Let FV(M) = (M → V ) denote all smooth func- define a curve as 1 tions from M to some vector space V. Any element t|τ|−1α(τ) V ∈ g induces a (right invariant) derivation V [·]: ν(t,a,p)= a(τ)Fp(τ). (|τ|−1)! FV(M) → FV(M) as τ∈OT ∂ We characterize z(t) as the solution of a differential V φ (p) = φ( ( ) · p) [ ] ∂s exp sV for any equation: s=0

φ ∈ FV(M). z (t) = ν(t,a,p)· z(t), z(0) = p. (3)

Similarly, higher order derivations are defined by iter- We define α :OT→ R to be the function satisfying ating this definition: the analytical solution to (2) for a(τ) = 1 for all τ α(•) = τ = (V · W)[φ] = V [W[φ]]. .In[19] it is shown that 1, and for (τ1,τ2,... ,τk) Linear combinations of derivations, and the identity k i (0th order derivation) are defined in the obvious way. j= |τj|−1 α(τ) = 1 α(τi) : |τi|−1 1 In this paper we consider only V = R or V = g. i=1 H. Munthe-Kaas, S. Krogstad / Future Generation Computer Systems 19 (2003) 1197–1205 1199

• A pullback series is the basis for the analysis of From this we see that the analytical solution y(t) is Crouch–Grossman [6] and Owren–Marthinsen [19]. again characterized by b(τ) = 1. Let a function b :OT→ R define a higher order • The development of z(t) is given as a curve σ(t) ∈ differential operator µ(t, b, p) ∈ G as g such that z(t) = exp(σ(t)) · p. The computation of σ(t) from the infinitesimal generator ν(t,a,p)is t|τ|−1α(τ) µ(t, b, p) = b(τ) accomplished by Magnus series [10,13,21]. In this (|τ|−1)! computation, also commutators of elementary dif- τ=(τ1,τ2,... ,τk)∈OT ferentials appears. If a curve z(t) can be represented × (Fp(τ ) · Fp(τ ) ···Fp(τk)). 1 2 as (3), then there exists a function c :fla(OT) → R The curve z(t) is characterized by the requirement such that that t|h|α(h) σ(t) = c(h)Fp(h), φ(z(t)) = µ(t, b, p) φ (p) φ ∈ F (M). (|h|)! [ ] for any R h∈H(fla(OT)) In a formal sense, the pullback series is the exponen- where H(fla(OT)) denotes a Hall basis for the free tial of the infinitesimal generator. It can be shown Lie algebra of ordered trees, to be defined below. that for a given z(t) we have (We do not discuss the extension of α from OT to ∂ H(fla(OT)) here.) Each h ∈ H(fla(OT)) is a (for- µ( ,b,p)= ν(s,a,p)+ , F (h) 1 exp ∂s mal) commutator of trees, p denotes the cor- s=0 responding commutator of elementary differentials where ν(s,a,p) is defined in (3). This yields and |h| denotes the sum of the degree of all trees in the commutator. It should be noted that in the com- b((τ ,τ ,... ,τ )) = a(τ )a(τ ) ···a(τ ). 1 2 k 1 2 k mutative theory all non-trivial commutators vanish

Fig. 1. Basis elements for the free Lie algebra of ordered trees up to order 5, and their corresponding binary balanced Lyndon words. 1200 H. Munthe-Kaas, S. Krogstad / Future Generation Computer Systems 19 (2003) 1197–1205

and σ(t) reduces to the classical B-series as defined In the case of fla(OT), there are C(i−1) generators in [9]. of grade i, thus As we see from this review, the free Lie algebra of ∞ j+1 ordered trees plays a central role in the Lie–Butcher p(x) = 1 − C(j)x . theory. Our goal in this paper is to present a basis for j=0 this algebra (see Fig. 1 for a graphical presentation of λ−d the lowest order elements). The sum j j should be interpreted as the trace of the nth power of the inverse of the companion matrix of p(x), in our case given as   3. The free Lie algebra of ordered trees C(0)C(1)C(2) ···    1    Definition 3. A Lie algebra g is free over the set I if: M =  1  .    1  (i) For every i ∈ I there corresponds an element Xi ∈ . .. g. (ii) For any Lie algebra h and any function i → Yi ∈ By recursion one finds that the non-zero diagonal el- h, there exists a unique Lie algebra homomor- d ements of M are given as phism π : g → h satisfying π(Xi) = Yi for all d i ∈ I. Mi,i = NCT(d − 1,d− 1 − i) for i = 0,... ,d− 1, This is written as g = fla(I). where m A concrete representation of g as a vector space over NCT(n, m) = C(n − k)C(k) (formal) Lie brackets of elements in I, is discussed in k=0 Section 5. is the new Catalan triangle (sequence A028364 in The degree function (1) on OT is naturally extended [11]). Thus to g, for any bracket [τ ,τ ] we define 1 2 −d d λ = trace(M ) |[τ ,τ ]|=|τ |+|τ |. j 1 2 1 2 j d−1 i Counting theorems for such graded free Lie algebras 1 2d are developed in [18]. Let gj denote the homogeneous = C(d − 1 − k)C(k) = . 2 d component of degree j, defined as the subspace of g i=0 k=0 j spanned by commutators of degree . The following We arrive at the following lemma. theorem is shown in [18]: Lemma 5. Let g be the free Lie algebra over ordered Theorem 4. Let g be the graded free Lie algebra gen- trees, and let gn be its homogeneous component of erated by elements i ∈ I, each element with degree grade n. Then + |i|∈Z . Then   1 n 2d dim(gn) = µ . n 2n d d (g ) = ν = 1  λ−d µ , d|n dim n n n j d (4) d|n j For n = 1, 2, 3,... these numbers are 1, 1, 3, 8, 25, 75, 245, 800, 2700, 9225, 32 065,... This integer where µ is the Möbius function and λj the roots of the : sequence occurs in a number of contexts, it is the number of order conditions for Crouch–Grossman– |i| p(x) = 1 − x . (5) Owren–Marthinsen methods, it counts the number i∈I of balanced binary Lyndon words of length 2n, and H. Munthe-Kaas, S. Krogstad / Future Generation Computer Systems 19 (2003) 1197–1205 1201 there are even interesting connections to knot theory pfac(w) = w1w2 ···wk, [2], and renormalization theory (see also sequence where wi are positive. A022553 in [11]). In the final part of this paper we will explicitly w = - - ···- - ∈ A establish the connection between fla(OT) and balanced Proof. Let 1 2 r where i . We obtain w binary Lyndon words. the positive factorization by splitting at all the points j where bal(-j-j+1 ···-r) = 0. ᮀ

4. Lyndon words Lemma 9. There is a 1–1 correspondence between positive balanced binary words and trees τ ∈ OT. Let A ={0, 1} denote the binary alphabet. Let A+ be the collection of all non-empty words over A.For + → A+ a word w ∈ A we define the partial degree |w|- with Proof. First we define the map bin : OT re- respect to a letter - ∈ A the number of occurrences of cursively as - in w, while the balance is given as bin(•) = 01, bal(w) =|w|1 −|w|0. bin((τ1,... ,τk)) = 0 bin(τ1) ···bin(τk)1. (6) A binary word is balanced if bal(w) = 0. By recursion we see that bal(bin(τ)) = 0 and that all non-trivial right factors of bin(τ) are non-balanced, Definition 6. Let < denote the following ordering of hence w = bin(τ) is always a positive balanced binary A+: Lyndon word. • If bal(v) < bal(w) then v

A balanced word w ∈ A+ is called non-negative if Lemma 10. Let all the non-trivial right factors have non-negative bal- τ = τ1τ2 ···τk,τi ∈ OT ance, and it is called positive if they all have a posi- tive balance. Note that due to our ordering of A+,any be a word over OT. Then τ is a Lyndon word with Lyndon word is non-negative, and that every positive respect to lexicographical ordering of OT if and only balanced word is a Lyndon word. if the word

w = bin(τ )bin(τ ) ···bin(τk) Lemma 8. Every non-negative balanced word w has 1 2 a unique factorization in positive balanced words. We is a balanced binary Lyndon word with respect to the call this the positive factorization: ordering of Definition 6. 1202 H. Munthe-Kaas, S. Krogstad / Future Generation Computer Systems 19 (2003) 1197–1205

Proof. Consider the word w = bin(τ1)bin(τ2) ··· The free Lie algebra over the alphabet A and the bin(τk) where τi ∈ OT. Since every bin(τi) is a field R is a vector space g obtained by taking all (fi- positive balanced word, w is a non-negative bal- nite) R-linear combinations of elements in M(A) and anced word, and by Lemma 8 it is represented in its extending the bracket [·, ·]tog by the following rules: unique positive factorization. Thus w is a Lyndon (1) Skew symmetry: [V, W ] =−[W, V ] for all word with respect to the ordering of Definition 6 V, W ∈ g. if and only if it is lexicographically less than all (2) Bilinearity: [V, W + Z] = [V, W ] + [V, Z] and its non-trivial balanced positive right factors, or [V, rW] = r[V, W ] for all V, W, Z ∈ g, r ∈ R. equivalently τ = τ τ ···τk is a Lyndon word 1 2 (3) Jacobi identity: [V, [W, Z]] + [W, [Z, V ]] + with respect to the lexicographical ordering of [Z, [V, W ]] = 0 for all V, W, Z ∈ g. OT. ᮀ Obviously M(A) is spanning g. To construct a ba- We have established a natural 1–1 correspondence sis we must systematically remove terms which are between balanced binary Lyndon words and Lyndon linearly dependent. This is achieved by a Hall basis, H ⊂ M(A) words over the (infinite) alphabet OT. The construction defined as a subset according to of a basis for the free Lie algebra fla(OT) now follows (1) A ⊂ H. from the standard theory of Hall bases, to be reviewed (2) For any m = [m1,m2] ∈ H\A one has m2 ∈ H in the next section. and

m

any i, either hi is a letter, or hi = [hi,hi ] where A+ h ≥ h ,... ,h {h ,h ,... ,h } The lexicographical ordering of induces an order- i i1 n. The sequence 1 2 n ing of M(A).Form1,m2 ∈ M(A) we say that m1 < is decreasing if hi ≥ hi+1 for i = 1,... ,n − 1, m2 if f(m1) < f(m2). otherwise let hj be the rightmost element such that H. Munthe-Kaas, S. Krogstad / Future Generation Computer Systems 19 (2003) 1197–1205 1203 hj

By repeatedly applying the map λ, the result becomes a decreasing standard sequence of Hall words. Let λ∗ denote iterated application of λ until a decreasing 6. Generating balanced Lyndon words sequence is obtained. Note that a sequence of letters is a standard Hall sequence. We define the standard There are numerous approaches for generating Lyn- bracketing of a word w = -1-2 ···-n, where -i ∈ A don words, say of length ≤ n, over an alphabet A. are letters, as the decreasing standard sequence: One option is to use the algorithm of Duval [7] (also described in [20]). Given any Lyndon word of length b(w) = λ∗({- ,... ,- }). 1 n (9) ≤ n, this algorithm always finds the next word w.r.t. lexicographical ordering in the desired subset of Lyn- The sequence b(w) ={h ,... ,hr} can be uniquely 1 don words. This algorithm can be modified to the set characterized as the decreasing sequence of Hall ele- of balanced Lyndon words, but it seems more efficient ments {h ,... ,hr} such that 1 to make use of the following elimination property. Let z denote the last letter in the alphabet A. Then w = f(h1)f(h2) ···f(hr).

L(A) = L(A ) ∪{z}, (10) It can be shown that if w ∈ L(A) is a Lyndon word, ∗ ∗ + then b(w) is containing a single Hall element. Further- where A = (A\z)z , and z = z ∪{∅}. This property more, the image under b of L(A) is a Hall basis for is used in [1] where an algorithm for computing all the free Lie algebra over A (see [20]). Thus we arrive Lyndon words of a given multidegree is presented. at the main theorem of this paper. We follow the lines of Section 4. First we generate the positive balanced words of degree ≤ n (where w |w| Theorem 11. There is a natural 1–1 correspondence the degree of a word is 1), and then generate ≤ n between the set of balanced binary Lyndon words and all Lyndon words of degree over the alphabet ≤ n a Hall basis H for the free Lie algebra fla(OT). The consisting of positive balanced words of degree . correspondence is given by the following algorithm, The positive balanced words are easily obtained by taking a balanced binary Lyndon word w ∈{0, 1}+ to the following recursion step that starts with the word w = a basis element h ∈ H: 1: w w (1) Compute the positive factorization of w: (1) if is balanced, then stop and add to the list of balanced words. (2) if w has degree n, then repeat the step for 0w. w w ···wk = pfac(w). 1 2 (3) otherwise, repeat the step for 0w and 1w. L (A) ≤ n (2) Compute the standard bracketing of {w1,... ,wk}: Let n denote the Lyndon words of degree over the alphabet A. The property (10) then applies: h = b(w ,w ,... ,wk). 1 2 Ln(A) = Ln(A ) ∪{z}, (11)

This theorem simplifies the construction of software where in A , we only need to consider the words of for Lie–Butcher series, since the set of all balanced degree ≤ n. Using this property repeatedly, we can binary Lyndon words is very easy to generate and find all Lyndon words, but we can only eliminate one represent. In Fig. 1 we see the first terms in fla(OT), generator per step. To reduce the number of steps, we sorted according to the grade. make the following simple observation. Let w ∈ A. 1204 H. Munthe-Kaas, S. Krogstad / Future Generation Computer Systems 19 (2003) 1197–1205

If the sum of the degree of w and the smallest degree [2] D.J. Broadhurst, On the enumeration of irreducible k-fold over all words in A is greater than n, then Euler sums and their roles in knot theory and field theory, J. Math. Phys., in press. Ln(A) = Ln(A \ w) ∪{w}. (12) [3] C. Brouder, Runge–Kutta methods and renormalization, Eur. Phys. J. C 12 (2000) 521–534. a b [4] J. Butcher, Coefficients for the study of Runge–Kutta Example 12. Let and denote the positive balanced integration processes, J. Aust. Math. Soc. 3 (1963) 185– words 0011 and 01 respectively, and suppose we wish 201. to find all Lyndon words of degree ≤ 5 over A = [5] J. Butcher, The Numerical Analysis of Ordinary Differential {a, b}. Using the above two properties, we get Equations, Runge–Kutta and General Linear Methods, Wiley, Chichester, 1987. L (A) = L ({a, ab, ab2, ab3}) ∪{b} [6] P.E. Crouch, R. Grossman, Numerical integration of ordinary 5 5 differential equations on manifolds, J. Nonlin. Sci. 3 (1993) 2 3 1–33. = L5({a, ab}) ∪{ab , ab ,b} [7] J.-P. Duval, Génération d’une section des classes de 2 2 3 conjugaison et arbre des mots de Lyndon de longueur bornée, = L5({a, a b}) ∪{ab, ab , ab ,b} Theor. Comput. Sci. 60 (1988) 255–283. 2 2 3 = L5({a}) ∪{a b, ab, ab , ab ,b} [8] E. Hairer, G. Wanner, On the Butcher group and general multi-value methods, Computing 13 (1974) 1–15. ={a, a2b, ab, ab2, ab3,b}. [9] E. Hairer, S.P. Nørsett, G. Wanner, Solving Ordinary Differential Equations. I. Nonstiff Problems, 2nd ed., Springer The approach in this example can easily be applied Series in Computational , Springer, Berlin, 1993, p. 8. to construct fast algorithms for generating all binary [10] A. Iserles, S.P. Nørsett, On the solution of linear differential balanced Lyndon words of a given maximal degree. equations on Lie groups, Phil. Trans. Roy. Soc. Lond. A 357 We note also that this procedure implicitly gives (1999) 983–1019. the standard bracketing of the words involved. For an [11] The On-line Encyclopedia of Integer Sequences. http://www. ∼ alphabet A with largest letter z, we have (analogous research.att.com/ njas/sequences/. [12] D. Lewis, J.C. Simo, Conserving algorithms for the dynamics to (10)): of Hamiltonian systems on Lie groups, J. Nonlin. Sci. 4 (1994) 253–299. k H(fla(A)) = H(fla({(−adz) (a)|k ≥ 0,a∈ A \ z})) [13] W. Magnus, On the exponential solution of the of differential ∪{z}. equations for a linear operator, Comm. Pure Appl. Math. VII (1954) 649–673. [14] H. Munthe-Kaas, Lie–Butcher theory for Runge–Kutta Thus in Example 12, the bracketings of the words methods, BIT 35 (1995) 572–587. in L5(A) \ b are generated by the elements a,[a, b], [15] H. Munthe-Kaas, A. Zanna, Numerical integration of [[a, b],b] and [[[a, b],b],b], etc. We have used here differential equations on homogeneous manifolds, in: F. the equivalence of Hall sets and Lazard sets, and refer Cucker, M. Shub (Eds.), Foundations of Computational to [20] for further reading. Mathematics, Springer, Berlin, 1997, pp. 305–315. [16] H. Munthe-Kaas, Runge–Kutta methods on Lie groups, BIT 38 (1) (1998) 92–111. [17] H. Munthe-Kaas, High order Runge–Kutta methods on Acknowledgements manifolds, J. Appl. Numer. Math. 29 (1999) 115– 127. [18] H. Munthe-Kaas, B. Owren, Computations in a free Lie This work was sponsored by the Norwegian Re- algebra, Phil. Trans. Roy. Soc. Lond. A 357 (1999) 957– search Council under contract no. 127582/432 and 981. 142955/431, through the SYNODE project. [19] B. Owren, A. Marthinsen, Runge–Kutta methods adapted to manifolds and based on rigid frames, BIT 39 (1) (1999) 116– 142. [20] C. Reutenauer, Free Lie Algebras, London Mathematical References Society Monographs, New Series, Academic Press, New York, 1993, p. 7. [1] P. Andary, Finely homogeneous computations in free Lie [21] A. Zanna, Collocation and relaxed collocation for the Fer and algebras, Discrete Math. Theor. Comput. Sci. 1 (1997) 101– the Magnus expansions, SIAM J. Numer. Anal. 36 (1999) 114. 1145–1182. H. Munthe-Kaas, S. Krogstad / Future Generation Computer Systems 19 (2003) 1197–1205 1205

H. Munthe-Kaas received his PhD in geometric integration at the Centre for Advanced Study in Oslo Numerical Analysis from the Norwegian together with Professor Brynjulf Owren, NTNU. University of Science and Technology (NTNU) in 1989. In 1990 he worked as a Post. Doc. Research Scientist at S. Krogstad received his Masters in Al- CERFACS, Toulouse. Since 1991 he gebra from the Norwegian University has been affiliated with the Department of Science and Technology (NTNU) in of Computer Science, University of 1997. He is currently a PhD student in Bergen (as Professor since 1997). He Numerical Analysis at the Department of has also held an Adjoint Professorship Computer Science, University of Bergen. in Mathematics, NTNU (1996–2000). In 1995 he was awarded His research interests include numerical the Carl-Erik Fröberg prize in numerical analysis for his work methods for ordinary differential equa- on Runge–Kutta methods on manifolds. In the academic year tions and geometric integration. 2002–2003 Munthe-Kaas is leading a research program on