Power Series Composition and Change of

Alin Bostan Bruno Salvy Eric´ Schost Algorithms Project Algorithms Project ORCCA and CSD INRIA Rocquencourt INRIA Rocquencourt University of Western Ontario France France London, ON, Canada [email protected] [email protected] [email protected]

ABSTRACT less than n can be multiplied in M(n) operations in K. We Efficient algorithms are known for many operations on trun- impose the usual super-linearity conditions of [17, Chap. 8]. cated power series (multiplication, powering, exponential, Using Fast Fourier Transform algorithms, M(n) can be taken . . . ). Composition is a more complex task. We isolate a in O(n log(n)) over fields with suitable roots of unity, and large class of power series for which composition can be O(n log(n) log log(n)) in general [31, 14]. performed efficiently. We deduce fast algorithms for con- If g(0) = 0,√ the best known algorithm, due to Brent and verting polynomials between various bases, including Euler, Kung, uses O( n log n M(n)) operations in K [11]; in small Bernoulli, Fibonacci, and the orthogonal Laguerre, Hermite, characteristic, a quasi-linear algorithm is known [5]. There Jacobi, Krawtchouk, Meixner and Meixner-Pollaczek. are however special cases of power series g with faster algo- rithms: evaluation at g = λx takes linear time; evaluation Categories and Subject Descriptors: at g = xk requires no arithmetic operation. A non-trivial I.1.2 [Computing Methodologies]: Symbolic and Alge- example is g = x + a, which takes time O(M(n)) when braic Manipulation – Algebraic Algorithms the base field has characteristic zero or large enough [1]. General Terms: Algorithms, Theory Brent and Kung [11] also showed how to obtain a cost in Keywords: Fast algorithms, transposed algorithms, basis O(M(n) log(n)) when g is a polynomial; this was extended conversion, orthogonal polynomials. by van der Hoeven [22] to the case where g is algebraic over K(x). In §2, we prove that evaluation at g = exp(x) − 1 and at g = log(1 + x) can also be performed in O(M(n) log(n)) 1. INTRODUCTION operations over fields of characteristic zero or larger than n. Through the Fast Fourier Transform, fast polynomial mul- Using associativity of composition and the linearity of tiplication has been the key to devising efficient algorithms the map Evalm,n, we show in §2 how to use these spe- for polynomials and power series. Using techniques such cial cases as building blocks, to obtain fast evaluation al- as Newton iteration or divide-and-conquer, many problems gorithms for a large class of power series. This idea was have received satisfactory solutions: polynomial evaluation first used by Pan [28], who applied it to functions of the and interpolation, power series exponentiation, logarithm, form (ax + b)/(cx + d). Our extensions√ cover further exam- . . . can be performed in quasi-linear time. ples such as 2x/(1 + x)2 or (1 − 1 − x2)/x, for which we In this article, we discuss two questions for which such fast improve the previously known costs. algorithms are not known: power series composition and Bivariate problems. Our results on the cost of evalu- change of basis for polynomials. We isolate special cases, ation (and of the transposed operation) are applied in §3 including most common families of orthogonal polynomials, to special cases of a more general composition, reminiscent for which our algorithms reach quasi-optimal complexity. of umbral operations [30]. Given a bivariate power series Composition. Given a power series g with coefficients in P j F = j≥0 ξj (x)t , we consider the linear map a field K, we first consider the map of evaluation at g X n Evaln(., F, t):(a0, . . . , an−1) 7→ ξj (x)aj mod x . Eval (., g): A ∈ [x] 7→ A(g) mod xn ∈ [x] . m,n K m K n j

Here, K[x]m is the m-dimensional K- of polyno- For instance, with mials of degree less than m. We note Evaln for Evaln,n. 1 X j j To study this problem, as usual, we denote by M a mul- F = = g(x) t , 1 − tg(x) tiplication time function, such that polynomials of degree j≥0

this is the map Evaln(., g) seen before. For general F, the conversion takes quadratic time (one needs n2 coefficients for F). Hence, better algorithms can only been found for Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are structured cases; in §3, we isolate a large family of bivari- not made or distributed for profit or commercial advantage and that copies ate series F for which we can provide such fast algorithms. bear this notice and the full citation on the first page. To copy otherwise, to This approach follows Frumkin’s [16], which was specific to republish, to post on servers or to redistribute to lists, requires prior specific Legendre polynomials. permission and/or a fee. ISSAC’08, July 20–23, 2008, Hagenberg, Austria. Change of basis. Our framework captures in particular Copyright 2008 ACM 978-1-59593-904-3/08/07 ...$5.00. the generating series of many classical polynomial families,

269

for which it yields at once conversion algorithms between tween particular families, such as Chebyshev, Legendre and the and polynomial bases, in both directions. B´ezier[27, 4], with however a quadratic cost. Floating-point Thus, we obtain in §4 change of basis algorithms of cost algorithms are known as well, of cost O(n) for conversion only O(M(n)) for all of Jacobi, Laguerre and Hermite or- from Legendre to Chebyshev bases [2] and O(n log(n)) for thogonal polynomials, as well as Euler, Bernoulli, and Mott conversions between Gegenbauer bases [25], but the results polynomials (see Table 3). These algorithms are derived in are approximate. Approximate conversions for the Hermite a uniform manner from our composition algorithms; they basis are discussed in [26], with cost O(M(n) log(n)). improve upon the existing results, of cost O(M(n) log(n)) or Note on the base field. For the sake of simplicity, in all 2 O(M(n) log (n)) at best (see below for historical comments). that follows, the base field is supposed to have character- We also obtain O(M(n) log(n)) conversion algorithms for a istic 0. All results actually hold more generally, for fields large class of Sheffer sequences [30, Chap. 2], including actu- whose characteristic is sufficiently large with respect to the arial polynomials, Poisson-Charlier polynomials and Meix- target precision of the computation. However, completely ner polynomials (see Table 4). explicit estimates would make our statements cumbersome. Transposition. A key aspect of our results is their heavy use of transposed algorithms. Introduced under this name 2. COMPOSITION by Kaltofen and Shoup, the transposition principle is an Associativity of composition can be read both ways: in algorithmic theorem with the following content: given an the identity A(f ◦ g) = A(f) ◦ g, f is either composed on algorithm that performs an r ×s matrix-vector product b 7→ the left of g or on the right of A. In this section, we discuss Mb, one can deduce an algorithm with the same complexity, the consequences of this remark. We first isolate a class of up to O(r +s) operations, and that performs the transposed t operators f for which both left and right composition can matrix-vector product c 7→ M c. In other words, this relates be computed fast. Most results are known; we introduce the cost of computing a K-linear map f : V → W to that t ∗ ∗ two new ones, regarding exponentials and logarithms. Us- of computing the transposed map f : W → V . ing these as building blocks, we then define composition se- For the transposition principle to apply, some restrictions quences, which enable us to obtain more complex functions must be imposed on the computational model: we require by iterated compositions. We finally discuss the cost of the that only linear operations in the coefficients of b are per- map Evaln and of its inverse for such functions, showing how formed (all our algorithms satisfy this assumption). See [12] to reduce it to O(M(n)) or O(M(n) log(n)). for a precise statement, Kaltofen’s “open problem” [23] for further comments and [7] for a systematic review of some 2.1 Basic Subroutines classical algorithms from this viewpoint. We now describe a few basic subroutines that are the To make the design of transposed algorithms transparent, building blocks in the rest of this article. we choose as much as possible to describe our algorithms in a “functional” manner. Most of our questions boil down Left operations on power series. In Table 1, we list basic composition operators, defined on various subsets of to computing linear maps K[x]m → K[x]n; expressing algo- rithms as a factorization of these maps into simpler ones K[[x]]. Explicitly, any such operator o is defined on a domain makes their transposition straightforward. In particular, dom(o), given in the third column. Its action on a power series g ∈ dom(o) is given in the second column, and the this leads us to systematically indicate the dimensions of n the source (and often target) space as a subscript. cost of computing o(g) mod x is given in the last column. Previous work. The question of efficient change of ba- Operator Action Domain Cost sis has naturally attracted a lot of attention, so that fast Aa (add) a + g K[[x]] 1 algorithms are already known in many cases. Mλ (mul) λg K[[x]] n Gerhard [18] provides O(M(n) log(n)) conversion algori- k Pk (power) g K[[x]] O(log k + M(n)) thms between the falling factorial basis and the monomial 1/k k rk Rk,α,r (root) g α x (1 + xK[[x]]) O(M(n)) basis: we recover this as a special case. The general case of ∗ Inv (inverse) 1/g K + xK[[x]] O(M(n)) Newton interpolation is discussed in [6, p. 67] and developed E (exp.) exp(g) − 1 x [[x]] O(M(n)) in [9]. The algorithms have cost O(M(n) log(n)) as well. K L (log.) log(1 + g) xK[[x]] O(M(n)) More generally, if (Pi) is a sequence of polynomials satis- fying a recurrence relation of fixed order (such as an orthog- Table 1: Basic Operations on Power Series onal family), the conversion from (Pi) to the i (x ) can also be computed in O(M(n) log(n)) operations: an Some comments are in order. For addition and multipli- algorithm is given in [29], and an algorithm for the trans- ∗ cation, we take a ∈ K and λ in K . To lift indeterminacies, pose problem is in [15]. Both operate on real or complex the value of Rk,α,r(g) is defined as the unique power series arguments, but the ideas extend to more general situations. with leading term αxr whose kth power is g; observe that to Alternative algorithms, based on structured matrices tech- n n+r(k−1) compute Rk,α,r(g) mod x , we need g modulo x as niques, are given in [21]. They perform conversions in both input. Finally, we choose to subtract 1 to the exponential so 2 directions in cost O(M(n) log (n)). as to make it the inverse of the logarithm. All complexity re- The overlap with our results is only partial: not all fam- sults are known; they are obtained by Newton iteration [10]. ilies satisfying a fixed order recurrence relation fit in our Right operations on polynomials. In Table 2, we de- framework; conversely, our method applies to families which scribe a few basic linear maps on [x] (observe that the di- do not necessarily satisfy such recurrences (the work-in- K m mension m of the source is mentioned as a subscript). Their progress [8] specifically addresses conversion algorithms for action on a polynomial orthogonal polynomials). m−1 Besides, special algorithms are known for converting be- A(x) = a0 + ··· + am−1x ∈ K[x]m

270

Name Notation Action Cost To summarize, we have obtained that Powering Power A(xk) 0 m,k t m−1 Exp (A) = ∆ (MultiEval (Shift (mod (A))), 1/i!). Reversal Revm x A(1/x) 0 m,n n n −1,n m,n n Mod modm,n A mod x 0 Using fast transposed evaluation [13, 7], Expm,n(A) can thus Scale Scaleλ,m A(λx) O(m) be computed in O(M(n) log(n)) operations. Inverting these P i Diagonal ∆m(., si) aisix m n computations leads to the factorization Multiply Mulm,n(., P ) AP mod x M(max(n, m)) t Shift Shifta,m A(x + a) M(m) + O(m) Logm,n(A) = Shift1,n(Interpn(∆n(modm,n(A), i!))),

Table 2: Basic Operations on Polynomials where Interpn is interpolation at 0, . . . , n − 1. Using algo- rithms for transpose interpolation [24, 7], this operation can is described in the third column. In the case of powering, be done in time O(M(n) log(n)).  it is assumed that k ∈ N>0. Here and in what follows, we m 2.2 Associativity Rules freely identify K[x]m and K , through the isomorphism For each basic power series operation in Table 1, we now X i m aix ∈ K[x]m ↔ (a0, . . . , am−1) ∈ K . express Evalm,n(A, o(g)) in terms of simpler operations; we i0, any polynomial A Eval (A, A (g)) = Eval (Shift (A), g), (A ) in K[x] can be uniquely written as m,n a m,n a,m 2 k k k k−1 Evalm,n(A, Pk(g)) = Evalk(m−1)+1,n(Powerm,k(A), g). (A3) A(x) = A0/k(x ) + A1/k(x )x + ··· + Ak−1/k(x )x . m−1 Inspecting degrees, one sees that if A is in [x] , then A Inversion. From A(1/g) = (Revm(A))(g)/g and writ- K m i/k ing h = g1−m mod xn, we get is in K[x]mi , with ( Eval (A, Inv(g)) = Mul (Eval (Rev (A), g), h), 1 if i ≤ m mod k, m,n n,n m,n m mi = bm/kc + (1) (A4) 0 otherwise. k Root taking. For g and h in K[[x]], if g = h , one has k−1 This leads us to define the map Splitm,k : A(h) = A0/k(g) + A1/k(g)h + ··· + Ak−1/k(g)h . We de- duce the following rule, where the indices mi are defined in A ∈ [x] 7→ (A ,...,A ) ∈ [x] ×· · ·× [x] . K m 0/k k−1/k K m0 K mk−1 Equation (1). It uses no arithmetic operation. We also use linear com- h = hi mod xn for 0 ≤ i < k bination with polynomial coefficients. Given polynomials i A0,...,Ak−1 = Split (A) (A5) G0,...,Gk−1 in K[x]m, we denote by m,k k Bi = Evalmi,n(Ai, g) for 0 ≤ i < k Combm(., G0,...,Gk−1): K[x]m → K[x]m k Evalm,n(A, Rk,α,r(g)) = Combn(B0,...,Bk−1, 1, . . . , hk−1). the map sending (A0,...,Ak−1) ∈ K[x]m to m Exponential and Logarithm. A0G0 + ··· + Ak−1Gk−1 mod x ∈ K[x]m. Eval (A, E(g)) = Eval (Exp (A), g), (A ) It can be computed in O(kM(m)) operations. Finally, we ex- m,n n m,n 6 tend our set of subroutines on polynomials with the following Evalm,n(A, L(g)) = Evaln(Logm,n(A), g). (A7) new results on the evaluation at exp(x) − 1 and log(1 + x). 2.3 Composition sequences Proposition 1. The maps We now describe more complex evaluations schemes, ob- n Expm,n : A ∈ K[x]m 7→ A(exp(x) − 1) mod x ∈ K[x]n, tained by composing the former basic ones. Log : A ∈ [x] 7→ A(log(1 + x)) mod xn ∈ [x] m,n K m K n Definition 1. Let O be the set of actions from Table 1. can be computed in O(M(n) log(n)) arithmetic operations. A sequence o = (o1,..., oL) with entries in O is defined at a series g ∈ K[[x]] if g is in dom(o1), and for i ≤ L, Proof. We start by truncating A modulo xn, since oi−1(··· o1(g)) is in dom(oi). It is a composition sequence n if it is defined at x; in this case, o computes the power series Expm,n(A) = Expm,n(A mod x ). g1, . . . , gL, with g0 = x and gi = oi(gi−1); it outputs gL. After shifting by −1, we are left with the question of evalu- P i ating a polynomial in K[x]n at i

271

This shows that g is output by the composition sequence Eval aux(A, m, n, `, o, G) (Mc, Ad, Inv, Me, Af ). A more complex example is if ` = 0 return A mod xn ! 0 2x 1 „ 2 «2 ` = ` − 1 g = = 1 − 1 − , switch(o ) (1 + x)2 2 1 + x ` case(Mλ): B = Scaleλ,m(A) 0 which shows that g is output by the composition sequence return Eval aux(B, m, n, ` , o, G) case(Aa): B = Shifta,m(A) 0 (A1, Inv, M−2, A1, P2, M−1, A1, M1/2). return Eval aux(B, m, n, ` , o, G) Finally, consider g = log((1 + x)/(1 − x)). Using case(Pk): B = Powerm,k(A) return Eval aux(B, km − k + 1, n, `0, o, G) „ „ 2 «« case(Inv): B = Rev (A) g = log 1 + −2 − , m x − 1 C = Eval aux(B, m, n, `0, o, G) 1−m n return Muln,n(C, g`0 mod x ) we get the composition sequence (A−1, Inv, M−2, A−2, L). case(Rk,α,r): m0, . . . , mk−1 = FindDegrees(m, k) Computing the associated power series. Our main h0 = 1 algorithm requires truncations of the series g1, . . . , gL asso- for i = 1, . . . , k − 1 do n ciated to a composition sequence. The next lemma discusses hi = hhi−1 mod x the cost of their computation. In all complexity estimates, A0,...,Ak−1 = Splitm,k(A) the composition sequence o is fixed; hence, our estimates for i = 0, . . . , k − 1 do hide a dependency in o in their constant factors. 0 Bi = Eval aux(Ai, mi, n, ` , o, G) return Combn(B0,...,Bk−1, h0, . . . , hk−1) Lemma 1. If o = (o1,..., oL) is a composition sequence case(E): B = Expm,n(A) that computes power series g1, . . . , gL, one can compute all 0 n return Eval aux(B, n, n, ` , o, G) gi mod x in time O(M(n)). case(L): B = Logm,n(A) 0 Proof. All operators in O preserve the precision, except for return Eval aux(B, n, n, ` , o, G) root-taking, since the operator R loses r(k − 1) terms of k,α,r Eval(A, n, o) precision. For i ≤ L, define εi = r(k − 1) if oi has the form Rk,α,r, εi = 0 otherwise, and define nL = n and inductively G = ComputeG(o, n) ni−1 = ni + εi. Starting the computations with g0 = x, we return Eval aux(A, n, n, L, o, G) ni ni−1 iteratively compute gi mod x from gi−1 mod x . Inspecting the list of possible cases, one sees that com- puting gi always takes time O(M(ni−1)). For powering, this Figure 1: Algorithm Eval. estimate is valid because we disregard the dependency in o: otherwise, terms of the form log(k) would appear. For the Similarly, the degree of the argument A passed through the same reason, O(M(ni−1)) is in O(M(n)), as is the total cost, obtained by summing over all i. recursive calls may grow, but only like O(n).  Two kinds of operations contribute to the cost: precompu- Composition using composition sequences. We now 1−m n k−1 n tations of g`−1 mod x (for Inv) or of 1, g`, . . . , g` mod x study the cost of computing the map Evaln(., g), assuming for Rk,α,r, and linear operations on A: shifting, scaling, mul- that g ∈ K[[x]] is output by a composition sequence o. The tiplication . . . The former take O(M(n)), since the exponents cost depends on the operations in o. To keep simple expres- involved are in O(n). The latter operations take O(M(n)) if sions, we distinguish two cases: if o contains no operation E no Exp or Log operation is performed, and O(M(n) log(n)) or L, we let T (n) = M(n); otherwise, T (n) = M(n) log(n). o o otherwise. This concludes the proof.  Theorem 1 (Composition). Let o = (o1,..., oL) be a 2.4 Inverse map composition sequence that outputs a series g ∈ [[x]]. Given K The map Eval (., g) is invertible if and only if g0(0) 6= 0 o, one can compute the map Eval (., g) in time O(T (n)). n n o (hereafter, g0 is the derivative of g). We discuss here the Proof. We follow the algorithm of Figure 1. The main func- computation of the inverse map. tion first computes the sequence G = g , . . . , g modulo xn, 1 L Theorem 2 (Inverse). x Let o = (o1,..., oL) be a using a subroutine ComputeG(o, n) that follows Lemma 1. 0 composition sequence that outputs g ∈ K[[x]] with g (0) 6= 0. The cost O(M(n)) of this operation is in O(To(n)). Then, −1 One can compute the map Evaln (., g) in time O(To(n)). we call the auxiliary Eval aux function. P i Proof. If h is the power series h = hix , with hi 6= 0, On input A, m, n, `, o, G, this latter function computes i≥i0 0 Evalm,n(A, g`). This is done recursively, applying the appro- val(h) = i0 is the valuation of h, lc(h) = hi0 its leading i0 priate associativity rule (A1) to (A7); the pseudo-code uses coefficient and lt(h) = hi0 x its leading term. We also a C-like switch construct to find the matching case. Even if introduce an equivalence relation on power series: g ∼ h if the initial polynomial A is in K[x]n, this may not be the case g(0) = h(0) and lt(g − g(0)) = lt(h − h(0)). The proof of for the arguments passed to the next calls; hence the need the next lemma is immediate by case inspection. for the extra parameter m. For root-taking, the subroutine Lemma 2. For o in O, if h ∼ g and g is in dom(o), then FindDegrees computes the quantities mi of Eq. (1). h is in dom(o) and o(h) ∼ o(g). Since we write the complexity as a function of n, the cost analysis is simple: even if several recursive calls are gener- Series tangent to the identity. We prove the proposition ated (k for kth root-taking), their total number is still O(1). in two steps. For series of the form g = x mod x2, it suffices

272

to “reverse” step-by-step the computation sequence for g. 3.1 Main Theorem The following lemma is crucial. Let F ∈ K[[x, t]] be the bivariate power series Lemma 3. Let g be in [[x]], with g = x mod x2, and let P i j P j K F = i,j≥0 Fi,j x t = j≥0 ξj (x)t . o = (o1,..., oL) be a sequence defined at g. Then o is a composition sequence. Associated with F, we consider the map P n Proof. We have to prove that o is defined at x, i.e., that Evaln(., F, t):(a0, . . . , an−1) 7→ j

inverse in time O(T(n) + Tog (n) + Toh (n)). Lemma 4. The sequence o˜ = (o˜1,..., o˜L) is a composi- tion sequence and outputs a series g˜ such that g˜(g) = x. P k Proof. Write f = k≥0 fkz ,

Proof. One sees by induction that for all i, o˜i−1(··· o˜1(g)) k X i k X j g(x) = gk,ix and h(t) = hk,j t . is in dom(o˜ ) and o˜ (··· o˜ (g)) = g . This shows that the i i 1 L−i i≥0 j≥0 sequence o˜ is defined at g and that o˜L(··· o˜1(g)) = x. From Lemma 3, we deduce that o˜ is defined at x. Lettingg ˜ be Since g(0)h(0) = 0, we have that either hk,j = 0 for k > j, ? ? the output of o˜, the previous equality givesg ˜(g) = x, which or gk,i = 0 for k > i. Thus, the coefficient Fi,j of F is concludes the proof.  well-defined and ? X Since To = T˜o, and in view of Theorem 1, the next lemma Fi,j = fkgk,ihk,j . concludes the proof of Theorem 2 in the current case. k≤n

Lemma 5. With g and g˜ as above, the map Evaln(., g˜) is These coefficients are those of a product of three matrices, the inverse of Evaln(., g). the middle one being diagonal; we deduce the factorization ? t Proof. Let F be in K[x]n and let G = Evaln(F, g), so that Evaln(., F , t) = Evaln(., g) ◦ ∆n(., fi) ◦ Evaln(., h). F (g) = G + H, with val(H) ≥ n. Evaluating atg ˜, we get F = G(˜g) + H(˜g) = G(˜g) mod xn, since val(˜g) = 1. The assumptions on f, g and h further imply that the map  Eval (., F?, t) is invertible, of inverse General case. Lemma 5 fails when val(g) = 0. We can n −1 ? −t −1 −1 however reduce the general case to that where val(g) = 1. Evaln (., F , t) = Evaln (., h) ◦ ∆n(., fi ) ◦ Evaln (., g). Let us write g = g + g x + ··· , with g 6= 0, and define 0 1 1 By Theorems 1 and 2, as well as Theorem 4 stated below, g˜ = (g − g )/g , so thatg ˜ = x mod x2. If o is a composition 0 1 Eval (., F?, t) and its inverse can thus be evaluated in time sequence for g, then o˜ = (o, A , M ) is a composition n −g0 1/g1 O(T(n) + T (n) + T (n)). Now, from the identity F = sequence forg ˜, and we have T = T . Thus, by the previous og oh ˜o o u(x)v(t)F?, we deduce that point, we can use this composition sequence to compute the −1 ? t map Evaln (., g˜) in time O(To(n)). From the equality Evaln(., F, t) = Muln,n(., u) ◦ Evaln(., F , t) ◦ Muln,n(., v).

Evaln(A, g) = Evaln(Scaleg1,n(Shiftg0,n(A)), g˜), Our assumptions on u and v make this map invertible, and −1 t −1 ? we deduce Evaln (., F, t) = Muln,n(., b) ◦ Evaln (., F , t) ◦ Muln,n(., a), −1 −1 n n Evaln (A, g) = Shift−g0,n(Scale1/g1,n(Evaln (A, g˜))). with a(x) = 1/u mod x and b(t) = 1/v mod t . The extra costs induced by the computation of u, v, their inverses, and Since scaling and shifting induce only an extra O(M(n)) the truncated products fit in O(T(n) + M(n)). arithmetic operations, this finishes the proof of Theorem 2.  3.2 Change of Basis 3. CHANGE OF BASIS To conclude, we consider polynomials (Pi)i≥0 in K[x], This section applies our results on composition to change with deg(Pi) = i, with generating series defined in terms of basis algorithms, between the monomial basis (xi) and of series u, v, f, g, h as in Theorem 3 by various families of polynomials (P ), with deg(P ) = i, for i i X i ` ´ which we reach quasi-linear complexity. As an intermediate P = Pi(x)t = u(x) v(t) f g(x)h(t) . step, we present a bivariate evaluation algorithm. i≥0

273

t Eval aux (A, m, n, `, o, G) is dealt with by noting that the transpose of modm,n is if ` = 0 return A mod xm modn,m. To conclude, it suffices to give transposed asso- `0 = ` − 1 ciativity rules for our basic operators. The formal approach switch(o ) we use to write our algorithms pays off now, as it makes this ` transposition process automatic. case (M ): B = Eval auxt(A, m, n, `0, o, G) λ Recall that our algorithms deal with polynomials. The return Scaleλ,m(B) t 0 dual of K[x]m can be identified with K[x]m itself: to a case (Aa): B = Eval aux (A, m, n, ` , o, G) P i i t K-linear form ` over K[x]m, one associates i

Corollary 1. Under the above assumptions, one can t t t t i Evalm,n(A, Aa(g)) = Shifta,m(Evalm,n(A, g)). (A2) perform the change of basis from (x )i≥0 to (Pi)i≥0, and conversely, in time O(T(n) + Tog (n) + Toh (n)). t Powering. The dual map Powern,k maps A ∈ K[x]k(n−1)+1 A surprisingly large amount of classical polynomials fits into to A0/k ∈ K[x]n (with the notation of §2.1). We deduce: this framework (see next section). An important special t t t t case is provided by Sheffer sequences [30, Chap. 2], whose Evalm,n(A, Pk(g)) = Powerm,k(Evalk(m−1)+1,n(A, g)). (A3) exponential generating function has the form Inversion. The transposed version of the rule for Inv is X Pi(x) i xh(t) t = v(t)e . t t t 1−m i! Evalm,n(A, Inv(g)) = Revm(Evalm,n(Muln,n(A, g ), g)). i≥0 t (A4) Examples include the actuarial, Laguerre, Meixner and Pois- t Root taking. Considering its matrix, one sees that Splitm,k son-Charlier polynomials, and the Bernoulli polynomials of maps (A0,...,Ak−1) ∈ K[x]m0 × · · · × K[x]mk−1 to the second kind (see Tables 3 and 4). In this case, if h k k k k−1 is output by the composition sequence o and v(t) can be A0(x ) + A1(x )x + ··· + Ak−1(x )x ∈ K[x]m. computed modulo tn in time T(n), one can perform the i Besides, since the map Comb is the direct sum of the maps change of basis from (x )i≥0 to (Pi)i≥0, and conversely, in time O(T(n) + To(n)). Muln,n(., Gi): K[x]n → K[x]n, t 3.3 Transposed evaluation its transpose Combn(., G0,...,Gk−1) sends A ∈ K[x]n to The following completes the proof of Theorem 3. t k (Muln,n(A, Gi))0≤i≤k−1 ∈ K[x]n. Theorem 4 (Transposition). Let o = (o1,..., oL) be a composition sequence that outputs g ∈ K[[x]]. Given o, one Putting this together gives the transposed associativity rule t can compute the map Evaln(., g) in time O(To(n)). i n hi = h mod x for 0 ≤ i < k t Proof. This result follows directly from the transposition A0,...,Ak−1 = Comb (A, h0, . . . , hk−1) n t principle. However, we give an explicit construction of the t (A5) t Bi = Evalmi,n(Ai, g) for 0 ≤ i < k transposed map Evaln(., g) in Figure 2. Non-linear precom- t t putations are left unchanged. The terminal case ` = 0 Evalm,n(A, Rk,α,r(g)) = Splitm,k(B0,...,Bk−1)

274

Exponential and Logarithm. From the proof of Propo- square root [10]. Exponentials are computed using the al- sition 1, we deduce the transposed map of Expm,n, Logm,n gorithm of [20]. Powers are computed through exponential and their associativity rules and logarithm [10], except when the arguments are binomi-

t t als, when faster formulas for binomial series are used. For Expm,n(A) = modn,m(Shift−1,n(MultiEval(∆n(A, 1/i!)))), evaluation and interpolation at 0, . . . , n−1, and their trans- t t t t Evalm,n(A, E(g)) = Expm,n(Evaln,n(A, g)); (A6) poses, we use the implementation of [7]. t t We use the Jacobi and Mittag-Leffler orthogonal polyno- Logm,n(A) = modn,m(∆n(Interpn(Shift1,n(A)), i!)), mials (a special case of Meixner polynomials, with β = 0 and t t t t Evalm,n(A, L(g)) = Logm,n(Evaln,n(A, g)). (A7) c = −1), with the composition sequences of §2.3. Our algo- rithm has cost O(M(n)) for the former and O(M(n) log(n)) for the latter. We compare this to the naive approach of 4. APPLICATIONS quadratic cost in Figure 3 and 4, respectively. Timings are Many generating functions of classical families of poly- given for the conversion from orthogonal to monomial bases; nomials fit into our framework. To obtain conversion algo- those for the inverse conversion are similar. rithms, it is sufficient to find suitable composition sequences. Table 3 lists families of polynomials for which conversions can be done in time O(M(n)) with our method (see e.g. [30, 3] for more on these classical families). In Table 4, a similar list is given, leading to conversions of cost O(M(n) log n); most of these entries are actually Sheffer sequences. Many other families can be obtained as special cases (e.g., Gegen- bauer, Legendre, Chebyshev, Mittag-Leffler, etc). The entry marked by (?) is from [18]; the entries marked by (??) are orthogonal polynomials, for which one conver- sion (from the orthogonal to the monomial basis) is already mentioned with the same complexity in [15, 29]. In all cases, the pre-multiplier u(x)v(t) depends on t only and can be computed at precision n in time O(M(n)); all our Figure 3: Jacobi polynomials. functions f can be expanded at precision n in time O(n). Regarding the functions g(x) and h(t), most entries are easy to check; the only explanations needed concern some series h(t). Rational functions are covered by the first example of §2.3; the second example of §2.3 deals with Jacobi polynomi- als and Spread polynomials; the last example of §2.3 shows how to handle functions with logarithms. For Fibonacci polynomials, the function h(t) = t/(1 − t2) satisfies

“ 1 + t2 ”2 (2h)2 = − 1. 1 − t2 From this, we deduce the sequence for h: Figure 4: Mittag-Leffler polynomials. (P2, M−1, A1, Inv, M2, A−1, P2, A−1, R2,2,1, M1/2). √ Our algorithm performs better than the quadratic one. For Mott polynomials the series h(t) = (1 − 1 − t2)/t can The crossover points lie between 100 and 200; this large be rewritten as value is due to the constant hidden in our big-Oh estimates: s 2 in both cases, there is a contribution of about 20M(n), plus h = √ − 1. an additional M(n) log(n) for Mittag-Leffler. 1 + 1 − t2 This yields the composition sequence 6. DISCUSSION This article provides a flexible framework for generating (P , M , A , R , A , Inv, M , A , R ). 2 −1 1 2,1,0 1 2 −1 2,1,0 new families of conversion algorithms: it suffices to add new composition operators to Table 1 and provide the corre- 5. EXPERIMENTS sponding associativity rules. Still, several questions need We implemented the algorithms for change of basis using further investigation. Several of the composition sequences NTL [32]; the experiments are done for coefficients defined we use are non-trivial: this raises in particular the questions modulo a 40 bit prime, using the ZZ_p NTL class (our al- of characterizing what functions can be computed by a com- gorithms still work for degrees small with respect to the position sequence, and of determining such sequences algo- characteristic). All timings reported here are obtained on a rithmically. Besides, the costs of our algorithms are mea- Pentium M, 1.73 Ghz, with 1 GB memory. sured only in terms of arithmetic operations; the questions Our implementation follows directly the presentation of of numerical stability (for floating-point computations) or of the former sections. We use the transposed multiplication coefficient size (when working over Q) require further work. implementation of [7]. The Newton iteration for inverse is Acknowledgments. We thank ANR Gecko, the joint Inria- built-in in NTL; we use the standard Newton iteration for Microsoft Research Lab and NSERC for financial support.

275

polynomial generating series u(x)v(t) f(z) g(x) h(t) α P α n −1−α −1 Laguerre Ln n≥0 Ln(x)t (1 − t) exp(z) −x t(1 − t) P 1 n 2 Hermite Hn n≥0 n! Hn(x)t exp(−t ) exp(z) 2x t (α,β) P (α+β+1)n (α,β) n −α−β−1 α+β+1 α+β+2 −2 Jacobi Pn Pn (x)t (1 + t) 2F1( , ; β + 1; z) 1 + x 2t(1 + t) n≥0 (β+1)n 2 2 P n 2 −1 −1 2 −1 Fibonacci Fn n≥0 Fn(x)t (1 − t ) (1 − z) x t(1 − t ) α P 1 α n α t −α Euler En n≥0 n! En (x)t 2 (e + 1) exp(z) x t Bernoulli Bα P 1 Bα(x)tn tα(et − 1)−α exp(z) x t n n≥0 n! n √ P 1 n 2 Mott Mn n≥0 n! Mn(x)t 1 exp(z) −x (1 − 1 − t )/t Spread S P S (x)tn (1 + t)(1 − t)−1 z(1 + 4z)−1 x t(1 − t)−2 n n≥0 n √ P 1 n Bessel pn n≥0 n! pn(x)t 1 exp(z) x 1 − 1 − 2t Table 3: Polynomials with conversion in O(M(n))

polynomial generating series u(x)v(t) f(z) g(x) h(t) P 1 n Falling factorial (x)n (?) n≥0 n! (x)nt 1 exp(z) x log(1 + t) P 1 n Bell φn n≥0 n! φn(x)t 1 exp(z) x exp(t) − 1 P 1 n Bernoulli, 2nd kind bn n≥0 n! bn(x)t t/log(1 + t) exp(z) x log(1 + t) P 1 n Poisson-Charlier cn(x; a) n≥0 n! cn(x; a)t exp(−t) exp(z) x log(1 + t/a) (β) P 1 (β) n Actuarial an n≥0 n! an (x)t exp(βt) exp(z) −x exp(t) − 1 (a) P 1 (a) n a −a Narumi Nn n≥0 n! Nn (x)t t log(1 + t) exp(z) x log(1 + t) (λ,µ) P 1 (λ,µ) n λ −µ Peters Pn n≥0 n! Pn (x)t (1 + (1 + t) ) exp(z) x log(1 + t) (λ) P (λ) n 2 −λ 1−teiφ Meixner-Pollaczek Pn (x; φ)(??) n≥0 Pn (x; φ)t (1 + t − 2t cos φ) exp(z) ix log( 1−te−iφ ) P (β)n n −β 1−t/c Meixner mn(x; β, c)(??) n≥0 n! mn(x; β, c)t (1 − t) exp(z) x log( 1−t ) P `N´ n N p−(1−p)t Krawtchouk Kn(x; p, N)(??) n≥0 n Kn(x; p, N)t (1 + t) exp(z) x log( p(1+t) ) Table 4: Polynomials with conversion in O(M(n) log(n))

7. REFERENCES [16] M. Frumkin. A fast algorithm for expansion over spherical harmonics. Appl. Algebra Engrg. Comm. Comp., 6(6):333–343, [1] A. V. Aho, K. Steiglitz, and J. D. Ullman. Evaluating 1995. polynomials at fixed sets of points. SIAM J. Comp., [17] J. von zur Gathen and J. Gerhard. Modern computer algebra. 4(4):533–539, 1975. Cambridge University Press, 1999. [2] B. K. Alpert and V. Rokhlin. A fast algorithm for the [18] J. Gerhard. Modular algorithms for polynomial basis evaluation of Legendre expansions. SIAM J. Sci. Statist. conversion and greatest factorial factorization. In RWCA’00, Comp., 12(1):158–179, 1991. pages 125–141, 2000. [3] G. Andrews, R. Askey, and R. Roy. Special functions. [19] G. Hanrot, M. Quercia, and P. Zimmermann. The Middle Cambridge University Press, 1999. Product Algorithm, I. Appl. Algebra Engrg. Comm. Comp., [4] R. Barrio and J. Pe˜na.Basis conversions among univariate 14(6):415–438, 2004. polynomial representations. C. R. Math. Acad. Sci. Paris, [20] G. Hanrot and P. Zimmermann. Newton iteration revisited. 339(4):293–298, 2004. http://www.loria.fr/ zimmerma/papers, 2002. [5] D. J. Bernstein. Composing power series over a finite in [21] G. Heinig. Fast and superfast algorithms for Hankel-like essentially linear time. J. Symb. Comp., 26(3):339–341, 1998. matrices related to orthogonal polynomials. In NAA’00, [6] D. Bini and V. Y. Pan. Polynomial and matrix computations. volume 1988 of LNCS, pages 361–380. Springer-Verlag, 2001. Vol. 1. Birkh¨auserBoston Inc., 1994. [22] J. van der Hoeven. Relax, but don’t be too lazy. J. Symb. [7] A. Bostan, G. Lecerf, and E.´ Schost. Tellegen’s principle into Comput., 34(6):479–542, 2002. practice. In ISSAC’03, pages 37–44. ACM, 2003. [23] E. Kaltofen. Challenges of symbolic computation: my favorite [8] A. Bostan, B. Salvy, and E.´ Schost. Fast algorithms for open problems. J. Symb. Comp., 29(6):891–919, 2000. orthogonal polynomials. In preparation. [24] E. Kaltofen and Y. Lakshman. Improved sparse multivariate [9] A. Bostan and E.´ Schost. Polynomial evaluation and polynomial interpolation algorithms. In ISSAC’88, volume 358 interpolation on special sets of points. J. Complexity, of LNCS, pages 467–474. Springer Verlag, 1989. 21(4):420–446, 2005. [25] J. Keiner. Computing with expansions in Gegenbauer [10] R. P. Brent. Multiple-precision zero-finding methods and the polynomials. Preprint AMR07/10, U. New South Wales, 2007. complexity of elementary function evaluation. In Analytic [26] G. Leibon, D. Rockmore, and G. Chirikjian. A fast Hermite Computational Complexity, pages 151–176. Acad. Press, 1975. transform with applications to protein structure determination. [11] R. P. Brent and H. T. Kung. Fast algorithms for manipulating In SNC’07, pages 117–124, New York, NY, USA, 2007. ACM. formal power series. J. ACM, 25(4):581–595, 1978. [27] Y.-M. Li and X.-Y. Zhang. Basis conversion among B´ezier, [12] P. B¨urgisser,M. Clausen, and A. Shokrollahi. Algebraic Tchebyshev and Legendre. Comput. Aided Geom. Design, complexity theory, volume 315 of GMW. Springer–Verlag, 1997. 15(6):637–642, 1998. [13] J. Canny, E. Kaltofen, and Y. Lakshman. Solving systems of [28] V. Y. Pan. New fast algorithms for polynomial interpolation non-linear polynomial equations faster. In ISSAC’89, pages and evaluation on the Chebyshev node set. Computers and 121–128. ACM, 1989. Mathematics with Applications, 35(3):125–129, 1998. [14] D. G. Cantor and E. Kaltofen. On fast multiplication of [29] D. Potts, G. Steidl, and M. Tasche. Fast algorithms for discrete polynomials over arbitrary algebras. Acta Inform., polynomial transforms. Math. Comp., 67(224):1577–1590, 1998. 28(7):693–701, 1991. [30] S. Roman. The umbral calculus. Dover publications, 2005. [15] J. R. Driscoll, J. D. M. Healy, and D. N. Rockmore. Fast [31] A. Sch¨onhageand V. Strassen. Schnelle Multiplikation großer discrete polynomial transforms with applications to data Zahlen. Computing, 7:281–292, 1971. analysis for distance transitive graphs. SIAM J. Comp., [32] V. Shoup. A new polynomial factorization algorithm and its 26(4):1066–1099, 1997. implementation. J. Symb. Comp., 20(4):363–397, 1995.

276