Arxiv:2010.10824V2 [Math.NA] 4 Nov 2020

MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES LIFTING THE CURSE OF DIMENSIONALITY

MICHAEL HECHT, KRZYSZTOF GONCIARZ, JANNIK MICHELFEIT, VLADIMIR SIVKIN, AND IVO F. SBALZARINI

Abstract. We present generalizations of the classic Newton and Lagrange interpolation schemes to arbitrary dimensions. The core contribution that enables this new method is the notion of unisolvent nodes, i.e., nodes on which the multivariate polynomial interpolant of a function is unique. We prove that by choosing these nodes in a proper way, the resulting interpolation schemes become generic, while approximating all continuous Sobolev functions. If in addition the function is analytical in the Trefethen domain then, by validation, we achieve the optimal exponential approximation rate given by the upper bound in Trefethen’s Theorem. The number of interpolation nodes required for computing the optimal interpolant depends sub-exponentially on the dimension, hence resisting the curse of dimensionality. Based on this, we propose an algorithm that can eﬃciently and numerically stably solve arbitrary- dimensional interpolation problems, and approximate non-analytical functions, with at most quadratic runtime and linear memory requirement.

1. Introduction Polynomial interpolation goes back to Newton, Lagrange, and others [70], and its fundamental importance for mathematics and computing is undisputed. Inter- polation is based on the fact that, in 1D, one and only one polynomial Qf,n of degree n can interpolate a function f : R R on n + 1 distinct unisolvent inter- −→ polation nodes Pn R, i.e., Qf,n(pi) = f(pi) for all pi Pn, 0 i n. This makes interpolation fundamentally⊆ diﬀerent from approximation.∈ For≤ the≤ latter, the famous Weierstrass Theorem [93] states that any continuous function f C0(Ω, R), Ω = [ 1, 1], can be uniformly approximated by polynomials in principle∈ [30, 93]. However,− the Weierstrass Theorem does not require the polynomials to coincide with f at all, i.e., it is possible that Q (x) = f(x) for all x Ω, but still Weierstrass,f,n 6 ∈ (1.1) QWeierstrass,f,n f uniformly on Ω . n arXiv:2010.10824v2 [math.NA] 4 Nov 2020 −−−−→→∞ Also, even the constructive version of the Weierstrass Theorem given by Serge Bernstein [7] only ensures a linear convergence rate. There has therefore been much research into faster approximation schemes. Here, we show that solving a unisolvent interpolation problem can provide higher (in fact, exponential) convergence rates, and yield a computationally eﬃcient and numerically stable algorithm for computing actual instances of such polynomials using only sub-exponentially many nodes with space dimension.

2020 Mathematics Subject Classiﬁcation. Primary 65D15, 41A50, 41A63, 41A05 ; Secondary 41A25, 41A10 . Key words and phrases. Newton interpolation, Lagrange interpolation, unisolvent nodes, multivariate approximation, Runge’s phenomenon. 1 2 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

Already in 1D, it is long known [37, 62, 80] that for any sequence of interpolation nodes Pn Ω, n N there exists at least one continuous function f :Ω R that ⊆ ∈ −→ can not be approximated by interpolation on Pn, i.e.,

Qf,n f , where Qf,n(q) = f(q) , q Pn . n →∞6−−−−→ ∀ ∈ Instead, the approximation quality of an interpolation polynomial is sensitive to the choice of the interpolation nodes P Ω. In other words: Interpolating f with a n ⊆ polynomial Qf,n of increasing degree n N does not guarantee that the interpolant ∈ Qf,n converges to f. This fact is famously known as Runge’s phenomenon [80]. Hence, a universal interpolation scheme that approximates all continuous functions does not exist. The best one can hope for is to choose interpolation nodes such that the interpolation scheme approximates “as many” functions as possible. In 1D, for instance, interpolation on Chebyshev and Legendre nodes are known to avoid Runge’s phenomenon for a generic class of functions and to yield exponential approximation rates [91], which is much faster than what has been shown possible by Weierstrass-type approximations [7]. There is thus a certain interest in extending eﬃcient and numerically stable interpolation schemes to higher dimensions for determining function approximations. Therefore, many approaches to extending polynomial interpolation to higher dimensions exist [9, 28, 29, 36, 40, 41, 48, 65, 76, 82, 91]. Yet, none of them answers the question of how to maintain the full power of Newton, Lagrange, or other 1D interpolation schemes in multi-dimensions (mD). That is:

m Question 1. How to construct interpolation nodes PAm,n Ω = [ 1, 1] , m, n N, and an eﬃcient and numerically stable interpolation algorithm⊆ such− that a generic∈ class of functions f :Ω R can be uniformly approximated −→ Qf,A f , Qf,A (q) = f(q) , q PA m,n n m,n m,n −−−−→→∞ ∀ ∈ by the unique interpolant Qf,Am,n with a fast (ideally exponential) convergence rate while keeping the number of interpolation nodes PAn required small (ideally, sub-exponential)? | | As far as we recognize, Chebyshev interpolation schemes (Chebfun) [32, 90, 91] best answers this question among all state-of-the-art approaches. However, their implementation has been limited to dimensions m 3, due to the curse of dimen- ≤ sionality, because the number PAm,n of interpolation nodes scales exponentially m| | with dimension PAm,n (n ) in these approaches, rendering them infeasible in high dimensions.| | ∈ O Alternative approaches [3, 15, 17, 18, 19, 72] are available to realize mD Weier- strass-type approximations. However, the linear convergence rate of the Bern- stein approximation [7] is reﬂected in the circumstance that these approaches are prevented from approximating a generic class of functions, but are limited to well- behaving a-priori bounded analytical or holomorphic functions occurring, for instance, as solutions of elliptic PDEs. In these scenarios, reasonable uniform approximations of the function f can be reached by sparse samples that avoid the curse of dimensionality in high dimensions m N, m 16. However, when ask- ing such approaches to deliver approximations∈ to machine≤ precision, or leave the tight class of well-behaving functions, their resistance to the curse of dimensionality disappears already for low dimensions m 6. ≤ MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 3

Recently, Lloyd Trefethen [90] proposed a way of overcoming these issues, by proving that for continuous functions f :Ω R that are only required to be −→ analytical in the Trefethen domain Nm,ρ, an upper bound on the convergence rate applies:

n/√m ε(ρ− ) , p = 1 O n 0 (1.2) f Qf,Am,n,p C (Ω) ε(ρ− ) , p = 2 , k − k ∈  O n (ρ− ) , p =  Oε ∞ n n where g ε(ρ− ) iﬀ g ((ρ ε)− ), ε > 0. The multi-index sets Am,n,p = m∈ O ∈m O − ∀ α N α p n N generalize the notion of polynomial degree to multidimensional{ ∈ kl k-degree.≤ } ⊆ As numerically demonstrated in Ref. [90], the rates apply p for the Runge function f(x) = 1/(1 + 10 x 2), which is not analytical in Ω but k k analytical in Nm,ρ with ρ 1.365. This suggests that interpolating the function ≈ α with respect to the polynomial space Πm,n,2 = span x α A spanned by all { } ∈ m,n,2 l2-monomials reaches the same convergence rates as interpolating with respect to the l -degree Am,n, , while l1-degree can not reach such a fast rate. ∞ ∞ m+n n The number of coeﬃcients Am,n,1 = n (m ) required is of polynomial | | m ∈ O cardinality for p = 1, whereas Am,n,2 o(n ) is of sub-exponential size for p = 2, m | | ∈ but Am,n, = (n + 1) scales exponentially with dimension m N for p = . | ∞| ∈ ∞ Thus, interpolation with respect to l2-degree polynomials might yield a way of answering Question 1.

However, in [90] the interpolants Qf,Am,n,1 ,Qf,Am,n,2 were computed by regression over PAm,n,∞ , which requires evaluating f on the exponentially many m Am,n, = (n+1) nodes PAm,n,∞ . In other words: There is no numerically stable | ∞| α and eﬃcient algorithm known that, given a polynomial space ΠA = span x α A, m { } ∈ A N , ﬁnds interpolation nodes PA of minimum cardinality PA = A = dim ΠA ⊆ | | | | that uniquely determine an interpolant Qf,A such that Qf,A, A = Am,n,p, reaches the rates stated as upper bounds in Eq. 1.2 for p = 2. Constructing such an algorithm would require answering the following long-standing open question:

m Question 2. Let m N, f : Ω = [ 1, 1] R be a function, and ΠA = α ∈m − −→ span x α A, A N be a polynomial space. How to ﬁnd interpolation nodes P {Ω,}P∈ = A⊆ such that: A ⊆ | A| | | C1) The interpolant Q Π , Q (p ) = f(p ) for all p P , α A, is f,A ∈ A f,A α α α ∈ A ∈ uniquely determined in ΠA, i.e., the PA are unisolvent. C2) The interpolant Qf,A can be computed numerically stably and eﬃciently. n C3) The interpolant Q reaches the convergence rate (ρ− ), ρ > 0, proposed f,A O in Eq. (1.2) whenever f is analytical in the Trefethen domain Nm,ρ and A = Am,n,p with p = 2. When aiming to answer Question 1, even an answer to Question 2 would leave a further question open:

Question 3. Assume that unisolvent interpolation nodes PA satisfying conditions (C1–C3) of Question 2 are given. What is the subspace H(Ω, R) C0(Ω, R) of all ⊆ continuous functions f :Ω R that can be approximated by interpolation on PA regardless of the achieved rate,−→ i.e,

0 Qf,A f uniformly for all f H(Ω, R) C (Ω, R)? n −−−−→→∞ ∈ ⊆ 4 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

Note that while multi-dimensional versions of the Weierstrass Theorem state that any continuous function f C0(Ω, R) can be approximated by polynomials in principle, Question 3 asks whether∈ such polynomials can be practically computed by interpolation. 1.1. Unisolvent nodes – The core contribution of this article. Here, we address the above questions and provide answers based on a new notion of unisolvent m interpolation nodes PA Ω = [ 1, 1] , m N. In particular, the new notion allows us to generalize Newton⊆ (NI)− and Lagrange∈ (LI) interpolation to arbitrary dimensions m N such that: ∈ P1) Computing the uniquely determined multivariate interpolant Qf,A ΠA, requires ( A 2) or ( A ) runtime for NI and LI, respectively, where∈A = O | | O | | | | PA = dim ΠA. | | m P2) Evaluating the interpolant Qf,A(x0) at any point x0 R requires ( A ) or ( A 2) runtime for NI and LI, respectively. ∈ O | | O | | P3) Any Sobolev function f Hk(Ω, R) C0(Ω, R) with k > m/2 can be ∈ ⊆ m approximated by interpolation, i.e., for all Am,n,p N , 1 p , there ⊆ ≤ ≤ ∞ are unisolvent interpolation nodes PA such that Qf,A f m,n,p m,n,p −−−−→n uniformly. →∞ P4) Numerical experiments validate that for the non-analytical Runge function 2 f(x) = 1/(1 + 10 x ) the nodes PA, A = Am,n,2 yield the optimal con- k nk vergence rate (ρ− ) stated in Eq. (1.2), with the theoretically predicted ρ by requiringO sub-exponentially many interpolation nodes P o(nm). | A| ∈ Note that the space Hk(Ω, R) C0(Ω, R), k > m/2, is the largest Hilbert space contained in the space of continuous⊆ functions. Therefore, we generalize known results in 1D [45, 86]. Further, we propose to combine the insights from (P3) with recent results [10], in order to make progress toward a mathematical proof for these rates to apply. In any case, (P3) makes generic the class of functions that can be approximated by interpolation and enables a broad range of applications as, for instance, indicated by the fact that PDEs are usually formulated in the weak Sobolev sense [53, 60, 64]. In summary, we answer Question 1 by establishing an eﬃcient mD interpolation scheme that can approximate a generic class of Sobolev functions and, at least by validation, reaches the proposed exponential approximation rate for strongly varying non-analytical functions as the Runge function f(x) = 1/(1 + 10 x 2) by requiring only a sub-exponential amount of interpolation nodes, hence liftingk k the curse of dimensionality. Our results also prove wrong the commonly believed hypothesis that multivariate splines would have better approximation properties than polynomial interpolation [23, 24, 25, 26]. In fact, multivariate splines reach a polynomial approximation rate, but not an exponential rate [27]. Also the recently introduced Floater– Hormann interpolation [38], which is based on using rational functions as interpolants and has been shown to possess better approximation quality than splines, only reaches a polynomial convergence rate [16]. This, together with the numerical experiments presented here in Section 8, suggests that the novel multivariate polynomial interpolation method presented here is superior.

1.2. Notation. Let m, n N, p 1. We denote by e1 = (1, 0,..., 0), . . . , em = ∈ ≥ (0,..., 0, 1) Rm the standard basis, by the euclidean norm on Rm, and by ∈ k · k MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 5

m m m M p the lp-norm of a matrix M R × . Further, Am,n,p N denotes all k k m∈ p p⊆ 1/p multi-indices α = (α1, . . . , αm) N with α p = (α + + α ) n. ∈ k k 1 ··· m ≤ We order a ﬁnite set A Nm, m N of multi-indices with respect to the lexi- ⊆m ∈ cographical order L on N starting from xm to x1, e.g. (5, 3, 1) L (1, 0, 3) L ≤ ≤ ≤ (1, 1, 3). Thereby, αmin, αmax shall denote the minimum and maximum of A = αmin, . . . , αmax with respect to L. We call A complete iﬀ there is no { } m ≤ β = (b1, . . . , bm) N A with bi ai, i = 1, . . . , m for some α = (a1, . . . , am) ∈ \ ≤ ∀ ∈ A. In [19] the terminology donward closed set is used. Note that Am,n,p is complete m A A for all m, n N, p 1. Given A N complete and a matrix RA R| |×| | we slightly abuse∈ notation≥ by denoting⊆ ∈

(1.3) RA = (rα,β)α,β A = (ri,j)1 i,j A , ∈ ≤ ≤| | with α, β being the i-th, j-th entry of A ordered by , respectively. ≤L We consider the real polynomial ring R[x1, . . . , xm] in m variables and denote by Πm the R-vector space of all real polynomials in m variables. Further, ΠA Πm denotes the polynomial subspace induced by A and generated by the canonical⊆ α α1 αm basis given by the monomials x = x1 xm with α A. For A = Am,n,p we ··· ∈ α m write Πm,n,p to mean ΠAm,n,p . Given a polynomial Q(x) = α A cαx , A N , ∈ ⊆ we call maxα A ,cα=0 α p the lp-degree of Q. As it will turn out, the notion of lp- degree plays a∈ crucial6 k rolek for the approximation quality of polynomialP interpolation. m Whatsoever, while A1,n,p = 0, . . . , n , for m > 1 considering A N generalizes the concept of polynomial degree{ to multi-dimensions.} ⊆ Throughout this article Ω = [ 1, 1]m denotes the m-dimensional standard hyper- − cube and C0(Ω, R) the Banach space of continuous functions f :Ω R with norm −→ f C0(Ω) = supx Ω f(x) . Finally, we use the standard Landau symbols ( ), o( ) k k ∈ | | O · · f(x) f(x) f (g) lim sup | | , f o(g) lim | | = 0 . ∈ O ⇐⇒ x g(x) ≤ ∞ ∈ ⇐⇒ x g(x) →∞ | | →∞ | | 1.3. The notion of unisolvence. We consider a complete set A Nm of multi- m ⊆ indices, a set of interpolation nodes PA = pαmin , . . . , pαmax R , and a function { } ⊆ f : Rm R. Let the multivariate Vandermonde matrix be given by −→ β (1.4) V (P ) = pα α,β A . ∈ If V (P ) is (numerically) invertible then one can interpolate f by solving the linear system of equations

V (P )C = F,C = (cαmin , . . . , cαmax ) ,F = (f(pαmin ), . . . , f(pαmax )) using ( A 3) operations. Indeed, O | | (1.5) Q (x) = c xα Π f,A α ∈ A α A X∈ yields the unique interpolant of f on P , i.e., Q (p) = f(p) for all p P . We A f,A ∈ A call a set of nodes PA unisolvent if and only if V (P ) is invertible, i.e., if and only if ker V (P ) = 0. While for a set of nodes to be unisolvent in 1D it suﬃces for P to be of cardinality A , in mD the condition ker V (P ) = 0 is equivalent to require that there exists no | | 1 hypersurface H = Q− (0) generated by a polynomial 0 = Q ΠA with P H. Indeed, the coeﬃcients C of such a polynomial would be6 a non-trivial∈ solution⊆ of V (P )C = 0. 6 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

We use this fact in Section 2 to establish a characterization of unisolvence in terms of a splitting statement, see Theorem 2.6. This allows constructing unisolvent nodes by recursively applying that splitting. In a less general form, this idea was already proposed in [48]. Here, we use this recursive splitting to deﬁne multivariate Newton and Lagrange polynomials and to extend the corresponding interpolation schemes to arbitrary dimensions, see Sections 3 and 4. This not only overcomes problems of numerical instability when numerically inverting the high-dimensional Vandermonde matrix, but it also establishes mD interpolation algorithms of ( A 2) runtime complexity and ( A ) memory space complexity. O | | O | | 1.4. The dual notion of unisolvence. Rather than constructing unisolvent nodes de novo, Carl de Boor and Amon Ros [28, 29] considered the dual problem of unisolvence. That is: for a given polynomial space Π and set of nodes P Rm being non–unisolvent with respect to Π, ﬁnd a subset P P and a polynomial⊆ subspace 0 ⊆ ΠP0 Π, such that P0 is unisolvent with respect to ΠP0 Π [28, 29]. We suggest to rephrase⊆ the problem in more abstract mathematical⊆ terms by considering the map

(1.6) Γk : k Gr(k, X) ,X = Πm,k, , P −→ ∞ m m where k = P R P = k is the set of all finite subsets of R of cardinality kP, and Gr(⊆k, X) is| the| Grassmann manifold, i.e., the smooth manifold that consists of all k-dimensional subspaces of the vector space X [73]. In particular, m m 1 m m 1 Gr(1, R ) = RP − , Gr(1, C ) = CP − are the real and complex projective spaces, respectively [31, 51]. In Section 6 we show that for every set of nodes P k there exists one and only one polynomial subspace Y (P ) X such that P is unisolvent∈ P with respect to Y (P ). ⊆ Thereby, setting Γk(P ) = Y (P ) yields a well-defined and smooth map. Revisiting the work of Carl de Boor and Amon Ros [28, 29] becomes thereby possible by considering the subspace Y (P ) = Γ (P ) Π . k ∩ The set P0 P and a basis of Y (P ) can be derived by using Gaussian elimination [92] on the⊆ associated Vandermondee matrix V (P ). We note that properties of the map P Y (P ) discussed in [28,e 29], such as continuity and differentiablity, can 7→ be naturally deduced by considering Γk from Eq. (1.6). In Sectione 6 we use our notion of unisolvent nodes and the derived Newton and Lagrange interpolation schemes to reformulate and simplify the approach of Carl de Boor and Amon Ros [28, 29]. In Section 8 we present a numerical experiment that demonstrates how the reformulated approach can be used for interpolation on non-planar geometries, such as Riemannian surfaces given as affine algebraic varieties [50], e.g., the torus T2. 1.5. Approximation power of multivariate splines. A prominent and well established alternative to the already discussed Weier- strass–type approximation schemes [3, 15, 17, 18, 19, 72] is the by Carl de Boor et al. developed multivariate spline interpolation [23, 24, 25, 26]. The approximation result [27] for bivariate splines due to de Boor is given as: Theorem 1.1 (Carl de Boor). Let f :Ω R2 R be a (n + 1)-times con- ⊆ −→ tinuously differentiable bivariate function, ∆ be a triangulation, and Sf,n,∆ = MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 7

ρ g C (Ω, R) g δ Πm,n,1 , δ ∆ , n > 3ρ + 1 be the space of piecewise polynomial∈ functions of| ∈ degree n. Then∀ ∈ there exists c(∆) > 0 such that n+1 n+1 (1.7) dist(f, S ) c(∆) D f 0 ∆ , f,n,∆ ≤ k kC (Ω)| | where ∆ = maxT ∆ T is the mesh size. | | ∈ | | This result states that any “sufficiently smooth” function f can be approximated by piecewise polynomial functions, which allows to approximate f by Hermite or Spline interpolation. Generalizations of this result rely on this fact and are formulated in a similar manner [23, 24, 26]. Despite its fundamental importance in linking interpolation and approximation, the above result and the resulting interpolation algorithms have some weak points: A) The strong regularity assumption, i.e., the (n + 1)-fold differentiablity of f; B) The error bound in Eq. (1.7) only guarantees a polynomial convergence rate, but no exponential convergence; C) The approach is sensitive to the curse of dimensionality, i.e., the number of polynomial coefficients Cm,n scales exponentially with dimension: Cm,n (nm). | | ∈ O Especially, for non-analytical functions, as the Runge function f(x) = 1/(1 + 10 x 2), (B) prevents spline interpolation. Several improvements have been presented,k k including Floatman–Hormann interpolation [16, 38] which reaches better approximation quality than splines. However, all of them share the above weaknesses (A,B,C), as we demonstrate in the numerical experiments of Section 8. Not only the numerical experiments but also by the aspects introduced in Section 1.1 suggest that multivariate polynomial interpolation overcomes the above weaknesses (A,B,C) of spline interpolation.

2. Unisolvent nodes While the pioneering work of Guenther and Roetman [48] proposes a partial answer to (C1) of Question 2, it does not answer (C2–C3). Independent of these works, we provide a notion of unisolvent nodes for multivariate polynomial inter- m polation with respect to arbitrary polynomial spaces ΠA, A N . Preliminary versions were already presented in our own previous work [52,⊆ 54, 55]. Here, we fully develop the concept and extend it beyond these former versions. We start by stating the deﬁnitions on which our concept rests:

Definition 2.1 (Transformations). An affine transformation τ : Rm Rm, m m −→ m N, is a map τ(x) = Bx + b, where B R × is an invertible matrix and ∈ ∈ b Rm. An affine translation is an affine transformation with B = I the identity matrix.∈ A linear transformation is an affine transformation with b = 0. The following fact is straightforward to prove:

Lemma 2.2. Any aﬃne transformation τ : Rm Rm, m N, induces a ring −→ ∈ isomorphism τ ∗ : R[x1, . . . , xm] R[x1, . . . , xm], i.e, the induced transformation −→ m τ ∗ :Πm Πm given by τ ∗(Q)(x) = Q(τ(x)) x R −→ ∀ ∈ is a linear transformation, such that τ ∗(Q Q ) = τ ∗(Q )τ ∗(Q ) for all Q ,Q 1 2 1 2 1 2 ∈ Πm, and τ ∗(1) = 1. 8 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

Definition 2.3. If Π Πm, m N, is a polynomial subspace, then we call ⊆ ∈ τ : Rm Rm a canonical transformation with respect to Π if and only if τ is an −→ affine transformation such that the induced transformation τ ∗ :Π Π maps −→ m onto Π, i.e., τ ∗(Π) Π. ⊆ Remark 2.4. Note that any affine translation and any affine transformation with B being a diagonal matrix is a canonical transformation with respect to Πm,n,p, m, n N, 1 p R . ∈ ≤ ∈ ∪ {∞} Using these definitions, we formalize the concept of unisolvent nodes as:

Definition 2.5 (Unisolvent nodes). Let m N and Π Πm be a polynomial ∈ ⊆ subspace. We call a finite non-empty set = P Rm unisolvent with respect to Π if and only if there exists no non-zero polynomial∅ 6 ⊆ Q Π 0 with Q(p) = 0 , p P. ∈ \{ } ∀ ∈ m Let further H R be a hyperplane defined by a linear polynomial QH Πm,1,1 ⊆ 1 m ∈ m \ 0 , i.e., H = QH− (0) such that any affine transformation τH : R R with { } m 1 −→ τH (H) = R − 0 is canonical with respect to Π. We consider × { } (2.1) Π H = Q Π τH∗ (Q) Π (Πm 1 0 ) | ∈ ∈ ∩ − × { } Π H¯ = Q Πm QH Q Π . | ∈ ∈ We call P unisolvent with respect to (Π,H) if and only if

i) There is no polynomial Q Π H with τH∗ (Q) = 0 and Q(P H) = 0. ∈ | 6 ∩ ii) There is no polynomial Q Π H¯ 0 with Q(P H) = 0. ∈ | \{ } \ m Theorem 2.6. Let m N, Π Πm a polynomial subspace, P R a ﬁnite 1 ∈ ⊆ ⊆ set, and H = QH− (0) be a hyperplane of co-dimension 1 deﬁned by a polynomial QH Πm,1,1 such that P is unisolvent with respect to (Π,H). Then P is unisolvent with∈ respect to Π.

Proof. Let Q Π with Q(P ) = 0. We consider the affine transformation τH : m m ∈ m 1 R R with τH (H) = R − 0 and the projection πm 1 :Πm Πm 1 −→ 1 × { } − −→ − × 0 . Set Q1 = τH∗− πm 1τH∗ (Q) and Q2 = (Q Q1)/QH , which is a well-defined { } m − − function on R H. Further, Q0 = τH∗ (Q) πm 1τH∗ (Q) is a polynomial consisting \ − − of monomials all sharing the variable xm, i.e., Q0 = xmq with q Πm. Observe that ∈ 1 τH∗ (QH ) = λxm for some λ R 0 . Thus, by Lemma 2.2 we have τH∗− (Q0) = 1 1 ∈ \{ } 1 τH∗− (λxm)τH∗− (q/λ) = QH Q2 ΠA, which shows that Q2 = τH∗− (q/λ) Π H¯ . ∈ ∈ | Since Q(p) = Q (p) for all p P H, by assumption, we have Q = 0 and thereby 1 ∈ ∩ 1 QH Q2(p) = 0 for all p P H. Since QH (p) = 0 for all p P H we get Q2 = 0, which yields that Q = 0∈ is the\ zero polynomial.6 Thus, P is∈ unisolvent\ with respect to Π. Theorem 2.7. Let the assumptions of Theorem 2.6 be fulfilled and f : Rm R be −→ a function. Assume that there are polynomials Q1 Π H ,Q2 Π H¯ with Π H , Π H¯ ∈ | ∈ | | | from Eq. (2.1), such that:

i) Q1(p) = f(p) , p P H ii) Q (p) = (f(p) ∀ Q∈(p))∩/Q (p) , p P H. 2 − 1 H ∀ ∈ \ Then Q = Q1 + QH Q2 Π is the unique polynomial in Π that interpolates f on P , i.e., Q(p) = f(p) , p ∈P . ∀ ∈ MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 9

m Proof. Indeed, QH = 0 on R H implies that Q(p) = f(p) , p P . Thus, Q 6 \ ∀ ∈ interpolates f on P . To show the uniqueness of Q let Q0 Π interpolate f on P . ∈ Then Q Q0 Π and (Q Q0)(p) = 0 , p P . Due to Theorem 2.6 we have that − ∈ − ∀ ∈ P is unisolvent with respect to Π. Thus, Q0 Q has to be the zero polynomial, proving that Q is uniquely determined in Π. −

Corollary 2.8. Let m N, A Nm be a complete set of multi-indices, and ∈ ⊆ ΠA Πm by the polynomial sub-space induced by A. We consider the generating nodes⊆ given by the grid

m (2.2) GP = i=1Pi ,Pi = p0,i, . . . , pni,i R , ni = max(αi) , ⊕ { } ⊆ α A ∈ where the Pi are arbitrary ﬁnite sets. Then, the node set P = (p , . . . , p ) α A A α1,1 αm,m ∈ is unisolvent with respect to ΠA. Proof. We argue by induction on m and A . If m = 1 then the claim follows from | | the fact that dim ΠA = A and no polynomial Q ΠA can vanish on A distinct nodes P = (p ) α| | A . The claim becomes∈ trivial for A = 1. Now| | assume A { α(1),1 ∈ } | | that m > 1 and A > 1. We consider A = α A α = 0 , A = A A . By | | 1 ∈ m 2 \ 1 decreasing m if necessary and w.l.o.g., we can assume that A2 = . Consider the m 1 6 ∅ hyperplane H = (x1, . . . , xm 1, p0,m) (x1, . . . , xm 1 ) R − and QH Πm,1,1 with Q (x) = x { p . By− induction we have that− P ∈is unisolvent} with∈ respect H m − 0,m A to (ΠA,H). Thus, we ﬁnish the proof by Theorem 2.6 and induction.

Deﬁnition 2.9 (Essential assumptions). We say that the essential assumptions m m hold with respect to A N and PA R , where m N and A is a complete set of multi-indices, if and⊆ only if there exist⊆ generating nodes∈

m (2.3) GP = i=1Pi ,Pi = p0,i, . . . , pni,i R , ni = max(αi) , ⊕ { } ⊆ α A ∈ and the unisolvent nodes PA are given by P = (p , . . . , p ) α A . A α1,1 αm,m ∈ Unless further speciﬁed, the generating nodes GP are arbitrary. In Figure 1, we illustrate examples of unisolvent nodes in two and three dimen- m 2nd sions for the generating nodes GP = i=1Chebn , where the Chebyshev nodes of 2nd ⊕ second kind Chebn are deﬁned in Eq. (7.1). For better visualization, all nodes belonging to the same line/plane are colored equally.

3. Multivariate Newton interpolation We use the above concept of unisolvence to provide a natural extension of the classic Newton interpolation scheme to arbitrary dimensions. The extension presented here relies on recursively applying Theorem 2.7 and Corollary 2.8. We start by deﬁning: 10Multivariate M. HECHT, Interpolation K. GONCIARZ, on Unisolvent J. MICHELFEIT, Nodes V. SIVKIN, AND I.F. SBALZARINI 11

Fig. 1 Unisolvent nodes PA in 2D (left) and 3D (right) with respect to Am,n,p for dimensions m =2Figure, 3, n =5, 1. Unisolventp =2,andgeneratingnodesGP= nodes PA in 2D (left) andm Cheb 3D (right)2nd.Nodesbelonging with i=1 n to the samerespect line/plane to A arem,n,p coloredfor dimensions equally. m = 2, 3, n = 5, p = 2, and m 2nd generating nodes GP = i=1Chebn . Nodes belonging to the same line/plane are colored⊕ equally. Definition 4 (Essential Assumptions) We say that the essential assump- m m tions hold with respect to A N and PA R ,wherem N and A is a Definitioncomplete set 3.1 of(Multivariate multi-indices, Newton✓ if and polynomials)only if there✓ . existLet the generating essential2 nodes assumptions m m (Definition 2.9) be fulfilled with respect to A N and PA R . Then, we define m ⊆ ⊆ the multivariateGP = Newtoni=1Pi ,P polynomialsi = p0,iby,...,pni,i R ,ni = max(↵i) , (10) { }✓ ↵ A m α 1 2 i− (3.1)and the unisolvent nodesN (xP)A =are given(x by p ) , α A. α i − j,i ∈ i=1 j=0 Y Y PA = (p↵1,1 ,... ,p↵m,m) ↵ A . Indeed, in dimension m = 1 this reduces to the2 classic definition of Newton polynomialsUnless further [45, specified, 86, 91]. the generating nodes GP are arbitrary. DefinitionIn Figure 1, 3.2 we(Multivariate illustrate examples divided of differences) unisolvent. Let nodes the in essential two and assumptions three di- m m (Definition 2.9) be fulfilled with respect to Am N 2ndand PA R . Further let mensions for the generating nodes GP = i=1⊆Chebn ,wheretheChebyshev⊆ f : Rm R be a function.2nd Then, we recursively define the multivariate divided nodes of−→ second kind Chebn are defined in Eq. (25). For better visualization, differencesall nodes belonging: to the same line/plane are colored equally. p = p , p = p , β = i , β = α , k = j , α,0,j α α,i,j β j k k ∀ 6 F = f(p ) ,F = F , for 1 j < m , 3 Multivariateα,0,m Newtonα α, Interpolation0,j α,αj+1,j+1 ≤ and We use the above conceptFα,i of1,j( unisolvencepα) Fα,i 1 to,j(p provideα,i 1,j) a natural extension of the Fα,i,j = − − − − , for i αj . classic Newton interpolation( schemepα pα,i to1 arbitrary,j) dimensions.≤ The extension − − Wepresented call F here= reliesF onthe recursivelyNewton coefficients applying Theoremof Q 2Π and. Corollary 1. We α,0,0 α,α1,1 f,A ∈ A start by defining: In dimension m = 1, this definition recovers the classic divided difference scheme ofDefinition 1D Newton 5 Interpolation(Multivariate[45, 86,Newton 91]. Polynomials) Let the essential as- m m sumptionsUsing these (Definition definitions 4) bewe fulfilled state the with main respect result to ofA thisN section,and generalizingPA R . NewtonThen, we interpolation define the multivariate to mD: Newton polynomials by✓ ✓

Theorem 3.3. Let the essentialm assumptions↵i 1 (Definition 2.9) be fulfilled with re- m m m spect to A N andNPA(x)=R , and let(x f :pR ) ,↵R beA. a function. Then,(11) the ⊆ ↵ ⊆ i i,j −→ 2 unique polynomial Qf,A ΠA interpolatingi=1 j=0 f on PA, i.e., Qf (p) = f(p) , p PA, ∈ Y Y ∀ ∈ MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 11 can be determined in ( A 2) operations requiring ( A ) storage and is given by O | | O | | (3.2) Qf,A(x) = cαNα(x) , α A X∈ where c = F are the Newton coefficients of Q Π . α α,0,0 f,A ∈ A Proof. We argue by induction on A . If A = 1 then the claim follows im- mediately. For A > 1 we consider| | A =| | α A α = 0 , A = A A . | | 1 ∈ m 2 \ 1 By decreasing m if necessary, and w.l.o.g., we can assume that A2 = . Let m 1 m 6 ∅m H = (x1, . . . , xm 1, p0,m) (x1, . . . , xm 1) R − and τH : R R with { − − m ∈ } −→ τH (x) = xm p0,m such that τH (H) = R 0 . By assumption, we have that − × { } m m 1 τ is a canonical transformation with respect to ΠA. Let πm 1 : R R − , − −→ πm 1(x1, . . . , xm) = (x1, . . . , xm 1) be the natural projection. − − Step 1: We reduce the interpolation to H. We set P1 = πm 1 τH (PA H) m − ∩ and consider f0 : R R with −→ 1 (3.3) f0(x1, . . . , xm 1) = f τ − (x1, . . . , xm 1, 0) = f x1, . . . , xm 1, p0,m) . − H − − Let M (x), α A , be the Newton polynomials with respect to A , P . Then α ∈ 1 1 1 induction yields that the coefficients dα R of the unique polynomial ∈ Qf ,A (x1, . . . , xm 1) = dαMα(x1, . . . , xm 1) 0 1 − − α A X∈ 1 interpolating f on P can be determined in less than D A 2 operations, D 0 1 0| 1| 0 ∈ R+, while requiring a linear amount of storage. Consider the natural embedding im∗ 1 :Πm 1 , Πm. Then − − → ¯ Mα(x1, . . . , xm) = im∗ 1(Mα)(x1, . . . , xm 1) = Mα(x1, . . . , xm 1) and − − − Q1(x1, . . . , xm) = im∗ 1(Qf0,A1 )(x1, . . . , xm 1) = Qf0,A1 (x1, . . . , xm 1) − − − ¯ m yields Mα(x) = Nα(x), for all α A1, x R . Further, Q1 ΠA is given by ∈ ∈ ∈ Q1(x1, . . . , xm) = dαNα(x1, . . . , xm) α A X∈ 1 and satisfies Q1(p) = f(p) for all p PA H. m ∈ ∩ Step 2: We interpolate on R H. Observe that Q1 is constant in direction \ xm, i.e., Q1(x1, . . . , xm 1, xm) = Qf ,A (x1, . . . , xm 1). Thus, by Eq. (3.3), − 0 1 − Q1(q1, . . . , qm 1, qm) = f(q1, . . . , qm 1, p0,m) for all (q1, . . . , qm 1, qm) PA . − − − ∈ In light of this fact, and for f1(x) = (f(x) Q1(x))/QH (x), it requires D1 A2 , + − | | D1 R , operations to compute ∈ f(p ) f(p ) f(p ) Q (p ) (3.4) F = α,1,m − α,0,m = α − 1 α = f (p ) α,1,m p p Q (p ) 1 α α − α,0,m H α for all p P = P H, α A . Denote by K (x) the Newton polynomial with α ∈ 2 A \ ∈ 2 α respect to ΠA2 ,P2 then induction yields that the coefficients eα R, α A2, of the unique polynomial ∈ ∈

Qf1,A2 (x1, . . . , xm) = eαKα(x1, . . . , xm) . α A X∈ 2 2 + interpolating f1 on P2 can be determined in less than D2 A2 , D2 R , operations | | ∈ while requiring linear storage. Due to Eq. (3.1) we observe that QH (x)Kα(x) = N (x) for all α A . By Corollary 2.8 we have that P is unisolvent and therefore α ∈ 2 A 12 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

Theorem 2.7 implies that the unique polynomial Q ΠA interpolating f on PA is given by: ∈

(3.5) Qf,A(x) = Q1(x) + QH (x)Q2(x) = dαNα(x) + eαNα(x) . α A α A X∈ 1 X∈ 2 Following Definition 3.2 and using Eq. (3.4), one readily observes that dα = cα, α A1, and that eα = cα, α A2. Thus, we have proven Eq. (3.2). In ∀ ∈ ∀ ∈ 2 2 total, the computation can be done in less than D0 A1 + D1 A2 + D2 A2 2 2 | | | | | | ≤ max D0,D1/2,D2 ( A1 + A2 ) ( A ) operations and ( A1 + A2 )) = ( A{ ) amount of storage.} | | | | ∈ O | | O | | | | O | | Corollary 3.4 (Newton basis). Let the essential assumptions (Definition 2.9) be m m fulfilled with respect to A N and PA R . Then the set of Newton polynomials ⊆ ⊆ Nα α A ΠA { } ∈ ⊆ are a basis of ΠA. Proof. Due Theorem 3.3 every polynomial Q Π can be uniquely expanded as ∈ A Q = α A cαNα, proofing the statement. ∈ P Corollary 3.5 (Evaluation in Newton form). Let the essential assumptions (Def- m m inition 2.9) be fulfilled with respect to A N and PA R . Further let ⊆ ⊆ Q(x) = α cαNα, cα R, be a polynomial in Newton form. Then, it requires ∈ m ( A ) operations to evaluate Q at x0 R . O | | P ∈ Proof. By following the proof of Theorem 3.3 and using an induction argument over the number of coefficients, Eq. (3.5) yields that Q1,Q2 can be evaluated in linear time. Since the evaluation of QH requires constant time, the claim follows.

Remark 3.6. Recursively applying the splitting Q = Q1 +QH Q2 recovers the classic Neville-Aitken evaluation algorithm [12, 75, 74] for dimension m = 1. Evaluation is eﬃciently done by applying a multivariate version of the Horner scheme, see for instance [45, 86, 71].

4. Multivariate Lagrange interpolation Similar to the Newton case, our notion of unisolvent nodes also allows us to generalize the concept of Lagrange interpolation to multi–dimensions. For this, we deﬁne:

Deﬁnition 4.1 (Lagrange polynomials). Let m N, A Nm be a complete set ∈ ⊆ of multi-indices, and PA = pα α A be an unisolvent set of nodes with respect { } ∈ to a polynomial subspace ΠPA Πm. Then, we deﬁne the multivariate Lagrange polynomials ⊆ (4.1) L Π with L (p ) = δ , α, β A, α ∈ PA α β α,β ∈ where δ , is the Kronecker delta. · · MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 13

For A = Am,n, and a regular grid PA, this definition recovers the notion of tensorial mD Lagrange∞ interpolation [40, 81], where m n xi pj,i (4.2) Lα(x) = lαi (x) , lαi (x) = − , α A. pα ,i pj,i ∈ i=1 j=0,j=α i Y Y6 i − The following theorem then generalizes the classic facts known for 1D Lagrange interpolation [8, 91] and for tensorial mD Lagrange interpolation. Corollary 4.2 (Lagrange basis). Let the assumptions of Definition 4.1 be fulfilled. Then: i) The Lagrange polynomials L Π are a basis of Π . α ∈ A A ii) The polynomial Qf,A(x) = α A f(pα)Lα(x) ΠA is the unique polyno- ∈ ∈ mial interpolating f on PA and can be determined in ( A ) operations. P O | | Proof. To show i), observe that there are A Lagrange polynomials. Due to Corol- | | lary 3.4 we deduce dim ΠA = A . Given cα R, α A such that α A cαLα = 0, | | ∈ ∈ ∈ the unisolvence of PA implies that the polynomial Q(x) = α A cαLα vanishes on ∈ P PA and, therefore, has to be the zero polynomial. Hence, cα = 0 for all α A, P ∈ implying that the Lα ΠA are linear independent and thus yield a basis of ΠA. The claimed uniqueness∈ in ii) follows from i), and the remaining statement holds for trivial reasons. Theorem 4.3 (LU-decomposition). Let the essential assumptions from Defi- ni- m tion 2.9 be fulfilled, and f : R R be a function and F = f(pαmin ), . . . , f(pαmax ) −→ ∈ A . Then: R| | A A 2 i) A lower triangular matrix NLA R| |×| | can be computed in ( A ) operations, such that ∈ O | | A NLA CNewt = CLag = F, where CNewt = (cαmin , . . . , cαmax ) R| | · ∈ denote the uniquely determined Newton coefficients of Qf,A = α A cαNα according to Theorem 3.3. ∈ A A P 2 ii) A lower triangular matrix LNA R| |×| | can be computed in ( A ) operations, such that ∈ O | | A LNA CLag = CNewt , where CLag = (cαmin , . . . , cαmax ) = F R| | · ∈ denote the uniquely determined Lagrange coefficients of Qf,A = α A cαLα 1 ∈ according to Corollary 4.2. In particular, NLA− = LNA. A A P 3 iii) An upper triangular matrix CNA R| |×| | can be computed in ( A ) operations, such that ∈ O | | A CNA Ccan = CNewt , where Ccan = (dαmin , . . . , dαmax ) R| | · ∈ α denote the canonical coefficients of Qf,A(x) = α A dαx . ∈ iv) An LU-decomposition of the multivariate Vandermonde matrix V (PA) = β 3 P (pα)α,β A can be computed in ( A ) operations and is given by ∈ O | | V (P ) = NL CN . A A · A The notation shall indicate that NLA, LNA transforms from Newton basis to 1 Lagrange basis and vice versa, and CNA, NCA = CNA− transform from canonical basis to Newton basis and vice versa, respectively. 14 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

Proof. We start by showing i). Due to Corollary 4.2, every Newton polynomial Nβ, β A, can be uniquely written as ∈ Nβ(x) = Nβ(pα)Lα(x) α A X∈ with L , α A, the Lagrange polynomials. Furthermore, due the the definition α ∈ in Eq. (3.1), we observe that Nβ(pα) = 0 for all α L β, where L denotes the lexicographical order of A. Hence, the matrix ≤ ≤ A A NLA = Nβ(pα) α,β A R| |×| | ∈ ∈ is of lower triangular form. Moreover, due to Corollary 3.5, the matrix NLA can be 2 determined in ( A ) operations. Nβ β A and Lα α A are bases of ΠA, due to O | | { } ∈ { } ∈ Corollaries 3.4 and 4.2. Thus, the matrix NLA represents the basis transformation from Nβ β A to Lα α A, so the identities of i) hold. To{ prove} ∈ii) we{ apply} ∈ Theorem 3.3 and observe that every Lagrange polynomial L , β A can be interpolated in ( A 2) operations into Newton form β ∈ O | | β Lβ(x) = cαNα(x) . α A X∈ Hence, the matrix β A A LNA = cα α,β A R| |×| | ∈ ∈ represents the basis transformation from Lα α A to Nβ β A. Consequently, 1 { } ∈ { } ∈ NLA− = LNA and thereby LNA is of lower triangular form, proving ii). We show iii) in a similar way. Indeed, due to Theorem 3.3, every canonical basis function xβ Π can be uniquely written as ∈ A β β x = c Nα(x) , cα R , α ∈ α A X∈ β 2 where the Newton coefficients cα, α A, can be computed in ( A ) operations for every β A. Moreover, due∈ to Corollary 3.4, we haveO | that| xβ ∈ β ∈ span Nα αmin Lα Lβ. Thus, cα = 0 for α L β. Hence, the matrix { } ≤ ≤ ≥ β A A (4.3) CNA = cα α,β A R| |×| | ∈ ∈ is of upper triangular form and represents the basis transformation from canonical basis to Newton basis, proving iii). Statement iv) is a direct consequence of i) and iii). Remark 4.4. Theorem 4.3 links the interpolation schemes in all three bases (New- ton, Lagrange, and canonical) and allows efficient interpolation and evaluation by using the transformation matrices NLA and CNA. If PA is fixed, the matrices NLA, 1 1 CNA and their inverses LNA = NLA− , NCA = CNA− can be precomputed. For a A given function f :Ω R we set F = (f(pαmin , . . . , f(pαmax ) R| | and observe −→ ∈ (4.4) V (P )C = F NL CN C = F CN C = LN F. A ⇐⇒ A A ⇐⇒ A A Thus, the Newton coefficients CNewt = LNAF can be computed efficiently. More- over, due to Corollary 3.5, the Newton representation of the interpolant Qf,A can be evaluated anywhere with linear runtime. In Section 8, we demonstrate that this approach yields an efficient and numerically stable interpolation algorithm. MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 15

Remark 4.5. The complexity of the interpolation depends on the choice of A. As introduced, the cases Am,n,1, Am,n,2, and Am,n, are of special interest. While ∞ m + n m + n n m m Am,n,1 = = (m ) (n ) , Am,n, = (n + 1) , | | n m ∈ O ∩ O | ∞| explicit such formulas for Am,n,p , 1 < p < , are unknown. An approximation for p = 2 is given in [90] as| | ∞ (π/4)m/2 (n + 1)m πe m/2 A (n + 1)mvol(Bm) = (n + 1)m , | m,n,2| ≈ l2 (m/2)! ≈ √πm 2m m where B denotes the l2-ball in dimension m. Thus, for fixed degree n N l2 ∈ Lagrange interpolation is of polynomial complexity (mn) for p = 1, of sub- exponential complexity o(nm) for p = 2, and of exponentialO complexity (nm) for p = . O ∞ 5. Interpolation on given nodes We address the question of how to interpolate a function f C0(Ω, R) if the interpolation nodes P can not be chosen according to Definition∈ 2.9, but are arbitrarily given and fixed. We start by noting that if the locations of the nodes are distributed uniformly at random, then the nodes are unisolvent with probability 1 [54]. The next statement deals with that situation. Corollary 5.1 (Interpolation on given nodes). Let the essential assumptions (Def- m m m inition 2.9) be fulfilled with respect to A N and PA R , let f : R R be m⊆ ⊆ −→ a function, and let P A = p¯α α A R be any given node set that is unisolvent { } ∈ ⊆ A with respect to ΠA. Denote with F = (f(¯pαmin ), . . . , f(¯pαmax )) R| |. Then, a ma- A A 3 ∈ trix SA R| |×| | can be computed in ( A ) operations, such that the Lagrange ∈ O | | coefficients CLag = (cαmin , . . . , cαmax ) of Qf,A are determined by (5.1) Q (x) = c L (x) Π ,C = S F f,A α α ∈ A Lag A · α A X∈ Proof. Since P is assumed to be unisolvent, we have that P = A = dim Π A | A| | | A and the assumed enumeration of P A can be chosen as required. By Theorem 4.3i), there are uniquely determined Lagrange polynomials L , α A, such that α ∈ L (¯p ) = δ for allp ¯ P , α A. Denote by L the Lagrange polynomials α α α,β α ∈ A ∈ α with respect to PA and consider the matrix A A (5.2) RA = (rα,β) R| |×| | with rα,β = Lβ(¯pα) . ∈ 3 Due to Theorem 4.3iv), we can compute RA in ( A ) operations. Further, SA = 1 O | | RA− transforms the basis Lα of ΠA into the Lagrange basis Lα, α A, of ΠA. 3 ∈ Thus, SA exists and can be computed in ( A ) time or faster [88], proving the statement. O | | Remark 5.2. Corollary 5.1 provides no benefit in terms of runtime complexity over to the na¨ıve algorithm given by inverting the associated Vandermonde ma- A A trix V (P A) R| |×| |. However, the Corollary provides that SA measures the ∈ k k∞ approximation quality of the scattered nodes P A, as we further discuss in Section 7 and demonstrate in Section 8. 16 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

6. The dual notion of unisolvence

We discuss how to handle the situation when a given set of nodes P A is not unisolvent with respect to a speciﬁed polynomial space Π. This extends earlier works [28, 29], as introduced in Section 1.4, on the dual notion of unisolvence: given a set of nodes P , what is the largest possible polynomial subspace Π Π with A P ⊆ m respect to which P A is unisolvent? We start by stating: m Theorem 6.1. Let m, k N, k = P R P = k be the set of all subsets m ∈ P ⊆ | | of R with cardinality k, and X = Πm,k, be the space of all polynomials with ∞ l -degree at most k. Then, there is one and only one polynomial subspace ΠP X ∞ ⊆ such that P is unisolvent with respect ΠP . In particular, the map Γ : Gr(k, X) , Γ (P ) = Π , k Pk −→ k P with Gr(k, X) denoting the Grassmann manifold [73], is well deﬁned and smooth.

Remark 6.2. Note that for m = 1 we have Πm,k, = Π1,k,1 and Gr(k, X) = X = ∞ Π1,n,1. Since n + 1 nodes are unisolvent in dimension 1, Theorem 6.1 becomes trivial, and Γ (P ) X is constant in that case. k ≡

Proof. Let PAm,k,∞ be a set of unisolvent nodes with respect to X = Πm,k, , generated according to the essential assumptions in Deﬁnition 2.9. Denote by L ∞ α ∈ X, α Am,k the corresponding Lagrange polynomials. Fix an ordering P = ∈ ∞ k K m p0, . . . , pk and consider the matrix R = (ri,α) R × , K = Am,k, = (k +1) , deﬁned{ by } ∈ | ∞|

ri,α = Lα(pi) , i = 1, . . . , k, αmin L α L αmax , α Am,k, . ≤ ≤ ∈ ∞ µ k µ − k k Let µ = rank(R) be the rank of R and D = diag(1,..., 1, 0,..., 0) R × the diagonal matrix with the first µ entries equal to 1 and all others equal∈ to 0. Let K k z }| { z }| { further C R × be a solution of RC = D and bi = (cαmin,i, . . . , cαmax,i) be the rows of C.∈ Then B (x) = c L (x) X , i = 1, . . . , k i α,i α ∈ α A ∞ ∈ Xm,k, is the maximal set of linearly independent polynomials with B (p ) = δ , 1 i j i,j ≤ j µ, where δ , denotes the Kronecker delta. The Bi are the interpolants of the ≤ · ·m functions di : R R, di(x) = δx,pi . Since PAm,k,∞ is unisolvent, the Bi are −→ uniquely determined and linearly independent. Hence, µ = k, and the Bi are a basis of the uniquely determined polynomial subspace ΠP = span(Bi)i=1,...,k for which P becomes unisolvent. Since Gr(k, X) is a smooth manifold [73], and the Bi depend smoothly on P , this shows that Γk is a well-defined smooth map. The fact that dim(X) = (k + 1)m implies that Theorem 6.1 is of mostly theoretical interest, re-addressing questions already discussed in Refs. [28, 29]. Only if Γ (P ) Y is known to be located near a relatively low-dimensional subspace k ≈ Y X, e.g. Y = Πm,n,2 with m, n small, then Theorem 6.1 might be of practical relevance,⊆ as the following consequence states. Theorem 6.3. Let the essential assumptions (Definition 2.9) be fulfilled with re- m m m spect to A N and PA R . Let further P R with P A be any node ⊆ ⊆ m ⊆ | | ≤ | | set, Γk be as in Theorem 6.1, and f : R R be a function. Then: −→ MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 17

i) There is exactly one set P0 P of maximal cardinality k = P0 such that Γ (P ) Π . P can be determined⊆ in ( A 3) operations. | | k 0 ⊆ A 0 O | | ii) A basis (ρ1, . . . , ρk) of Γk(P0) ΠA and a basis (µ1, . . . , µ A k) of the ∩ | |− quotient space Π /Γ(P ) can be computed in ( A 3) operations. A 0 O | | iii) The Lagrange coefficients cα R, α A, of the uniquely determined polynomial Q = c L (x∈) Γ(P∈ ) interpolating f on P can be com- P0,f α A α α ∈ 0 0 puted in ( A 3) operations.∈ O | | P iv) If Γ(P0) = ΠA, then the Lagrange coefficients dα R of a polynomial 0 = Q =6 d L (x) Π satisfying Q (p) = 0∈for all p P can be 6 0 α A α α ∈ A 0 ∈ 0 computed in ∈( A 3) operations. PO | | Proof. All statements follow from the following observation: We order the nodes P = p1, . . . , pl , l = P . Let Lα, α A, be the Lagrange polynomials with respect { } | | ∈ l A to A, PA. Consider the matrix RA = (ri,α) R ×| | given by ∈ r = L (p ) , α α α , 1 i l . i,α α i 0 ≤ ≤ max ≤ ≤ By using Gaussian elimination with partial pivoting (GEPP) [92], we can find an A A LU-decomposition of RA. That is, there exist a permutation matrix Q R| |×| |, A A ∈ a unitary lower triangular matrix L R| |×| |, and an upper triangular matrix A A ∈ U R| |×| | such that ∈ ¯ RA A A (6.1) QRA = LU , where RA = R| |×| | . 0 ∈ ¯ Consequently, rank(RA) = rank(RA) = rank(U) = k N. By applying GEPP T ¯ A A ∈ again on U , we find a permutation matrix Q R| |×| |, a unitary upper triangular ¯ A A ∈ ¯ A A matrix L R| |×| |, and a lower triangular matrix U R| |×| | such that ∈ ∈ U¯ L¯ U¯ L¯ L¯ L¯ U¯ 0 UQ¯ = U¯L¯ = 1 1 1 2 , with L¯ = 1 2 , U¯ = 1 , 0 0 0 L¯ 0 0 3 k k ¯ ¯ ¯ where L1,U1 R × with rank(L1) = rank(U1) = k. Thus, QRAQ = LUL with ∈ L 0 U¯ L¯ U¯ L¯ QR Q¯ = 1 1 1 1 2 A L L 0 0 2 3 L U¯ L¯ L U¯ L¯ R R = 1 1 1 1 1 2 = 1 2 . L U¯ L¯ L U¯ L¯ R R 2 1 1 2 1 2 4 3 Let P0 = p¯1,..., p¯k be the first k nodes of the node set P¯ obtained by reordering P according{ to Q¯.} Moreover, let B = β , . . . , β be the multi-indices { min max} induced by A and reordered according to Q. Denote by Lβmin ,...,Lβk−1 the first k Lagrange polynomials. Since the upper-left (k k) block R of QR Q¯ is of full × 1 A rank, the Lagrange polynomials L¯i, i = 1, . . . , k, with L¯i(¯pj) = δij for allp ¯j P0, i, j = 1, . . . , k, are uniquely determined by ∈ k 1 ¯ − 1 Li(x) = cijLβj (x) ,Ci = (ci1, . . . , cik) = R1− ei , j=0 X k where ei is the i-th standard basis vector of R . Since rank(RA) = k, the set P0 is the uniquely determined maximal subset of P with that property. While all matrices above can be determined by GEPP or matrix inversion/multiplication in ( A 3) operations, this shows i). O | | 18 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

Further, by setting ρ1 = L¯1, . . . , ρk = L¯k ΠA we find a basis of Γk(P0) ΠA. ∈ k ∩ The coefficients Di = (di, ei), with di1, . . . , dik R and ei the i-th standard basis A k ∈ vector of R| |− , solving U¯ L¯ d = U¯ L¯ e 1 1 i − 1 2 i are uniquely determined and define polynomials

(6.2) µi(x) = dijLβj−1 (x) + Lβk−1+i (x) , j=1 X with µi(p) = 0 for all p P . Hence, µ1, . . . , µ A k are a basis of Q ∈ { | |− } { ∈ ΠA Q(P ) = 0 ∼= ΠA/Γk(P0). Thus, we have proven ii). Finally, iii) and iv) are direct consequences} of i) and ii).

One of the practical applications of Theorem 6.3 can be stated as follows:

Corollary 6.4. Let the essential assumptions (Definition 2.9) be fulfilled with re- m m spect to A N and PA R . Let further QM Πm be a polynomial and 1 ⊆ ⊆ ∈ M = QM− (0) be the affine algebraic variety given by the zero-level set of QM . De- note by

(6.3) ΠM = Q M Q ΠA Πm { | ∈ } ⊆ the polynomial subspace of all restrictions Q M of polynomials Q ΠA to M. | ∈ i) If P M and P = k is maximal, i.e., there is no P 0 M with P 0 = 0 ⊆ | 0| 0 ⊆ | 0| k0 > k and Γ 0 (P 0) Π , then Γ (P ) = Π and dim(Π ) = k. k 0 ⊆ A k 0 ∼ M M ii) If P0 is as in i), then by replacing Γk(P0) with ΠM the statements ii), iii), and iv) of Theorem 6.3 apply.

Proof. Observe that ΠM can be understood as the quotient of ΠA by all polynomials mM := Q ΠA Q M 0 ) vanishing on M, i.e., { ∈ | ≡ }

ΠM = ΠA/mM , dim(ΠM ) = dim(ΠA) dim(mM ) = l N . ∼ − ∈

According to Deﬁnition 4.1, for any set of unisolvent nodes p1, . . . , pl M there are Lagrange polynomials L ,...,L with L (p ) = δ . Hence, Π = span⊆ L ,...,L . 1 l i j ij M { 1 l} By setting P0 = p1, . . . , pl we realize that Γk(P0) ∼= ΠM , with k = l. Due to the maximality of P {we therefore} have P = k for any P that satisﬁes the assump- 0 | 0| 0 tions in i). Since Γk(P ) ΠA ΠM for any P M, P = k, hence proving i). Statement ii) is obvious. ∩ ⊆ ⊆ | |

Remark 6.5. Since unisolvence is a generic property [54, 85], and due to Theorem 6.3i), randomly sampling A nodes on M yields a set P as required in i) with prob- | | 0 ability 1. Consequently, any restriction Q M ΠM of a polynomial Q ΠA to M can be interpolated as in Theorem 6.3iii).| We∈ demonstrate the numerical∈ stability of such an approach in Section 8. But before that, we discuss the approximation power of polynomial interpolation in mD. MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 19

7. Approximation theory We address the fundamental question of how well polynomial interpolation can approximate continuous functions in mD. We derive several statements that enable control over the approximation error

f Q 0 . k − f kC (Ω) To do so, let α Nm and f be a suﬃciently smooth function. Then, we denote α α1∈ αm by ∂ f(x) = ∂x1 . . . ∂xm f(x) the partial derivative of f with respect to the multi- index α, evaluated at point x Ω. Further, we denote by C0(Ω, R) the R-vector ∈ space of continuous functions on Ω with norm f C0(Ω) = supx Ω f(x) . For a set k k ∈ | | of multi-indices A Nm, m N, we consider ⊆ ∈ 0 α 0 CA(Ω, R) = f C (Ω, R) ∂ f C (Ω, R) , α A , ∈ ∈ ∀ ∈ α f = ∂ f 0 . k kCA(Ω) k kC (Ω) α A X∈

For A = Am,n,1 the normed vector space (CA(Ω, R), CA(Ω)) coincides with the n k · k classic deﬁnition of the of the Banach space (C (Ω, R), Cn(Ω)) [2]. In light of k · k this fact, one can easily deduce that (CA(Ω, R), CA(Ω)) is a Banach space for all k · k A Nm. ⊆An important goal of approximation is to avoid the Runge’s phenomenon, as discussed in Section 1.5. Therefore, the Chebyshev nodes of ﬁrst and second kind 2k 1 Cheb1st = cos − π 1 k n + 1 , n 2(n + 1) ≤ ≤ kπ (7.1) Cheb2nd = cos 0 k n n n ≤ ≤ 1st in dimension m = 1 play a central role. This is because the nodes Chebn are minimizers of the product MPn (x) = p P x p , Pn = n + 1, i.e., ∈ n | − | | | 1 1st 0 Q (7.2) min MPn C (Ω) = n for Pn = Chebn . Pn Ω k k 2 ⊆ 2nd The nodes Cheb are the extrema of M 1st , with values oscillating between n Chebn 2nd M 1st (q) 1, 1 , q Cheb . Therefore, the term Chebyshev extreme nodes Chebn ∈ {− } ∈ n is also often used [45, 91]. For convenience in the following, and w.l.o.g., we assume 1st 2nd that the Chebn and Chebn are Leja-ordered [66], i.e., that for P = p0, . . . , pn the following holds: { } j 1 j 1 − − (7.3) p0 = max p , pj pi = max pk pi , 1 j n . | | p P | | | − | j k m | − | ≤ ≤ ∈ i=0 ≤ ≤ i=0 Y Y This ordering is known to minimize numerical rounding errors for 1D Newton interpolation [11]. In particular, p , p = 1, 1 holds for Cheb2nd, n 1. { 0 1} {− } n ≥ 7.1. Error bound in multiple dimensions. We generalize the classic 1D approximation-error bound for polynomial interpolation to arbitrary dimensions m N. ∈ For a given complete set of multi-indices A Nm, we denote by ⊆ ∂A = α A α + e A for all 1 i m ∈ i 6∈ ≤ ≤ m ∂A = β N β = α + ei for some α ∂A, 1 i m ∈ ∈ ≤ ≤

20 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI the discrete inner and outer boundaries of A and define A¯ = A ∂A to be the ∪ closure of A in this sense. If m = 1 then for any n N, A = 0, . . . , n , ∂A = n , ∈ { } { } ∂A = n + 1 , and A¯ = 0, . . . , n + 1 . With this we can state: { } { } Theorem 7.1 (Approximation error). Let the essential assumptions (Definition 2.9) m m be fulfilled with respect to A N and PA R . Further, let f CA¯(Ω, R). Then, ⊆ ⊆ ∈ for any x Ω, there are ξ Ω, β ∂A, such that: ∈ x,β ∈ ∈ ∂βf(ξ ) m (7.4) f(x) Q (x) = x,β N (x) , β! := β ! . − f,A β! β i β ∂A i=1 X∈ Y Proof. We argue by induction on A . For A = 1 we have A = 0 and Qf,A(x) = f(p ) for all x Ω. Thus, by the| Mean| Value| | Theorem, we have{ } 0 ∈ m f(x) Q (x) = f(x) f(p ) = ∂ f(ξ )(x p ) − f,A − 0 xi x,i i − 0,i i=1 X for some ξ Ω, yielding Eq. (7.4) in this case. For A > 1, we consider x,i ∈ | | the subsets A1 = α A αm = 0 , A2 = A A1, and the hyperplane H = ∈ m 1 \ (x1, . . . , xm 1, p0,m) (x1, . . . , xm 1) R − defined by the polynomial QH = { − − ∈ } (xm p0,m) Πm,1,1. As in Eq. (3.5), we use the splitting Qf,A(x) = Q1(x) + − ∈ m QH (x)Q2(x), x R . We denote by xH = (x1, . . . , xm 1, p0,m) the projec- ∈ m − tion of x = (x1, . . . , xm) R onto H and recall that Theorem 2.7 guarantees ∈ Q1(x) = Q1(xH ). By decreasing m if necessary and w.l.o.g., we can assume that A = . We denote by K (x) the multivariate Newton polynomials with respect to 2 6 ∅ β A2 and recursively compute f(x) f(x ) (7.5) f(x) Q(x) = f(x ) Q (x ) + Q (x) − H Q (x) − H − 1 H H x p − 2 m − 0,m ∂βf(ξ ) = xH ,β N (x ) + ∂ f(η ) Q (x) Q (x) β! β H xm x − 2 H β ∂A ∈X 1 ∂βf(ξ ) ∂β∂ f(ξ ) = xH ,β N (x) + xm x,β Q (x)K (x) β! β β! H β β ∂A β ∂A ∈X 1 ∈X 2 ∂βf(ξ ) (7.6) = x,β N (x) , β! β β ∂A X∈ where we used the Mean Value Theorem for the second term in Eq. (7.5) and the fact that Q (x)K (x) = N for all β ∂A to yield Eq. (7.6). H β β(x) ∈ 2 Remark 7.2. Note that for m = 1, Eq. (7.4) reduces to the classic 1D result f (n+1)(ξ ) n f(x) Q (x) = x (x p ) − f,A (n + 1)! − i i=0 Y with PA = p0, . . . , pn , A = n + 1. This yields the known approximation-error bound in 1D{ [45]: } | | (n+1) (n+1) f (ξ ) f C0(Ω) (7.7) f(x) Q (x) | x | k k ,P = Cheb1st . | − f,A | ≤ 2n(n + 1)! ≤ 2n(n + 1)! A n MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 21

Our result in Eq. (7.4) yields a similar bound on the approximation error in mD whenever the k-th derivatives of f are known or bounded. Equation (7.7) implies (n+1) that any smooth function f with derivatives f C0(Ω), n N, increasing slower 1 k k ∈1st than 2n(n+1)! can be approximated by interpolation on Chebn . However, in mD, the extrema of Nβ(x) can not be estimated as easily as in 1D. We show below how the class of functions that can be approximated extends to multi-dimensions.

7.2. The space of functions that can be approximated by interpolation. Even in 1D, the question of which functions can be approximated by interpolation on Chebyshev points of ﬁrst or second kind is not complete answered. Usually, additional assumptions such as Lipschitz continuity or diﬀerentiabilty are used. Here, we address this question from the perspective of functional analysis. That is to consider the interpolation operator 0 0 m IPA : C (Ω, R) ΠA C (Ω, R) , f Qf,A ,A N . −→ ⊆ 7→ ⊆

Indeed, IPA is a linear operator, whenever PA is unisolvent. For A = Am,n,p, m, n N, p 1 the question for approximation can stated as: What is the largest ∈ ≥ 0 subspace H(Ω, R) C (Ω, R) such that IPA converges to the identity operator ⊆ m,n,p on H(Ω, R), i.e.,

(7.8) IP IdH(Ω, ) . Am,n,p n R −−−−→→∞ In Appendix A we give a brief introduction into Sobolev theory, allowing us to state the required analytical setup, for answering that question, as follows: We consider the Hilbert space L2(Ω,R), R = R, C of all real- or complex-valued Lebesgue- integrable functions [2] with associated scalar product and norm

1 2 f, g 2 = f(x)g(x) dx , f 2 = f, f 2 . h iL (Ω) Ω k kL (Ω) h iL (Ω) | | ZΩ For k N, we consider the Sobolev space of all L2 functions with existing and integrable∈ weak derivatives Hk(Ω,R) = f L2(Ω,R) ∂αf L2(Ω,R) for all α A , ∈ ∈ ∈ m,k,1 α α 2 f, g k = ∂ f, ∂ g 2 , f k = f, f k ,R = R, C . h iH (Ω) h iL (Ω) k kH (Ω) h iH (Ω) α A X∈ Indeed, Hk(Ω, R), , is a Hilbert space and by the Sobolev embedding Theo- h· ·iHA rem for k > m/2, we have that Hk(Ω,R) C0(Ω,R) [2]. Therefore, Hk(Ω,R) with k > m/2 is the largest Hilbert space contained⊆ in the space of continuous functions. We now show that this is also the space of functions that can be approximated by polynomial interpolation, i.e., Eq. (7.8) holds for Hk(Ω,R), k > m/2. Definition 7.3 (Lebesgue function). Let the assumptions of Definition 4.1 be 0 fulfilled and f C (Ω, R) denote by Qf,A(x) = α A f(pα)Lα(x) the interpolant of f in Lagrange∈ form. Then we define the Lebesgue∈ function as the operator norm 0 P0 of the interpolation operator IPA : C (Ω, R) C (Ω, R), f Qf,A with respect −→ 7→ to PA, i.e.,

0 Λ(PA) := IPA = sup Qf,A C (Ω) = Lα . 0 0 k k f C (Ω,R) , f 0 1 k k | | C (Ω) ∈ k kC (Ω)≤ α A X∈

22 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

Since PA is unisolvent, we have that IPA is linear and exact on ΠA, i.e, IA(Q) = Q for all Q Π . The Lebesgue function provides a relative measure of the ∈ A approximation quality of P in the following sense: Let Q∗ P be an optimal A f ∈ A polynomial approximation of f; then:

f Q 0 f Q∗ 0 + Q Q∗ 0 k − f,AkC (Ω) ≤ k − f kC (Ω) k f,A − f kC (Ω) f Q∗ 0 + I (f Q∗ ) 0 ≤ k − f kC (Ω) k PA − f kC (Ω) (7.9) (1 + Λ(P )) f Q∗ 0 . ≤ A k − f kC (Ω) In dimension m = 1, it is known that for any arbitrary sequence of interpolation 1st nodes Pn Ω, Λ(Pn) is unbounded. However, by choosing Pn = Chebn or ⊆2nd Pn = Chebn , the following estimate applies [13, 14, 34, 57, 59, 69, 78, 77]: 2 (7.10) Λ(P ) = log(n) + γ + log(8/π) + (1/n2) , n π O where γ 0.5772 is the Euler-Mascheroni constant. Due to Eq. (7.9), this implies ≈ that all functions f for which the optimal approximation error f Q∗ 0 k − n,f kC (Ω) decreases faster than (1 + log(n)) can be approximated by polynomial interpo- 1st O 2nd lation on Chebn or Chebn . Based on this observation, we state for the multidimensional case:

m 2nd Lemma 7.4. Let m, n N, p 1, A = Am,n,p, GP = i=1Chebn , and PA be generated according to the∈ essential≥ assumptions in Deﬁnition⊕ 2.9. Then Λ(P ) Λ(Cheb2nd)m (log(n)m) . A ≤ n ∈ O Proof. Let B = Am,n,p0 with p0 p. We claim that Λ(PB) Λ(PA). Indeed, let 0 ≤ ≤ f C (Ω, R) and Qf,B = IPB (f) be its interpolant w.r.t. B. Consider a function ∈ f C0(Ω, R) such that ∈ f(p) if p P f(p) = B . e Q (p) if p ∈ P P f,B ∈ A \ B 0 We denote with Ff,Be C (Ω, R) the set of all such functions. We use that ⊆ IPA (IPB (f)) = IPB (f) to deduce Qf,B = IPB (f) = IPA (IPB (f)) = IPA (f) = Qf,Ae e for all f Ff,B. Thus, we estimate ∈ e 0 Qf,Ae C (Ω) eΛ(PBe) = sup Qf,B C0(Ω) = sup k k 0 k k 0 f C (Ω,R), f C0(Ω) 1 fe Fef,B 0 f C (Ω) ∈ k k ≤ ∈ \{ } k k (7.11) sup Qf,A C0(Ω) = Λ(PA) . 0 ≤ f C (Ω, ), f 0 1 k k e ∈ R k kC (Ω)≤

For A = Am,n, the Lagrange polynomials are given by Eq. (4.2). We set li,j(x) = n ∞ xi pk,i li(xj) = − , pj,i GP, 1 i m, 0 j n, and use Eq. (7.10) to k=0,k=j pj,i pk,i ∈ ≤ ≤ ≤ ≤ bound 6 − Q Λ(PA) = sup Qf,A C0(Ω) Lα C0(Ω) 0 k k ≤ k k f C (Ω), f C0(Ω) 1 α A ∈ k k ≤ ∈ m m m X

l 0 l 0 ≤ k αi kC (Ω) ≤ k i,jkC (Ω) α A i=1 i=1 j=1 X∈ Y Y X (7.12) Λ(Cheb2nd)m (log(n)m) . ≤ n ∈ O MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 23

Combining Eq. (7.11) with Eq. (7.12) completes the proof. This provides all the necessary ingredients to characterize the space of func- 2nd tions that can be approximated by polynomial interpolation on Chebn in m N dimensions: ∈ Theorem 7.5 (Approximation of Sobolov functions). Let the essential assumptions (Definition 2.9) be fulfilled with respect to Am,n,p, m, n N, p 1, and nodes ∈ ≥ PAm,n,p generated by GP = m Cheb2nd . ⊕i=1 n k 0 Assume that f H (Ω, R), k > m/2. Then f Qf,Am,n,p C (Ω) 0. n ∈ k − k −−−−→→∞ Proof. We first prove the statement for p = 1 and then generalize. Step 1: We argue by induction over dimension m N. For m = 0 the Theorem is trivial. Now assume that m > 0 and let ∈ m ∂Ω = Ω , Ω = x = (x , . . . , x ) Ω x = ( 1)j i,j i,j 1 m ∈ i − i=1 j=0,1 [ [ be the boundary of Ω decomposed into the faces Ωi,j of the hypercube. Denote by f ∂Ω and Qf,A Ω the restrictions of f and of Qf,A to these faces, respectively. | i,j | i,j Furthermore, set Ai,j = α A αi = j . Then, Ai,j = Am 1,n j,1 generates ∈ ∼ − − P P according to Eq. (2.9). Since Cheb2nd = p , . . . , p is assumed to Ai,j ⊆ A { 0 n} be Leja ordered (see Eq. (7.3)), we have p0, p1 = 1, 1 . Denote by τi,j the { } {− } m 1 affine transformations identifying the faces Ω with [ 1, 1] − in the obvious i,j − way. Then, τi,j(PAi,j ) satisfies the assumptions of the Theorem with respect to Am 1,n j,1 for all 1 i m, j = 0, 1. Further, as explained in Appendix A, − − ≤ ≤ k 1/2 Eq. (A.3), we have f Ωi,j H − (Ωi,j, R), k 1/2 > (m 1)/2. Thus, by | ∈ − − induction f Ω Qf,A Ω C0(Ω) 0 for all 1 i m, j = 0, 1. k | i,j − | i,j k −−−−→n ≤ ≤ Step 2a: We split A = →∞A , A = A A and 0 i=1,j=0,1 i,j 1 \ 0 S Qf,A(x) = cαLα(x) + cαLα(x) = Qf,A0 + Qf,A1 . α A α A X∈ 0 X∈ 1

We want to show that Qf,A f Qf,A = f uniformly. Indeed, Qf,A (pα) = 0 1 n 0 0 −−−−→→∞ − for all pα PA1 , α A1 and Qf,A1 (pα) = 0 for all pα PA0 , α A0. Thus, by ∈ ∈ e ∈ ∈ Step 1, it suﬃces to show that Qf,A f uniformly for any periodic f satisfying n −−−−→→∞ f ∂Ω = 0. | Step 2b: We use the facts summarized in Appendix A. According to Eq. (A.1), we can write πi β,x f(x) = cβe h i , cβ C ∈ β m X∈N almost everywhere and consider the projections from Eq. (A.4) onto the spaces of ﬁnite Fourier series and polynomials k m 0 m θn : H (T , R) Θm,n,1 C (T , R) , −→ ⊆ k m 0 πn : H (T , R) Πm,n,1 C (Ω, R) , −→ ⊆ 24 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI with complementary projections θn⊥ = I θn, and πn⊥ = I πn, respectively. We k − − denote by IPA : H (Ω, R) Πm,n,1, f Qf,A the interpolation operator. Let −→ 7→ q N, use Lemma A.1i) to estimate C0(Ω Hk(Ω, and derive ∈ k · k ≤ k · k f Q 0 θ (f) I (θ (f)) 0 + θ⊥(f) I (θ⊥(f)) 0 k − f,AkC (Ω) ≤ k q − PA q kC (Ω) k q − PA q kC (Ω) (7.13) π (θ (f)) I (π θ (f)) 0 ≤ k n q − PA n q kC (Ω) (7.14) + π⊥(θ (f)) I (π⊥θ (f)) 0 k n q − PA n q kC (Ω) (7.15) + θ⊥(f) 0 + Λ(P ) θ⊥(f) 0 , k q kC (Ω) A k q kC (Ω) where we used Eq. (7.9) to yield the term (7.15). Since πn(θq(f)) ΠA is a polynomial term (7.13) vanishes, and it remains to estimate terms (7.14)∈ and Eq. (7.15). We choose n = n(q) such that

1/2 (7.16) (n) = (e(q/m) ) (q) = (m log(n)2) . O O ⇐⇒ O O Then, Eq. (7.10) in combination with Lemma 7.4 ensures the existence of a constant C R+ such that ∈ m (1 + Λ(P )) θ⊥(f) 0 (1 + C log(n) ) θ⊥(f) 0 A k q kC (Ω) ≤ k q kC (Ω) k (1 + C(q/m) ) θ⊥(f) 0 . ≤ k q kC (Ω) k By Lemma A.1ii), we have (q/m) ) θq⊥(f) C0(Ω) 0. Thus, term (7.15) con- q k k −−−→→∞ verges to zero as q . According to Lemma A.1iii), there exists D R+ such that we can bound−→ the∞ remaining term by ∈

π⊥(θ (f)) I (π⊥θ (f)) 0 (1 + Λ(P )) π⊥(θ (f)) 0 k n q − PA n q kC (Ω) ≤ A k n q kC (Ω) 2 n+1 k 1/2 (π q) D(q/m) N(m, q) f k . ≤ (n + 1)! k kH (Ω) Due to the identities N(m, q) = m+q (qm+1) and 2k = m + l, l 1, we m ∈ O ≥ can ﬁnd constants E,F,H R+ (independent of n) such that by using Stirling’s ∈ formula [1, 79] for n n0 N large enough, we ﬁnd: ≥ ∈ (π2q)n+1 qm+l+n+2 (m log(n)2)m+l+n+2 D(q/m)kN(m, q)1/2 E F (n + 1)! ≤ (n + 1)! ≤ (n + 1)! (m log(n)2)m+l+n+2 (7.17) H ≤ √ n n 2πn( e ) H e(m log(n)2)m+l+2 n = 0 . √2πn n −−−−→n →∞ 0 Hence, term (7.15) also converges to zero. Taken together, f Qf,Am,n,1 C (Ω) converges to zero as n . k − k −→ ∞ Step 3: Finally, for p 1, we consider Am,n,1, Am,n,p and denote by PAm,n,1 , I (f) = Q and≥ P (f), I (f) = Q the interpolation PAm,n,1 f,Am,n,1 Am,n,p PAm,n,p f,Am,n,p nodes, interpolation operators, and interpolants, respectively. Since Qf,Am,n,p (q) = f(q) for all q A1 we realize that IP (IP (f)) = IP (f). Thus, ∈ Am,n,1 Am,n,p Am,n,1 (7.18) f = lim IP (f) = lim IP (IP (f)) , n Am,n,1 n Am,n,1 Am,n,p →∞ →∞ k where all limits are understood uniformly. IPAp (f) Πm,n,p H (Ω, R) is a ∈ ⊆ k sequence of Sobolev functions. We claim that for any sequence fn H (Ω, R) with ∈ MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 25 k > m/2, there holds 0 k f = lim IPA (fn) in C (Ω, R) f = lim fn in H (Ω, R) . n m,n,1 n →∞ ⇐⇒ →∞ Indeed, since the Theorem holds for p = 1, we obtain f = limn IP (f) = →∞ Am,n,1 k 0 limn IPA (fn). Since H (Ω, R) C (Ω, R) for k > m/2, the ”= ” direction →∞ m,n,1 ⊆ ⇒ follows. To show the reverse direction, we use the continuity of I to compute PAm,n,1

lim lim IPA (fh) lim lim IPA (fh) = h n m,n,1 − n h m,n,1 →∞ →∞ →∞ →∞ lim fh lim IPA ( lim fh) = lim fh lim IPA (f) = f f = 0 . h − n m,n,1 h h − n m,n,1 − →∞ →∞ →∞ →∞ →∞ Thus, f = limn IP (fn) uniformly and thereby the ” =” direction follows. →∞ Am,n,1 ⇐k Hence, by Eq. (7.18), the limit limn IPA (f) = f H (Ω, R) exists. Since →∞ m,n,p ∈ Hk(Ω, R) C0(Ω, R) for k > m/2, this proves the Theorem for all p 1. ⊆ ≥

Corollary 7.6. Let the assumptions of Theorem 7.5 be fulﬁlled, f Hk(Ω, R), and K ∈ K N. Then f Qf,A C0(Ω) o(log(n)− ), i.e., ∈ k − m,n,p k ∈ mK log(n) f Qf,A C0(Ω) 0 . m,n,p n k − k −−−−→→∞ Proof. The proof follows along the same lines as the proof of Theorem 7.5. Indeed, Steps 1 & 2a above apply again if we replace Eq. (7.16) by

1/2K (n) = (e(q/m) ) (q) = (m log(n)2K ) . O O ⇐⇒ O O Then, term (7.15) can be bounded by

θ⊥(f) 0 + Λ(P ) θ⊥(f) 0 (1 + Λ(P )) θ⊥(f) 0 k q kC (Ω) A k q kC (Ω) ≤ A k q kC (Ω) m (1 + C log(n) ) θ⊥(f) 0 ≤ k q kC (Ω) 1 + C(q/m)k θ⊥(f) 0 . ≤ log(n)mK k q kC (Ω) o(log(n)mK ) ∈ Further, the bound of term (7.14) becomes (π2q)n+1 qm+l+n+2 (m log(n)2K )m+l+n+2 D(q/m)kN(m, q)1/2 E F (n + 1)! ≤ (n + 1)! ≤ (n + 1)! (m log(n)2K )m+l+n+2 H ≤ √ n n 2πn( e ) H e(m log(n)2K )m+l+2 n ≤ √2πn n o(log(n)K ) . ∈ Step 3 remains unchanged, proving the corollary. Corollary 7.6 establishes convergence rates for arbitrary Sobolev functions f ∈ Hk(Ω, R) with k > m/2. However, one hopes that faster rates exist when additional assumptions on f can be made. The next section discusses this question. 26 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

7.3. Approximation rate in multiple dimensions. In practice, the question of how fast the interpolant Qf,A converges to f is of certain interest. Lloyd N. Trefethen recently used the famous result of Bernstein’s prize-winning memoir of 1914 [7] to derive upper bounds on the convergence rates [90]. He assumed the function f to be analytical over generalized versions of Hooke and Newton ellipses [5]. Numerical experiments suggested that these rates are also lower bounds, with ﬁrst steps already done to mathematically prove this expectation [10]. Here we revisit these results and adapt them to our problem.

Deﬁnition 7.7. Let Eρ be the Hooke ellipse of ratio ρ with foci 1 and 1 and topmost point ih. Hence, ρ and h are related by −

1 2 h = (ρ ρ− )/2 , ρ = h + 1 + h . − 2 2 Further denote by E 2 the Newton ellipse with focip 0 and 1 and leftmost point h . h − For m N and h [0, 1] we then call the open region ∈ ∈ m 2 2 2 Nm,ρ = (x1, . . . , xm) C (x + + x )/m E 2 ∈ 1 ··· m ∈ h the Trefethen domain[90].

Lloyd N. Trefethen’s statement [90] of the convergence rates was formulated for Chebyshev polynomials. However, since the approximation rate is independent of the representation of a polynomial interpolant, the statement also applies to our setting. We restate it here in adapted form to match our notation. We call a m m 2m function f : C R analytical in a domain D C ∼= R if and only if f possesses a in D absolutely−→ convergent Taylor series.⊆ Theorem 7.8 (Lloyd N. Trefethen). Let the essential assumptions (Deﬁnition 2.9) m 2nd be fulﬁlled with respect to Am,n,p and GP = i=1Chebn . Further, assume that 0 ⊕ f C (Ω, R) is analytical in the Trefethen domain Nm,ρ. Then ∈ n/√m ε(ρ− ) , p = 1 O n 0 f Qf,Am,n,p C (Ω) ε(ρ− ) , p = 2 . k − k ∈  O n (ρ− ) , p = ,  Oε ∞ n n Thereby, g(n) ε(ρ− ) if and only if g(n) ((ρ ε)− ) for all ε > 0. Thus, the above are upper∈ O bounds on the approximation∈ O rates.−

Remark 7.9. The above statement indicates that the notion of l2-degree suffices for the approximation to be as good as when using the entire PAm,n,∞ grid. In light of Remark 4.5, the interpolation can therefore reach any approximation accuracy m with sub-exponential complexity ( Am,n,2 ), Am,n,2 o(n ) in any dimension m whenever the rate for p = 2 applies.O | This| | can be| ∈ observed in the numerical experiments presented in Section 8. Next, we generalize Theorem 7.8 to approximation on given, arbitrarily scattered interpolation nodes. Corollary 7.10 (Approximation on scattered data). Let the essential assumptions m 2nd (Definition 2.9) be fulfilled with respect to Am,n,p, GP = i=1Chebn . Let further m ⊕ P Am,n,p = p¯α α Am,n,p R be any scattered node set that is unisolvent with { } ∈ ⊆ respect to Πm,n,p. Assume further that the matrix SAm,n,p from Corollary 5.1 is 0 given. Let f C (Ω, R) be analyticalin the Trefethen domain Nm,ρ and denote ∈ MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 27 by Q = I (f) and Q = I (f) the interpolants of f on f,Am,n,p PA,m,n,p f,Am,n,p P A,m,n,p

PAm,n,p and on P Am,n,p , respectively. Then:

i) The Lebesgue functions on P Am,n,p and PAm,n,p can be related by 1 Λ(P A) SA Λ(PA) , Λ(PA) S− Λ(P A) ,A = Am,n,p . ≤ k k∞ ≤ k A k∞

ii) The approximation error on P A is bounded by

f Q 0 (1 + Λ(P )) f Q 0 ,A = A . k − f,AkC (Ω) ≤ A k − f,AkC (Ω) m,n,p iii) For sn = 1 + SA Λ(PA ), we have k m,n,p k∞ m,n,p n/√m ε(snρ− ) , p = 1 O n 0 f Qf,Am,n,p C (Ω) ε(snρ− ) , p = 2 k − k ∈  O n (s ρ− ) , p = .  Oε n ∞ + iv) For bounded SAm,n,p , i.e., SAm,n,p s0 R for n > n0 N large k k∞ k k∞ ≤ ∈ ∈ enough, and for f Hk(Ω, R), k > m/2, we have ∈ f Qf,A C0(Ω) 0 . m,n,p n k − k −−−−→→∞ Proof. To show i) we use Deﬁnition 7.3 and Corollary 5.1 and denote by L , L α α ∈ ΠA the Lagrange polynomials on PA and P A, respectively, with A = Am,n,p. Using

(dαmin , . . . , dαmax ) = SAF with F = f(pαmin ), . . . , f(pαmax ) , we compute Λ(P A) = sup Q = sup f(p )Lα f,A α 0 f∈C0(Ω, ) , k k f∈C0(Ω, ) , C (Ω) R R α A kfk 0 ≤1 kfk 0 ≤1 C (Ω) C (Ω) X∈

= sup dαLα SA F Lβ 0 0 f∈C0(Ω, ) , C (Ω) ≤ | | C (Ω) R α A ∞ ∞ α A kfk 0 ≤1 C (Ω) X∈ X∈

= SA Λ(PA) . k k∞

Swapping the roles of PA and P A yields the converse estimate. Statement ii) can be proven by adapting Eq. (7.9) as follows:

f Q 0 f Q 0 + Q Q 0 k − f,AkC (Ω) ≤ k − f,AkC (Ω) k f,A − f,AkC (Ω) f Qf,A C0(Ω) + I (f Qf,A) C0(Ω) ≤ k − k k P A − k (1 + Λ(P )) f Q 0 for A = A . ≤ A k − f,AkC (Ω) m,n,p Statement iii) is a direct consequence of ii) in conjunction with Theorem 7.8. + Finally, due to Lemma 7.4 and Corollary 7.6, there exists s1 R such that for ∈ n > n0 we can bound:

f Q 0 (1 + s Λ(P )) f Q 0 k − f,Am,n,p kC (Ω) ≤ 0 A k − f,Am,n,p kC (Ω) m (1 + s1 log(n) ) f Qf,A C0(Ω) 0, m,n,p n ≤ k − k −−−−→→∞ proving iv). In the following section, we illustrate the practical relevance of the statements proven so far in numerical experiments designed to illustrate several points. 28 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

8. Numerical experiments We implemented a prototype of our multivariate interpolation solver, named MIP, in MATLAB. The code implements the multivariate divided difference scheme of Definition 3.2 for interpolation nodes PA, A = Am,n,p, generated by Leja-ordered nd m 2nd Chebyshev nodes of 2 kind, i.e., GP = i=1Chebn according to the essential assumptions in Definition 2.9. We compare⊕ our solver with the following alternative methods: (1) Chebfun from the corresponding MATLAB package [32]; (2) Cubic splines and 5th-order splines from the MATLAB Curve Fitting Tool- box; (3) Floater-Hormann interpolation [38] from the R package chebpol [42]; (4) Multi-linear (piecewise linear) interpolation from chebpol [42]; (5) Chebyshev interpolation of 1st kind from chebpol [42]; (6) Uniform (grid) interpolation by Chebyshev polynomials from chebpol [42];

(7) Vandermonde interpolation on PAm,n,p in MATLAB. Note that apart from MIP and Vandermonde, all other schemes use regular grids

PAm,n∞ as interpolation nodes. Therefore, Chebfun and Chebyshev only deliver l -degree interpolations. ∞ All implementations were benchmarked using MATLAB version R2019b, Cheb- fun package version 5.7.0, and R versions 3.2.3/Linux and 3.6.2/macOS with chebpol package version 2.1.2 on a standard personal computer (Intel(R) Xeon(R) CPU E5- 2660 v3 @2.60GHz, 128GB RAM). The code and all benchmark data sets are freely available from: https://git. mpi- cbg.de/mosaic/polyapprox. The implementation of MIP is provided as a prototype, which can be used to reproduce the results presented here. In the future, we are going to optimize the code from a software engineering perspective and by including further algorithmic improvements, such as the multivariate barycentric Lagrange interpolation [8] discussed in Section 9.1.

8.1. Approximation on the hypercube. In the ﬁrst set of experiments, we illustrate the statements of Theorems 7.5 and 7.8. Therefore, we consider the Runge function 1 f (x) = . R 1 + 10 x 2 k k Although the derivatives ∂αfR C0(Ω) diverge for α 1 , they are k k −−−−−→α ∞ k k −→ ∞ | |→∞ k bounded for α 1 k, k > m/2. Thus, f H (Ω, R) for k > m/2 and the Runge function cank thereforek ≤ be approximated according∈ to Theorem 7.5. Observe that for z = x + iy, z¯ = x iy C we can rewrite − ∈ 1 1 fR(z) = = . 1 + 10zz¯ (1 + i√10z)(1 i√10¯z) −

Thus, fR has poles in z = i/√10 C. In light of this fact, one can show ± ∈ that f is analytical in the Trefethen domain N with h = 1/√10 0.316 and R m,ρ ≈ ρ = h + √1 + h2 1.365, thereby fulﬁlling the requirements of Theorem 7.8. ≈ MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 29

Figure 2. Approximation errors for the benchmarked methods interpolating the Runge function in dimension m = 2.

Experiment 1. We measure the approximation errors of the interpolants computed by the mentioned methods. To do so, we sample 100 randomly nodes Q = 100, in- dependently generated for each degree, but identical for all methods and| | determine maxq Q f(q) Qf (q) f Qf C0(Ω) . ∈ | − | ≈ k − k Figure 2 shows the results of this experiment in dimension m = 2. We observe that Chebyshev, Chebfun, and MIP are the only methods that converge down to machine precision (32-bit double-precision arithmetics). The convergence rate is as stated in Theorem 7.8 and reproduces earlier results by Lloyd N. Trefethen [90] as introduced in Section 1. However, we only use the PA, A = Am,n,p, p = 1, 2, unisolvent nodes to determine the interpolants, whereas Trefethen computed the rates for the l1- and l2-degree approximations by regression over the whole l -grid. This detail might be the reason for the slight advantage of MIP over Chebfun∞ and Chebyshev for high degrees. Further, we recognize that the Vandermonde approach is inaccurate and even becomes numerically unstable (rising errors) for higher degrees. It is therefore inappropriate for approximating strongly varying functions, such as the Runge function. As expected, (Chebyshev) polynomial interpolation on uniform grids (uniform) and multi-linear interpolation also do not converge. Finally, we observe that Floater-Hormann interpolation performs better than multivariate cubic splines. It is comparable to 5th-order splines, but reaches an 7 accuracy of 10− faster than any other approach. Figure 3 shows the results of the same experiment in dimension m = 3, leaving out the infeasible methods. The observations made in 2D remain valid. However, Floater-Hormann becomes indistinguishable from 5th-order splines. Further, when considering the amount of coeﬃcients/nodes required to determine the interpolant, plotted in the right panel (with logarithmic scales on both axes). The polynomial convergence rates of Floater-Hormann and all spline-type approaches become visible. MIP requires 1223/899028 2-times less coeﬃcients/nodes than Chebyshev or Chebfun to approximate f to machine≈ precision for n = 121. Figure 4 shows the results for dimension m = 4. Spline interpolation was not able to scale to high degrees due to computer memory requirements. To simulate the behaviour for higher degrees we rescale the hypercube to 1 Ω = [ 1 , 1 ]m. √10 − √10 √10 That is we approximate the Runge function on two scales: Once for Runge factor RF = 10 and once for RF = 1. The results for RF = 10 show a similar situation 30 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

Figure 3. Approximation errors for the benchmarked methods interpolating the Runge function in dimension m = 3.

Figure 4. Approximation errors for the benchmarked methods interpolating the Runge function in dimension m = 4. as the results in 3D. However, Figure 3 suggests that again degree n 75 is the crossover point, where MIP and Chebyshev become the superior approaches.≈ Indeed, for RF = 1, only Chebyshev and MIP converge down to machine precision. But MIP reaches that goal earlier (n = 40/47) than Chebyshev, and with less CChebyshev 5308416 interpolation nodes | | = 6. CMIP 858463 ≈ The same is true in| dimension| m = 5, as Fig. 5 illustrates. Especially when considering the right plot (with logarithmic scales on both axes), we observe that MIP best resists the curse of dimensionality by yielding 2 orders of magnitude 14 12 better accuracy than Chebyshev for n = 40 (3.0 10− vs. 2.1 10− ) with less CChebyshev 115856201 · · interpolation nodes | | = 6. CMIP 18920038 ≈ To assess the convergence| | rates, we ﬁt the data for MIP with the model y = n cρ− with an R-squared of 0.99 or better, indicated by the dashed lines in the corresponding ﬁgures. This yields the exponential decays reported in Table 1 for the approximation errors in the corresponding ranges. Since ρmax = 1 + √2 2.41 for RF = 1, the upper bounds in Theorem 7.8 are almost achieved by the≈ MIP MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 31

Figure 5. Approximation errors for the benchmarked methods interpolating the Runge function in dimension m = 5.

dim fitting range ρRF =10 cRF =10 fitting range ρRF =1 cRF =1 2 2 121 1.35 4.30 3 2 ∼ 121 1.34 4.41 4 2∼ 80 1.32 4.42 2 40 2.33 5.40 5∼ 2 ∼ 40 2.35 13.37 Table 1. Fitted convergence rates∼ of MIP. method. In contrast, Chebyshev just reaches convergence rates ρRF =1 1.9 in dimension m = 4, 5. ≈ In summary, Experiment 1 confirms that MIP converges as expected from The- orem 7.8. Compared to the other methods tested, MIP is efficient in reaching machine precision. MIP also seems to resistant to the curse of dimensionality best, which becomes increasingly visible in higher dimensions, thus supporting the pre- diction of Remark 7.9. 8.2. Interpolation of scattered data. We design the following experiment to verify Corollary 7.10: 2 Experiment 2. Consider the Chebyshev-grid PA Ω = [ 1, 1] , A = A2,n,2, gener- m 2nd ⊆ − ated with respect to GP = i=1Chebn . We realize perturbations of PA as follows. Since the points located at⊕ the boundary ∂Ω play a crucial role for the approximation error the perturbation is done relatively to the distance from the boundary, i.e., d (p ) = dist( 1, 1 , p ), d (p ) = dist( 1, 1 , p ), p P , α A.A x α {− } α,x y α {− } α,y α ∈ A ∈ perturbed point p is given by px = px + ρxdx(p), py = py + ρydy(p) with ρx, ρy uniformly random [ ν, ν], where ν [0, 1] is the perturbation amplitude. ∈ − ∈ Thus, perturbationse pα ∂eΩ of boundary nodese pα ∂Ω remain within the boundary, while inner nodes∈ can leave their original position∈ freely with radius ν and could even overlape in the worst case with other perturbed nodes pβ. We use Corollary 5.1 to approximate a rescaled version of the Runge function f(x) = 2 1/(1 + x ) on the perturbed nodes PA,ν . e k k Figure 6 shows the approximation errors AP-ν = f Q C0(Ω) mea- e k − f,PeA,ν k sured on 200 randomly selected nodes, for perturbation radii corresponding to ν = 0%, 5%, 10%, 25%, 50%, 100% of noise. The dashed lines are the error estimates

b 32 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

Figure 6. Approximation error convergence on scattered data obtained from randomly perturbed grids (perturbation amplitude in percent of grid spacing) in dimension m = 2.

from Corollary 7.10. For this, we set sn = 1 + SA,ν Λ(PA) with SA,ν the oc- k k∞ curring transformation matrix. According to Eq. (7.10), we estimate Λ(PA) 2 log(n) + γ + log(8/π) , and we plot the estimated errors EST-ν = s f(x) ≈ π nk − Q , with Qf,P the approximation of f on the un-perturbed grid, i.e., for f,PeA,0 k A ν = 0. The results show that even for 50% perturbation the convergence rates remain reasonable. The theoretical upper bounds are 2 ... 4 orders of magnitude larger than the actually measured errors in this case. For 100% noise, convergence disappears as expected, and the approximation error estimation becomes useless. In summary, Experiment 2 shows that the error bounds stated in Corollary 7.10 are not only of theoretical nature, but yield useful indicators of the approximation quality on scattered data. They might help choose alternative nodes PA of high ap- m proximability. For instance, PA might be generated by GP = i=1Pi with Pi given 1st 2nd ⊕ by Legendre nodes instead of Chebn , Chebn [91] or a mixture/(Leja)-reordering.e Minimizing SA for such alternativese might provide well-approximating nodes. k k∞ 8.3. Regression on curved manifolds. We realize the algorithm of Carl de Boor 2 and Amon Ros [28, 29] in terms of Corollary 6.4 in case of the torus M = TR,r. That is, we consider 2 2 2 2 2 2 2 2 2 Q 2 (x, y, z) = x + y + z + R r 4R x + y TR,r − − 2 1 with R = 0.7 and r = 0.3. TR,r = Q−2 (0) is an algebraic hypersurface of degree TR,r 4. Given a function f :Ω R, we aim to interpolate the restriction f 2 . −→ |TR,r Experiment 3. We choose A = Am,n,p with m = 3, p = 2, and sample S = 2 1, 5 A N uniformly random nodes P 2 on the torus T , as illustrated in b · | |c ∈ TR,r R,r Figure 7 (left). Further, we generate PA with respect to GP = Cheb2nd Cheb2nd r Cheb2nd , n ⊕ n ⊕ · n 2 yielding a feasible grid near the sampled nodes P 2 . Since is a hypersurface TR,r TR,r 2 of degree 4, the nodes p PT are not unisolvent for ΠAm,n,p with n 4. Thus, due to Corollary 6.4 ∈ ≥ 2 mM := Q ΠA Q M 0 = 0 ,M = TR,r . { ∈ | ≡ } 6 { }

Multivariate InterpolationMULTIVARIATE on Unisolvent INTERPOLATION Nodes ON UNISOLVENT NODES 37 33

function approximation error MIP

Q 2 6.4358e-15 TR,r

Q 2 5.3956e-13 r TR,r

2 Figure 7. Uniformly random nodes2 on the torus TR,r with R = Fig. 7 Uniformly random nodes on the torus TR,r with R =0.7andr =0.3(left),ap- proximation0.7 and errorsr = of 0 the.3 (left), level set approximation function Q 2 errorsand its of gradientthe level setQ 2 function(table) and TR,r r TR,r Q 2 and its gradient Q 2 (table) and approximation error for approximationTR,r error for the restrictionTfR,r2 of Runge function to the torus (right). ∇ |TR,r the restriction f 2 of Runge function to the torus (right). |TR,r

8.3 Regression on Curved Manifolds However, by Remark 6.5 and Theorem 6.3 the splitting We realize the algorithm of Carl de Boor and Amon Ros [28,29] in terms of ΠA = ΠA mM + mM =: ΠA, 2 + ΠA,⊥ 2 ∼ 2 T T Corollary 6 in case of the torus M = TR,r. That is, we consider holds with probability 1. For A= A , we have dim Π⊥ 2 = 1 and by computing 3,4,2 A,T 2 2 2 2 2 2 2 2 2 a basis µ ΠA2 of dim Π⊥ 2 as in Eq. (6.2) we have determined a level-set function QT (x, y, z)=A,T x + y + z + R r 4R x + y ∈ R,r 2 1 Q 2 = µ, i.e., = Q−2 (0). TR,r TR,r T R,r 2 1 with R =0.7 and r =0.3. = Q2 (0) is an algebraic hypersurface of Further, considering RA =TR,r (Lα(pi))T(i,α) S A, where Lα, α A denote the R,r ∈ × ∈ Lagrangedegree 4. polynomials Given a function w.r.t. fPA:,⌦ we findR, the we Lagrangeaim to interpolate coefficients theC restrictionLag Qf,A 2 ! |TR,r 2 off theT interpolant. of f 2 for any f :Ω R by solving | R,r |TR,r −→ S 2 ExperimentRAClag 3F,FWe choose= (f(pA1),= . . .A , fm,n,p(pS))withR m,=3 pi , PpT=2, ,andsample1 i S ≈ ∈ ∈ R,r ≤ 2≤ S = 1, 5 A N uniformly random nodes P 2 on the torus T ,as using standardb ·| MATLAB|c 2 regression. TR,r R,r illustrated in Figure 7 (left). Further, we generate PA with respect to The results of this experiment are shown in Fig. 7. All errors are measured GP = Cheb2nd Cheb2nd r Cheb2nd , 2 on the S nodes P 2 , plus additionaln 200n uniformly randomn points P T for TR,r · ⊆ R,r 10 independent repetitions in each case. The inset table in Fig.2 7 shows that the yielding a feasible grid near the sampled nodes P 2 .SinceTR,r is a hypersur- approximation errors for the level-set function Q T2R,r and its gradient Q 2 , cor- TR,r TR,r face of degree 4, the nodes p P 2 are not unisolvent for ⇧ with∇ n 4. responding to the surface normal,T are within machine precision.Am,n,p The right panel of Thus, due to Corollary 6 2 Fig. 7 shows the mean approximation errors with min–max error bars for interpo- 2 lating the restriction fR 2 of the Runge function f(x) = 1/2 (1 + 10 x ) to the mM := |QTR,r ⇧A Q M 0 = 0 ,M= TR,r . k k torus, measured analogously.{ 2 | ⌘ }6 { } However,The results by Remark suggest 7that and the Theorem notion 6 of the the splitting matrix RA defined with respect to the unisolvent grid PA is the main reason for the fast approximation rate on the ⇧ = ⇧ m + m =: ⇧ 2 + ⇧? 2 Runge function. In particular,A ⇠ A weM usedM the pre-computationA,T A,T approach mentioned in Remark 4.4 to execute the interpolations in just seconds. Implicitly, we thereby also validate the numerical stability of the pre-computation approach. 34 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

We are not aware of any other approach that can handle such an interpolation 2 task without requiring a triangulation or parametrization of the manifold TR,r. Therefore, these results demonstrate the ﬂexibility of MIP and suggest its use in numerical methods on curved surfaces, including quadrature schemes, ODE & PDE solvers, and optimization algorithms.

9. Conclusion We have introduced a notion of unisolvent nodes for polynomial interpolation in arbitrary dimensions with respect to a generalized concept of polynomial degree, i.e., A = Am,n,p. This allowed us to generalize the classic 1D Newton and La- grange interpolation methods to multivariate schemes in a numerically stable and eﬃcient way resulting in the algorithm MIP possessing ( A 2) runtime complexity and ( A ) storage complexity. We also provided theO theory| | for generalizing the approachO | | to interpolation of scattered data and on curved manifolds e.g., the torus T2. By characterizing the space of functions that can be approximated by polynomial interpolation to be the space of Sobolev functions Hk(Ω, R) with k > m/2, we have proven the genericity of the approach. Further, we validated that the resulting algorithm MIP reaches the optimal approximation rate given by Lloyd N. Trefethen’ Theorem [90]. In contrast to previous approaches, such as Chebfun [32], multivariate splines [26], and Floater-Hormann interpolation [38], MIP achieves exponential approximation rates for the Runge function using only sub-exponentially many interpolation nodes. This suggests that we have found an eﬃcient approximation scheme that overcomes the curse of dimensionality for a generic class of functions. In closing, we discuss related concepts and possible further developments.

9.1. Barycentric Lagrange interpolation. In 1D, barycentric Lagrange interpolation is the most efficient interpolation scheme [8] for fixed nodes. Both determining the interpolant Qf,n and evaluating Qf,n at any x R require linear time (n). This is achieved by precomputing the constant barycentric∈ weights that only Odepend on the locations of the nodes, but not on the function f. We have already established preliminary theoretical results toward generalizing this approach to mD for the case of l1-degree [84]. Our generalization is based on the observation that the transformation matrices NLA, LNA, CNA, NCA from Theorem 4.3 are spare, but structured. This structure has not been exploited so far. It is known that some structured matrices can be inverted and multiplied much faster than the general case [35, 46]. This suggests that the current complexity ( A 2) O | | of interpolation and evaluation for the case of multi-indices A = Am,n,p, p > 1 can still be reduced significantly. In practice, efficient interpolation in dimensions m 6 might become possible, as it is asked for, when considering phase spaces of 3D≥ dynamical systems.

9.2. Multivariate polynomial regression. For a given function f :Ω R and −→ set of nodes P Ω = [ 1, 1]m, m N we consider the graph G = f(P ) P Rm+1. Corollary 6.4 and⊆ Experiment− 3∈ allow identifying the hypersurface M× ⊆G that n+1 dim M⊇ 1 contains G as a level set of polynomials QM,i Πm+1, M = i=1 − QM,i− (0) ∈ m+1 ∩ and to ﬁt the restriction g M of any function g : R R to M. We consider both aspects as crucial steps| toward developing a multivariate−→ polynomial regression MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 35 scheme, including applications in multivariate analysis & statistics, and topological analysis [4, 33, 39, 68, 67, 83, 94].

9.3. Trigonometric interpolation. In 1D, trigonometric Clairaut-Lagrange-Gauss interpolation can be used to compute the discrete Fourier transform (DFT) of a periodic function f [56]. This fact was for example used in the development of the famous Cooley-Tukey algorithm [20], which re-invented an algorithm by Carl F. Gauss [44] to yield a modern realization of the Fast Fourier Transform. These concepts are also closely related to the invention of wavelets [87]. Revisiting these aspects from the perspective of multivariate Lagrange interpolation might be worthwhile to make progress in open problems, such as Fourier transformation of non-periodic, highly oscillating signals or fast Fourier transform on scattered nodes [6, 47, 55].

9.4. Numerical integration. Until today, the classic Gauss quadrature formula is the best approach to approximating integrals IGauss(f) Ω f(x) dx in one variable [43, 63]. Many contributions toward extending this≈ approach to higher di- R mensions have been made [21, 22, 49, 89]. This list is by no means exhaustive, and research in this direction is actively ongoing. The present notion of unisolvent nodes might be helpful in this endeavor.

Acknowledgements Christian L. Mueller, Leslie Greengard, Alex Barnett, Manas Rachh, Uwe Her- nandez Acosta, and Nico Hoﬀmann are deeply acknowledged for their inspiring hints and helpful discussions. Further, we are grateful to Michael Bussmann and thank the whole CASUS institute (G¨orlitz, Germany) for hosting stimulating workshops on the subject.

Appendix A. Sobolev theory for periodic functions This brief summary of Sobolev theory is a rewritten and corrected version of aspects discussed in our previous work [54]. We recommend [2] as an excellent overview of Sobolev theory and [53, 61] for specific aspects deduced here. Let Ω = [ 1, 1]m, k, m N, and the spaces Ck(Ω,R), L2(Ω,R), Hk(Ω,R), R = R, C − ∈ with associated norms k , 2 , k and inner products , 2 , k · kC (Ω) k · kL (Ω) k · kH (Ω) h· ·iL (Ω) , k be as introduced in Section 7.2, respectively. h· ·iH (Ω) We denote by Tm = Rm/2Zm the torus with fundamental domain Ω and call a function f : Rm R periodic if and only if f(x+2Zm) = f(x), i.e., f : Tm R −→ k k m −→ is well defined. Since C∞(Ω,R) H (Ω,R) is dense [2], H (T ,R) can be defined as the completion of the space of⊆ all smooth periodic functions

k m k·kHk(Ω) H (T ,R) = Cperiod∞ (Ω,R) .

k m k0 m 0 m 2 m Thus, H (T ,R) H (T ,R) for all k k0 0 with H (T ,R) = L (T ,R). ⊆ ≥ ≥ Consequently, every f Hk(Tm,R) can be expanded in a Fourier series ∈ πi β,x (A.1) f(x) = cβe h i , cβ C ∈ β m X∈N almost everywhere, i.e., Eq. (A.1) is violated only on a set Ω0 Ω of Lebesgue πi β,x 2 ⊆ measure zero. Since the e h i are an orthogonal L -basis, the norm of f is given 36 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI by

2 α α α 1 α 2 (A.2) f k = ∂ f, ∂ f 2 = πk k β c , || ||H (Ω) h iL (Ω) | β| α A α A β m ∈Xm,k,1 ∈Xm,k,1 X∈N where βα = βα1 βαm . Due to the Sobolev and Rellich-Kondrachov Embedding 1 ··· m Theorem [2], we have that whenever k > m/2 then Hk(Tm, R) C0(Tm, R) and the embedding ⊆ k m 0 m i : H (T , R) , C (T , R) → is well deﬁned, continuous, and compact. Thus, for m N there exists a constant ∈ c = c(m, Ω) R+ such that ∈ f 0 c f k . || ||C (Ω) ≤ || ||H (Ω) By the Trace Theorem [2], we observe furthermore that whenever H Rm is a hyperplane of co-dimension 1, then the induced restriction ⊆ k k 1/2 (A.3) % : H (Ω, R) H − (Ω H, R) −→ ∩ + is continuous, i.e., f Ω H Hk−1/2(Ω H) d f Hk(Ω) for some d = d(m, Ω) R . We consider || | ∩ || ∩ ≤ || || ∈ k m πi α,x Θm,n,1 = f H (T , R) f(x) = cαe h i , cα C ∈ ∈ β A n ∈Xm,n,1 o the space of all ﬁnite Fourier series of bounded frequencies and denote by k m 0 m (A.4) θn : H (T , R) Θm,n,1 C (T , R) , −→ ⊆ k m 0 τn : H (T , R) Πm,n,1 C (Ω, R) −→ ⊆ the corresponding projections onto Θm,n,1 and onto the space of polynomials Πm,n,1 of l1-degree bounded by n. Further, we denote by θn⊥ = I θn, τn⊥ = I τn the complementary projections. − −

Lemma A.1. Let k, m N, k > m/2, and f Hk(Tm, R). Then: ∈ ∈ k m i) f C0(Ω) f Hk(Ω) for all f H (T , R). || || ≤ || || k ∈ ii) θ⊥(f) 0 o (m/n) . More precisely: || n ||C (Ω) ∈ k (πn/m) θn⊥(f) C0(Ω) 0 for every m N . n || || −−−−→→∞ ∈ k m 0 m iii) For q, n N, the operator norm of τn⊥θq : H (T , R) C (T , R) is bounded by∈ −→ 2 n+1 1/2 (π q) m + q τ ⊥θ N(m, q) ,N(m, n) = . n q ≤ (n + 1)! q

Proof. To show i), we approximate f by a ﬁnite Fourier series, i.e., we assume that πi β,x f(x) = θn(f) = β A cβe h i. Due to Eq. (A.2) we compute ∈ m,n,1 P 2 2 πi β,x 2 2 2 f C0(Ω) = sup cβe h i cβ = f L2(Ω) f Hk(Ω) . || || x Ω ≤ | | || || ≤ || || ∈ β Am,n,1 β Am,n ∈X ∈X

Since θn(f) f and the norm Hk(Ω) is continuous, we have proven i). n −−−−→→∞ k · k MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 37

To show ii) we assume

πi β,x θ⊥(f)(x) = cβe h i , cβ C n ∈ β >n k Xk1 and consider again Eq. (A.2). Observe that for α Am,n,1 with α 1 = k and m α α k ∈ k k β N with β 1 n, we have π β (nπ/m) . By i) and the continuity of the ∈ k k ≥ ≥ norms k , 0 we can estimate k · kH (Ω) k · kC (Ω) k (πn/m) τn⊥(f) C0(Ω) τn⊥(f) Hk(Ω) 0 , n || || ≤ || || −−−−→→∞ which shows ii). To show iii), we assume that θq(f) is given by πi β,x θq(f)(x) = cβe h i , cβ C ∈ β A ∈Xm,q,1 iπ β,x and that f Hk(Ω) 1. We expand e h i in a Taylor series around 0. Thus, for || m|| ≤ every x R there exists ξx,β Ω such that ∈ ∈ n h h n+1 iπ β,ξx,β iπ β,x ∞ (iπ β, x ) (iπ β, x ) ∂v e h i n+1 e h i = h i = h i + (iπ β, x ) , h! h! (n + 1)! h i hX=0 hX=0 m where ∂v denotes the partial derivative in direction v = x/ x R and the second term is the Lagrange remainder, see for instance [58]. Hencek k ∈

n+1 iπ β,ξx,β ∂v e h i n+1 (A.5) π⊥(θ (f))(x) = c (iπ β, x ) . n q β (n + 1)! h i β A ∈Xm,q,1 We further observe that n+1 iπ β,ξ n iπ β,ξ n iπ β,ξ n+1 iπ β,ξ ∂ e h x,β i = ∂ e h x,β i, v = iπ β, v ∂ e h x,β i = iπ β, v e h x,β i . v v ∇ v Since β, v , β, x β 1 we estimate | h i | | h i | ≤ k k n+1 iπ β,ξ 2 n+1 n+1 ∂ e h x,β i π β, v β v (iπ β, x )n+1 π2n+2 k k1 . (n + 1)! h i ≤ (n + 1)! ≤ (n + 1)!

Combining this bound with Eq. (A.5) yields

n+1 2 2 4n+4 2 β 1 τ ⊥θ (f) 0 π c k k || n q ||C (Ω) ≤ | β| (n + 1)! β Am,q,1 ∈X qn+1 2 π4n+4 c 2 ≤ | β| (n + 1)! β Am,q,1 β Am,q,1 ∈X ∈X qn+1 2 π4n+4 , ≤ (n + 1)! β Am,q,1 ∈X 2 where we used cβ = θq(f) L2(Ω) θq(f) k f k 1 for β Am,q,1 | | k k ≤ k kH (Ω) ≤ k kH (Ω) ≤ the last step. Thus,∈ P n+1 2 n+1 2n+2 1/2 q 1/2 (π q) τ ⊥θ (f) 0 π A = N(m, q) , || n l ||C (Ω) ≤ | m,q,1| (n + 1)! (n + 1)! proving iii). 38 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

References [1] Milton Abramowitz, Irene A Stegun, and Robert H Romer. Handbook of mathematical functions with formulas, graphs, and mathematical tables, 1988. [2] Robert A Adams and John JF Fournier. Sobolev spaces, volume 140. Academic press, 2003. [3] Ben Adcock, Simone Brugiapaglia, and Clayton G Webster. Compressed sensing approaches for polynomial approximation of high-dimensional functions. In Compressed Sensing and its Applications, pages 93–124. Springer, 2017. [4] Theodore W Anderson. An introduction to multivariate statistical analysis. Inc., New York, 54, 1958. [5] Vladimir I Arnold. Huygens and Barrow, Newton and Hooke: Pioneers in mathematical analysis and catastrophe theory from evolvents to quasicrystals. Springer Science & Business Media, 1990. [6] Alexander H Barnett, Jeremy Magland, and Ludvig af Klinteberg. A parallel nonuniform fast Fourier transform library based on an ”exponential of semicircle” kernel. SIAM Journal on Scientific Computing, 41(5):C479–C504, 2019. [7] Serge Bernstein. Sur l’ordre de la meilleure approximation des fonctions continues par des polynômes de degrédonné, volume 4. Hayez, imprimeur des académies royales, 1912. [8] Jean-Paul Berrut and Lloyd N Trefethen. Barycentric Lagrange interpolation. SIAM review, 46(3):501–517, 2004. [9] L. Bos, S. De Marchi, and M. Vianello. Polynomial approximation on Lissajous curves in the d-cube. arXiv:1502.04114, 2015. [10] Len Bos and Norm Levenberg. Bernstein–Walsh theory associated to convex bodies and applications to multivariate approximation theory. Computational Methods and Function Theory, 18(2):361–388, 2018. [11] Michael Breuß, Friedemann Kemm, and Oliver Vogel. A numerical study of Newton interpolation with extremely high degrees. arXiv preprint arXiv:1609.08839, 2016. [12] Cl Brezinski. The Mühlbach-Neville-Aitken algorithm and some extensions. BIT Numerical Mathematics, 20(4):443–451, 1980. [13] L Brutman. On the Lebesgue function for polynomial interpolation. SIAM Journal on Nu- merical Analysis, 15(4):694–704, 1978. [14] Lev Brutman. Lebesgue functions for polynomial interpolation – a survey. Annals of Numer- ical Mathematics, 4:111–128, 1996. [15] Abdellah Chkifa, Albert Cohen, Giovanni Migliorati, Fabio Nobile, and Raul Tempone. Dis- crete least squares polynomial approximation with random evaluations – application to parametric and stochastic elliptic PDE’s. ESAIM: Mathematical Modelling and Numerical Anal- ysis, 49(3):815–837, 2015. [16] Emiliano Cirillo, Kai Hormann, and Jean Sidon. Convergence rates of derivatives of Floater– Hormann interpolants for well-spaced nodes. Applied Numerical Mathematics, 116:108–118, 2017. [17] Albert Cohen and Abdellah Chkifa. On the stability of polynomial interpolation using hier- archical sampling. In Sampling theory, a renaissance, pages 437–458. Springer, 2015. [18] Albert Cohen, Ronald Devore, and Christoph Schwab. Analytic regularity and polynomial approximation of parametric and stochastic elliptic PDE’s. Analysis and Applications, 9(01):11– 47, 2011. [19] Albert Cohen and Giovanni Migliorati. Multivariate approximation in downward closed polynomial spaces. In Contemporary Computational Mathematics-A celebration of the 80th birth- day of Ian Sloan, pages 233–282. Springer, 2018. [20] James W Cooley and John W Tukey. An algorithm for the machine calculation of complex Fourier series. Mathematics of computation, 19(90):297–301, 1965. [21] Ronald Cools. Advances in multidimensional integration. Journal of Computational and Ap- plied Mathematics, 149(1):1–12, 2002. [22] Ronald Cools and Philip Rabinowitz. Monomial cubature rules since “Stroud”: a compilation. Journal of Computational and Applied Mathematics, 48(3):309–326, 1993. [23] Carl De Boor. On calculating with B-splines. Journal of Approximation theory, 6(1):50–62, 1972. [24] Carl De Boor. Efficient computer manipulation of tensor products. Technical report, Wiscon- sin Univ. Madison Mathematics Research Center, 1977. MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 39

[25] Carl De Boor. On wings of splines. In Creative Minds, Charmed Lives: Interviews at Institute for Mathematical Sciences, National University of Singapore, pages 50–57. World Scientific, 2010. [26] Carl De Boor, Carl De Boor, Etats-Unis Mathématicien,Carl De Boor, and Carl De Boor. A practical guide to splines, volume 27. Springer-Verlag New York, 1978. [27] Carl De Boor and Klaus Höllig.Approximation power of smooth bivariate pp functions. Mathematische Zeitschrift, 197(3):343–363, 1988. [28] Carl De Boor and Amos Ron. On multivariate polynomial interpolation. Constructive Ap- proximation, 6(3):287–302, 1990. [29] Carl De Boor and Amos Ron. Computational aspects of polynomial interpolation in several variables. Mathematics of Computation, 58(198):705–727, 1992. [30] Louis De Branges. The Stone-Weierstrass Theorem. Proceedings of the American Mathemat- ical Society, 10(5):822–824, 1959. [31] Jean Dieudonnéand Alexandre Grothendieck. Eléments´ de géométriealgébrique.1971. [32] Tobin A Driscoll, Nicholas Hale, and Lloyd N Trefethen. Chebfun guide, 2014. [33] Herbert Edelsbrunner and John Harer. Persistent homology-a survey. Contemporary mathematics, 453:257–282, 2008. [34] H Ehlich and K Zeller. Auswertung der Normen von Interpolationsoperatoren. Mathematische Annalen, 164(2):105–112, 1966. [35] Y Eidelman and I Gohberg. Linear complexity inversion algorithms for a class of structured matrices. Integral Equations and Operator Theory, 35(1):28–52, 1999. [36] W. Erb, C. Kaethner, P. Denker, and M. Ahlborg. A survey on bivariate Lagrange interpolation on Lissajous nodes. Dolomites Research Notes on Approximation, 8:23–36. [37] Georg Faber. Uber¨ die interpolatorische Darstellung stetiger Funktionen. Jber. Deutsch. Math. Verein, 23:192–210, 1914. [38] Michael S Floater and Kai Hormann. Barycentric rational interpolation with no poles and high rates of approximation. Numerische Mathematik, 107(2):315–331, 2007. [39] Jerome Friedman, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning, volume 1. Springer series in statistics New York, 2001. [40] M. Gasca and J. I. Maeztu. On Lagrange and Hermite interpolation in Rk. Numerische Mathematik, 39(1):1–14, 1982. [41] Mariano Gasca and Thomas Sauer. Polynomial interpolation in several variables. Advances in Computational Mathematics, 12(4):377, 2000. [42] Simen Gaure. Usage notes for package chebpol, https://cran.r-project.org/package=chebpol. 2018. [43] Carl Friedrich Gauss. Methodus nova integralium valores per approximationem inveniendi, (1814). Werke, 3:165–196. [44] Carl Friedrich Gauss. Nachlass: Theoria interpolationis methodo nova tractata. Carl Friedrich Gauss Werke, 3:265–327, 1866. [45] Walter Gautschi. Numerical analysis. Springer Science & Business Media, 2011. [46] I Gohberg and V Olshevsky. Complexity of multiplication with vectors for structured matrices. Linear Algebra and its Applications, 202:163–192, 1994. [47] Leslie Greengard and June-Yub Lee. Accelerating the nonuniform fast Fourier transform. SIAM review, 46(3):443–454, 2004. [48] R. B. Guenther and E. L. Roetman. Some observations on interpolation in higher dimensions. Math. Comp. 24 (), 517-522, 24:517–522, 1970. [49] Preston C Hammer and Arthur H Stroud. Numerical integration over simplexes. Mathematical tables and other aids to computation, 10(55):137–139, 1956. [50] Robin Hartshorne. Algebraic geometry, volume 52. Springer Science & Business Media, 2013. [51] Allen Hatcher. Vector bundles and k-theory. http://www. math. cornell. edu/˜hatcher, 2003. [52] M Hecht, Bevan L. Cheeseman, Karl B. Hoffmann, and Ivo F. Sbalzarini. A quadratic-time algorithm for general multivariate polynomial interpolation. arXiv preprint arXiv:1710.10846, 2017. [53] Michael Hecht. Isomorphic chain complexes of Hamiltonian dynamics on tori. Journal of Fixed Point Theory and Applications, 14(1):165–221, 2013. [54] Michael Hecht, Karl B. Hoffmann, Bevan L Cheeseman, and Ivo F Sbalzarini. Multivariate Newton interpolation. arXiv preprint arXiv:1812.04256, 2018. 40 M. HECHT, K. GONCIARZ, J. MICHELFEIT, V. SIVKIN, AND I.F. SBALZARINI

[55] Michael Hecht and Ivo F. Sbalzarini. Fast interpolation and Fourier transform in high- dimensional spaces. In K. Arai, S. Kapoor, and R. Bhatia, editors, Intelligent Computing. Proc. 2018 IEEE Computing Conf., Vol. 2,, volume 857 of Advances in Intelligent Systems and Computing, pages 53–75, London, UK, 2018. Springer Nature. [56] Michael T Heideman, Don H Johnson, and C Sidney Burrus. Gauss and the history of the fast Fourier transform. Archive for history of exact sciences, pages 265–277, 1985. [57] Jan S Hesthaven. From electrostatics to almost optimal nodal sets for polynomial interpolation in a simplex. SIAM Journal on Numerical Analysis, 35(2):655–676, 1998. [58] Harro Heuser. Lehrbuch der Analysis. Springer-Verlag, 2013. [59] N Hoang. On node distributions for interpolation and spectral methods. Mathematics of Computation, 85(298):667–692, 2016. [60] Helmut Hofer and Eduard Zehnder. Symplectic invariants and Hamiltonian dynamics. In The Floer memorial volume, pages 525–544. Springer, 1995. [61] Helmut Hofer and Eduard Zehnder. Symplectic invariants and Hamiltonian dynamics. Birkhäuser,2012. [62] Dunham Jackson. On the accuracy of trigonometric interpolation. Transactions of the Amer- ican Mathematical Society, 14(4):453–461, 1913. [63] Carl Gustav Jakob Jacobi. Uber¨ Gauss neue Methode, die Werte der Integrale näherungsweise zu finden. Journal fürdie reine und angewandte Mathematik, 1826(1):301–308, 1826. [64] J. Jost. Partial Differential Equations. New York: Springer-Verlag, 2002. [65] A. Le Méhauté.On some aspects of multivariate polynomial interpolation. Advances in Com- putational Mathematics, 12(4):311–333, 2000. [66] Franciszek Leja. Sur certaines suites liées aux ensembles plans et leur application àla représentation conforme. In Annales Polonici Mathematici, volume 1, pages 8–13, 1957. [67] KV Mardia, JT Kent, and JM Bibby. Multivariate Statistics. Academic Press Inc, London LTD (1979). [68] KV Mardia, JT Kent, and JM Bibby. Multivariate Analysis, volume 15. Academic Press Inc, London LTD (1979), 1979. [69] John H McCabe and George M Phillips. On a certain class of Lebesgue constants. BIT Numerical Mathematics, 13(4):434–442, 1973. [70] E. Meijering. A chronology of interpolation: From ancient astronomy to modern signal and image processing. Proceedings of the IEEE, 90(3):319–342, March 2002. [71] Jannik Michelfeit. multivar horner: A python package for computing Horner factorisations of multivariate polynomials. Journal of Open Source Software, 5(54):2392, 2020. [72] Giovanni Migliorati. Adaptive polynomial approximation by means of random discrete least squares. In Numerical Mathematics and Advanced Applications-ENUMATH 2013, pages 547– 554. Springer, 2015. [73] John Milnor. Stasheff, Characteristic classes. Ann. of Math. Studies, 76, 1974. [74] G Mühlbach et al. Neville-Aitken algorithms for interpolation by functions of Chebychev- systems in the sense of Newton and in a generalized sense of Hermite. Theor. Approximation Appl. Conf. Proc.; Calgary; New York; Academic Press; pp. 200-212; Bibl. 4 Ref., 1976. [75] Günter Mühlbach. The general Neville-Aitken-algorithm and some applications. Numerische Mathematik, 31(1):97–110, 1978. [76] Peter J Olver. On multivariate interpolation. Studies in Applied Mathematics, 116(2):201– 240, 2006. [77] Theodore J Rivlin. The Chebyshev polynomials. Wiley-Interscience, New York, 1974. [78] TJ Rivlin. The Lebesgue constants for polynomial interpolation. In Functional Analysis and its Applications, pages 422–437. Springer, 1974. [79] Dan Romik. Stirling’s approximation for n!: The ultimate short proof? The American Math- ematical Monthly, 107(6):556–557, 2000. [80] Carl Runge. Uber¨ empirische Funktionen und die Interpolation zwischen äquidistanten Or- dinaten. Zeitschrift fürMathematik und Physik, 46(224-243):20, 1901. [81] Thomas Sauer. Lagrange interpolation on subgrids of tensor product grids. Mathematics of Computation, 73(245):181–190, 2004. [82] Thomas Sauer and Yuan Xu. On multivariate Lagrange interpolation. Mathematics of computation, 64(211):1147–1170, 1995. MULTIVARIATE INTERPOLATION ON UNISOLVENT NODES 41

[83] Yuri K Shestopaloff and Alexander Y Shestopaloff. New reconstruction and data processing methods for regression and interpolation analysis of multidimensional big data. arXiv preprint arXiv:1703.07009, 2017. [84] Vladimir Sivkin. Multivariate Lagrange interpolation. technical report, MOSAIC Group, TU Dresden, 2019. [85] S. Smale. An infinite dimensional version of Sard’s theorem. Amer. J. Math., 87:861–866, 1965. [86] Josef Stoer, Roland Bulirsch, Richard H. Bartels, Walter Gautschi, and Christoph Witzgall. Introduction to numerical analysis. Texts in applied mathematics. Springer, New York, 2002. [87] Gilbert Strang. Wavelets. American Scientist, 82(3):250–255, 1994. [88] Volker Strassen. Gaussian elimination is not optimal. Numerische Mathematik, 13(4):354– 356, 1969. [89] Arthur H Stroud. Approximate calculation of multiple integrals. 1971. [90] Lloyd N Trefethen. Multivariate polynomial approximation in the hypercube. Proceedings of the American Mathematical Society, 145(11):4837–4844, 2017. [91] Lloyd N Trefethen. Approximation theory and approximation practice, volume 164. Siam, 2019. [92] Lloyd N Trefethen and David Bau III. Numerical linear algebra, volume 50. Siam, 1997. [93] Karl Weierstrass. Uber¨ die analytische Darstellbarkeit sogenannter willkürlicher Funktionen einer reellen Veränderlichen. Sitzungsberichte der KöniglichPreußischen Akademie der Wis- senschaften zu Berlin, 2:633–639, 1885. [94] Afra Zomorodian and Gunnar Carlsson. Computing persistent homology. Discrete & Com- putational Geometry, 33(2):249–274, 2005.

MOSAIC Group, Chair of Scientific Computing for Systems Biology, Faculty of Com- puter Science, TU Dresden, Dresden, Germany & Center for Systems Biology Dresden, Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany Current address: Center for Systems Biology Dresden, Pfotenhauerstraße 108, 01307 Dresden, Germany Email address: [email protected], [email protected]