Integer points close to a transcendental and correctly-rounded evaluation of a function Nicolas Brisebarre, Guillaume Hanrot

To cite this version:

Nicolas Brisebarre, Guillaume Hanrot. Integer points close to a transcendental curve and correctly- rounded evaluation of a function. 2021. ￿hal-03240179v2￿

HAL Id: hal-03240179 https://hal.archives-ouvertes.fr/hal-03240179v2 Preprint submitted on 21 Sep 2021

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. INTEGER POINTS CLOSE TO A TRANSCENDENTAL CURVE AND CORRECTLY-ROUNDED EVALUATION OF A FUNCTION

NICOLAS BRISEBARRE AND GUILLAUME HANROT

Abstract. Despite several significant advances over the last 30 years, guar- anteeing the correctly rounded evaluation of elementary functions, such as √ cos, exp, 3 · for instance, is still a difficult issue. This can be formulated asa Diophantine approximation problem, called the Table Maker’s Dilemma, which consists in determining points with integer coordinates that are close to a curve. In this article, we propose two algorithmic approaches to tackle this problem, closely related to a celebrated work by Bombieri and Pila and to the so-called Coppersmith’s method. We establish the underlying theoretical foundations, prove the algorithms, study their complexity and present practical experiments; we also compare our approach with previously existing ones. In particular, our results show that the development of a correctly rounded mathematical library for the binary128 format is now possible at a much smaller cost than with previously existing approaches.

Contents 1. Introduction2 1.1. Arithmetic framework3 1.2. Correct rounding, Table Maker’s Dilemma4 1.3. Fast and cheap correctly-rounded function evaluation in binary645 1.4. Goal and outline of the paper6 2. Formalization of the problem7 3. State of the art9 3.1. Diophantine approximation results: the exponential and the functions9 3.2. Algorithmic approaches 10 4. A quick overview of uniform approximation and lattice basis reduction 11 4.1. Relation between uniform approximation and interpolation 11 4.2. A quick reminder on Euclidean lattices and the LLL algorithm 13 5. The one-variable method, à la Liouville 14 5.1. Volume estimates for rigorous interpolants at the Chebyshev nodes 14 5.2. Statement of the algorithms 17 5.3. Practical remarks 18 5.4. Proof of correctness 21 5.5. Complexity analysis 26 6. The two-variable method 32

2020 Mathematics Subject Classification. Primary 11Y99, 41A05, 65G50; Secondary 11D75, 11J25. This work was partly supported by the TaMaDi, FastRelax and NuSCAP projects of the French Agence Nationale de la Recherche. 1 2 N. BRISEBARRE AND G. HANROT

6.1. Volume estimates for rigorous interpolants at the Chebyshev nodes 32 6.2. Statement of the algorithms 36 6.3. Practical remarks 37 6.4. Proof of correctness 37 6.5. Complexity analysis 40 7. Comparison with previous work 42 7.1. Univariate method and Bombieri and Pila’s approach 43 7.2. Bivariate method vs. Stehlé’s approach 43 8. Experimental results 44 8.1. Algorithms1 and2 in action: the TMD for the gamma function in binary128 44 8.2. Using 휔0 in Algorithms1 and2 : the TMD for the in binary128 - experimental validation of Figure1 45 8.3. Algorithms3 and4 in action: the TMD for the exponential function in binary128 45 9. Conclusion 47 Acknowledgements. 48 References 48 Appendix A. Proofs of facts regarding Chebyshev polynomials 51 Appendix B. Proof of Theorem 5.27 56 Appendix C. Precision required, 1D case 57 Appendix D. Lemmata on 휙, 휓 60 Appendix E. Proofs of Theorems 6.4 and 6.12 63

1. Introduction Modelling real numbers on a computer is by no means a trivial task. Until the mid- 80s, processor manufacturers developed their own representations and conventions, leading to a difficult era – a time of weird, unexpected and dangerous behaviours33 [ ]. This motivated the publication in 1984 of the IEEE-754 standard [17, 3], since then revised in 2008 [29, 48] and 2019 [28], for binary floating-point (FP) arithmetic1, which remains the best trade-off for representing real numbers on a computer51 [ ]. This put an end to this dangerous era of “numerical insecurity”. In particular, the IEEE-754 standard clearly specifies the formats of the FP representations of numbers, and the behaviour of the four arithmetic operations and the square root. And yet, as of today, the standard still does not rule the behaviour of usual functions, such as the ones contained in the C mathematical library (libm), as precisely as it does for the four arithmetic operations and the square root. The issue that we address in this paper is the problem of correctly-rounded evaluation of a one-variable function. Usually, when one wants to evaluate a function such as the cube root or the exponential functions, one actually evaluates a very good approximation of it (such as a polynomial for instance). This raises a problem, that is correct rounding: how can one guarantee that the rounding of the value of the function coincides with the rounding of the value of the approximation? This issue is related to a problem called Table’s Maker Dilemma (TMD).

1and the radix independent IEEE-854 [16,4] standard that followed NEW RESULTS ON THE TABLE MAKER’S DILEMMA 3

This paper presents two heuristic approaches to address the TMD. Both mix ingredients from approximation theory and algorithmic number theory (actually Euclidean lattice basis reduction). The first approach can be viewed as an effective variant of Bombieri and Pila’s approach developed in [6]. The second one is an improvement over the algorithmic approaches developed in [62, 60]. Rather than reducing the problem for 푓 to the same problem for an approximation (Taylor) polynomial for 푓 as it is done in [62, 60], we work with the function 푓 itself as long as possible. The difference may seem subtle, but it raises significant difficulties, while bringing two major improvements: smaller matrices and the prereduction trick (see Section8). In particular, we give the first significant non-trivial results for the binary128 format and this work paves the way for the first development of an efficient correctly rounded mathematical library in the three fundamental formats binary32, binary64 and binary128. As of today, the library CRlibm2 [38] offers correctly rounded evaluation of the binary64 precision C99 standard elementary functions. We believe that our results are interesting in themselves. In particular, beyond their application to the Table Maker’s Dilemma for which we improve on some of the theoretical and practical results of [39, 41, 40, 62, 60], they offer a practical means to compute integer points in a strip around a transcendental analytic curve3. Note that we restrict ourselves in the present paper to transcendental function. Our methods, as we describe them, are bound to fail for algebraic functions of small degree. They may however be adapted in this case (similarly to Bombieri and Pila’s adaptation in the algebraic case). We intend to come back to this in a sequel of this paper.

1.1. Arithmetic framework. We first recall the definition of a FP number. Z Definition 1.1. Let 훽, 푝, 퐸min, 퐸max ∈ , 훽, 푝 > 2, 퐸min < 0 < 퐸max, a (normal) radix-훽 FP number in precision 푝 with exponent range [퐸min, 퐸max] is a number of the form 푀 푥 = (−1)푠 · 훽퐸, 훽푝−1 where : Z ∙ the exponent 퐸 ∈ is such that 퐸푚푖푛 6 퐸 6 퐸푚푎푥, 푝−1 ∙ the integral significand 푀 ∈ N represented in radix 훽 satisfies 훽 6 푀 6 훽푝 − 1, ∙ 푠 ∈ {0, 1} is the sign bit of 푥. In the sequel, we shall leave the exponent range implicit unless it is explicitly required, and simply talk about “radix-훽 FP numbers in precision 푝”. Remark 1.2. For the sake of clarity, we chose not to mention subnormal FP numbers since they will not appear in the text. One can find the complete definition in[51, Chap. 2.1]. The number zero is a special case, cf. [51, Chap. 3], that we add to the set of radix-훽 and precision-푝 FP numbers. This yields a set denoted ℱ훽,푝.

2https://gforge.inria.fr/scm/browser.php?group_id=5929&extra=crlibm 3We mean here a representative curve of a transcendental analytic function. 4 N. BRISEBARRE AND G. HANROT

precision 푝 minimal exponent 퐸푚푖푛 maximal exponent 퐸푚푎푥 binary32 24 −126 127 binary64 53 −1022 1023 binary128 113 −16382 16383 Table 1. Main parameters of the three basic binary formats (up to 128 bits) specified by the standard [28].

Remark 1.3. In this paper, we use radix 2 for the sake of clarity but our approach remains valid for any radix, in particular radix 10, the importance of which grows at a steady pace. The set ℱ2,푝 will thus be denoted ℱ푝. Table1 gives the main parameters of the three basic binary formats specified by IEEE 754-2019. The result of an arithmetic operation whose input values belong to ℱ푝 may not belong to ℱ푝 (in general it does not). Hence that result must be rounded. The IEEE standard defines 5 different rounding functions; in the sequel, 푥 is any real number to be rounded:

∙ round toward +∞, or upwards: ∘푢(푥) is the smallest element of ℱ푝 that is greater than or equal to 푥; ∙ round toward −∞, or downwards: ∘푑(푥) is the largest element of ℱ푝 that is less than or equal to 푥; ∙ round toward 0: ∘푧(푥) is equal to ∘푢(푥) if 푥 < 0, and to ∘푑(푥) otherwise; ∙ round to nearest ties to even, denoted ∘푛푒(푥) and round to nearest ties to away, denoted ∘푛푎(푥). If 푥 is exactly halfway between two consecutive elements of ℱ푝, ∘푛푒(푥) is the one for which the integral significand 푀 is an even number and ∘푛푎(푥) is the one for which the integral significand 푀 is largest. Otherwise, both return the element of ℱ푝 that is the closest to 푥. The first three rounding functions are called directed rounding functions. The following real numbers will play a key role in the problem that we address.

Definition 1.4. A rounding breakpoint (or simply, a breakpoint) is a point where the rounding function changes (namely a discontinuity point). For round-to-nearest functions, the rounding breakpoints are the exact middles of consecutive floating- point numbers. For the other rounding functions, they are the floating-point numbers themselves.

1.2. Correct rounding, Table Maker’s Dilemma. The standard requires that the user should be able to choose one rounding function among these ones, called the active rounding function. An active rounding function being chosen, when performing one of the 4 arithmetic operations, or when computing square roots, the obtained rounded result should be equal to the rounding of the exact result: this requirement on the quality of the computation is called correct rounding. Being able to provide correctly rounded functions is of utter interest: ∙ it greatly improves the portability of software; ∙ it allows one to design algorithms that use this requirement; ∙ this requirement can be used for designing formal proofs of pieces of software; NEW RESULTS ON THE TABLE MAKER’S DILEMMA 5

∙ one can easily implement interval arithmetic, or more generally one can get certain lower or upper bounds on the exact result of a sequence of arithmetic operations. While the IEEE 754-1985 and 854-1987 standards required correctly rounded arithmetic operations and square root, they did not do it for the√ most√ common mathematical functions, such as simple algebraic4 functions like 1/ ·, 3 ·,... and also a few transcendental5 functions like sine, cosine, exponentials, and of radices 푒, 2, and 10, etc. More generally, a natural target is the whole class of elementary functions6. A subset of these functions is usually available from the libms delivered with compilers or operating systems. This lack of requirement is mainly due to a difficult problem known as the Table Maker’s Dilemma (TMD), a term coined by Kahan. When evaluating most elementary functions, one has to compute an approximation to the exact result, using an intermediate precision somewhat larger than the “target” precision 푝. The TMD is the problem of determining, given a function 푓, what this intermediate precision should be in order to make sure that rounding that approximation yields the same result as rounding the exact result. Ideally, we aim at getting the minimal such precision htr푓 (푝), that we call hardness to round of 푓 (see Definition 2.3). If we have 푁 FP numbers in the domain being considered, it is expected that 7 htr푓 (푝) is of the order of 푝+log2(푁) (hence 2푝 for most usual functions and binades ). This is supported by a probabilistic heuristic approach that is presented in detail in [51, 50]. It has been studied in [10] where O. Robert and the authors of the present paper gave, under some mild hypothesis on 푓 ′′, solid theoretical foundations to some instances of this probabilistic heuristic, targeting in particular the cases that the CRlibm library uses in practice.

1.3. Fast and cheap correctly-rounded function evaluation in binary64. Diophantine approximation-type methods yield – not fully satisfactory — upper bounds for htr푓 (푝) for algebraic functions: the precision to which the computations must be performed is, in general, grossly overestimated [30, 37, 11]. On the other hand, regarding transcendental functions, either no theoretical statement exists or they provide results that are off by such a margin that they cannot be usedin practical computations [52, 35, 34]. Therefore, algorithmic approaches to the TMD [39, 41, 40, 62, 27] had to be developed. They allowed for solving the TMD for the IEEE binary64 format (also known as "double precision"). As a consequence, the revised IEEE-754 standard now recommends (yet does not require, due to the lack of results in the case of binary128) that the following functions should be correctly rounded: 푒푥, 푒푥 − 1, 2푥, 2푥 − 1, 10푥, 10푥 − 1, ln(푥), log (푥), √ 2 √︀ 2 2 푛 푛 1/푛 log10(푥), ln(1 + 푥), log2(1 + 푥), log10(1 + 푥), 푥 + 푦 , 1/ 푥, (1 + 푥) , 푥 , 푥 (푛 is an integer), sin(푥), cos(푥), tan(푥), arcsin(푥), arccos(푥), arctan(푥), arctan(푦/푥),

4We say that a function 휙 is algebraic if there exists 푃 ∈ Z[푥, 푦] ∖ {0} such that for all 푥 such that 휙(푥) is defined, 푃 (푥, 휙(푥)) = 0. 5A function is transcendental if it is not algebraic. 6An elementary function is a function of one variable which is the composition of a finite number of arithmetic operations (+, −, ×, /), exponentials, logarithms, constants, and solutions of algebraic equations [12, Def. 5.1.4]. 7A binade is an interval of the form [2푘, 2푘+1) or (−2푘+1, −2푘] for 푘 ∈ Z. 6 N. BRISEBARRE AND G. HANROT sin(휋푥), cos(휋푥), tan(휋푥), arcsin(푥)/휋, arccos(푥)/휋, arctan(푥)/휋, arctan(푦/푥)/휋, sinh(푥), cosh(푥), tanh(푥), sinh−1(푥), cosh−1(푥), tanh−1(푥). Thanks to these results, it is now possible to obtain correct rounding in binary64 in two steps only (inspired by a strategy developed by A. Ziv [71] and implemented in the libultim library8), which one may then optimize separately. This is the approach used in CRlibm: ∙ the first quick step is as fast as a current libm, and provides a relative accuracy of 2−52−푘 (푘 = 11 for the exponential function for instance), which is sufficient to round correctly to the 53 bits of binary64 in most cases; ∙ the second accurate step is dedicated to challenging cases. It is slower but has a reasonable bounded execution time, being tightly targeted at the hardest-to-round cases computed by Lefèvre et al. [42, 41, 61, 62, 60]. In particular, there is no need for arbitrary multiple precision anymore. This approach [21, 22] leads to correctly-rounded function evaluation routines that are fast and have a reasonable memory consumption. Unfortunately, the lack of useful information about the TMD in binary128 has so far prevented the development of an extension of CRlibm to this format.

1.4. Goal and outline of the paper. In this paper, we present two new algorith- mic approaches to tackle the TMD. For both, we follow the standard strategy to subdivide the interval under study into subintervals; but instead of approximating the function 푓 by a polynomial function using Taylor expansion at the center of such a tiny interval 퐼, as it was done in [39, 41, 40, 62, 60], we approximate 푓 by an algebraic function using uniform approximation: if we assume for instance 푓 : [1/2, 1) → [1/2, 1) (hence every involved FP number has denominator 2푝),we 푝 푝 search for 푃0 and 푃1 ∈ Z[푋, 푌 ] that are small on the “weighted” curve (2 푥, 2 푓(푥)) (first approach) or in a strip around this “weighted” curve (second approach). This smallness implies that the bad cases for rounding are common roots to 푃0 and 푃1. Then, we use a heuristic argument of coprimality of 푃0 and 푃1, analogous to the one used in [7, 62, 60] to obtain these bad cases. In order to compute 푃0 and 푃1, we use ideas and techniques developed by the first author and S. Chevillard9 [ ]. Very roughly speaking, if we still assume 푓 : [1/2, 1) → [1/2, 1), the key idea is to find 푃 ∈ Z[푋1, 푋2] that is small at some 푝 푝 푝 푝 points (2 푥푖, 2 푓(푥푖)) of the “weighted” curve (first approach) or (2 푥푖, 2 (푓(푥푖)+푦푖)) of a strip around the “weighted” curve (second approach). If the points 푥푖 (resp. the pairs (푥푖, 푦푖)) are carefully chosen, these discrete smallness constraints imply uniform smallness over the curve, resp. the strip around the curve, cf. Section 4.1. The discrete constraints can be reformulated as the fact that the values of 푃 at 푝 푝 푝 푝 the (2 푥푖, 2 푓(푥푖)) (resp. the (2 푥푖, 2 (푓(푥푖) + 푦푖))) are the coordinates of a certain short vector in a Euclidean lattice. The celebrated LLL algorithm, cf. Section 4.2, then allows for computing a reasonable candidate for 푃 . The first approach which favours smallness on the curve (2푝푥, 2푝푓(푥)), 푥 ∈ 퐼 is somehow akin to [6] whereas the second one, which forces smallness on a strip around this curve, is somehow analogous to [62, 60]. As our reader will see, our work does not lead, with respect to the previous algorithmic approaches, to an improvement for the determination of worst cases for

8libultim was released by IBM. NEW RESULTS ON THE TABLE MAKER’S DILEMMA 7 rounding and optimal values of htr푓 (푝). On the other hand, for certain elementary or special functions evaluated in binary128, we are able to provide:

∙ upper bounds for htr푓 (푝) that are useful in practice. For instance, for the exponential function, which plays a central role for correctly-rounded evaluation of the elementary functions of the C mathematical library, we 9 provide a roadmap to reach, in practice , the upper bound htr푓 (푝) 6 12푝 for 푝 = 113 which corresponds to the binary128 format; ∙ an effective determination of the FP values whose evaluation by 푓 is exactly an FP number or the middle of two consecutive FP numbers. This is a key issue in the development of correctly-rounded evaluation routines. An exhaustive evaluation is possible in binary32 and theoretical results [31] yield lists of these values for some restricted classes of functions and algorithmic approaches [42, 41, 61, 62, 60] make it possible to address this problem in binary64. However, before the present paper, the only practical means to tackle the binary128 format was S. Torres’ implementation of [60] in his PhD thesis [65] and we will show in Section8 that our work significantly improves the situation. As an example, we address the case of the Euler function Γ, a function of the C mathematical library, for which theoretical results from Transcendental Number Theory are almost nonexistent. This hopefully paves the way to an extension of CRlibm to the binary128 format, provided that we adopt the following three step strategy for this format: ∙ a first quick step identical to the one mentioned in the previous subsection; ∙ a second step, slower but with a reasonable bounded execution time, where the evaluation is performed using a precision of 260 = 2푝 + 34 bits, say. Heuristically, this should cover all the hardest-to-round cases; ∙ a third step, where the evaluation is performed using a precision of 12푝 = 1356 bits. Heuristically, this step should never be called, so it is important to write routines simple enough to be formally proved in order to guarantee its validity. We will formalize the problem we address in Section2. We then give a state of the art in Section3. The theoretical results are presented in Section 3.1, including appli- cations of [35, 34] that, to the best of our knowledge, are reviewed for the first time and offer theoretical upper bounds for htr푓 (푝) in the binary64 and binary128 cases which greatly improve upon the existing ones. The existing algorithmic approaches are sketched in Section 3.2. Our approach relies on tools from Approximation Theory and Euclidean lattice basis reduction and an idea presented in [9, 15]. We recall them in Section4. Our first approach is presented in Section5 and our second one in Section6. We present a comparison with previous work in Section7 and we conclude with experimental results in Section8.

2. Formalization of the problem Assume we wish to correctly round a real-valued function 휙. Note that if 푥 is a bad case for 휙 (i.e., 휙(푥) is difficult to round), then it is also a bad case for −휙 and −푥 is a bad case for 푡 ↦→ 휙(−푡) and 푡 ↦→ −휙(−푡). Hence we can assume that 푥 > 0 and 휙(푥) > 0.

9A few days for a binade, see Section8. 8 N. BRISEBARRE AND G. HANROT

푒1 푒1+1 We consider that all input values are elements of ℱ푝 ∩ [2 , 2 ). The method must be applied for each possible integer value of 푒1. If the values of 휙(푥), for 푥 ∈ [2푒1 , 2푒1+1), are not all included in the binade [2푒2 , 2푒2+1), we split the input interval into subintervals such that for each subinterval, there is a rational integer 푒2 such that the values 휙(푥), for 푥 in the subinterval, are in [2푒2 , 2푒2+1). We now restrict to one of those subintervals 퐼 included in [2푒1 , 2푒1+1). For directed rounding functions, the problem to be solved is the following: Problem 2.1 (TMD, directed rounding functions). What is the minimum 휇(푝) ∈ Z 푝−1 푝 such that, for 2 6 푋 6 2 − 1 (and, possibly, the restrictions implied by −푒1+푝−1 (︀ 푒1−푝+1)︀ 푝−1 푝 푋/2 ∈ 퐼) such that 휙 푋2 ∈/ ℱ푝 and for 2 6 푌 6 2 − 1, we have ⃒ (︂ )︂ ⃒ ⃒ −푒 푋 푌 ⃒ 1 ⃒2 2 휙 − ⃒ > . ⃒ 2−푒1+푝−1 2푝−1 ⃒ 2휇(푝) For rounding to nearest functions, the problem to be solved is the following: Problem 2.2 (TMD, rounding to nearest functions). What is the minimum 휇(푝) ∈ 푝−1 푝 Z such that, for 2 6 푋 6 2 − 1 (and, possibly, the restrictions implied by 푋/2−푒1+푝−1 ∈ 퐼) such that 휙 (︀푋2푒1−푝+1)︀ is not the middle of two consecutive 푝−1 푝 elements of ℱ푝 and for 2 6 푌 6 2 − 1, we have ⃒ (︂ )︂ ⃒ ⃒ −푒 푋 2푌 + 1⃒ 1 ⃒2 2 휙 − ⃒ > . ⃒ 2−푒1+푝−1 2푝 ⃒ 2휇(푝) These statements lead to the following definition. Definition 2.3 (hardness to round). Let a precision 푝 be given, ∘ be a rounding function and 휙 be a real valued function. Let 푥 be a FP number in precision 푝 and 푒2 푒2+1 푒2 ∈ Z be the unique integer such that 휙(푥) ∈ [2 , 2 ) (here again, we assume 푥 and 휙(푥) > 0, since the extension to the other cases is straightforward). The hardness to round 휙(푥), denoted htr휙,{푥},∘(푝) is equal to: ∙ −∞ if 휙(푥) is a breakpoint; ∙ the smallest integer 푚 such that the distance of 휙(푥) to the nearest break- point is larger than or equal to 2−푚−푒2 .

The hardness to round 휙 over an interval 퐼, denoted htr휙,퐼,∘(푝), is then the maximum of the hardness to round 휙(푥) for all FP 푥 ∈ ℱ푝 ∩ 퐼, while the hardness to round 휙 is the hardness to round 휙 over R, simply denoted htr휙,∘(푝). When there is no ambiguity over the rounding function, we get rid of the symbol ∘. Remark 2.4. Note that Problem 2.2 for precision 푝 is a subproblem of Problem 2.1 for precision 푝 + 1. Remark 2.5. If we assume that 휙 admits an inverse 휙−1 and is differentiable over 퐼 and that we have a precise control over the image of 휙′ over 퐼, it follows from the mean value theorem that addressing Problems 2.1 and 2.2 for 휙 over 퐼 is analogous −1 to addressing Problems√ 2.1 and 2.2 for 휙 over 휙(퐼). For instance, one can think of exp and log or 푥 ↦→ 3 푥 and 푥 ↦→ 푥3. The problem that we actually tackle in this paper is the following. Problem 2.6. Let 푎, 푏 ∈ R, 푎 < 푏, 푓 :[푎, 푏] → R be a transcendental function analytic in a (complex) neighbourhood of [푎, 푏]. Let 푢, 푣, 푤 ∈ N ∖ {0}, determine the NEW RESULTS ON THE TABLE MAKER’S DILEMMA 9 integers 푋, 푎 6 푋/푢 6 푏 for which there exists 푌 ∈ Z satisfying ⃒ (︂ )︂ ⃒ ⃒ 푋 푌 ⃒ 1 (2.1) ⃒푓 − ⃒ < . ⃒ 푢 푣 ⃒ 푤

This problem encompasses the TMD: consider 푎 = 2푒1 , 푏 = 2푒1+1 − 1, 푢 = 2푝−푒1−1, 푣 = 2푝−푒2−1 and 푓 = 휙 (for directed rounding functions) or 푓 = 휙 − 1/(2푣) (for rounding to nearest functions). It also includes a generalization of the question addressed in [6], which corresponds to the case 푎 = 0, 푏 = 1, 푢 = 푣. Remark 2.7. Note that the status of 푤 in Problem 2.6 may vary. In Section5, we’ll consider 푤 as an output of the algorithm: on input 푢, 푣, 푎, 푏, Algorithm2 heuristically returns a value of 푤 and a set (푋, 푌 ) of solutions of (2.1). This comes from the fact that Algorithm2 is primarily devoted to finding solutions to 푌/푣 = 푓(푋/푢), and that it happens that from the work done on this curve, one can deduce information close to the curve. A parameter 휔0 gives some influence on 푤, but no complete control. On the other hand, in Section6, 푤 will be an input of the problem : on input 푢, 푣, 푤, 푎, 푏, Algorithm4 heuristically returns the set (푋, 푌 ) of solutions of (2.1).

3. State of the art In this section, we review the state of the art on the TMD; we shall start by discussing results which can be derived from previous estimates in Diophantine approximation, then shall account on the algorithmic approaches which have been developed since the late 90s. One can find a more complete (but slightly outdated since the results of [35, 34] on the exponential function are not considered) state of the art in [51, Chap. 12]. 3.1. Diophantine approximation results: the exponential and the loga- rithm functions. In this section, we use the conventions and notations that we introduced in Section2. The exponential function is central in the study of correctly-rounded evaluation of the elementary functions of libms: a relevant information on its hardness to round yields relevant information as well on trigonometric and hyperbolic functions, and their respective reciprocals, see [51, §12.4.4] and Remark 2.5, the logarithm function and inverse . Following the works [52] and [35], Khémira and Voutier proved in [34] a lower ⃒ ⃒ bound (called transcendence measure) for the expression ⃒푒훽 − 훼⃒, where 훼 and 훽 are algebraic numbers, 훽 ̸= 0. When specialized in FP numbers, their result provides interesting upper bounds for htrexp(푝). Let 푚 and 푛 ∈ N, we put 푚! 푑푛 = l.c.m.(1, . . . , 푛) and 퐷푚,푛 = , ∏︀ 푣푞 (푚!) 푞6푛, 푞 푞 prime where 푣푞(푚!) is the 푞-adic valuation of 푚!. We can now state Khémira and Voutier’s Theorem in the particular case where 훼 is a FP number and 훽 is a FP number (directed rounded functions) or the middle of two consecutive FP numbers (round-to- nearest functions). As mentioned above, we can get a similar result for the logarithm function. 10 N. BRISEBARRE AND G. HANROT

Theorem 3.1 (Khemira and Voutier [34], specialized here to FP numbers, directed rounded functions). Let a precision 푝 be given, let 푥 ̸= 0 and 푦 ∈ ℱ푝 such that 푦 푥 and 푒 are in the same binade. We denote by 푒푥, resp. 푒푦, the exponent of 푥, resp. N 10 푦. We have 푒푦 = ⌊log2(exp(푥))⌋ = ⌊푥/ log(2)⌋. Let 퐾 and 퐿 ∈ ∖ {0}, 퐾 > 2, 퐿 > 2 and 퐸 ∈ (1, +∞) which satisfy √ 퐾퐿 log 퐸 > 퐾퐿 log 2 + (퐾 − 1)(1 + log( 3퐿푑퐿−1)) + log(퐷퐾−1,퐿−1) (︀ (︀ 퐾−1 )︀)︀ (3.1) + (1 + 2 log 2)(퐿 − 1) + log min 푑퐿−2 , (퐿 − 2)! + log((퐾 − 1)!)

+ (퐾 − 1) max(푝 − 2 − 푒푥, −1) log 2 + 퐿퐸|푥| + 퐿 log 퐸

+ (퐿 − 1) max(푝 − 2 − 푒푦, −1) log 2. 푥 −퐾퐿 Then we have |푒 − 푦| > 퐸 . Remark 3.2. For the round-to-nearest functions, we assume that 푦 is the middle of two consecutive FP numbers and that the numbers 푦 and 푒푥 are in the same binade. If we denote again 푒푦 = ⌊푥/ log(2)⌋, the conclusion of the theorem remains valid if we replace max(푝 − 2 − 푒푦, −1) with max(푝 − 1 − 푒푦, −1) in the last line of (3.1). For instance, using the following sets of parameters, we are able to compute the following upper bounds for the hardness to round exp on [1/4, 1/2): ∙ In binary64 (푝 = 53), the triple (퐾, 퐿, 퐸) = (61, 29, 81.29...) yields htrexp,[1/4,1/2)(53) 6 11225 ∼ 211푝. ∙ In double extended precision (푝 = 64), the triple (퐾, 퐿, 퐸) = (62, 37, 82.62...) yields htrexp,[1/4,1/2)(64) 6 14610 ∼ 228푝. ∙ In binary128 (푝 = 113), the triple (퐾, 퐿, 퐸) = (84, 59, 109.44...) yields htrexp,[1/4,1/2)(113) 6 33573 ∼ 297푝. 3.2. Algorithmic approaches. In view of the lack of practicality (or in order to improve on it) of fundamental results discussed in the previous subsection, algorithmic approaches have been developed and used in an extensive way since the late 90s. A first straightforward idea consists in testing all possible FPvalues 푥; for each value of 푥 one computes a sufficiently accurate interval approximation to 푓(푥) and determines the hardness to round 푓(푥). The cost of the approach is obviously proportional to the number of different FP numbers of the format under study i.e., 2푝+퐸max−퐸min , which makes it basically tractable for binary32 [59]. In [23], the exhaustive evaluations are performed on an FPGA using a tabulated difference approach, which makes it possible to address the binary64 case. Currently, the double extended or binary128 formats seem completely out of reach of such approaches. More subtle ideas proceed by splitting the domain into subintervals and replacing the function (assumed to be sufficiently smooth) by a polynomial, in practice a Taylor approximation, over the interval under study; one is then reduced to study the problem in the polynomial case. Lefèvre, together with Muller [39, 41, 40], studied the degree 1 case; in this case, the remaining Diophantine problem is to find two integers 푥, 푦, |푥| 6 푋, |푦| 6 푌 such that |훼푥 + 훽 − 푦| is minimal, which is solved by elementary Diophantine arguments, either the three distance theorem, or continued fractions (see e.g., [5]). These ideas

10 The condition 퐾 > 2 is not stated in [34] but it is actually necessary to have Inequality (3.1) satisfied here. NEW RESULTS ON THE TABLE MAKER’S DILEMMA 11 lead to an algorithm of complexity 푂˜(22푝/3) for floating-point numbers of precision 푝, as 푝 → ∞, which computes all worst cases for rounding in the domains under consideration. Highly optimized and parallel implementations of this method have proved invaluable to find optimal values of htr푓 (53) for several functions of the standard libm. This was a key step towards the development of CRlibm. Higher degree approximations give rise to more complicated Diophantine problems. Stehlé, Lefèvre and Zimmermann [62], further refined by Stehlé [60], make use of a technique due to Coppersmith [19, 20] and based on lattice basis reduction to solve it. We recall Corollaries 4 & 5 of [60], adapted to our context. Theorem 3.3 (Stehlé [60]). For all 휀 > 0, there exists a heuristic algorithm of complexity 풫(푝)2푝(1+휀)/2, where 풫 is a polynomial, which, given a function 푓, returns all FP numbers 푥 ∈ [1/2, 1) of precision 푝 such that the hardness to round 푓(푥) is > 2푝. There exists a polynomial-time heuristic algorithm which returns all FP numbers 2 푥 ∈ [1/2, 1) of precision 푝 such that the hardness to round 푓(푥) is > 4푝 ; the latter works by reducing a lattice of dimension 푂(푝2) of R푚 for some 푚 = 푂(푝4). The heuristic character of the algorithm is rather mild (i.e., the algorithm works in practice as expected on almost all inputs). We use a somewhat different approach: the algorithmic content of our method remains based on lattice basis reduction, but rather than reducing the problem to a polynomial problem, we keep the problem linked to the function, which we shall make possible thanks to rigorous uniform approximation techniques based on Chebyshev interpolation. In order to develop our approach, we thus now need to give a short survey of Chebyshev interpolation/approximation and lattice basis reduction.

4. A quick overview of uniform approximation and lattice basis reduction We shall require some tools from uniform approximation theory [24, 58, 57, 14, 8, 49, 66] and algorithmic geometry of numbers [47, 26, 18, 13, 54, 68].

4.1. Relation between uniform approximation and interpolation. Let 푛 ∈ N , the 푛-th Chebyshev polynomial of the first kind is defined by 푇푛 ∈ R푛[푥] and 푇푛(cos 푡) = cos(푛푡) for all 푡 ∈ [0, 휋]. The 푇푛’s can also be defined by

푇0(푥) = 1, 푇1(푥) = 푥, 푇푛+2(푥) = 2푥푇푛+1(푥) − 푇푛(푥), ∀푛 ∈ N.

4.1.1. Interpolation at the Chebyshev nodes. The zeros of 푇푛+1 are (︂(푛 − 푘 + 1/2)휋 )︂ 휇 = cos , 푘 = 0, . . . , 푛. 푘,푛 푛 + 1 They are called (푛 + 1)-Chebyshev nodes of the first kind. Polynomials interpolating functions at this family give rise to very good uniform approximations over [−1, 1] to these functions [49, 66]. To be able to work on an interval [푎, 푏], we will need scaled versions of Chebyshev polynomials and nodes. We then define, for 푛 ∈ N, (︂2푥 − 푏 − 푎)︂ (푏 − 푎)휇 + 푎 + 푏 (4.1) 푇 := 푇 , 휇 := 푘,푛 , 푘 = 0, . . . , 푛. 푛,[푎,푏] 푛 푏 − 푎 푘,푛,[푎,푏] 2 12 N. BRISEBARRE AND G. HANROT

Here again, when there is no ambiguity, we denote the nodes as 휇푘,[푎,푏]. Note that (︀ )︀ 푇푛,[푎,푏] 휇푘,푛,[푎,푏] = 푇푛(휇푘,푛). Let 푁 > 1, let 푓 be a function defined over [푎, 푏], if we interpolate 푓 by a polynomial in R푁−1[푥] at the scaled Chebyshev nodes of the first kind, we have the following expressions for the interpolation polynomial 푃 [49, Chap. 6]: ∑︁′ 푃 (푥) = 푐푘푇푘,[푎,푏](푥) ∈ R푁−1[푥] with 06푘6푁−1 2 ∑︁ 푐 = 푓(휇 )푇 (휇 ) for 푘 = 0, . . . , 푁 − 1, 푘 푁 ℓ,푁−1,[푎,푏] 푘,[푎,푏] ℓ,푁−1,[푎,푏] 06ℓ6푁−1 2 ∑︁ = 푓(휇 )푇 (휇 ). 푁 ℓ,푁−1,[푎,푏] 푘 ℓ,푁−1 06ℓ6푁−1 ∑︁′ The symbol means that the first coefficient has to be halved. Note that,if (︀ 푏−푎 푎+푏 )︀ we introduce 푓̂︀ : 푧 ∈ [−1, 1] ↦→ 푓 푧 2 + 2 , the coefficients 푐푘 are also the coefficients of the interpolation polynomial in R푁−1[푥] of 푓̂︀ at the Chebyshev nodes of the first kind. 4.1.2. Uniform approximation using interpolation polynomials. Let 휌 > 1, if 푎 < 푏 are two real numbers, we define the ellipse {︂푏 − 푎 휌푒푖휃 + 휌−1푒−푖휃 푎 + 푏 }︂ ℰ = + , 휃 ∈ [0, 2휋] 휌,푎,푏 2 2 2 and let 퐸휌,푎,푏 be the closed region bounded by the ellipse ℰ휌,푎,푏. For 푓 a function analytic in a neighbourhood of 퐸휌,푎,푏, we define 푀휌,푎,푏(푓) = max푧∈ℰ휌,푎,푏 |푓(푧)|. Let 푁 ∈ N, 푁 > 1, we also define 휌2 + 1 휂 = 1 and 휂 = for 푘 = 1, . . . , 푁 − 1. 휌,0 휌,푘 휌2 − 1 The following two propositions establish Cauchy’s inequalities for interpolation polynomials at (scaled) Chebyshev nodes.

Proposition 4.1. Let 휌 > 1, 푎 < 푏, let 푁 ∈ N, 푁 > 1, 푓 be a function analytic in a neighbourhood of 퐸휌,푎,푏, the coefficients 푐푘, 푘 = 0, . . . , 푁 − 1, of the interpolation polynomial 푝푁−1 of 푓 at the (scaled) Chebyshev nodes of the first kind over [푎, 푏],

(휇푘,푁−1,[푎,푏])06푘6푁−1 satisfy 푀 (푓) |푐 | 2 휌,푎,푏 휂 for 푘 = 0, . . . , 푁 − 1, 푘 6 휌푘 휌,푘 Moreover, we have 4푀 (푓) ‖푓 − 푝 ‖ 휌,푎,푏 . 푁−1 ∞,[푎,푏] 6 휌푁−1(휌 − 1) Proof. See AppendixA. 

Let 휌1, 휌2 > 1, 푎1 < 푏1, 푎2 < 푏2, we define ℰ휌1,푎1,푏1,휌2,푎2,푏2 = ℰ휌1,푎1,푏1 × ℰ휌2,푎2,푏2 and 퐸휌1,푎1,푏1,휌2,푎2,푏2 = 퐸휌1,푎1,푏1 × 퐸휌2,푎2,푏2 . N Proposition 4.2. Let 휌1, 휌2 > 1, 푎1 < 푏1, 푎2 < 푏2, let 푀1, 푀2 ∈ , 푀1, 푀2 > 2,

푓 be a function analytic in a neighbourhood of 퐸휌1,푎1,푏1,휌2,푎2,푏2 , the coefficients

푐푘1,푘2 , 푘1 = 0, . . . , 푀1 − 1, 푘2 = 0, . . . , 푀2 − 1 of the interpolation polynomial NEW RESULTS ON THE TABLE MAKER’S DILEMMA 13

푃푀1−1,푀2−1 of 푓 at pairs of Chebyshev nodes of the first kind satisfy, for 푘1 = 0, . . . , 푀1 − 1, 푘2 = 0, . . . , 푀2 − 1,

푀휌1,푎1,푏1,휌2,푎2,푏2 (푓) |푐푘 ,푘 | 4 휂휌 ,푘 휂휌 ,푘 , 1 2 6 푘1 푘2 1 1 2 2 휌1 휌2 where 푀 (푓) = max |푓(푧)|. Moreover, we have 휌1,푎1,푏1,휌2,푎2,푏2 푧∈ℰ휌1,푎1,푏1,휌2,푎2,푏2 (︂ )︂ 16휌1휌2푀휌1,휌2 (푓) 1 1 ‖푓 − 푃푀 −1,푀 −1‖ + . 1 2 ∞,[푎1,푏1]×[푎2,푏2] 6 푀1 푀2 (휌1 − 1)(휌2 − 1) 휌1 휌2 Proof. See AppendixA.  4.2. A quick reminder on Euclidean lattices and the LLL algorithm. In this subsection, we shortly review basic facts concerning lattices and lattice basis reduction algorithms. 푀 Definition 4.3. Let 푀 ∈ N, 푀 > 1, a lattice of R is a discrete subgroup of R푀 ; equivalently, a lattice 퐿 ⊂ R푀 is the set of integer linear combinations of a 푀 family (푏1, . . . , 푏푁 ) of R-linearly independent vectors of R . We shall then say that

(푏푖)16푖6푁 is a basis of 퐿, and that 푁 6 푀 is the dimension (or the rank) of 퐿.

Proposition 4.4. The sets 퐵 = (푏푖)16푖6푁 , 퐶 = (푐푖)16푖6푁 are two bases of the same lattice, given in (row) matrix form if and only if there exists 푈 ∈ ℳ푁 (Z), det 푈 ∈ {±1}, such that 퐶 = 푈퐵. As a consequence, the quantity (det 퐶퐶푡)1/2 = (det 퐵퐵푡)1/2 is independent of the basis and is associated to the lattice itself – we shall call it the volume of the lattice and denote it by vol 퐿.

Given a basis (푏1, . . . , 푏푁 ) of 퐿 as input, finding a shortest nonzero vector in 퐿 is called the shortest vector problem. The decision version of this problem has been shown [1] to be hard under randomized reductions; in practice, one thus has to content oneself with approximation algorithms, such as the LLL algorithm [43]: Theorem 4.5 (Lenstra, Lenstra, Lovász, 1982). The LLL algorithm, given 푁 푀 R-linearly independent vectors (푏1, . . . , 푏푁 ) ∈ Z , returns a basis (푐1, . . . , 푐푁 ) such (푁−1)/4 1/푁 (푁−1)/2 that ‖푐1‖2 6 2 (vol 퐿) , and ‖푐1‖2 6 2 min푥∈퐿−{0}‖푥‖2. One also (푁−1)/4 1/(푁−1) has ‖푐2‖2 6 2 (vol 퐿) . The time complexity of the LLL algorithm is polynomial in the maximal bit-length of the coefficients of the 푏푖’s, the lattice rank 푁, and the space dimension 푀. Proof. See Theorems 9 and 10 from [54, Chap. 2], except for the last inequality on ‖푐2‖ which is a consequence of the proof of Fact 3.3 in [7].  The constant 2 of the inequalities of the theorem is arbitrary, and could be replaced by any real number > 4/3. We now discuss shortly an improvement due to Akhavi & Stehlé [2] to the LLL algorithm in the case where 푁 is much smaller than 푀, the dimension of the ambient space. Let 퐴 be an 푁 × 푀 matrix, the rows of which generate the lattice 퐿; the idea is to reduce a smaller 푁 × 푁 matrix obtained by a random projection (i.e., multiplying 퐴 on the right by a random 푀 × 푁 matrix), and apply the same transformation to the original matrix.

Theorem 4.6 (Akhavi & Stehlé, 2008). For all 푁, there is an 푛0(푁) such that for 푀 > 푛0(푁), if 푃 is an 푀 × 푁 matrix whose columns are independent random vectors picked up uniformly independently inside the 푀-th dimensional unit ball, 14 N. BRISEBARRE AND G. HANROT

′ −푁 and 퐴 = LLL(퐴 · 푃 ); then, with probability > 1 − 2 , the first column vector of ′ −1 4푁 1/푁 the matrix 퐴 (퐴 · 푃 ) 퐴 is a vector of 퐿 of norm 6 2 (vol 퐿) . S. Torres [65] showed that this idea indeed improves the practical outcome of [60]. As for us, cf. Section8, we noticed that this idea works even for simpler models for the random matrix 푃 such as random, uniform {0, ±1} coefficients, which give equally good results.

5. The one-variable method, à la Liouville Let 푢, 푣 ∈ N ∖ {0}, 푎 < 푏 be two real numbers and 푓 :[푎, 푏] → R. The starting point of this approach to Problem 2.6 follows a simple (but fundamental!) idea due to J. Liouville [44, 45, 46], which we now recall. For any 푃 ∈ Z[푋1, 푋2], for any 푥 ∈ [푎, 푏], 푦 ∈ [푓(푥) − 1/푣, 푓(푥) + 1/푣], we know, from the mean value theorem, that there exists 푧 between 푣푓(푥) and 푣푦 (hence 푧 ∈ [푣푓(푥) − 1, 푣푓(푥) + 1]), such that 휕푃 (5.1) 푃 (푢푥, 푣푓(푥)) − 푃 (푢푥, 푣푦) = 푣(푓(푥) − 푦) (푢푥, 푧). 휕푦 Then we compute, by combining Chebyshev interpolation and lattice reduction, two polynomials 푃0, 푃1 ∈ Z[푋1, 푋2] such that, for 푖 = 0, 1, |푃푖(푢푥, 푣푓(푥))| < 1/2, while ⃒ ⃒ ⃒ 휕푃푖 (푢푥, 푧 )⃒ is bounded by “not too large” an 푀. Let 푥 = 푋/푢 ∈ [푎, 푏], 푋 ∈ Z, ⃒ 휕푦 푖 ⃒ 0 푦0 = 푌/푣, 푌 ∈ Z, we have 푃푖(푢푥0, 푣푦0) = 푃푖(푋, 푌 ) ∈ Z. If, for instance 푃0(푋, 푌 ) is non zero, then it follows from (5.1) ⃒ ⃒ ⃒휕푃0 ⃒ |푃0(푋, 푌 )| − |푃0(푋, 푣푓(푋/푢))| 6 푣|푓(푋/푢) − 푌/푣| ⃒ (푢푥0, 푧0)⃒, ⏟ ⏞ ⏟ ⏞ ⃒ 휕푦 ⃒ >1 <1/2 ⏟ ⏞ 6푀 hence |푣푓(푋/푢) − 푌 | > 1/(2푀). Otherwise, we have 푃0(푋, 푌 ) = 푃1(푋, 푌 ) = 0. We now use our heuristic assumption, that is 푃0 and 푃1 have no nonconstant common factor: we then perform elimination of one of the variables and retrieve the list of all the bad cases, i.e., the 푋 such |푣푓(푋/푢) − 푌 | 6 1/(2푀). In the sequel of this section, we give all the details of this approach: we first give estimates of the determinants of the lattices that we use, we present our algorithm, the proof of its correctness and analyse its complexity.

5.1. Volume estimates for rigorous interpolants at the Chebyshev nodes. Let 푁 > 2, for 푖 = 0, . . . , 푁 − 1, let 푓푖 be a function defined over [푎, 푏] and 푄푖 be its interpolation polynomials in R푁−1[푥] at Chebyshev nodes of the first kind. Let DCT-II denote the discrete cosine transform of type 2: DCT-II : R푁 → R푁

(푥0, . . . , 푥푁−1) ↦→ (푋0, . . . , 푋푁−1) with ∑︁ (︂푘(ℓ + 1/2)휋 )︂ 푋 = 푥 cos , for 푘 = 0, . . . , 푁 − 1. 푘 ℓ 푁 06ℓ6푁−1 This function is often introduced with slightly different normalizations63 [ , 56] and we can take advantage of fast algorithms that make possible to compute it in at NEW RESULTS ON THE TABLE MAKER’S DILEMMA 15 most 풪(푁 log 푁) operations. Recall, cf. Section 4.1.1, that for 푖 = 0, . . . , 푁 − 1, ∑︁′ 푄푖(푥) = 푐푘,푖푇푘,[푎,푏](푥) ∈ R푁−1[푥] with 06푘6푁−1 2 (5.2) (푐 , . . . , 푐 ) = DCT-II(푓 (휇 ), . . . , 푓 (휇 )). 0,푖 푁−1,푖 푁 푖 푁−1,[푎,푏] 푖 0,[푎,푏]

Let us introduce two real parameters 휌 > 1 and 휔0 > 0 (to be chosen later on). We now assume that all the 푓푖’s are analytic in a neighbourhood of 퐸휌,푎,푏. For 푁−1 푖 = 0, . . . , 푁 − 1, let 푅푖 = 4푀휌,푎,푏(푓푖)/(휌 (휌 − 1)). We have, by Proposition 4.1, ‖푓푖 − 푄푖‖∞,[푎,푏] 6 푅푖. If 훿푖푗 denotes the Kronecker delta, we introduce the 푁 × 2푁 matrix 퐴 = (퐴1퐴2), where

(︀ 훿푗0 )︀ 휔0 (5.3) 퐴1 = 푐푗,푖/2 , 퐴2 = (훿푖푗휌 푅푖) . 06푖,푗6푁−1 06푖,푗6푁−1 Its rows generate the lattice that will be reduced in our algorithm. The (diagonal) right half of the matrix are weights that will be used for two tasks:

∙ (remainders) controlling that 푃0(푢푥, 푣푓(푥)), 푃1(푢푥, 푣푓(푥)) are uniformly small, where 푃0 , 푃1 are two polynomials, with integer coefficients, output by the lattice reduction process. Matrix 퐴1 helps us securing uniform smallness of the interpolation polynomial of 푃0, resp. 푃1, and matrix 퐴2 helps us securing smallness of the corresponding approximation remainders (this accounts for the presence of the 푅푖 term); ∙ (coefficients) controlling the size of the coefficients of the polynomial, hence the quality of the lower bound deduced from the output of Algorithm2; this accounts for the presence of the 휌휔0 term. We shall assume later on 휔0 푁−1 (in Section 5.2) that 4휌 푣푀휌,푎,푏(푓)/(휌 (휌 − 1)) < 1. This is a necessary 휔0 condition for the success of the method; otherwise, most of the 휌 푅푖 are too large, and the method is bound to fail. Note that this assumption can be made without loss of generality on 푓, as for fixed 푓, 푎, 푏, 휔0, 휌 it holds for 푁 large enough. We now establish a slightly improved version of [60, Theorem 2]11. Let an 푁 × 푀 matrix 퐵 whose rows span a lattice 퐿. We assume that the entries of 퐵 satisfy: |퐵푖,푗| 6 푟푟푖 · 푐푐푗, for some 푟푟푖’s and 푐푐푗’s, 0 6 푖 6 푁 − 1 and 0 6 푗 6 푀 − 1. As mentioned in [60], this is typical for Coppersmith-type lattice bases.

Theorem 5.1. Let 퐵 be an 푁 × 푀 matrix (with 푀 > 푁), the entries of which are bounded by the product of some quantities 푟푟푖’s and 푐푐푗’s as described above. Let 퐿 be the lattice spanned by the rows of the matrix 퐵, and P the product of the 푁 largest 푐푐푗’s. We have:

1/2 (︃푁−1 )︃ (︂푀)︂ ∏︁ vol 퐿 = (det 퐵퐵푡)1/2 푁 푁/2 푟푟 P. 6 푁 푖 푖=0

Proof. Let us denote C0,..., C푀−1 the columns of 퐵. The classical Lagrange’s identity, which is a particular case of Cauchy-Binet formula [25, 36], then states

√ 11Note that there is an inaccuracy in the statement of [60, Theorem 2] : 푛 should be replaced (︀푛)︀1/2 1/2 with 푑 푑! . 16 N. BRISEBARRE AND G. HANROT that

푡 ∑︁ 2 (5.4) det 퐵퐵 = det(C푗1 ,..., C푗푁 ) 06푗1<···<푗푁 6푀−1 (︂ )︂ 푀 2 6 max det(C푗1 ,..., C푗푁 ) . 푁 06푗1<···<푗푁 6푀−1 (︁∏︀푁−1 )︁ We can assume 푖=0 푟푟푖 ̸= 0: otherwise, there is at least one row of 퐵 that is identically 0, hence vol 퐿 = 0. Now, for a given 0 6 푗1 < ··· < 푗푁 6 푀 − 1, if one of the 푐푐푗 is zero, it follows that at least one column of (C푗1 ,..., C푗푁 ) is zero, hence det(C푗1 ,..., C푗푁 ) = 0. Otherwise, we consider the matrix (C′ ... C′ ) obtained from (C ... C ) after 푗1 푗푁 푗1 푗푁 having divided the 푖-th row by 푟푟푖 for all 푖 = 0, . . . , 푁 − 1 and the 푗푘-th column by 푐푐 for all 푘 = 1, . . . , 푁. All the coefficients of (C′ ... C′ ) have an absolute value 푗푘 푗1 푗푁 less or equal to 1. Hadamard’s inequality then implies det(C′ ,..., C′ )2 푁 푁 . 푗1 푗푁 6 It follows

(︃푁−1 )︃2 (︃ 푁 )︃2 ∏︁ ∏︁ det(C ,..., C )2 = 푟푟 푐푐 det(C′ ,..., C′ )2 푗1 푗푁 푖 푗푘 푗1 푗푁 푖=0 푘=1 (︃푁−1 )︃2 ∏︁ 2 푁 6 푟푟푖 P 푁 . 푖=0 We conclude by combining the last inequality with (5.4).  Then, we upper bound det 퐴퐴푡, where 퐴 is the matrix defined by (5.3).

Theorem 5.2. Let 휌 > 1, 휔0 > 0, 푎 < 푏, 푁 > 2, 푓0, . . . , 푓푁−1 be functions analytic in a neighbourhood of 퐸휌,푎,푏. We have (︂ )︂푁 ∏︀푁−1 푡 1/2 푁/2 휌 푖=0 푀휌,푎,푏(푓푖) (det 퐴퐴 ) 6 (64푁) 휌 − 1 휌푁(푁−1)/2+⌊휔0⌋(⌊휔0⌋−2휔0+1)/2

Proof. For 0 6 푖, 푗 6 푁 − 1, we have from Proposition 4.1 and Lemma A.6, 2 ⃒ 푐푗,푖 ⃒ 1 휌 + 1 푀휌,푎,푏(푓푖) 휌 |퐴1,푖,푗| 6 ⃒ ⃒ 6 2푀휌,푎,푏(푓푖) 6 2 . ⃒2훿푗0 ⃒ 휌푗 휌2 − 1 휌푗 휌 − 1 Now, let us write, for 0 6 푖, 푗 6 푁 − 1,

푀휌,푎,푏(푓푖) 휌 푀휌,푎,푏(푓푖) |퐴 | 휌휔0 푅 = 4휌휔0 = 4휌휔0 . 2,푖,푗 6 푖 휌푁−1(휌 − 1) 휌 − 1 휌푁

In view of these estimates, we can apply Theorem 5.1 with 푟푟푖 = 2휌/(휌−1)푀휌,푎,푏(푓푖) for 푖 = 0, . . . , 푁 − 1, {︂ −푗 휌 , 0 6 푗 6 푁 − 1, 푐푐푗 = 휔 −푁 2휌 0 , 푁 6 푗 6 2푁 − 1. Hence, P is the maximum, for 푠 = 0, . . . , 푁, of

푠 푁 ∏︁ 1 ∏︁ 2휌휔0 1 (2휌휔0 )푁−푠 = . 휌푘−1 휌푁 휌푠(푠−1)/2 휌푁(푁−푠) 푘=1 푘=푠+1 NEW RESULTS ON THE TABLE MAKER’S DILEMMA 17

Finally, Theorem 5.1 yields 푁−1 (︂ )︂1/2 (︂ )︂푁 휔0 (푁−푠) 푡 1/2 2푁 푁/2 푁 휌 (2휌 ) ∏︁ (det 퐴퐴 ) 6 푁 2 max 푀휌,푎,푏(푓푖) 푁 휌 − 1 0 푠 푁 휌푠(푠−1)/2+푁(푁−푠) 6 6 푖=0 푁−1 (︂ )︂푁 휔0(푁−푠) 푁/2 휌 휌 ∏︁ 6 (64푁) max 푀휌,푎,푏(푓푖). 휌 − 1 0 푠 푁 휌푠(푠−1)/2+푁(푁−푠) 6 6 푖=0

휔0(푁−푠) 푠(푠−1)/2+푁(푁−푠) 푁−푠−휔0 Then, if 푃푠 = 휌 /휌 , we have 푃푠+1/푃푠 = 휌 , hence 푃푠 is maximal for 푠 = 푁 − ⌊휔0⌋, which completes the proof of the Theorem.  We now specialize the previous estimate to our situation, where we shall use the 푘 푘 ℓ ℓ ordered list of functions [푓푖, 0 6 푖 6 (푑 + 1)(푑 + 2)/2 − 1] = [푥 ↦→ 푢 푥 푣 푓(푥) , ℓ = 0, . . . , 푑, 푘 = 0, . . . , 푑 − ℓ]. Let the 푐푗,푖’s be defined by (5.2), using the same ordering for the functions and [푅2,푖, 푖 = 0, . . . , 푁 − 1] be the ordered list [︂ 푢푘푀 (푥)푘푣ℓ푀 (푓)ℓ ]︂ (5.5) 4 휌,푎,푏 휌,푎,푏 ; ℓ = 0, . . . , 푑, 푘 = 0, . . . , 푑 − ℓ . 휌푁−1(휌 − 1) 12 We denote again 퐴 = (퐴1퐴2) the 푁 × 2푁 matrix defined by

(︀ 훿푗0 )︀ 휔0 (5.6) 퐴1 = 푐푗,푖/2 , 퐴2 = (훿푖푗휌 푅2,푖) . 06푖,푗6푁−1 06푖,푗6푁−1 Corollary 5.3. Let 휌 > 1, 푎 < 푏, 푓 be a function analytic in a neighbourhood N of 퐸휌,푎,푏. Let 푑 > 1, 푁 = (푑 + 1)(푑 + 2)/2, 휔0 > 0, 푢, 푣 ∈ ∖ {0}. Define 푘 푘 ℓ ℓ 푓푘,ℓ(푥) = 푢 푥 푣 푓(푥) for 0 6 ℓ 6 푑, 0 6 푘 6 푑−ℓ, the matrices 퐴1, 퐴2, 퐴 = (퐴1퐴2) 푡 1/2 as in (5.6), and the quantity Δ푁,[푎,푏],휔0 := (det 퐴퐴 ) . We have

(︂ )︂푁/(푁−1) 2푁/(3(푑+3)) 1/(푁−1) √ 휌 (푢푣) (5.7) Δ 6 30 푁 푁,[푎,푏],휔0 휌 − 1 휌푁/2+⌊휔0⌋(⌊휔0⌋−2휔0+1)/(2(푁−1)) (︂ (︂ −1 )︂ ⃒ ⃒)︂2푁/(3(푑+3)) 푏 − 푎 휌 + 휌 ⃒푏 + 푎⃒ 2푁/(3(푑+3)) + ⃒ ⃒ 푀휌,푎,푏(푓) . 2 2 ⃒ 2 ⃒ 푘 ℓ 푘 Proof. Follows from Theorem 5.2, the facts that 푀휌,푎,푏(푥 푓(푥) ) 6 푀휌,푎,푏(푥) ℓ 푀휌,푎,푏(푓) , the inequality (︂ −1 )︂ ⃒ ⃒ 푏 − 푎 휌 + 휌 ⃒푏 + 푎⃒ 푀휌,푎,푏(푥) + ⃒ ⃒ , 6 2 2 ⃒ 2 ⃒ √ √ 푁/(푁−1) and finally the fact that (8 푁) 6 30 푁 for 푁 > 3.  5.2. Statement of the algorithms. Our main routine is Algorithm2. It comes together with Algorithm1 that mainly constructs the lattice to be reduced in Algorithm2. We shall make use of the following notation: for any 푥 ∈ R, [푥]0 = ⌊푥⌋ if 푥 > 0 and ⌈푥⌉ otherwise. Before writing the algorithm, we explain how we turn our problem, which leads to the reduction of a sublattice of R2푁 , into a problem leading to the reduction of a sublattice of Z2푁 . From a mathematical point of view, the lattice that we would ideally work with is the one generated by the rows of 퐴 defined in (5.6), the volume of which is estimated in Corollary 5.3. And yet, in order to perform lattice reduction computations, it is

12 This is the same definition as (5.3) where we have specialized the 푓푖. 18 N. BRISEBARRE AND G. HANROT safer to work with lattices given by vectors defined over Z, hence the introduction of 퐴^ = (퐴^1퐴^2): (︀[︀ tprec ]︀ tprec)︀ (︀ tprec tprec)︀ 퐴^1 = 2 퐴1[푖, 푗] /2 , 퐴^2 = ⌊2 퐴2[푖, 푗]⌋/2 , 0 06푖,푗6푁−1 06푖,푗6푁−1 where tprec = ⌈− log2(min06푖6푁−1 퐴2[푖, 푖]) + log2(푁)⌉ + 5. The integer tprec cor- responds to a truncation precision that will allow us to work over Z푁 and keep enough information from 퐴 at the same time. ⃒ ⃒ ⃒ ^ ⃒ Remark 5.4. By construction, ⃒퐴[푖, 푗]⃒ 6 |퐴[푖, 푗]| for all 푖, 푗. Hence, Theorem 5.2 and its corollaries, which proceed by upper bounding the absolue values of the coefficients of 퐴 and applying Theorem 5.1, also hold for (det 퐴^퐴^푡)1/2.

Note that the matrices 푀푐 and 푀푟 computed in Algorithm1 correspond to the tprec tprec scaled matrices 2 퐴^1 and 2 퐴^2. The rows of 퐴^ generate the lattice that will be reduced in our algorithm. Lemma 5.5. The Z-module generated by the rows of 퐴^ is a lattice of rank 푁.

Proof. Recall that tprec = ⌈− log2(min06푖6푁−1 퐴2[푖, 푖]) + log2(푁)⌉ + 5. Thus, for tprec 5 ^ 5−tprec 푖 = 0, . . . , 푁 − 1, 2 퐴2[푖, 푖] > 2 푁, hence 퐴2[푖, 푖] > 2 푁 > 0. This shows that the matrix 퐴^2 is an invertible diagonal matrix, so that the matrix 퐴^ has full rank 푁.  We now give an equivalent but more convenient form for tprec. In order to do that, we henceforth assume that the set 푢[푎, 푏], resp. 푣푓([푎, 푏]), contains at least one nonzero integer 푛푥, resp. 푛푓 ; note that this assumption is made without loss of generality with respect to our problem, since if the assumption does not hold the problem is trivial. Lemma 5.6. We have

휔0 tprec =⌈− log2(휌 푅2,0) + log2(푁)⌉ + 5

=⌈(푁 − 휔0 − 1) log2(휌) + log2(휌 − 1) + log2(푁)⌉ + 3 and 푁−휔 −1 tprec 푁−휔 −1 8푁휌 0 (휌 − 1) 6 2 6 16푁휌 0 (휌 − 1).

Proof. From our assumption, we have 푢푀휌,푎,푏(푥) > 푛푥 > 1 and 푣푀휌,푎,푏(푓) > 푛푓 > 푁−1 1. It then follows 푅2,푖 > 4/(휌 (휌 − 1)) = 푅2,0 for all 푖. Therefore, we get 휔0 tprec = ⌈− log2(휌 푅2,0) + log2(푁)⌉ + 5. 

5.2.1. Heuristic character of Algorithm2. We will see in the sequel of the section that, up to a suitable choice of parameters, the condition stated at Step4 can always be satisfied. On the other hand, we do not know how to simultaneously guarantee both this condition and the condition stated at Step8. It is even likely that it is not possible when 푓 is close to an algebraic function of small height.

5.3. Practical remarks. Algorithm1 has been written with readability in mind. We now add some practical clarifications. We will discuss some experiments in Section8. NEW RESULTS ON THE TABLE MAKER’S DILEMMA 19

Algorithm 1 Computation of the lattice to be reduced (1D approach) Input: Two real numbers 푎 < 푏, 푓 a transcendental function analytic in a complex neighbourhood of [푎, 푏], three positive integers 푑, 푢, 푣, two real numbers 휌 > 휔0 푁−1 1, 휔0 > 0 such that 4휌 푣푀휌,푎,푏(푓) < 휌 (휌 − 1), where 푁 = (푑 + 1)(푑 + 2)/2. Output: Two matrices 푀푐, 푀푟 ∈ ℳ푁 (Z), respectively storing scaled values of the coefficients and the remainders, an integer tprec which is the truncation precision. 4휌휔0 1: 푅휔0 ← 휌푁−1(휌−1) , tprec ← ⌈− log2(푅휔0 ) + log2(푁)⌉ + 5 [︀ 푏−푎 (︀ 휋 )︀ 푎+푏 ]︀ 2: 퐿푐ℎ푒푏 ← cos (푗 + 1/2) + // Computation of the Cheby- 2 푁 2 06푗6푁−1 shev nodes, listed in reverse order 3: 푀푐 ← [0]푁×푁 ; 푀푟 ← [0]푁×푁 ⃒ 푎+푏 ⃒ 푏−푎 −1 4: 퐵푥 ← ⃒ 2 ⃒ + 4 (휌 + 휌 ) (︀ ⃒ (︀ 푎+푏 푏−푎 −1 )︀⃒)︀ 5: 푔 ← 푥 ↦→ ⃒푓 2 + 4 (휌 exp(푖푥) + 휌 exp(−푖푥)) ⃒ 6: 퐵푓 ← max (푔([0, 2휋])) 7: for ℓ = 0 to 푑 do 8: for 푘 = 0 to 푑 − ℓ do 9: 휙 ← (푥 ↦→ (푢푥)푘(푣푓(푥))ℓ) // We compute the coefficient matrix : for each function, we compute its value at points of 퐿푐ℎ푒푏, use DCT and scale. 10: 푈 ← [휙(퐿푐ℎ푒푏[0]), . . . , 휙(퐿푐ℎ푒푏[푁 − 1])] 2 1 11: 퐿DCT ← 푁 DCT-II(푈), 퐿DCT[0] ← 2 퐿DCT[0] 12: for 푗 = 0 to 푁 − 1 do tprec 13: 푀푐[푖, 푗] ← [2 퐿DCT[푗]]0 // Scaling of the coefficient matrix 14: end for // We compute the scaled remainder matrix. ⌊︀ tprec 푘 ℓ⌋︀ 15: 푀푟[푖, 푖] ← 2 푅휔0 (푢퐵푥) (푣퐵푓 ) , 푖 ← 푖 + 1 16: end for 17: end for 18: Return 푀푐, 푀푟, tprec

5.3.1. Efficiency. The number of DCT calls (see Step 11 in Algorithm1) can be reduced from 푂(푁) to 푂(푑) by noticing that the DCT 훿′ of the vector 푢′ associated to 푥휙(푥) can be deduced from the DCT 훿 of the vector 푢 associated to 휙(푥) via the following formulas, which are easily deduced from the recurrence relation 2푥푇푛(푥) = 푇푛+1(푥) + 푇푛−1(푥): ∙ 훿′[0] = (푏 − 푎)훿[1]/4 + (푏 + 푎)훿[0]/2, ′ ∙ 훿 [푘] = (푏 − 푎)(훿[푘 − 1] + 훿[푘 + 1])/4 + (푏 + 푎)훿[푘]/2, 1 6 푘 6 푛 − 2, ∙ 훿′[푛 − 1] = (푏 − 푎)훿[푛 − 2]/4 + (푏 + 푎)훿[푛 − 1]/2. This gives, in practice, a significant speedup in the construction of the matrix for large 푑. Note that these formulas must be applied to 퐿DCT before the renormalization 1 instruction 퐿DCT[0] ← 2 퐿DCT[0]. A similar strategy applies for the computation of the remainder matrix 푀푟.

5.3.2. Overestimation issues. We now discuss the instruction 퐵푓 ← max (푔([0, 2휋])), presented at Step6 of Algorithm1. For some functions such as exp or Γ, we can take advantage of a closed expression for this maximum. Otherwise, either we develop a dedicated routine to derive a tight estimate of this value, or we can use interval 20 N. BRISEBARRE AND G. HANROT

Algorithm 2 1D approach to Problem 2.6 Input: Two real numbers 푎 < 푏, 푓 a transcendental function analytic in a complex neighbourhood of [푎, 푏], three positive integers 푑, 푢, 푣, two real numbers 휌 > 휔0 푁−1 1, 휔0 > 0 such that 4휌 푣푀휌,푎,푏(푓) < 휌 (휌 − 1), where 푁 = (푑 + 1)(푑 + 2)/2. R 2 Output: If successful, return 퐾 ∈ >0 and a list ℒ of integers, #ℒ 6 푑 , such that for all integer 푋, 푎 6 푋/푢 6 푏 and 푋∈ / ℒ, for all integer 푌 , we have 푁−휔 −1 ⃒푓 (︀ 푋 )︀ − 푌 ⃒ 1/퐾. The bound 퐾 is guaranteed to be at most 푑휌 0 (휌−1) . ⃒ 푢 푣 ⃒ > 2푀휌,푎,푏(푓) 1: (푀푐, 푀푟, tprec) ← Algorithm1( 푎, 푏, 푓, 푑, 푢, 푣, 휌, 휔0) 2: 푀퐿퐿퐿 ← LLL-reduce the rows of (푀푐 | 푀푟) −1 3: 푈 ← 푀푟 푀퐿퐿퐿,푟 // This is the LLL change of basis matrix; 푀퐿퐿퐿,푟 is the right part of the matrix 푀퐿퐿퐿. Note that 푀푟 is diagonal. 4: tprec if max(‖(푀퐿퐿퐿[0, 푗])06푗62푁−1‖2, ‖(푀퐿퐿퐿[1, 푗])06푗62푁−1‖2) 6 2 /(2푁) then 푘 ℓ 5: [퐿푚[푗], 푗 = 0, . . . , 푁 − 1] ← [푋1 푋2 for 푘 = 0 to 푑 − ℓ for ℓ = 0 to 푑] // List of monomials, ordered in a way compatible with Algorithm1, steps7–9. ∑︀푁−1 ∑︀푁−1 6: 푃0 ← 푗=0 푈[0, 푗]퐿푚[푗], 푃1 ← 푗=0 푈[1, 푗]퐿푚[푗],

7: 푅(푋1) ← Res푋2 (푃0(푋1, 푋2), 푃1(푋1, 푋2)) 8: if 푅(푋1) ̸= 0 then 9: ℒ ← {푡 ∈ Z; 푅(푡) = 0} 10: (퐵0, 퐵1) ← (0, 0) // Monomials in 푋1 do not contribute to the 퐵푖’s; given the ordering of 퐿푚 this is obtained by starting the loop at 푘 = 푑 + 1. 11: for 푘 = 푑 + 1 to 푁 − 1 do 푑퐿푚[푘] 12: 퐽 ← (푋1 = 푢 max(|푎|, |푏|), 푋2 = 푣 max |푓|([푎, 푏])), 푑푋2 13: 퐵0 ← 퐵0 + |푈[0, 푘]|퐽, 퐵1 ← 퐵1 + |푈[1, 푘]|퐽 14: end for 15: return 퐾 = 4푣 max(퐵0, 퐵1, 푑/2), ℒ 16: else 17: return “FAIL” 18: end if 19: else 20: return “FAIL” 21: end if

or ball arithmetic [67, 32] to quickly obtain an upper bound, which then may raise overestimation issues. So far, our experiments, which use Arb13, resp. MPFI14, for every ball, resp. interval, arithmetic based computation, did not show any problematic overestimation – we thus did not have to develop dedicated routines. This remark leads to the fact that we may overestimate 퐵푓 , thus 푀푟, in imple- mentations of the Algorithm. Still, if the condition stated at Step4 of Algorithm2 is satisfied with these (possibly overestimated) computed values, then this condition is also satisfied for the actual values and Theorem 5.8 and Corollary 5.11 apply.

13https://arblib.org/ 14https://gitlab.inria.fr/mpfi NEW RESULTS ON THE TABLE MAKER’S DILEMMA 21

5.3.3. Rounding issues. In Algorithm1, the computation of tprec at Step1, as well as those performed at Steps 13 and 15, as written, require correct rounding, and may raise issues such as those this paper aims at solving. In this context, we can, however quite easily avoid them, using the classical remark that if we let 푥˜ be an underapproximation of 푥 with 푥˜ 6 푥 6 푥˜ + 1/2, then we have ⌈푥⌉ ∈ {⌈푥˜⌉, ⌈푥˜⌉ + 1}. Similarly, if 푥˜ is an approximation of a nonzero 푥 such that |푥˜| − 1/2 6 |푥| 6 푥˜, we get [푥]0 ∈ {[˜푥]0, [˜푥]0 − sgn(푥)}. It is well known that such approximations are easy to compute, either by using floating-point with sufficient intermediate precision and ensuring that we workwith over/under-approximations using the suitable rounding mode for each operation, or using ball arithmetic as provided, for instance, by Arb. In the sequel, we denote by tpreccomp, 푀푐,comp, 푀푟,comp the quantities computed in this way. Note that 푀푐 and 푀푟 are then defined with tpreccomp instead of tprec. We have tpreccomp ∈ {tprec, tprec + 1}, 푀푐[푖, 푗] − 푀푐,comp[푖, 푗] ∈ {0, sgn(푀푐[푖, 푗])}, and 푀푟[푖, 푗] − 푀푟,comp[푖, 푗] ∈ {0, 1} for 푖, 푗 = 0, . . . , 푁 − 1. The question of the intermediate precision required will be addressed in Appen- dixC. Note that in this setting, Remark 5.4 must be replaced by:

−tpreccomp Remark 5.7. Let 퐴^comp = 2 (푀푐,comp 푀푟,comp) be the actual computed matrix, ⃒ ⃒ ⃒ ^ ⃒ we notice that since ⃒퐴comp[푖, 푗]⃒ 6 |퐴[푖, 푗]| for all 푖, 푗, the same argument as in Remark 5.4 applies: Theorem 5.2 and its corollaries hold for 퐴^comp. 푘 푘 5.3.4. Newton polynomials. In practice, we replace the monomial functions {푢 푥 }06푘6푑 with Newton polynomial functions {푢푥(푢푥 − 1) ··· (푢푥 − 푘 + 1)/푘!}06푘6푑. In both 푘 cases, the substitution 푥 = 푋/푢 yields integer values - respectively {푋 }06푘6푑 and {푋(푋 − 1) ··· (푋 − 푘 + 1)/푘!}06푘6푑. Likewise, we replace the “monomials” 푘 푘 {푣 푓(푥) }06푘6푑 with “Newton polynomials” {푣푓(푥)(푣푓(푥) − 1) ··· (푣푓(푥) − 푘 + 1)/푘!}06푘6푑. Hence, the changes to operate are: ∙ Step9, Algorithm1, (︁ (︁∏︀푘 )︁∏︀ (︁ ℓ )︁)︁ 휙 ← 푥 ↦→ 푗=1(푢푥 − 푗 + 1)/푗 푗=1(푣푓(푥) − 푗 + 1)/푗 , ∙ Step 15, Algorithm1, ⌊︁ ⌋︁ 푀 [푖, 푖] ← 2tprec 4 ∏︀푘 푢퐵푥+푗−1 ∏︀ℓ 푣퐵푓 +푗−1 , 푟 휌푁−휔0−1(휌−1) 푗=1 푗 푗=1 푗 ∙ Step5, Algorithm2, [︁ ]︁ ∏︀푘 푋1−푗+1 ∏︀ℓ 푋2−푗+1 퐿푚 ← 푗=1 푗 푗=1 푗 for 푘 = 0 to 푑 − ℓ for ℓ = 0 to 푑 . The use of Newton polynomials leads to smaller uniform norms, hence makes it possible to tackle larger intervals for a same 푑. This improvement is asymptotically negligible (it contributes to a lower order term) but is quite significant in practice. Note that the iterations described in Section 5.3.1 can easily be adapted to the case of Newton polynomials. 5.4. Proof of correctness. In this section, we shall prove the correctness of Algorithm2. This is done in two steps: first, in Subsection 5.4.1 we prove that if the Algorithm does not return FAIL, then the output is indeed as specified; second, in Subsection 5.4.2, we prove that for suitable choices of parameters, Algorithm2 may not return FAIL at Step 20. Recall 푁 = (푑 + 1)(푑 + 2)/2. 5.4.1. Proof of correctness of the output in case of success. This part is devoted to the proof of the following. 22 N. BRISEBARRE AND G. HANROT

Theorem 5.8. Let 푑, 푢, 푣 be nonzero integers, 푁 = (푑+1)(푑+2)/2, (푓푗)06푗6푁−1 = 푘 푘 ℓ ℓ (푢 푥 푣 푓(푥) ) 06ℓ6푑 . Let 휔0 > 0, 휌 > 1 such that there exists Λ = (휆푘,ℓ) 06ℓ6푑 ∈ 06푘6푑−ℓ 06푘6푑−ℓ 푁 ∑︀ 푘 ℓ Z with ‖Λ퐴^‖2 1/(2푁), and let 푃 (푋1, 푋2) = 휆푘,ℓ푋 푋 . 6 06푘+ℓ6푑 1 2 Then, we have

(1) max푥∈[푎,푏] |푃 (푢푥, 푣푓(푥))| < 1/2; ⃒ 푃 (푢푥,푣푧)−푃 (푢푥,푣푦) ⃒ 퐵 푑휌푁−휔0−1(휌−1) (2) max푥∈[푎,푏],푧∈푓([푎,푏]) ⃒ ⃒ < < , where ⃒ 푧−푦 ⃒ 2 4푀휌,푎,푏(푓) 06|푦−푧|6|푧|/(2푑)

∑︁ 푘 푘 ℓ−1 ℓ−1 퐵 = 4푣 ℓ|휆푘,ℓ|푢 max(|푎|, |푏|) 푣 ‖푓‖∞,[푎,푏]. 16ℓ6푑 06푘6푑−ℓ ^ ^ Proof. The assumption ‖Λ퐴‖2 6 1/(2푁) implies in particular, since 퐴2 is diagonal, 5−tprec 1/(2푁) |휆푘,ℓ| min 퐴^2[푖, 푖] 2 푁|휆푘,ℓ|, for all 푘, ℓ, > 푖 > tprec−6 2 cf. proof of Lemma 5.5, hence |휆푘,ℓ| 6 2 /푁 . For 푗 = 0,..., 2푁 − 1, we have ∑︀ −tprec −6 |(Λ퐴)[푗] − (Λ퐴^)[푗]| |휆푘,ℓ|2 2 /푁, from which follows ‖Λ퐴‖1 6 06푘+ℓ6푑 6 6 −5 ‖Λ퐴^‖1 + 2 . ∑︀푁−1 Let 푃 be as in the statement of the Theorem, and 푄(푥) = 푗=0 푞푗푇푗,[푎,푏](푥) be the interpolation polynomial of 푃 (푢푥, 푣푓(푥)) at the order 푁 Chebyshev nodes of the first kind. Then, the coordinates of Λ퐴1 are exactly 푞푗, 0 6 푗 6 푁 − 1. Proposition 4.1 shows that

푀휌,푎,푏(푃 ) max |푄(푥) − 푃 (푢푥, 푣푓(푥))| 6 4 푥∈[푎,푏] 휌푁−1(휌 − 1) 푘 푘 ℓ ℓ ∑︁ 푢 푀휌,푎,푏(푥) 푣 푀휌,푎,푏(푓) 4 |휆 | , 6 푘,ℓ 휌푁−1(휌 − 1) 06푘+ℓ6푑 hence 푘 푘 ℓ ℓ ∑︁ 푢 푀휌,푎,푏(푥) 푣 푀휌,푎,푏(푓) max |푃 (푢푥, 푣푓(푥)| 6 max |푄(푥)| + 4 |휆푘,ℓ| . 푥∈[푎,푏] 푥∈[푎,푏] 휌푁−1(휌 − 1) 06푘+ℓ6푑 ∑︀ As max푥∈[푎,푏] |푇푘,[푎,푏](푥)| = 1 for all 푘, we have max푥∈[푎,푏] |푄(푥)| |푞푗|, 6 06푗6푁−1 so that: ∑︁ max |푃 (푢푥, 푣푓(푥)| 6 |푞푗|+ 푥∈[푎,푏] 06푗6푁−1 푘 푘 ℓ ℓ ∑︁ 푢 푀휌,푎,푏(푥) 푣 푀휌,푎,푏(푓) (5.8) 4휌휔0 |휆 | 푘,ℓ 휌푁−1(휌 − 1) 06푘+ℓ6푑 ^ −5 = ‖Λ퐴‖1 6 ‖Λ퐴‖1 + 2 √ −5 ^ 6 2 + 2푁‖Λ퐴‖2 thanks to Cauchy-Schwarz inequality √ −5 (5.9) 6 2 + 1/ 2푁 < 1/2 since 푁 > 3.

Finally, let 푥 ∈ [푎, 푏], 푧 ∈ 푓([푎, 푏]) such that 0 6 |푦 − 푧| 6 |푧|/(2푑). Notice that the 푃 (푢푥,푣푦)−푃 (푢푥,푣푧) quantity 푦−푧 is actually a polynomial, so is well defined for 푦 = 푧. NEW RESULTS ON THE TABLE MAKER’S DILEMMA 23

First, we note that max(|푎|, |푏|) 6 푀휌,푎,푏(푥) and ‖푓‖∞,[푎,푏] 6 푀휌,푎,푏(푓) from the maximum modulus principle. Then, we have ⃒ ⃒ ⃒푃 (푢푥, 푣푦) − 푃 (푢푥, 푣푧)⃒ ∑︁ 푘 푘 ℓ ℓ−1 ⃒ ⃒ ℓ|휆푘,ℓ|푢 |푥| 푣 max(|푧|, |푦|) ⃒ 푦 − 푧 ⃒ 6 16ℓ6푑 06푘6푑−ℓ ℓ−1 ∑︁ (︂ 1 )︂ ℓ|휆 |푢푘|푥|푘푣ℓ|푧|ℓ−1 1 + 6 푘,ℓ 2푑 16ℓ6푑 ⏟ ⏞ 06푘6푑−ℓ <2 ∑︁ 푘 푘 ℓ ℓ−1 (5.10) < 2 ℓ|휆푘,ℓ|푢 max(|푎|, |푏|) 푣 ‖푓‖∞,[푎,푏] =: 퐵/2 16ℓ6푑 06푘6푑−ℓ ∑︁ 푘 푘 ℓ ℓ−1 6 2푑 |휆푘,ℓ|푢 푀휌,푎,푏(푥) 푣 푀휌,푎,푏(푓) 16ℓ6푑 06푘6푑−ℓ

푑휌푁−휔0−1(휌 − 1) (5.11) < . 4푀휌,푎,푏(푓) The last inequality follows from the comparison of (5.8) to (5.9). We take the supremum over the compact set 푥 ∈ [푎, 푏], 푧 ∈ 푓([푎, 푏]), 0 6 |푦 − 푧| 6 |푧|/(2푑); as, again, the quantity under study is actually a polynomial, this supremum is actually a maximum, which concludes the proof. 

푘 푘 ℓ ℓ ∑︀ 푢 푀휌,푎,푏(푥) 푣 푀휌,푎,푏(푓) Remark 5.9. Note that ‖Λ퐴‖1 4 |휆푘,ℓ| 푁−1 in partic- > 06푘+ℓ6푑 휌 (휌−1) ^ ular. Since the constraint ‖Λ퐴‖2 6 1/(2푁) implies ‖Λ퐴‖1 < 1/2, it comes either 푘 푘 ℓ ℓ 푢 푀휌,푎,푏(푥) 푣 푀휌,푎,푏(푓) 휆푘,ℓ = 0 or 4 휌푁−1(휌−1) < 1 for any 푘, ℓ. Also, the proof of Lemma 5.6 푁−1 shows in particular that 푅2,푖 > 푅2,푑+2 = 4푣푀휌,푎,푏(푓)/(휌 (휌 − 1)) for all 푖 > 푑 + 2. 휔0 휔0 Hence, if 휌 푅2,푑+2 > 1, we thus have 휌 푅2,푖 > 1 for all 푖 > 1 and 휆푘,ℓ = 0 for 푘 푘 any 1 6 ℓ 6 푑, 0 6 푘 6 푑 − ℓ: the only functions taken into account are the 푢 푥 ’s and the method fails as claimed at the beginning of this section. This explains the 휔0 푁−1 condition 4푣휌 푀휌,푎,푏(푓) < 휌 (휌 − 1). Remark 5.10. The proof should be slightly adapted if Subsection 5.3.3 is used. tpreccomp−5 2 ^ In this case, we have |휆푘,ℓ| 6 2 /푁 for any 푘, ℓ. Recall that 퐴comp = −tpreccomp 2 (푀푐,comp 푀푟,comp), we obtain for 푗 = 0,..., 2푁 − 1,

^ ∑︁ 1−tpreccomp −4 |(Λ퐴)[푗] − (Λ퐴comp)[푗]| 6 |휆푘,ℓ|2 6 2 /푁, 06푘+ℓ6푑 ^ −4 from which follows√ ‖Λ퐴‖1 6 ‖Λ퐴‖1 + 2 . The upper bound in Inequality (5.9) −4 becomes 2 + 1/ 2푁 < 1/2 since 푁 > 3. From Theorem 5.8, we can deduce a lower bound for |푌/푣 − 푓(푋/푢)| in the following way: Corollary 5.11. With the notations and assumptions of Theorem 5.8, we have, for all 푋 ∈ Z, 푎 6 푋/푢 6 푏, all 푌 ∈ Z, either: ⃒ (︂ )︂⃒ ⃒푌 푋 ⃒ 1 2푀휌,푎,푏(푓) 푃 (푋, 푌 ) = 0 or ⃒ − 푓 ⃒ > > . ⃒ 푣 푢 ⃒ max(퐵, 2푑푣) 푑휌푁−휔0−1(휌 − 1) 24 N. BRISEBARRE AND G. HANROT

Proof. The inequality 1 > 2푀휌,푎,푏(푓) follows from the second point of Theo- 퐵 푑휌푁−휔0−1(휌−1) rem 5.8. We have, for any 푥 ∈ [푎, 푏], 푦 ∈ R, 푃 (푢푥, 푣푦) − 푃 (푢푥, 푣푓(푥)) (5.12) 푃 (푢푥, 푣푦) = 푃 (푢푥, 푣푓(푥)) + (푦 − 푓(푥)) . 푦 − 푓(푥)

As 푃 ∈ Z[푋1, 푋2] and 푋, 푌 ∈ Z, we must have 푃 (푋, 푌 ) ∈ Z, so that either 푃 (푋, 푌 ) = 0 or |푃 (푋, 푌 )| > 1. In the former case, there is nothing to prove. In the latter case, we plug 푥 = 푋/푢 and 푦 = 푌/푣 into (5.12) and obtain ⃒ (︂ )︂⃒ ⃒ ⃒ ⃒푌 푋 ⃒ ⃒푃 (푋, 푌 ) − 푃 (푋, 푣푓(푋/푢))⃒ (5.13) ⃒ − 푓 ⃒ ⃒ ⃒ ⃒ 푣 푢 ⃒ ⃒ 푌/푣 − 푓(푋/푢) ⃒ 1 |푃 (푋, 푌 )| − |푃 (푋, 푣푓(푋/푢))| > > 2 from the first point of Theorem 5.8. If we assume |푌/푣 − 푓(푋/푢)| 6 |푓(푋/푢)|/(2푑), we then derive the expected result from the second point of Theorem 5.8. We now assume |푌/푣 − 푓(푋/푢)| > |푓(푋/푢)|/(2푑). ∙ If |푓(푋/푢)| < 1/푣, then either 푌 = −1, 0, 1 or |푌/푣 − 푓(푋/푢)| > 1/푣. For 푌 = −1, 0, 1, we have ⃒ ⃒ ⃒푃 (푋, 푌 ) − 푃 (푋, 푣푓(푋/푢))⃒ ∑︁ 푘 ∑︁ 푗 ℓ−1−푗 ⃒ ⃒ |휆푘,ℓ||푋| 푣 |푌 | |푣푓(푋/푢)| ⃒ 푌/푣 − 푓(푋/푢) ⃒ 6 1 ℓ 푑 0 푗 ℓ−1 ⏟ ⏞ ⏟ ⏞ 6 6 6 6 61 61 06푘6푑−ℓ ∑︁ 푘 푘 ∑︁ 푘 푘 ℓ−1 6 ℓ|휆푘,ℓ|푢 |푋/푢| 푣 6 ℓ|휆푘,ℓ|푢 max(|푎|, |푏|) 푣(푣‖푓‖∞,[푎,푏]) 16ℓ6푑 16ℓ6푑 ⏟ ⏞ 푛푓 1 06푘6푑−ℓ 06푘6푑−ℓ > >

where 푛푓 was introduced just before Lemma 5.6.The conclusion follows by combining this upper bound with (5.13). If |푌/푣 − 푓(푋/푢)| > 1/푣, the conclusion holds since not all 휆푘,ℓ are zero and ∑︁ 푘 푘 ℓ−1 ℓ−1 퐵 = 4푣 ℓ |휆푘,ℓ| 푢 max(|푎|, |푏|) 푣 ‖푓‖∞,[푎,푏] > 4푣. 1 ℓ 푑 ⏟ ⏞ ⏟ ⏞ 6 6 ∈N 푛푘 1 ⏟ ⏞ 0 푘 푑−ℓ 푥> ℓ−1 6 6 푛푓 >1 ∙ Likewise, if |푓(푋/푢)| > 1/푣, we have

|푓(푋/푢)| 1 2푀휌,푎,푏(푓) |푌/푣 − 푓(푋/푢)| > > > , 2푑 2푑푣 푑휌푁−휔0−1(휌 − 1) since 1 > 4푀휌,푎,푏(푣푓) thanks to Remark 5.9, the conclusion holds again. 휌푁−휔0−1(휌−1)  We now have all the elements to prove Algorithm2. First, recall that the matrices tprec 푀푐 and 푀푟 computed in Algorithm1 correspond to the scaled matrices 2 퐴^1 tprec and 2 퐴^2. If the condition stated at Step4 of Algorithm2 is satisfied, then Theorem 5.8 and Corollary 5.11 prove that, given an integer pair (푋, 푌 ), either 푃0(푋, 푌 ) = 푃1(푋, 푌 ) = 0, or the lower bound of Corollary 5.11 holds. If the test at Step8 succeeded, 푃0 and 푃1 are coprime and since the total degree of each 푃푖 is at most 푑, we deduce from Bézout’s theorem that the cardinal of ℒ, the list of integer 2 roots of 푃0 and 푃1 is at most 푑 . NEW RESULTS ON THE TABLE MAKER’S DILEMMA 25

Steps7&9 of Algorithm2 deal with the former case, by trying to solve the polynomial system 푃0(푥, 푦) = 푃1(푥, 푦) = 0. If 푃0 and 푃1 are not coprime, however, this will fail; this is what makes our algorithm (and actually all bivariate versions of Coppersmith’s method) heuristic. When this situation occurs, one convenient solution is to subdivide the interval of interest into two subintervals and to apply the algorithm again to these subintervals. When 푋∈ / ℒ, then at least one of the values 푃0(푋, 푌 ), 푃1(푋, 푌 ) is non-zero. Then, we deduce from Corollary 5.11 that |푓(푋/푢) − 푌/푣| > 1/퐾 > 2푀휌,푎,푏(푓) , 푑휌푁−휔0−1(휌−1) where 퐾 is the value computed at Step 15 of Algorithm2. 5.4.2. Proof of success of Algorithm2. If we apply the LLL lattice basis reduction algorithm to 퐴^, we obtain: ^ ^푡 1/2(푁−1) 2−(푁+3)/4−tprec/(푁−1) Corollary 5.12. Assume that det(퐴퐴 ) 6 푁 ; then Theo- rem 5.8 applies with Λ equal to any of the first two vectors of an LLL-reduced basis of the lattice generated by the rows of 퐴^.

tprec Proof. Let 퐴˜ = 2 퐴^ ∈ ℳ푛(Z), we know from Theorem 4.5 that if 푤1 and 푤2 denote these first two vectors, we have tprec (푁−1)/4 ˜ ˜푡 1/2(푁−1) ˜ ˜푡 1/(2푁) ‖2 푤푖‖2 6 2 max(det(퐴퐴 ) , det(퐴퐴 ) ). ˜ ˜ ˜푡 1/2(푁−1) As 퐴 is an integer matrix, its determinant is an integer, so that det(퐴퐴 ) > det(퐴˜퐴˜푡)1/(2푁); we thus have tprec (푁−1)/4 푁tprec/(푁−1) ^ ^푡 1/2(푁−1) 2 ‖푤푖‖2 6 2 2 det(퐴퐴 ) , hence 1 ‖푤 ‖ 2(푁−1)/42tprec/(푁−1) det(퐴^퐴^푡)1/2(푁−1) . 푖 2 6 6 2푁  We shall base our analysis on Inequality (5.7). The important term in the 2푁/(3(푑+3)) 2푁/(3(푑+3)) 푁/2+... analysis is (푢푣) 푀휌,푎,푏(푓) /휌 . The quality of the bound thus depends on the size of 휌 and the growth of 푓. We shall start by giving a general result. This result will then be turned into more readable versions under various sets of assumptions in Theorems 5.18, 5.26 and 5.27.

Proposition 5.13. Let 푓 be analytic in a neighbourhood of the closed disc 풟푎,푏,퐾 = C {푧 ∈ : |푧−(푎+푏)/2| 6 퐾/2}, 푑 be an integer > 1, 푁 = (푑+1)(푑+2)/2, and 휔0 > 0, 휌 = 퐾/(푏 − 푎) > 2 be two real parameters. Let 푀풟푎,푏,퐾 (푓) := max푧∈풟푎,푏,퐾 |푓(푧)| 푡 1/2 and recall that Δ푁,[푎,푏],휔0 = (det 퐴퐴 ) . Then, if − 2푑푁 (︁ 6(푑+3) )︁ 3(푁(푁−3)+2휔0+⌊휔0⌋(⌊휔0⌋−2휔0+1)) 푏 − 푎 < 퐾 2 푢푣(|푎 + 푏| + 퐾)푀풟푎,푏,퐾 (푓) ,

−(푁+3)/4−tprec/(푁−1) we have Δ1/(푁−1) < 2 . 푁,[푎,푏],휔0 푁 Proof. We first notice that under our assumption 휌 = 퐾/(푏 − 푎) > 2, we have 푁/(푁−1) 3/2 퐸휌,푎,푏 ⊂ 풟푎,푏,퐾 . Thanks to Corollary 5.3, in view of (휌/(휌 − 1)) 6 2 , we have √ 2푁/(3(푑+3)) 1/(푁−1) (푢푣(|푎 + 푏| + 퐾)/2) 2푁/(3(푑+3)) Δ 6 60 2푁 푀풟푎,푏,퐾 (푓) . 푁,[푎,푏],휔0 휌푁/2+⌊휔0⌋(⌊휔0⌋−2휔0+1)/(2(푁−1)) 26 N. BRISEBARRE AND G. HANROT

Thus, for Δ1/(푁−1) < 2−(푁+3)/42−tprec/(푁−1)/푁, it suffices, using Lemma 5.6 that 푁,[푎,푏],휔0 (︁ 휌 > 60 · 2(푁+5)/4−2푁/(3(푑+3))+3/(푁−1)푁 3/2

2(푁−1) 2푁/(3(푑+3)))︁ 푁(푁−3)+2휔0+⌊휔0⌋(⌊휔0⌋−2휔0+1) (푢푣(|푎 + 푏| + 퐾)푀풟푎,푏,퐾 (푓)) . We observe that 60 · 2(푁+5)/4−2푁/(3(푑+3))+3/(푁−1)푁 3/2 < 24푁 for 푑 > 1, from which the proposition follows.  Corollary 5.14. Under the assumptions of Proposition 5.13, Algorithm2 over [푎, 푏] produces at Step6 two polynomials 푃0, 푃1 such that max푥∈[푎,푏] |푃푖(푢푥, 푣푓(푥))| < 1/2 for 푖 ∈ {0, 1}. In particular, Algorithm2 never reaches Step 20 and its output is valid. Proof. It suffices to apply Proposition 5.13, Corollary 5.12, Theorem 5.8 and Corol- lary 5.11 (in order to get the estimate on |푓(푋/푢) − 푌/푣| which is part of the output of Algorithm2).  Note that the algorithm may still return “FAIL” at Step 17, precisely in the case where 푃0 and 푃1 are not coprime – this makes the algorithm heuristic. 5.5. Complexity analysis. In this subsection, we deduce estimates for the com- plexity of our algorithm applied to a fixed interval [훼, 훽). This actually requires several things: ∙ Evaluating the complexity of the basic blocks, namely Algorithms1 and2 (Subsection 5.5.1), and the precision required for all intermediate computa- tions. ∙ Evaluating, thanks to Corollary 5.14, the size of a subinterval [푎, 푏] which can be treated at once by those algorithms; the general case will be treated in Subsection 5.5.2, whereas the case where 푓 is entire allows for an asymptotic improvement in the estimates by letting 휌 tend to infinity with 푑; we shall discuss this in Subsection 5.5.4. ∙ Investigating the interplay between 푑 and 휔0, two parameters which have an impact on both the complexity and the quality of the final bound on 1/푤. (Subsection 5.5.3). We start by giving complexity estimates for Algorithms1 and2. 5.5.1. Complexity of Algorithms1 and2. In this subsection, we denote M(푛) the complexity of multiplying two 푛-bit integers (or two precision 푛 floating point numbers). We assume that interval evaluation at precision 푝 of a function 푓 uses 푂(1) evaluations of 푓 at precision 푝. In our implementation, we used the Arb library [32] and the MPFI library.

′ Lemma 5.15. Put ℳ = max(푢, 푣, |푎|, |푏|, 휌, 퐵푓 , max[푎,푏] |푓 (푥)|). The computations of Algorithm1 on input 푎, 푏, 푓, 푑, 푢, 푣, 휌, 휔0 can be made in floating-point precision p = tprec + 푂(푑 log ℳ). 푝 In particular, for fixed 푎, 푏, 휌, 휔0, 푓, for 푢, 푣 = 2 , with 푝 > 푑, the required precision is 푂(푑푝). NEW RESULTS ON THE TABLE MAKER’S DILEMMA 27

Proof. It is a corollary of Theorem C.9, see AppendixC. 

Proposition 5.16. On input 푎, 푏, 푓, 푑, 푢, 푣, 휌, 휔0, assuming that evaluating 푓 in precision 푃 costs 퐶푓,푃 , and a DCT of size 푛 in precision 푞 has cost 푂˜(푛M(푞)), Algorithm1 has complexity 4 2 푂˜(푑 M(p) + 푑 퐶푓,p), where p is as in Lemma 5.15. Proof. The most costly steps of Algorithm1 appear in the loop7-17 which can be performed using 푂(푑2) evaluations of 푓 at precision 푂(tprec), and 푂(푑4) multiplica- tions of real numbers in precision 푂(tprec), plus 푂(푑2) DCTs of size 푁 in precision tprec.  푝 ˜ ˜ 휅 In particular, for fixed 푎, 푏, 휌, 푢 = 푣 = 2 , 푝 > 푑, 퐶푓,푃 = 푂(M(푃 )), M(푛) = 푂(푛 ), 4+휅 휅 and ignoring the 휔0 log 휌 term in tprec, we obtain a complexity of 푂˜(푑 푝 ). We now turn to the analysis of Algorithm2. We shall limit ourselves to the analysis of Steps1–6, which compute the two auxiliary polynomials. This is, in any case, the core of the algorithm, but also the choice made in previous papers, and thus allows for a better comparison.

Proposition 5.17. On input 푎, 푏, 푓, 푑, 푢, 푣, 휌, 휔0, under the assumption 퐶푓,푃 = (푃 2), Steps1–6 of Algorithm2 have complexity 푂(푑6M(푑2)(푑2 + p)p). Proof. The main steps of Algorithm2 are: ∙ A call to Algorithm1; ∙ A call to LLL on a lattice of dimension 푁 with entries of size 푂(p). For the second part, we use the 퐿2 algorithm [53] on a lattice of dimension 푁 = 푂(푑2), embedded into R2푁 ; we thus have complexity 푂(푑6M(푑2)(푑2 + p)p). This cost dominates the cost of Algorithm2.  푝 Note that in typical situations (for instance, either 푢, 푣 = 2 , 푝 > 푑 or 휔0 ≠ 2 6 2 2 푁 − 표(푁), 휌 > 2) we have 푑 = 푂(p) and the complexity simplifies to 푂(푑 M(푑 )p ). 5.5.2. Number of subintervals for fixed 푑. Thanks to the results of Subsection 5.4.2, we can estimate the maximum size of an interval [푎, 푏] ⊂ [훼, 훽], with 훼, 훽 fixed, for which Algorithm2 succeeds (in the sense of Corollary 5.14). This follows from Proposition 5.13, and yields at the same time the number of subintervals to be considered if one wants to deal with a full interval [훼, 훽]. Theorem 5.18. Given fixed 푓, a fixed parameter 푑, two fixed real numbers 훼 and 훽, Problem 2.6 can heuristically be solved for 푢, 푣 → ∞ over [훼, 훽] using

(︁ 2푁푑 )︁ (5.14) 푂 (훽 − 훼)(푢푣) 3(푁(푁−3)+2휔0+⌊휔0⌋(⌊휔0⌋−2휔0+1)) calls to Algorithm 2 with parameter 푑. We then obtain a value

(︂ 2푁(푁−휔0)푑 )︂ (5.15) 푤 = 푂 (푢푣) 3(푁(푁−3)+2휔0+⌊휔0⌋(⌊휔0⌋−2휔0+1)) .

When 푑 → ∞, both statements remain valid if 푁 − 휔0 = Θ(푁), or if one replaces 푢푣 by 푢푣2푂(푑). 28 N. BRISEBARRE AND G. HANROT

Proof. This is a direct consequence of Proposition 5.13 and Corollary 5.14, where we choose 퐾 = 2. Recall that the heuristic nature of this result comes from the possibility that the two polynomials obtained in Algorithm2 are not coprime, in which case one cannot recover the solutions 푋, 푌 from those two polynomials. Finally, thanks to Corollary 5.11, the upper bound on 푤 is 푂(휌(푁−휔0)), from which the second part of the result follows. For 푑 → ∞, we need to take into account the term 26(푑+3) = 2푂(푑) of Proposi- tion 5.13. However, if 휔 − 푁 = Θ(푁), the global exponent in Proposition 5.13 is 푂(1/푑) and, overall, this term is absorbed by the 푂 notation.  Remark 5.19. If one is only interested with the smallest possible complexity, it should be noted that the exponent in (5.14) is minimal for 휔0 ∈ [1, 2], and equal in this 2 (︀ 4푁/(3(푑+3)))︀ case to 8푁/(3(푑 + 3)(푑 + 3푑 − 2)); for 휔0 = 2 we then get 푤 = 푂 (푢푣) .

Remark 5.20. In the case where 휔0 is an integer, the bounds take the nicer form (︁ 2푁푑 )︁ (︁ 2푁푑 )︁ 푂 (훽 − 훼)(푢푣) 3(푁−휔0)(푁+휔0−3) and 푤 = 푂 (푢푣) 3(푁+휔0−3) .

In particular, this shows the limits of the approach: for 휔0 close to 푁, we decrease the exponent of the bound on 푤 by a factor of 2, but can expect nothing better. We shall see in the next section how to go beyond this limitation. We now discuss the case where we let 푑 → ∞. This has two goals: ∙ See for what value of 푑 we can expect to treat a whole interval [훼, 훽] at once; notice that better results will be obtained later (Subsection 5.5.4) on if 푓 is entire and we have control on its growth at infinity; ∙ Give a simplified form of the estimates of Theorem 5.18, which, in reason of the technical parameter 휔0, are rather unpleasant and unintuitive.

Corollary 5.21. Let again 훼, 훽 be fixed real numbers. For 푑 → ∞, for 휔0 = 휆푁(1 + 표(1)), 휆 ∈ [0, 1), Problem 2.6 can heuristically be solved for 푢, 푣 → ∞ over [훼, 훽] using

(︂ 4(1+표(1)) )︂ (5.16) 푂 (훽 − 훼)(푢푣) 3(1−휆2)푑 calls to Algorithm 2 with parameter 푑, giving a bound (︁ 2푑(1+표(1)) )︁ (5.17) 푤 = 푂 (푢푣) 3(1+휆) . Remark 5.22. Note that if we assume 푑 = Θ(log(푢푣)), Corollary 5.21 states that one call is enough to address an interval of the size 훽 − 훼 = 푂(1) and we then 2 obtain 푤 = 푒푂(log (푢푣)). We will improve this result in Section 5.5.4 under additional assumptions on the growth of 푓 at infinity. Remark 5.23. In all this section, our complexity estimates should be considered as slightly pessimistic for fixed 푑, at least for usual transcendental functions. Indeed, we base our complexity estimates on estimates on the size of the second vector of an LLL-reduced basis, estimates which can only be obtained under a (trivial, thus pessimistic in practice) lower bound on the size of the first vector. For a “classical” function such as exp or Γ, we notice in practice that most of the time, the second vector has a size similar to the size of the first one. This yields the slighly better bound (︁ 2푑푁 )︁ 푂 (훽 − 훼)(푢푣) 3(푁(푁−3)+2휔0+푁⌊휔0⌋(⌊휔0⌋−2휔0+1)) . NEW RESULTS ON THE TABLE MAKER’S DILEMMA 29

Note that it is easy to build examples where this latter bound does not hold, by taking a function which has a very good algebraic approximation (which gives a very short first vector) over the interval under study.

Remark 5.24. The first part of the Corollary, when 휆 = 0, is akin to, asymptotically, Bombieri and Pila’s result [6] on the number of real algebraic of degree 6 푑 containing all integer points on a given transcendental curve. The only reasons why we do not get the exact same result as theirs are: the fact that we use a bound on the second vector (see previous remark); and the fact that in order to get a practical algorithm, we truncate our matrix to get an integer matrix – this has a slight effect, asymptotically negligible, on the final bound.

5.5.3. Tuning 푑 and 휔0. Again, in order to ease this very technical discussion, we shall focus on the situation of a fixed 푓, 푎, 푏 for 푢푣 → ∞. Let us start by pointing that a tedious, but not difficult computation shows that for 푑 > 2 the exponent in (5.15) is decreasing for 휔0 ∈ [0, 푁 − 1); the maximal value of this exponent, for 휔0 = 0, is 4푑푁/(3(푑 − 1)(푑 + 4)) ≈ 2푑/3, whereas its minimal value, for 휔0 close to 푁 − 1, is 푑푁/(3푁 − 6) ≈ 푑/3. We thus have a wall-type phenomenon: if we want to get access to a good complexity (see (5.14)), we need to increase 푑; but then 휔0 fails to prevent the degradation of the estimate on 푤, at least in a significant way. In practice, if we target a sharp bound andlet 휔0 grow with this purpose in mind, we observe that the lattice basis reduction step decreases 푑 to some 훿 on its own simply by not using monomials of degree > 훿 for the first vectors. We shall however see that setting 휔0 to a non-zero value still allows one to get a better complexity/푤 compromise, and shall study a different method giving complete control on 푤 in the next section. 휇 From now on, we thus fix a value of 푤 = (푢푣) and try to find a pair (푑, 휔0) which minimizes the complexity required to achieve this value of 푤. We start with the asymptotic situation, i.e., 푑 → ∞.

Proposition 5.25. Let 푑 → ∞, and 휔0 = 휆푁(1 + 표(1)). The value of 푑 such that the exponent of 푢푣 in (5.17) is 휇 while minimizing the exponent in (5.16) is 푑 = 2휇(1+표(1)), obtained for 휆 = 1/3(1+표(1)). This gives a number of subintervals

(︁ 3 (1+표(1))︁ 푂 (훽 − 훼)(푢푣) 4휇 .

Proof. Elementary calculus. 

Led by this asymptotic statement, we have computed (experimentally), for small values of 휇, the value of 푑 giving the best estimate for the complexity in (5.14); it turns out that in all our computations, the optimal value was 푑 = ⌊2휇⌋, except when 휇 = 푟 + 1/2 is an half-integer, where the optimal 푑 is 2푟. Working out a closed form for the exponent of the complexity estimate as a function of 휇 seems thus possible, but would be moderately enlightening; it seems preferable to give a plot of the corresponding function. Figure1 gives three curves. The dashed curve corresponds to the best exponent in Theorem 5.18 as a function of the exponent of 푢푣 in the bound on 푤. The dotted curve represents a similar function, but using a version of our bounds controlling only the first vector of the lattice. Finally, the plain curve is the asymptotic bound 3/(4휇). 30 N. BRISEBARRE AND G. HANROT

0.35

0.3

0.25

0.2

0.15

0.1

2 3 4 5 6 7 8 9 10

Figure 1. Exponent estimates

5.5.4. The case 휌(푏 − 푎) → ∞. In this subsection, we shall now let 퐾 depend on 푑, namely we shall let it tend to ∞ with 푑. We shall thus need the function under study to be an entire function. Recall that if 푓 : C → C is an entire function, and if 휃 = lim sup log log max |푓(푧)|/ log 휌 is finite, the function 푓 is said to have 휌→∞ |푧|6휌 finite order 휃. The presence of a term 푀휌,푎,푏(푓), depending on the growth of 푓 at infinity, shows that it is difficult to give a single ready-to-use result. We thus split the discussion into two parts: the case of entire functions of finite order, such as exp for instance, in which the value of the order gives sufficiently precise information on the growth at infinity to obtain a general result; and then two examples of a function oforder zero and of a function of order infinity – in those cases, case-by-case estimates of the growth at infinity are required. Theorem 5.26. Let 푎 < 푏 two fixed real numbers, let 휃 be a positive real number, and let 푓 be an entire function of finite order 6 휃, 푑 > 2 be an integer, 푁 = (푑+1)(푑+2)/2, and let 휔0 = 휆푁 + 표(푁), for some fixed 휆 ∈ [0, 1). 4휃 For 푢푣 → ∞, for any constant 휈 > 3(1−휆2) , for log(푢푣) 푑 = 휈 (1 + 표(1)), log log(푢푣) Algorithm2 succeeds (in the sense of Corollary 5.14) over [푎, 푏] and yields a bound on 푤 of the form: 2 휈 (1−휆) log(푢푣) (1+표(1)) 푤 6 (푢푣) 2휃 log log(푢푣) . ′ Proof. Let 휃′ > 휃 be a parameter which will be fixed later on. We choose 퐾 = 푑1/휃 /2, which is > 2(푏 − 푎) for 푑 large enough. The disc 풟푎,푏,퐾 (see Theorem 5.13) is, for ′ our choice of 휌, included into the ball 퐵(0, 푑1/휃 ) for 푑 large enough; hence, the R assumption that 푓 has order 6 휃 shows that there exist constants 퐶푓,휃′ , 휎푓,휃′ ∈ such that 푂(푑) ′ ′ 푀풟푎,푏,퐾 (푓) 6 퐶푓,휃 exp(휎푓,휃 푑) = 2 . NEW RESULTS ON THE TABLE MAKER’S DILEMMA 31

Proposition 5.13 then implies that a sufficient condition for the conclusion of the theorem to hold is

2 (︃ (︃ ′ )︃ )︃−4/(3푑(1−휆 ))(1+표(1)) 1 푑1/휃 푏 − 푎 < 26(푑+3) |푎 + 푏| + 푀 (푓) 2 2 풟푎,푏,퐾 ⏟ ⏞ =:퐴푑 ′ 2 푑1/휃 (푢푣)−4/(3푑(1−휆 ))(1+표(1)), or equivalently

1/휃′ −4/(3푑(1−휆2))(1+표(1)) 4/(3푑(1−휆2))(1+표(1)) 푑 (푢푣) > 2퐴푑 (푏 − 푎). 푂(푑) As 푎, 푏 are fixed and 퐴푑 = 2 when 푑 → ∞, the right hand side is bounded and a sufficient condition for this to hold is simply

′ 2 푑1/휃 (푢푣)−4/(3푑(1−휆 ))(1+표(1)) → ∞, for which it suffices that, for some 휀′ > 0, (︂ 4휃′ )︂ 푑 log 푑 + 휀′ log(푢푣), > 3(1 − 휆2) which obviously holds under the assumption on 푑 made in the theorem for 푢푣 large enough and 휃′ < 3(1 − 휆2)휈/4. 푁(1−휆) 푁(1−휆)(1+표(1)) As for the last part, the bound on 푤 is 6 휌 = 퐾 , namely we have (1 − 휆)푑2 log 푑 (1 − 휆)휈2 log2(푢푣) log(푤) = (1 + 표(1)) (1 + 표(1)), 2휃′ 6 2휃 log log 푢푣 as claimed.  In particular, for the TMD over [1/4, 1/2) for the exponential function (휃 = 1), hence for 푎 = 1/4, 푏 = 1/2 and 푢 = 2푝+1 and 푣 = 2푝−1, for 푝 → ∞, we obtain (︁ 8 log 2 )︁ 푝 the condition 푑 > 3(1−휆2) + 휀 log 푝 for the full interval [1/4, 1/2), with a bound 2 (︀ 32 log 2 +휀)︀ 푝 푤 6 2 9(1−휆2)(1+휆) log 푝 . Two other examples. We illustrate, more generally, the fact that we get asymptotic results depending on the rate of growth of 푓 at infinity: the slower the growth of 푓, the better the performance. Theorem 5.27. Let 푓 = exp(exp(푧)). For 푢푣 large enough, for any constant 휈 > 4 log(푢푣) 3(1−휆2) , for 푑 = ⌈휈 log log log(푢푣) ⌉, Algorithm2 succeeds in the sense of Corollary 5.14 and we obtain 휈2(1−휆) log(푢푣) log log(푢푣) (1+표(1)) 푤 = (푢푣) (log log log(푢푣))2 . ∑︀ 2 푛 4 Let 푔(푧) = 푛 0 exp(−푛 )푧 . For 푢푣 large enough, for any constant 휈 > 3(1−휆2) , √︀ > for 푑 = ⌈휈 log(푢푣)⌉, Algorithm√ 2 succeeds in the sense of Corollary 5.14 and we 3 휈2(1−휆2)(1−휆) log(푢푣)(1+표(1)) obtain 푤 = (푢푣) 2 . Proof. The proof is similar to the proof of the previous theorem, with 퐾 = log 푑 for 2 푓 and 퐾 = exp(3(1 − 휆 )푑/2) for 푔. See AppendixB.  32 N. BRISEBARRE AND G. HANROT

6. The two-variable method

We consider 푢, 푣 ∈ N ∖ {0}, 푎1 < 푏1 and 푎2 < 푏2, and 푓 :[푎1, 푏1] → R a function that is analytic in a neighbourhood of [푎1, 푏1]. In this section, we develop a heuristic algorithmic approach to determine the integers 푋, 푌 such that

(6.1) 푋/푢 ∈ [푎1, 푏1] and 푎2 < 푌/푣 − 푓(푋/푢) < 푏2. Note that this is a mere reformulation of Problem 2.6: let 푤 ∈ N ∖ {0}, we set [푎, 푏] = [푎1, 푏1] and 푏2 = −푎2 = 1/푤. Actually, without loss of generality, we can assume 푏2 = −푎2 by replacing 푓 by 푓 + (푎2 + 푏2)/2. We prefer to keep 푎2 and 푏2 arbitrary in the sequel to put more emphasis on the symmetry between (푎1, 푏1) and (푎2, 푏2) in the formulas and statements. Our approach aims at building a trap for these pairs (푋, 푌 ). We compute, by combining two-dimensional Chebyshev interpolation and lattice reduction, two polynomials 푃0, 푃1 ∈ Z[푋1, 푋2] such that, for 푖 = 0, 1, for all 푥 ∈ [푎1, 푏1], 푡 ∈ [푎2, 푏2], we have |푃푖(푢푥, 푣(푓(푥) + 푡))| < 1. Let 푋 ∈ Z such that 푋/푢 = 푥0 ∈ [푎1, 푏1], 푌 ∈ Z such that 푌/푣 = 푓(푥0) + 푡0 with 푡0 ∈ [푎2, 푏2], we have 푃푖(푢푥0, 푣(푓(푥0) + 푡0)) = 푃푖(푋, 푌 ) ∈ Z ∩ (−1, 1) = {0}, that is to say (푋, 푌 ) is a common root to 푃0 and 푃1. As in Section5, we use our heuristic assumption: 푃0 and 푃1 are supposed to have no nonconstant common factor. We eliminate one of the variables and get the list of all the integers 푋, 푌 that satisfy (6.1). In the sequel of this section, we start with estimates of the determinants of the lattices that we use, we present our algorithm, the proof of its correctness and analyse its complexity. When the proofs of the statements are similar to the ones presented in Section5, we shall postpone them to AppendixE. Throughout this section, 푁1, 푁2 > 2 and 푁 > 2, will be three integers. In order to avoid degenerate situations and trivial output, we shall always assume 푁1푁2 > 푁. 6.1. Volume estimates for rigorous interpolants at the Chebyshev nodes. We start by introducing the two dimensional extension of the DCT-II: 2D-DCT-II : R푁1 × R푁2 → R푁1 × R푁2

(푥ℓ1,ℓ2 )06ℓ16푁1−1 ↦→ (푋푘1,푘2 )06푘16푁1−1 06ℓ26푁2−1 06푘26푁2−1 with (︂ )︂ (︂ )︂ ∑︁ ∑︁ 푘1(ℓ1 + 1/2)휋 푘2(ℓ2 + 1/2)휋 푋푘1,푘2 = 푥ℓ1,ℓ2 cos cos , 푁1 푁2 06ℓ16푁1−1 06ℓ26푁2−1 for 푘1 = 0, . . . , 푁1 − 1, 푘2 = 0, . . . , 푁2 − 1. N Let 푁 ∈ , 푁 > 2, let 푖 = 0, . . . , 푁 − 1, let 푓푖 a function defined over R 15 [푎1, 푏1] × [푎2, 푏2], if we interpolate 푓푖 by polynomials in 푁1−1,푁2−1[푥, 푡] at pairs of Chebyshev nodes 휇 = (휇 , 휇 ) , cf. Sec- 푘1,푘2 푘1,푁1−1,[푎1,푏1] 푘2,푁2−1,[푎2,푏2] 06푘16푁1−1 06푘26푁2−1 tion 4.1, we have the following expressions for the interpolation polynomials (the proof is identical to the one variable case [49, Chap. 6]), for 푖 = 0, . . . , 푁 − 1:

푁1−1푁2−1 ′ ′ ∑︁ ∑︁ R 푄푖(푥, 푡) = 푐푘1,푘2,푖푇푘1,[푎1,푏1](푥)푇푘2,[푎2,푏2](푡) ∈ 푁1−1,푁2−1[푥, 푡]

푘1=0 푘2=0

15This denotes the set of polynomials in two indeterminates 푥 and 푡 with real coefficients, degree in 푥 less than 푁1 and degree in 푡 less than 푁2. NEW RESULTS ON THE TABLE MAKER’S DILEMMA 33 with

(6.2) (푐푘1,푘2,푖)06푘16푁1−1 = 06푘26푁2−1 4 (︂ )︂ 2D-DCT-II (푓 (휇 )) . 푁 푁 푖 푁1−1−ℓ1,푁2−1−ℓ2 06ℓ16푁1−1 1 2 06ℓ26푁2−1

Let 휌1, 휌2 > 1, 푎1 < 푏1, 푎2 < 푏2, we recall from Section 4.1.2, ℰ휌1,푎1,푏1,휌2,푎2,푏2 =

ℰ휌1,푎1,푏1 × ℰ휌2,푎2,푏2 and 퐸휌1,푎1,푏1,휌2,푎2,푏2 = 퐸휌1,푎1,푏1 × 퐸휌2,푎2,푏2 . We also recall 푀 (푔) := max |푔(푧)| if 푔 is analytic in a neighbourhood 휌1,푎1,푏1,휌2,푎2,푏2 푧∈ℰ휌1,푎1,푏1,휌2,푎2,푏2 of 퐸휌1,푎1,푏1,휌2,푎2,푏2 . Let 푁1, 푁2 > 2, and 푁 6 푁1푁2. Let 푓0, . . . , 푓푁−1 be functions analytic in a neighbourhood of 퐸휌1,푎1,푏1,휌2,푎2,푏2 . We introduce the 푁 × (푁1푁2 + 푁) matrix 퐴 = (퐴1퐴2), defined by (︁ 푐 )︁ (퐴 ) = 푘1,푘2,푖 , 1 푖,(푘1,푘2) 훿 +훿 0 푖 푁−1 2 0푘1 0푘2 6 6 06푘16푁1−1, 06푘26푁2−1 (6.3) (︂ )︂ 16휌1휌2푀휌1,푎1,푏1,휌2,푎2,푏2 (푓푖) 1 1 (퐴2)푖,푗 = 훿푖푗 + , 0 푖, 푗 푁 − 1. 푁1 푁2 6 6 (휌1 − 1)(휌2 − 1) 휌1 휌2 Recall from Proposition 4.2 that ‖푓푖 − 푄푖‖∞ 6 퐴2[푖, 푖], 푖 = 0, . . . , 푁 − 1. Once again, the diagonal right part, 퐴2, of the matrix will be used for controlling that the functions 푃0(푢푥, 푣(푓(푥) + 푡)), 푃1(푢푥, 푣(푓(푥) + 푡)), output by the lattice basis reduction process, are uniformly small; this accounts for the presence of (︂ )︂ 16휌1휌2푀휌1,푎1,푏1,휌2,푎2,푏2 (푓푖) 1 1 the 푁 + 푁 remainder term for the approximation of (휌1−1)(휌2−1) 1 2 휌1 휌2 푃푖(푢푥, 푣(푓(푥) + 푡)) by its interpolation polynomial at the 2D Chebyshev points. We start with a convenient combinatorial lemma. R Lemma 6.1. Let 훾 ∈ , 훾 > 1, 푁, 푁1, 푁2 positive integers. Consider the multiset ′ ′ 16 푆 = {푘 + 훾푘 , (푘, 푘 ) ∈ [0, 푁1 − 1] × [0, 푁2 − 1]} + {푁1, . . . , 푁1} , and order the ⏟ ⏞ 푁times elements of 푆 as 휎0 6 ... 6 휎card 푆−1. Define

Ω훾 (푁1, 푁2, 푁) = 휎0 + ··· + 휎푁−1, the sum of the 푁 smallest elements of 푆. Let now 푠 6 푁1 be a real number, let 풦푠 = {(푖, 푗) ∈ [0, 푁1 − 1] × [0, 푁2 − 1] : 푖 + 훾푗 6 푠}. Then, ∑︁ Ω훾 (푁1, 푁2, 푁) > 푠(푁 − card 풦푠) + (푖 + 훾푗). (푖,푗)∈풦푠

Proof. Put ℳ푠 = {푖+푗훾, (푖, 푗) ∈ 풦푠}. Then, if card 풦푠 6 푁, ℳ푠 ⊂ {휎0, . . . , 휎푁−1}, and any element in {휎0, . . . , 휎푁−1} ∖ ℳ푠 is at least equal to 푠. Otherwise, we have {휎0, . . . , 휎푁−1} ⊂ ℳ푠, and any element in ℳ푠∖{휎0, . . . , 휎푁−1} is at most equal to 푠. 

Remark 6.2. The quantity Ω훾 (푁1, 푁2, 푁) plays a key role in the analysis of our Ω훾 (푁1,푁2,푁) 푁(푁−1)/2 bivariate method, as 휌1 will turn to play the role that 휌 played in the univariate case. For fixed values of 푁, 푁1, 푁2, 훾, it is easy to compute explicit values of Ω훾 (푁, 푁1, 푁2). We thus focus in the sequel on the asymptotic (for 푁 → ∞)

16By sum of multisets, we mean that the multiplicity of an element of the union is the sum of its multiplicities in the multisets. 34 N. BRISEBARRE AND G. HANROT behaviour of Ω훾 (푁, 푁1, 푁2) and shall hence mostly study the asymptotic behaviour of this bivariate method – even though the analysis itself is not asymptotic by nature (see e.g. Theorem 6.4). We now give explicit expressions for card 풦 and ∑︀ (푖 + 푗훾). 푠 (푖,푗)∈풦푠 R Lemma 6.3. Let 푠 ∈ , 푠 < 푁1 and 훾 > 푁1/푁2 > 1. We have ⌊푠/훾⌋ ⌊푠/훾⌋ ∑︁ ∑︁ (6.4) card 풦푠 = (1 + ⌊푠 − 푗훾⌋) = (1 + ⌊푠/훾⌋) + ⌊푠 − 푗훾⌋ 푗=0 푗=0 and ⌊푠/훾⌋ ∑︁ ∑︁ (︂ ⌊푠 − 푗훾⌋)︂ (6.5) (푖 + 푗훾) = (1 + ⌊푠 − 푗훾⌋) 푗훾 + . 2 (푖,푗)∈풦푠 푗=0 This implies

(6.6) (1 + ⌊푠/훾⌋)(푠 − 훾⌊푠/훾⌋/2) 6 card 풦푠 6 (1 + ⌊푠/훾⌋)(1 + 푠 − 훾⌊푠/훾⌋/2) and 6푠(푠 − 1) + 훾⌊푠/훾⌋(3 − 훾 − 2훾⌊푠/훾⌋) ∑︁ (1 + ⌊푠/훾⌋) (푖 + 푗훾) 12 6 (6.7) (푖,푗)∈풦푠 6푠(푠 + 1) + 훾⌊푠/훾⌋(3 − 훾 − 2훾⌊푠/훾⌋) (1 + ⌊푠/훾⌋) . 6 12

Proof. First, note that 푠/훾 < 푁1/훾 6 푁1/(푁1/푁2) = 푁2, hence ⌊푠/훾⌋ 6 푁2 − 1. Let (푖, 푗) ∈ 풦푠, the largest possible value of 푗 corresponds to the case 푖 = 0: we then have 푗훾 6 푠, that is to say 푗 6 ⌊푠/훾⌋. Now, in order to count the elements of 퐾푠, we enumerate, for 푗 = 0,..., ⌊푠/훾⌋, the elements of each slice {푖+푗훾 6 푠, 푖 ∈ [0, 푁1 −1]}: there are 1 + ⌊푠 − 푗훾⌋ such elements, which proves (6.4). Now, for 푗 = 0,..., ⌊푠/훾⌋, we sum the values 푖 + 푗훾 for 푖 in the slice {푖 + 푗훾 6 푠, 푖 ∈ [0, 푁1 − 1]}, i.e., 푖 ∈ [0, ⌊푠 − 푗훾⌋]. Hence, for 푗 = 0,..., ⌊푠/훾⌋, we sum the values ⌊푠 − 푗훾⌋(1 + ⌊푠 − 푗훾⌋)/2 + (1 + ⌊푠 − 푗훾⌋)푗훾, from which (6.5) follows. We use 푠 − 푗훾 − 1 6 ⌊푠 − 푗훾⌋ 6 푠 − 푗훾 for 푗 = 0,..., ⌊푠/훾⌋ to derive from (6.4) (1 + ⌊푠/훾⌋)(푠 − 훾⌊푠/훾⌋/2) 6 card 풦푠 6 (1 + ⌊푠/훾⌋)(1 + 푠 − 훾⌊푠/훾⌋/2). Likewise, we derive from (6.5)

⌊푠/훾⌋ ⌊푠/훾⌋ ∑︁ (푠 − 푗훾)(푠 + 푗훾 − 1) ∑︁ ∑︁ (1 + 푠 − 푗훾)(푠 + 푗훾) (푖 + 푗훾) , 2 6 6 2 푗=0 (푖,푗)∈풦푠 푗=0 which yields (6.7).  The next result gives an upper bound for the volume of the lattice generated by the rows of 퐴:

푁1 푁2 Theorem 6.4. Let 휌1, 휌2 > 1, 푎1 < 푏1, 푎2 < 푏2. We further assume that 휌1 6 휌2 , and define 훾 = log 휌2/ log 휌1. Let 푓1, . . . , 푓푁 be functions analytic in a neighbourhood of 퐸휌1,푎1,푏1,휌2,푎2,푏2 . Then, we have

푁 (︂ )︂푁 ∏︀푁 (︁ √ )︁ 휌1휌2 푀휌 ,푎 ,푏 ,휌 ,푎 ,푏 (푓푖) (det 퐴퐴푡)1/2 32 푁 2푁1푁2 푖=1 1 1 1 2 2 2 . 6 Ω (푁,푁 ,푁 ) (휌1 − 1)(휌2 − 1) 훾 1 2 휌1 NEW RESULTS ON THE TABLE MAKER’S DILEMMA 35

Proof. See AppendixE. 

We now give two statements on the behaviour of Ω훾 (푁, 푁1, 푁2) when 푁 → ∞, for a fixed (essentially optimal) choice of 푁1, 푁2. The proofs are elementary, but long and we postpone them to AppendixD. Proposition 6.5. Let 휙 be the function from [1, +∞) to [1, +∞) defined by 휙(푥) = (1 + ⌊푥⌋)(푥 − ⌊푥⌋/2). Then 휙 is invertible. We further define 휓 by 1 + ⌊휙−1(푥)⌋ 휓(푥) = (︀6휙−1(푥)2 − ⌊휙−1(푥)⌋ − 2⌊휙−1(푥)⌋2)︀ ; 12푥 we then have, for any 푦 ∈ [1, +∞), 푘 + 1 (︁ )︁ 휓−1(푦) = 2푦 − 푘 + √︀4푦(푦 − 푘) + 2푘(2푘 + 1)/3 , 2 √ −1 where√ 푘 = ⌊3푦/2 + 1/4⌋. Further, when 푥 → ∞, 휙 (푥) = 2푥 + 푂(1), 휓(푥) = 2 2푥/3 + 푂(1).

Proof. See Corollary D.3.  √ Note that for 푥 > 1, we prove in Lemma D.4 the√ inequalities 휓(푥) − 2 2푥/3 ∈ [−5/6, 0] and observe numerically that 휓(푥) − 2 2푥/3 ∈√[−1/2, −0.44], meaning that for our purposes 휓(푥) is very well approximated by 2 2푥/3 − 1/2. √ R Proposition 6.6. Let 훾 ∈ such that 3 6 훾 6 푁. Put 푁1 = ⌊ 2푁훾⌋ and √︀ 푁2 = ⌈ 2푁/훾⌉. Then, we have 훾 > 푁1/푁2 and

Ω훾 (푁, 푁1, 푁2) = 휓(푁/훾)푁훾 + 푂(푁). In particular, for 훾 = 표(푁), we have √ 2 2 Ω (푁, 푁 , 푁 ) = 푁 3/2훾1/2 + 푂(푁훾). 훾 1 2 3 Proof. See AppendixD.  √ Remark 6.7. We can obtain a similar result for 1 < 훾 < 3 if we set 푁1 = 1+⌊ 2푁훾⌋ √︀ and 푁2 = 1 + ⌈ 2푁/훾⌉. This allows us to give asymptotic versions of Theorem 6.4, which will be more convenient in the sequel.

Corollary 6.8. Let 휌1, 휌2 > 1 such that 훾 = log 휌2/ log 휌1 ∈ [3, 푁]. Let 푎1 < 푏1, −1 √ √︀ 푎2 < 푏2, 푠 = 훾휙 (푁/훾), 푁1 = ⌊ 2푁훾⌋ and 푁2 = ⌈ 2푁/훾⌉. Let 푓1, . . . , 푓푁 be functions analytic in a neighbourhood of 퐸휌1,푎1,푏1,휌2,푎2,푏2 . Assume that 푁 → ∞, we obtain (︂ 휌 휌 )︂푁 ∏︀푁 푀 (푓 ) (det 퐴퐴푡)1/2 2푂(푁 log 푁) 1 2 푖=1 휌1,푎1,푏1,휌2,푎2,푏2 푖 . 6 휓(푁/훾)푁훾+푂(푁) (휌1 − 1)(휌2 − 1) 휌1

Again, we specialize this statement to the case of the functions 푓푘,ℓ.

Corollary 6.9. Let 휌1, 휌2 > 1 such that 훾 = log 휌2/ log 휌1 ∈ [3, 푁]. Let 푎1 < 푏1, −1 √ √︀ 푎2 < 푏2, 푠 = 훾휙 (푁/훾), 푁1 = ⌊ 2푁훾⌋ and 푁2 = ⌈ 2푁/훾⌉. Let 푓 be a function 푘 푘 ℓ ℓ analytic in a neighbourhood of 퐸휌,푎1,푏1 . Define 푓푘,ℓ(푥, 푡) = 푢 푥 푣 (푓(푥) + 푡) for 36 N. BRISEBARRE AND G. HANROT

0 6 ℓ 6 푑, 0 6 푘 6 푑 − ℓ, the matrices 퐴1, 퐴2, 퐴 = (퐴1퐴2) as in (6.3), and the 푡 1/2 quantity Δ푁,푁1,푁2,[푎1,푏1],[푎2,푏2],휌1,휌2 := (det 퐴퐴 ) . We have, as 푑 → +∞, (︂√ )︂1+표(1) 1/(푁−1) 푂(1) 휌1휌2 (6.8) Δ푁,푁 ,푁 ,[푎 ,푏 ],[푎 ,푏 ],휌 ,휌 6 2 푁 1 2 1 1 2 2 1 2 (휌1 − 1)(휌2 − 1) 푑/3+푂(1) (︂ (︂ −1 )︂ ⃒ ⃒)︂푑/3+푂(1) (푢푣) 푏1 − 푎1 휌1 + 휌1 ⃒푏1 + 푎1 ⃒ + ⃒ ⃒ 휓(푁/훾)훾+푂(1) 2 2 ⃒ 2 ⃒ 휌1 (︂ (︂ −1 )︂ ⃒ ⃒)︂푑/3+푂(1) 푏2 − 푎2 휌2 + 휌2 ⃒푏2 + 푎2 ⃒ 푀휌1,푎1,푏1 (푓) + + ⃒ ⃒ . 2 2 ⃒ 2 ⃒

Proof. Note that each 푓푘,ℓ is analytic in a neighbourhood of 퐸휌1,푎1,푏1,휌2,푎2,푏2 . Also, (︁ 휌 +휌−1 )︁ since ∑︀ 푘 = ∑︀ ℓ = 푑푁/3, the exponent of 푢푣, 푏1−푎1 1 1 + 06푘+ℓ6푑 06푘+ℓ6푑 2 2 (︁ −1 )︁ ⃒ 푏1+푎1 ⃒ 푏2−푎2 휌2+휌2 ⃒ 푏2+푎2 ⃒ 푑푁 푑 ⃒ 2 ⃒ and 푀휌1,푎1,푏1 (푓) + 2 2 + ⃒ 2 ⃒ is 3(푁−1) = 3 + 푂(1).  6.2. Statement of the algorithms. Our main routine is Algorithm4. It comes together with Algorithm3 that mainly constructs the lattice to be reduced in Algorithm4. Let [푅푖, 푖 = 0, . . . , 푁 − 1] be the ordered list [︂16휌 휌 푢푘푀 (푥)푘푣ℓ푀 (푓(푥) + 푡)ℓ (︂ 1 1 )︂ 1 2 휌1,푎,푏 휌1,푎1,푏1,휌2,푎2,푏2 + ; 푁1 푁2 (휌1 − 1)(휌2 − 1) 휌1 휌2 ]︂ ℓ = 0, . . . , 푑, 푘 = 0, . . . , 푑 − ℓ . (︁ )︁ Let 퐴 = (퐴1퐴2) and 퐴^ = 퐴^1퐴^2 be the 푁 × (푁1푁2 + 푁) matrices defined by (︁ 푐 )︁ 퐴 = 푘1,푘2,푖 , 퐴 = (훿 푅 ) , 1 훿 +훿 0 푖 푁−1 2 푖푗 푖 0 푖,푗 푁−1 2 0푘1 0푘2 6 6 6 6 06푘16푁1−1, 06푘26푁2−1 (︀[︀ tprec ]︀ tprec)︀ (︀ tprec tprec)︀ 퐴^1 = 2 퐴1[푖, 푗] /2 0 푖 푁−1 , 퐴^2 = ⌊2 퐴2[푖, 푗]⌋/2 , 0 6 6 06푖,푗6푁−1 06푗6푁1푁2−1 17 where tprec = ⌈− log2(min06푖6푁−1 퐴2[푖, 푖]) + log2(푁)⌉ + 1. The reasons for intro- ducing 퐴^ and tprec are the same as in Section5. ⃒ ⃒ ⃒ ^ ⃒ By construction, ⃒퐴[푖, 푗]⃒ 6 |퐴[푖, 푗]| for all 푖, 푗. Hence, Theorem 6.4 and its corollaries, which proceed by upper bounding the coefficients of 퐴 and applying The- orem 5.1, also hold for (det 퐴^퐴^푡)1/2. The rows of 퐴^ generate the lattice that will be reduced in our algorithm. Lemma 6.10. The Z-module generated by the rows of 퐴^ is a lattice of rank 푁. Proof. Identical to the proof of Lemma 5.5. 

The matrices 푀푐 and 푀푟 computed in Algorithm3 correspond to the scaled tprec tprec matrices 2 퐴^1 and 2 퐴^2. We now derive an explicit expression for tprec in this bivariate context. In order to do so, we assume that the set 푢[푎1, 푏1], resp. 푣(푓([푎1, 푏1]) + [푎2, 푏2]), contains at least one nonzero integer 푛푥, resp. 푛푓 . Again, this assumption is made without loss

17We shall prove in Lemma 6.11 that this value coincides with the definition of tprec at Step1 of Algorithm3. NEW RESULTS ON THE TABLE MAKER’S DILEMMA 37 of generality with respect to our problem, since if the assumption does not hold the problem is trivial. Lemma 6.11. We have

tprec = ⌈− log2(푅0) + log2(푁)⌉ + 1 ⌈︁ ⌉︁ −푁1 −푁2 = log2(1 − 1/휌1) + log2(1 − 1/휌2) − log2(휌1 + 휌2 ) + log2(푁) − 3 and 푁(휌 − 1)(휌 − 1)휌푁1−1휌푁2−1 푁(휌 − 1)(휌 − 1)휌푁1−1휌푁2−1 1 2 1 2 2tprec 1 2 1 2 . 푁1 푁2 6 6 푁1 푁2 8(휌1 + 휌2 ) 4(휌1 + 휌2 )

Proof. Under our assumption, it comes 푣(푀휌1,푎1,푏1,휌2,푎2,푏2 (푓(푥) + 푡)) > 푛푓 > 1 and −푁1 −푁2 푢푀휌1,푎1,푏1 (푥) > 푛푥 > 1. It then follows 푅푖 > 16휌1휌2(휌1 + 휌2 ))/((휌1 − 1)(휌2 − 1)) = 푅0 for all 푖. Therefore, we get tprec = ⌈− log2(푅0) + log2(푁)⌉ + 1. 

6.3. Practical remarks. All practical details and optimizations mentioned in Section 5.3 apply mutatis mutandis to Algorithms3 and4: optimization of the construction of the matrix using properties of the DCT (Section 5.3.1), overestimation issues (Section 5.3.2), rounding issues (Section 5.3.3), use of Newton polynomials (Section 5.3.4). 6.4. Proof of correctness. We shall now prove the correctness of Algorithm4. 6.4.1. Uniformly small polynomials in the vicinity of a transcendental analytic curve. We now state a key result for the proof of Algorithm4. Theorem 6.12. Let 푑 > 1, 푚 > 2 be two integers, 푁 = (푑 + 1)(푑 + 2)/2, 푢, 푣 > 0 푘 푘 ℓ ℓ and (푓푗)16푗6푁 = (푢 푥 푣 (푓(푥) + 푡) )06푘+ℓ6푑. Let 휌1, 휌2 > 1, 푎1 < 푏1, 푎2 < 푏2, Z푁 ^ 푁1, 푁2 > 2, and 푁 6 푁1푁2. Let Λ = (휆푘,ℓ)06푘+ℓ6푑 ∈ be such that ‖Λ퐴‖2 6 ∑︀ 푘 ℓ 1/(푁 + 푁1푁2), and let 푃 (푋, 푌 ) = 휆푘,ℓ푋 푌 , we have 06푘+ℓ6푑 max |푃 (푢푥, 푣(푓(푥) + 푡))| < 1. 푥∈[푎1,푏1] 푡∈[푎2,푏2] Proof. See AppendixE.  Remark 6.13. The proof of Theorem 6.12 yields in particular that

‖Λ퐴‖1 > 푘 푘 ℓ ℓ (︂ )︂ ∑︁ 16푢 푀휌1,푎1,푏1 (푥) 푣 푀휌1,푎1,푏1,휌2,푎2,푏2 (푓(푥) + 푡) 휌1휌2 1 1 |휆푘,ℓ| 푁 + 푁 . (휌1 − 1)(휌2 − 1) 1 2 0 푘+ℓ 푑 휌1 휌2 6 6 ⏟ ⏞ =:푄푘,ℓ ^ Since the constraint ‖Λ퐴‖2 6 1/(푁 + 푁1푁2) implies ‖Λ퐴‖1 < 1, cf. AppendixE, it comes either 휆푘,ℓ = 0 or 푄푘,ℓ < 1 for any 푘, ℓ. Also, the proof of Lemma 6.11 (︂ )︂ 푣푀휌1,푎1,푏1,휌2,푎2,푏2 (푓(푥)+푡)휌1휌2 1 1 shows in particular that 푅푖 푅푑+2 = 16 푁 + 푁 > (휌1−1)(휌2−1) 1 2 휌1 휌2 for all 푖 > 푑 + 2. Hence, if 푅푑+2 > 1, we thus have 푅푖 > 1 for all 푖 > 푑 + 2 and 휆푘,ℓ = 0 for any 1 6 ℓ 6 푑, 0 6 푘 6 푑 − ℓ: the only functions taken into account are the 푢푘푥푘’s and the method fails. This explains the condition −푁1 −푁2 16휌1휌2(휌1 + 휌2 )푀휌1,푎1,푏1,휌2,푎2,푏2 (푓(푥) + 푡) < (휌1 − 1)(휌2 − 1) in the input of Algorithm3 and4. 38 N. BRISEBARRE AND G. HANROT

Algorithm 3 Computation of the lattice to be reduced (2D approach)

Input: Four real numbers 푎1 < 푏1, 푎2 < 푏2, 푓 a transcendental function analytic in a complex neighbourhood of [푎1, 푏1], five positive integers 푑, 푁1, 푁2, 푢, 푣, two real numbers 휌1, 휌2 > 1 such that 푁1, 푁2 > 2, 푁1푁2 > 푁 := (푑 + 1)(푑 + 2)/2 −푁1 −푁2 and 16휌1휌2(휌1 + 휌2 )푀휌1,푎1,푏1,휌2,푎2,푏2 (푓(푥) + 푡) < (휌1 − 1)(휌2 − 1). Z Z Output: Two matrices 푀푐 ∈ ℳ푁,푁1푁2 ( ), 푀푟 ∈ ℳ푁 ( ), where 푁 = (푑 + 1)(푑 + 2)/2, respectively storing scaled values of the coefficients and the remainders, an integer tprec which is the truncation precision. 1−푁1 1−푁2 16(휌1 휌2+휌1휌2 ) 1: 푅0 ← , tprec ← ⌈− log (푅0) + log (푁)⌉ + 1 (휌1−1)(휌2−1) 2 2 // Computation of the Chebyshev nodes, listed in reverse order [︁ (︁ )︁ ]︁ 푏1−푎1 휋 푎1+푏1 2: 퐿푐ℎ푒푏,푥 ← 2 cos (푗 + 1/2) 푁 + 2 1 0 푗 푁 −1 [︁ (︁ )︁ ]︁ 6 6 1 푏2−푎2 휋 푎2+푏2 3: 퐿푐ℎ푒푏,푡 ← cos (푗 + 1/2) + 2 푁2 2 06푗6푁2−1 4: 푀푐 ← [0]푁×푁1푁2 ; 푀푟 ← [0]푁×푁 ⃒ 푎1+푏1 ⃒ 푏1−푎1 −1 5: 퐵푥 ← ⃒ 2 ⃒ + 4 (휌1 + 휌1 ), 퐵푡 ← 휌2 max(|푎2|, |푏2|) (︀ ⃒ (︀ 푎1+푏1 푏1−푎1 −1 )︀⃒)︀ 6: 푔 ← 푥 ↦→ ⃒푓 2 + 4 (휌1 exp(푖푥) + 휌1 exp(−푖푥)) ⃒ 7: 퐵푓 ← max (푔([0, 2휋])) , 푖 ← 0 8: for ℓ = 0 to 푑 do 9: for 푘 = 0 to 푑 − ℓ do 10: 휙 ← ((푥, 푡) ↦→ (푢푥)푘(푣(푓(푥) + 푡))ℓ) // We compute the coefficient matrix : for each function, we compute its value at points of 퐿푐ℎ푒푏,푥 × 퐿푐ℎ푒푏,푡, use DCT and scale. (︂ )︂ 4 11: 푈 ← 2D-DCT-II (휙(퐿푐ℎ푒푏,푥[ℓ1], 퐿푐ℎ푒푏,푡[ℓ2]))0 ℓ 푁 −1 , 푁1푁2 6 16 1 06ℓ26푁2−1 12: for 푘1 = 0 to 푁1 − 1 do 13: for 푘2 = 0 to 푁2 − 1 do 1 14: 푀푐[푖, 푘2 + 푘1푁2] ← 푈[푘1, 푘2], 푀푐[푖, 푘2] ← 2 푀푐[푖, 푘2] 15: end for 1 16: 푀푐[푖, 푘1푁2] ← 2 푀푐[푖, 푘1푁2] 17: end for 18: for 푗 = 0 to 푁1푁2 − 1 do tprec 19: 푀푐[푖, 푗] ← [2 푀푐[푖, 푗]]0 20: end for // We compute the scaled remainder matrix. ⌊︀ tprec 푘 ℓ⌋︀ 21: 푀푟[푖, 푖] ← 2 푅0(푢퐵푥) (푣(퐵푓 + 퐵푡)) , 푖 ← 푖 + 1 22: end for 23: end for 24: Return 푀푐, 푀푟, tprec

We deduce the following corollary. Corollary 6.14. Under the assumptions and notations of the previous theorem, for all 푥, 푦 such that 푢푥, 푣푦 ∈ Z, we have either 푃 (푢푥, 푣푦) = 0, or 푦 ̸∈ [푓(푥) + 푎2, 푓(푥) + 푏2]. 6.4.2. Proof of success of Algorithm4. If we apply the LLL lattice basis reduction algorithm to 퐴^, we obtain: NEW RESULTS ON THE TABLE MAKER’S DILEMMA 39

Algorithm 4 2D approach to Problem 2.6

Input: Four real numbers 푎1 < 푏1, 푎2 < 푏2, 푓 a transcendental function analytic in a complex neighbourhood of [푎1, 푏1], five positive integers 푑, 푁1, 푁2, 푢, 푣, two real numbers 휌1, 휌2 > 1 such that 푁1, 푁2 > 2, 푁1푁2 > 푁 := (푑 + 1)(푑 + 2)/2 −푁1 −푁2 and 16휌1휌2(휌1 + 휌2 )푀휌1,푎1,푏1,휌2,푎2,푏2 (푓(푥) + 푡) < (휌1 − 1)(휌2 − 1). Z Output: If successful, return a list ℒ such that ℒ ⊃ {푋 ∈ such that 푎1 6 푋/푢 6 Z 푌 [︀ (︀ 푋 )︀ (︀ 푋 )︀ ]︀ 푏1 and there exists 푌 ∈ , 푣 ∈ 푓 푢 + 푎2, 푓 푢 + 푏2 }. 1: (푀푐, 푀푟, tprec) ← Algorithm3( 푎1, 푏1, 푎2, 푏2, 푓, 푑, 푁1, 푁2, 푢, 푣, 휌1, 휌2), 2: 푀퐿퐿퐿, 푈 ← LLL-reduce the rows of (푀푐 | 푀푟) // We assume that our LLL call returns both the reduced basis and the change-of-basis matrix denoted by 푈. 3: if max(‖(푀퐿퐿퐿[0, 푗])06푗6푁+푁1푁2−1‖2, ‖(푀퐿퐿퐿[1, 푗])06푗6푁+푁1푁2−1‖2) 6 tprec 2 /(푁 + 푁1푁2) then 푘 ℓ 4: 퐿푚 ← [푋1 푋2 for 푘 = 0 to 푑 − ℓ for ℓ = 0 to 푑] // List of monomials, ordered in a way compatible with Algorithm3, Steps8–10. ∑︀푁−1 ∑︀푁−1 5: 푃0 ← 푗=0 푈[0, 푗]퐿푚[푗], 푃1 ← 푗=0 푈[1, 푗]퐿푚[푗]

6: 푅(푋1) ← Res푋2 (푃0(푋1, 푋2), 푃1(푋1, 푋2)) 7: if 푅(푋1) ̸= 0 then 8: ℒ ← {푡 ∈ Z; 푅(푡) = 0} 9: return ℒ 10: else 11: return “FAIL” 12: end if 13: else 14: return “FAIL” 15: end if

−(푁−1)/4−tprec/(푁−1) Corollary 6.15. Assume that det(퐴^퐴^푡)1/2(푁−1) 2 ; then Theo- 6 푁+푁1푁2 rem 6.12 applies with Λ any of the first two vectors of an LLL-reduced basis of the lattice generated by the rows of 퐴^.

Proof. Identical to the proof of Corollary 5.12. 

We now study the case 휌1 = 퐾1/(푏1 − 푎1) and similarly 휌2 = 퐾2/(푏2 − 푎2), where 퐾1 > 2(푏1 − 푎1) and 퐾2 > 2(푏2 − 푎2) are fixed real numbers (note that 휌1, 휌2 > 2); 푁1 푁2 we further assume 휌1 6 휌2 .

Proposition 6.16. Let 푓 be analytic in a neighbourhood of the closed disc 풟푎1,푏1,퐾1 = C {푧 ∈ : |푧 − (푎1 + 푏1)/2| 6 퐾1/2}, 푑 be an integer > 2, 푁 = (푑 + 1)(푑 + 2)/2, 휌 = 퐾 /(푏 − 푎 ) > 2, 휌 = 퐾 /(푏 − 푎 ) > 2, 훾 = log 휌 / log 휌 ∈ [3, 푁], 푁 = √1 1 1 1 2 2 2 2 2 1 1 ⌊ 2훾푁⌋, 푁 = ⌈√︀2푁/훾⌉ two integers. Let 푀 (푓) := max |푓(푧)|. 2 풟푎1,푏1,퐾1 푧∈풟푎1,푏1,퐾1 Then, for 푑 → ∞, if

(︀ 푁 )︀ 푂 − 휓(푁/훾)훾 푏1 − 푎1 < 퐾12 푑 (︂ (︂ )︂)︂− 3휓(푁/훾)훾 (1+푂(1/푑)) 푢푣 퐾2 + |푎2 + 푏2| (|푎1 + 푏1| + 퐾1) 푀풟 (푓) + , 2 푎1,푏1,퐾1 2

−(푁−1)/4−tprec/(푁−1) we have Δ1/(푁−1) < 2 . 푁,푁1,푁2,[푎1,푏1],[푎2,푏2],휌1,휌2 푁+푁1푁2 40 N. BRISEBARRE AND G. HANROT

Proof. Since 휌1 = 퐾1/(푏1 − 푎1) > 2, we have 퐸휌1,푎1,푏1 ⊂ 풟푎1,푏1,퐾1 . Thanks to 푁/(푁−1) 3/2 Corollary 6.9, in view of (휌푖/(휌푖 − 1)) 6 2 for 푖 ∈ {1, 2}, we have Δ1/(푁−1) 2푂(1)푁 1/2+표(1) 푁,푁1,푁2,[푎1,푏1],[푎2,푏2],휌1,휌2 6 푑/3+푂(1) (︂ )︂푑/3+푂(1) (푢푣(|푎1 + 푏1| + 퐾1)/2) 퐾2 + |푎2 + 푏2| 푀풟 (푓) + . 휓(푁/훾)훾+푂(1) 푎1,푏1,퐾1 2 휌1

푁1 푁2 Note that, using Lemma 6.11, as 휌1 6 휌2 , 4 (︁ )︁ 2−tprec 휌−푁1 + 휌−푁2 > 푁 1 2 8 8 휌−푁2 = 휌−훾푁2 > 푁 2 푁 1 −표(푁) −푂(푁) √ √ > 2 휌1 , as 훾푁2 < 2푁훾 + 훾 6 푁(1 + 2). 1/(푁−1) −(푁−1)/4−tprec/(푁−1) Thus, for Δ < 2 /(푁 + 푁1푁2), it suf- 푁,푁1,푁2,[푎1,푏1],[푎2,푏2],휌1,휌2 fices that Δ1/(푁−1) < 2−푂(푁)휌−푂(1), or again that 푁,푁1,푁2,[푎1,푏1],[푎2,푏2],휌1,휌2 1

(︀ 푁 )︀ 푂 휓(푁/훾)훾 휌1 > 2 푑 (︂ (︂ )︂)︂ 3휓(푁/훾)훾 (1+푂(1/푑)) 푢푣 퐾2 + |푎2 + 푏2| (|푎1 + 푏1| + 퐾1) 푀풟 (푓) + . 2 푎1,푏1,퐾1 2  Corollary 6.17. Under the assumptions of Proposition 6.16, Algorithm4 over [푎1, 푏1] and [푎2, 푏2] produces at Step5 two polynomials 푃0, 푃1 such that

max |푃푖(푢푥, 푣(푓(푥) + 푡))| < 1 for 푖 ∈ {0, 1}. 푥∈[푎1,푏1], 푡∈[푎2,푏2] In particular, Algorithm4 never executes Step 14 and its output is valid.

Proof. It suffices to apply Proposition 6.16, Corollary 6.15, and Theorem 6.12. 

Note again that 푃0 and 푃1 may not be coprime, in which case the algorithm returns “FAIL” at Step 11. This is what makes the algorithm heuristic.

6.5. Complexity analysis. In this subsection, we deduce estimates for the com- plexity of our algorithm applied to a fixed interval [훼, 훽). As in the univariate case, this actually requires several things: ∙ An evaluation of the complexity of the basic blocks, namely Algorithms3 and4. ∙ Use Corollary 6.17 to evaluate the size of a subinterval [푎1, 푏1] which can be treated at once by those algorithms. We start by giving complexity estimates for Algorithms3 and4.

6.5.1. Complexity of Algorithms3 and4. In this subsection, we keep notations and assumptions of Section 5.5.1.

Proposition 6.18. On input 푎1, 푏1, 푎2, 푏2, 푓, 푑, 푁1, 푁2, 푢, 푣, 휌1, 휌2, if ′ ℳ := max(푢, 푣, |푎1|, |푏1|, 휌1, |푎2|, |푏2|, 휌2, 퐵푓 , max |푓 (푥)|), [푎1,푏1] NEW RESULTS ON THE TABLE MAKER’S DILEMMA 41

2 under the assumption 퐶푓,p = 푂(p ), the computations of Algorithm3 can be made in floating-point precision p = tprec + 푂(max(푑 log ℳ, | log((휌1 − 1)(휌2 − 1))|). Hence, Steps1–5 of Algorithm4 have complexity 푂(푑6M(푑2)(푑2 + p)p) using the 퐿2 algorithm. Proof. Similar to Propositions 5.16 and 5.17.  6.5.2. Number of subintervals for fixed 푑. Thanks to the results of the previous subsection, given a value 훾, we can estimate the maximum size of an interval [푎1, 푏1] ⊂ [훼, 훽], with 훼, 훽 fixed, for which Algorithm4 succeeds (in the sense of Corollary 6.17) −훾 and yields an upper bound of the order of magnitude 푤 = 푂(|푏1 − 푎1| ). This follows from Proposition 6.16, and yields at the same time the number of subintervals to be considered if one wants to deal with a full interval [훼, 훽]. Theorem 6.19. Given fixed 푓 and two fixed real numbers 훼, 훽, Problem 2.6 can heuristically be solved for 푢, 푣 → ∞, 푑 → ∞, 훾 ∈ [3, 푁], over [훼, 훽] using

푂(︀ 푁 )︀ 푑 (1+푂(1/푑)) (6.9) (훽 − 훼)2 훾휓(푁/훾) (푢푣) 3휓(푁/훾)훾 calls to Algorithm4 with parameter 푑. We then obtain a value 푑 (1+푂(1/푑)) (6.10) 푤 = 2푂(푁/휓(푁/훾))(푢푣) 3휓(푁/훾) .

Proof. This is a direct consequence of Corollary 6.17, where we note that 푎1, 푎2, 푏1, 푏2 훾 are bounded, and we choose 퐾1 = 2(푏1 − 푎1), 퐾2 = 1 and 휌2 = 휌1 . The heuristic nature of this result comes from the possibility that the two polynomials obtained in Algorithm4 are not coprime, in which case one cannot recover the solutions 푋, 푌 from those two polynomials. −훾 Finally, we can take 1/푤 = 푏2 − 푎2, thus the upper bound on 푤 is 푂((푏1 − 푎1) ), from which the second part of the result follows.  Remark 6.20. To get a better feeling of this result we should distinguish two cases: ∙ We let 훾 tend to infinity as 푁/휅, 휅 > 1. Then, we obtain an upper (︁ 2휅 (1+푂(1/푑)))︁ bound for the number of intervals 푂 (훽 − 훼)(푢푣) 3푑휓(휅) , with 푑 (1+푂(1/푑)) 푤 = (푢푣) 3휓(휅) . ∙ If, on the other hand, 훾 = 표(푁), 푁/훾 tends to infinity and we can use the 2 √︁ 2푁 asymptotic estimate (see Proposition 6.5) 휓(푁/훾) = 3 훾 + 푂(1) to get a bound on the number 푛퐼 of intervals and on 푤 of the respective forms √ √ √ 푂(푑/ 훾) 1/(2 훾)(1+푂( 훾/푑)) 푛퐼 = (훽 − 훼)2 (푢푣) , √ √ √ 푤 = 2푂(푑 훾)(푢푣) 훾/2(1+푂( 훾/푑)). The first case resembles the results obtained in the previous section withthe univariate algorithm (but with a different constant), whereas the second part is unattainable using the methods of the previous section. For 푢 = 푣 = 2푝, 훾 = 4 + 표(1), 푑 = 표(푝), we recover Stehlé’s result [60], namely the fact that we can solve the TMD (i.e., get the bound 1/푤 = 2−2푝 in time 2푝/2(1+표(1)). Remark 6.21. One can adapt Remark 5.22, Theorems 5.26 and 5.27 in the case of this bivariate method. However, for the last two results, the region where the discussion makes sense is restricted to 훾 ≫ 푁/(log 푁)2, as otherwise the impact of choosing an optimal 휌1 occurs only on second order terms. 42 N. BRISEBARRE AND G. HANROT

Further, this leads to a somewhat delicate discussion and, in the end, yields the same result up to some improvement in the constants. We thus chose not to include the corresponding theorems.

7. Comparison with previous work The following table summarizes the main results of the paper and compares them to [60]. The columns complexity and bound on 푤 should be understood as the exponent of 푢푣 in the corresponding values. For the sake of readability,

∙ In the first row, we restrict to 휔0 ∈ Z>0 just in order to get a more compact bound ; ∙ In the fourth row, we shall assume that 푑 = 표(log(푢푣)) ; ∙ We shall omit all factors (1 + 표(1)) in the asymptotic results. Rows labelled “1D” refer to Section5 (Algorithms1 and2) whereas rows labelled “2D” refer to Section6 (Algorithms3 and4).

References Parameters Matrix Complexity Bound on 푤 to be chosen dimensions (exponent of 푢푣) (exponent of 푢푣) 2푁푑 2푁푑 1D Rk. 5.20 휔0 ∈ [0, 푁) ∩ Z>0 푁 × 2푁 3(푁−휔0)(푁+휔0−3) 3(푁+휔0−3) 4 2푑 1D푑→∞ Cor. 5.21 휆 = 휔0/푁 ∈ [0, 1) 푁 × 2푁 3(1−휆2)푑 3(1+휆) 2휅 푑 2D푑→∞ Rk. 6.20 훾 = 푁/휅, 휅 > 1 ≈ 푁 × 3푁 3푑휓(휅) √3휓(휅) √1 2D푑→∞ Rk. 6.20 훾 = 표(푁) ≈ 푁 × 3푁 2 훾 훾/2 (훼+1)(훼+2) 1 √ S훼→∞ [60, Thm. 3] 휉 1 √ 휉/2 > 2 2 휉 ×푂(휉훼2) Table 2. Comparison of the main methods of this paper and [60]

Concerning [60], we have introduced a parameter 휉 for a clearer√ comparison; namely, we√ have put (with Stehlé’s notations) 푛1 − 푡 = log2(푢푣)/(2 휉), which gives 푛2 + 푚 = 휉 log(푢푣)/2. Note that Stehlé’s 훼 corresponds to our 푑, while his 푑 is a technical parameter with a completely different meaning of our 푑; finally, Stehlé’s 푡 corresponds to our notations 푝 − 1 − log2(푏1 − 푎1). To avoid any confusion, we will denote them 푑Ste, 훼Ste and 푡Ste in the sequel. Remark 7.1. Table2 only estimates the exponential part of the complexity; it would remain to estimate the polynomial part, which is dominated by the cost of lattice basis reduction. Akhavi-Stehlé’s trick reduces greatly the influence of the dimension, but the size of the integers involved in the different methods differ. Roughly speaking, and ignoring the dependency on 푓, 푎, 푏 which is similar for the two methods: ∙ in the case of the univariate method, the size of the integers involved is ≈ tprec + 푑 log(max(푢, 푣)) ≈ 푑 log max(푢, 푣) + 푁 log 휌, which is of the order of 푂(푑 log max(푢, 푣) + log 푤). As for this method, log(푤) = 푂(푑 log(푢푣)), the size of the integers is 푂(푑 log max(푢, 푣)); ∙ in the case of the bivariate method, the size of the integers involved is 푁1 푁2 ≈ 푑 log(max(푢, 푣)) + tprec ≈ 푑 log max(푢, 푣) + log max(휌1 , 휌2 ), whereas NEW RESULTS ON THE TABLE MAKER’S DILEMMA 43

the bound on 푤 is of the order of 휌2; as we expect, for optimal choices of pa- rameters, that 푁1 log 휌1 and 푁2 log 휌2 have the same order of magnitude, this size is thus 푂(푑 log max(푢, 푣) + 푁2 log 휌2) = 푂(푑 log max(푢, 푣) + 푁2 log 푤). √ (︀ √ )︀ As 푁2 ≍ 푑/ 훾, this is 푂(푑 log max(푢, 푣) + log 푤/ 훾 ); 푑Ste 훼Ste ∙ in Stehlé’s paper, the integers involved are of the order of (푀푁2푁1 ) , which, in our notations, means that their size is of the order of 푂(푑(log 푤 + log max(푢, 푣))). This difference in the size of the integers involved may, at least partially, explain the fact that lattice basis reduction performs somewhat better in our method than in the BaCSeL implementation of Stehlé’s method (see Section8).

7.1. Univariate method and Bombieri and Pila’s approach. Our univariate method bears a strong resemblance to Bombieri & Pila’s approach [6] of bounding the number of integer points on an analytic curve. The method that we develop is effective, and yields a way to not only control integer points on the curve, butclose to the curve. Note that we (asymptotically) recover Bombieri and Pila’s estimate [6, Main 4 Lemma] under the form (푢푣) 3(푑+3) , which is the number of intervals required. We can thus recover, following Bombieri and Pila’s arguments based on [6, Thm. 5] or [55], their bound for the number of points on the curve (without any heuristic). Our method also allows for the explicit determination of those points, but is on this point only (mildly) heuristic. Our variation with the 휔0, on the other hand, is new; it worsens the quality of Bombieri-Pila’s bound, but improves the distance around the curve in which we are able to detect points with denominator dividing 푣.

7.2. Bivariate method vs. Stehlé’s approach. On the other hand, our bivariate method bears a strong resemblance with Stehlé’s work [60]. Our approach mostly differs by the use of approximation-related techniques (Chebyshev interpolants) rather than a computer-algebra oriented vision of functions using Taylor expansions. We have a dense representation of our auxiliary polynomials, which leads us to manipulate almost square matrices, whereas Stehlé’s matrices are inherently more rectangular, because he has to represent all coefficients of the bivariate polynomials he manipulates. Table5 shows that our approach is better in practice. We propose an analysis of this favourable situation in Section 8.3. Note two further facts : ∙ our analysis is sharper in the case of entire functions, as we are able to take into account the growth of the function at infinity. The same results could probably be derived in Stehlé’s paper using Cauchy inequalities to estimate Taylor coefficients. This sharper analysis allows us to a obtain, in Problem 2.6, a lower −푂(푝2/ log 푝) bound 1/푤 > (푢푣) in the case where we want to solve the problem without any subdivision. This improves at the same time on Stehlé’s method and, heuristically, Nesterenko-Waldschmidt paper which, though with different goals and through a purely theoretical (vs. algorithmic) −푂(푝2) method, both obtain the upper bound 푤 6 (푢푣) . 44 N. BRISEBARRE AND G. HANROT

∙ Our analysis is also sharper in the more practical domain where we choose 훾 = 푁/휅 for some constant 휅. We obtain better constants in the exponents for both the overall complexity of the method and the bound on 푤.

8. Experimental results We have implemented TMD-oriented versions of our algorithms in SageMath18. Our codes are available from https://perso.ens-lyon.fr/nicolas.brisebarre/tmd.html. The tests hereafter were executed on an Intel Xeon E5620 2.40GHz CPU with a 64-bit Linux-based system. In the two examples that we address, we cut the binades under consideration into subintervals of the same size and we apply the algorithms to each subinterval. For Algorithms1 and2, the subinterval will correspond to the interval [푎, 푏] considered in these algorithms. For Algorithms3 and4, the subinterval will correspond to the interval [푎1, 푏1], while [푎2, 푏2] will be equal to [−1/푤, 1/푤], cf. Problem 2.6. Currently, the most expensive part of our algorithms is the LLL reduction. In our implementations, the following two optimizations led to a significant speedup of the LLL reduction part: (1) we use a random projection trick inspired from [2], cf. Theorem 4.6.

(2) If we consider two contiguous subintervals 퐼0 and 퐼1, the matrices 푀푐,퐼0 and

푀푟,퐼0 , 푀푐,퐼1 and 푀푟,퐼1 , output by Algorithm1 (resp. Algorithm3) applied to 퐼0 and 퐼1 will be close by construction. Hence, our optimization consists in:

∙ retrieving the change-of-basis matrix 푈퐼0 computed at Step2 of Algo- rithm2 (resp. Step2 of Algorithm4) applied to 퐼0,

∙ then left-multiplying 푀푐,퐼1 and 푀푟,퐼1 by 푈퐼0 , which operates in practice

as a significant prereduction of the lattice built from 푀푐,퐼1 and 푀푟,퐼1 , ∙ eventually, we apply LLL to these prereduced matrices. Another optimization comes from the use of Newton polynomials instead of monomials (as pointed in Section 5.3.4): for given values of 푑, 푁, 푁1, 푁2, it makes it possible to process larger subintervals. * The timings and the values of log2(푤) presented with a are estimated ones: we performed our computations on a subinterval and then extrapolate the timing to address the whole binade, and the corresponding value of log2(푤). We chose to limit the evaluation of our algorithms on feasible computations in binary128, namely computations that could be performed in real life, possibly using a large cluster. In terms of the bound on 푤, we have thus excluded the optimal case of the TMD, namely 푤 ≈ 22푝, and have started at 푤 ≈ 26푝.

8.1. Algorithms1 and2 in action: the TMD for the gamma function in binary128. Euler’s gamma [64, Chap. 3] is one of the functions of the C mathe- matical library. Very little is known about the Diophantine properties of its values at rational numbers: we have Γ(푘 + 1) = 푘! for any 푘 ∈ N and the transcendence of the numbers Γ(1/2), Γ(1/3), Γ(1/4), Γ(1/6), Γ(2/3), Γ(3/4), Γ(5/6) [69]. We used our implementation of Algorithms1 and2 to address the TMD over [1, 2), for directed rounding functions and for the precision 푝 = 113. More precisely, we address the

18https://www.sagemath.org/ NEW RESULTS ON THE TABLE MAKER’S DILEMMA 45 following question, for various values of the parameters 푑 and 휔0: compute 푤 > 0 푝−1 and all the integers 푋, 1 6 푋/2 < 2 for which there exists 푌 ∈ Z satisfying ⃒ (︂ )︂ ⃒ ⃒ 푋 푌 ⃒ 1 ⃒Γ − ⃒ < . ⃒ 2푝−1 2푝 ⃒ 푤 We report in Table3 our results. We first set 푝 = 113, 푢 = 2푝−1, 푣 = 2푝, then the integer 푑 and the real number 휔0, and finally we choose 휌 in order to maximize the width of the subinterval [푎, 푏] of [1, 2) which we apply the algorithms to. The column ‖푀‖∞ stands for the size (in bits) of the largest coefficient of the integer matrix 푀 = (푀푐 | 푀푟) which is LLL-reduced during the algorithm.

푑 푁 휌 푏 − 푎 휔0 ‖푀‖∞ log2(푤) Timing 6 28 232 2−31 0 ≈ 1575 bits 7.86푝* 35.53* years 8 45 226 21/229 0 ≈ 2070 bits 10.11푝* 510* days 10 66 221 13/224 0 ≈ 2510 bits 12.24푝 102 days 12 91 219 25/223 14 ≈ 2810 bits 12.92푝 84 days 12 91 219 21/222 8 ≈ 2920 bits 13.79푝 57 days 12 91 218 15/221 0 ≈ 2985 bits 14.44푝 46 days Table 3. Algo.1 and2: The gamma function over the binade [1, 2)

8.2. Using 휔0 in Algorithms1 and2 : the TMD for the exponential function in binary128 - experimental validation of Figure1. Using the exp function over [1/4, 1/2), we have also studied the influence of 휔0, in order to build the experimental equivalent of the theoretical curves of Figure1. The exponential function is also part of the C mathematical library. We recall that Section 3.1 presents up-to-date, to the best of our knowledge, theoretical results regarding the TMD in the case of the exponential function. Here, we address the TMD for exp over [1/4, 1/2), for directed rounding functions and for the precision 푝 = 113: we compute 푤 > 0 and all the integers 푋, 1/4 6 푋/2푝+1 < 1/2 for which there exists 푌 ∈ Z satisfying ⃒ (︂ )︂ ⃒ ⃒ 푋 푌 ⃒ 1 ⃒exp − ⃒ < . ⃒ 2푝+1 2푝−1 ⃒ 푤 Here, we set 푢 = 2푝+1 and 푣 = 2푝−1. For each value of the bound on 푤 we have tried to find, for various values of 푑, the 휔0 allowing to attain this bound in minimal time (in the case of exp, we use 휌 = 푑/(푏 − 푎), cf. Theorem 5.26). Figure2 represents the log2 of the time to treat a binade (푦-axis) as a function of log2(푤)/푝 (푥-axis); this is indeed the experimental equivalent of Figure1, up to a rescaling of the 푥-axis (by a factor of 2, as Figure1 is expressed in terms of powers of 푢푣 = 22푝) and 푦-axis.

8.3. Algorithms3 and4 in action: the TMD for the exponential function in binary128. We used our implementation of Algorithms3 and4 to address the TMD over [1/4, 1/2), for directed rounding functions and for the precision 푝 = 113. More precisely, we address the following question, for various values of the parameter 46 N. BRISEBARRE AND G. HANROT

20 d=6 d=7 d=8 d=9 15 d=10 d=11 d=12

10

5

0 6 8 10 12 14

Figure 2. Exponent estimates

푝+1 푤: determine the integers 푋, 1/4 6 푋/2 < 1/2 for which there exists 푌 ∈ Z satisfying ⃒ (︂ )︂ ⃒ ⃒ 푋 푌 ⃒ 1 ⃒exp − ⃒ < . ⃒ 2푝+1 2푝−1 ⃒ 푤 푝+1 푝−1 We report in Table4 our results. We first set 푝 = 113, 푢 = 2 , 푣 = 2 , 푏2 = 19 −푎2 = 1/푤 and the value of 푑. The choice of the parameters 푁1, 푁2, 휌1, 푎1 and 푏1 is then made in order to maximize the width of the subinterval [푎1, 푏1] of [1/4, 1/2). 푁1/푁2 We finally fix 휌2 = min(휌1 , 1/푏2). The column ‖푀‖∞ stands for the largest coefficient of the integer matrix 푀 = (푀푐 | 푀푟) which is to be LLL-reduced.

log2(푤) 푑 푁 푁1 푁2 휌1 푏1 − 푎1 ‖푀‖∞ Timing 6푝 6 28 24 2 236 7/235 ≈ 1540 bits 22.63* years 6푝 12 91 60 3 229 59/230 ≈ 3080 bits 3.41* years 8푝 8 45 39 2 230 15/229 ≈ 2065 bits 203* days 8푝 12 91 59 3 226 2−21 ≈ 2870 bits 137 days 10푝 10 66 59 2 225 7/223 ≈ 2590 bits 25.8 days 10푝 12 91 61 3 224 7/222 ≈ 3260 bits 37.7 days 12푝 12 91 80 2 221 5/219 ≈ 3020 bits 8.7 days Table 4. Algo.3 and4: exp over the binade [1/4, 1/2)

For the sake of practical comparison, we have attempted a comparison with BaCSeL-4.020, which implements [60]. For the latter, we used BaCSel-4.0 only for the generation of the corresponding matrix, and simply measured the cost of the LLL step (which dominates the total cost anyway), using the same implementation of fplll as in our code. For the comparison to be fair, we have included the Akhavi-Stehlé’s trick (cf. Theorem 4.6) in BaCSeL. We have also tried to include the prereduction trick but

19 Actually, it is the value of 푏1 − 푎1 which matters and not the values of 푎1 and 푏1. 20https://gitlab.inria.fr/zimmerma/bacsel NEW RESULTS ON THE TABLE MAKER’S DILEMMA 47 the latter, in the setting of [60], seems to make the reduction more costly. In this case, the complete timings are merely estimates for the cost of treating a whole binade. The results are reported in Table5. Again, the column ‖푀‖∞ stands for the largest coefficient of the integer matrix which is to be LLL-reduced.

log2(푤) 훼Ste 푑Ste 푡Ste ‖푀|‖∞ Timing Comparison with this paper ‖푀|‖∞ Timing 6푝 6 16 78.7 ≈ 3870 bits 169* years ×2.5 ×7.5 6푝 12 20 86.2 ≈ 7780 bits 334* years ×2.5 ×98 8푝 8 30 86.3 ≈ 7220 bits 16.82* years ×3.5 ×30 8푝 12 30 90.1 ≈ 10230 bits 35.01* years ×3.5 ×93 10푝 10 40 90.8 ≈ 11150 bits 7.37* years ×4.3 ×104 10푝 12 55 93.2 ≈ 13545 bits 10.53* years ×4.2 ×102 12푝 12 60 94.6 ≈ 16255 bits 3.81* years ×5.4 ×160 Table 5. Stehlé’s BaCSeL parameters and timings for the expo- nential function over the binade [1/4, 1/2)

We observe that we gain a significant constant factor, increasing with the value of 훼Ste (= 푑); our method allows for slightly larger intervals (by a factor around 2, which seems to decrease with 푑), but the main factors explaining the difference are the fact that our lattices seem somewhat easier to reduce and that the “prereduction trick” also plays an important role. These three terms each account for a small 2 to 5 (depending on the cases) factor, explaining overall the factors ≈ 9-100 that we observe above. It should finally be recalled that the comparison is biased in favour of[60] : we are comparing optimized C code to an algorithm implemented in an interpreted language. The comparison remains rather fair as long as the lattice basis reduction dominates (which is the case for 훼Ste = 10, 12) but a low-level implementation of our algorithms is required in order to get reliable results when our parameter 푑 6 8, where the gap is probably larger than suggested by Table5.

Remark 8.1. Note that, regarding the targets log2(푤) = 6푝, 8푝, a more relevant comparison between Table4 and Table5 would probably be obtained by comparing ∙ Row 2 of Table4 with Row 1 of Table5, which corresponds to the best choice of parameters for the problem with log2(푤) = 6푝; with this criterion, the ratio is ≈ 50. ∙ Row 4 of Table4 with Row 3 of Table5, which corresponds to the best choice of parameters for the problem with log2(푤) = 8푝; with this criterion, the ratio is ≈ 45.

9. Conclusion We expect this work to be used in practice to address the TMD for the binary128 format, but we also hope that it will be of practical use for the determination of integer points close to a transcendental curve. Moreover, we believe that the tools that we have developed are of a more general interest in the context of practical 48 N. BRISEBARRE AND G. HANROT applications of Coppersmith’s method, and allow easier analysis of some “rectangular” variants. Regarding future work, we plan, first, to improve our current implementations and then to study the case of algebraic functions, for which we have some preliminary results.

Acknowledgements. We wish to thank Jean-Michel Muller for so many invaluable discussions, Martin Albrecht for his kind and effective help about fplll and Paul Zimmermann for his thorough and extremely helpful rereading of this paper.

References

[1] M. Ajtai. The Shortest Vector Problem in L2 is NP-hard for Randomized Reductions (Extended Abstract). In Proceedings of the 30th ACM symposium on Theory of computing (STOC), pages 10–19, 1998. [2] A. Akhavi and D. Stehlé. Speeding-up Lattice Reduction with Random Projections. In E. S. Laber, C. F. Bornstein, L. T. Nogueira, and L. Faria, editors, LATIN 2008: Theoretical Informatics, 8th Latin American Symposium, Búzios, Brazil, April 7-11, 2008, Proceedings, volume 4957 of Lecture Notes in Computer Science, pages 293–305. Springer, 2008. [3] American National Standards Institute and Institute of Electrical and Electronic Engineers. IEEE Standard for Binary Floating-Point Arithmetic. ANSI/IEEE Standard 754–1985, 1985. [4] American National Standards Institute and Institute of Electrical and Electronic Engineers. IEEE Standard for Radix Independent Floating-Point Arithmetic. ANSI/IEEE Standard 854–1987, 1987. [5] V. Berthé and L. Imbert. Diophantine approximation, Ostrowski numeration and the double- base number system. Discrete Math. Theor. Comput. Sci., 11(1):153–172, 2009. [6] E. Bombieri and J. Pila. The number of integral points on arcs and ovals. Duke Math. J., 59(2):337–357, 1989. [7] D. Boneh and G. Durfee. Cryptanalysis of RSA with private key 푑 less than 푁 0.292. IEEE Trans. Inform. Theory, 46(4):1339–1349, 2000. [8] J. P. Boyd. Chebyshev and Fourier spectral methods. Dover Publications Inc., Mineola, NY, second edition, 2001. Available from http://www-personal.umich.edu/~jpboyd/BOOK_ Spectral2000.html. [9] N. Brisebarre and S. Chevillard. Efficient polynomial 퐿∞ approximations. In ARITH ’07: Proceedings of the 18th IEEE Symposium on Computer Arithmetic, pages 169–176, Washington, DC, 2007. IEEE Computer Society. [10] N. Brisebarre, G. Hanrot, and O. Robert. Exponential sums and correctly-rounded functions. IEEE Trans. Comput., 66(12):2044–2057, 2017. [11] N. Brisebarre and J.-M. Muller. Correct rounding of algebraic functions. Theor. Inform. Appl., 41(1):71–83, 2007. [12] M. Bronstein. Symbolic integration. I, volume 1 of Algorithms and Computation in Mathemat- ics. Springer-Verlag, Berlin, second edition, 2005. Transcendental functions, With a foreword by B. F. Caviness. [13] J. W. S. Cassels. An introduction to the geometry of numbers. Classics in Mathematics. Springer-Verlag, Berlin, 1997. Corrected reprint of the 1971 edition. [14] E. W. Cheney. Introduction to approximation theory. AMS Chelsea Publishing, Providence, RI, 1998. Reprint of the second (1982) edition. [15] S. Chevillard. Évaluation efficace de fonctions numériques – Outils et exemples. PhD thesis, École normale supérieure de Lyon – Université de Lyon, 2009. In French. [16] W. J. Cody. A proposed radix and word length independent standard for floating-point arithmetic. ACM SIGNUM Newsletter, 20:37–51, Jan. 1985. [17] W. J. Cody, J. T. Coonen, D. M. Gay, K. Hanson, D. Hough, W. Kahan, R. Karpinski, J. Palmer, F. N. Ris, and D. Stevenson. A proposed radix-and-word-length-independent standard for floating-point arithmetic. IEEE MICRO, 4(4):86–100, Aug. 1984. NEW RESULTS ON THE TABLE MAKER’S DILEMMA 49

[18] H. Cohen. A course in computational algebraic number theory, volume 138 of Graduate Texts in Mathematics. Springer-Verlag, Berlin, 1993. [19] D. Coppersmith. Small solutions to polynomial equations, and low exponent RSA vulnerabili- ties. J. Cryptology, 10(4):233–260, 1997. [20] D. Coppersmith. Finding small solutions to small degree polynomials. In J. H. Silverman, editor, Proceedings of Cryptography and Lattices (CaLC), volume 2146 of Lecture Notes in Computer Science, pages 20–31. Springer-Verlag, Berlin, 2001. [21] F. de Dinechin, A. V. Ershov, and N. Gast. Towards the post-ultimate libm. In Proceedings of the 17th IEEE Symposium on Computer Arithmetic, ARITH ’05, pages 288–295, Washington, DC, USA, 2005. IEEE Computer Society. [22] F. de Dinechin, C. Lauter, and J.-M. Muller. Fast and correctly rounded logarithms in double-precision. Theor. Inform. Appl., 41(1):85–102, 2007. [23] F. De Dinechin, J.-M. Muller, B. Pasca, and A. Plesco. An FPGA architecture for solving the Table Maker’s Dilemma. In Application-Specific Systems, Architectures and Processors (ASAP), 2011 IEEE International Conference on, pages 187–194, Santa Monica, United States, Sept. 2011. IEEE Computer Society. [24] L. Fox and I. B. Parker. Chebyshev polynomials in numerical analysis. Oxford University Press, London-New York-Toronto, Ont., 1968. [25] F. Gramain. Sur le lemme de Siegel (d’après E. Bombieri et J. Vaaler). Publications mathé- matiques de l’Univ. P. et M. Curie, 64:fascicule 1, 1983-1984. [26] P. M. Gruber and C. G. Lekkerkerker. Geometry of numbers, volume 37 of North-Holland Mathematical Library. North-Holland Publishing Co., Amsterdam, second edition, 1987. [27] G. Hanrot, V. Lefèvre, D. Stehlé, and P. Zimmermann. Worst Cases of a Periodic Function for Large Arguments. In P. Kornerup and J.-M. Muller, editors, 18th IEEE Symposium in Computer Arithmetic, pages 133–140, Montpellier, France, June 2007. IEEE. [28] IEEE. IEEE Standard for Floating-Point Arithmetic (IEEE Std 754-2019). July 2019. Available at https://ieeexplore.ieee.org/servlet/opac?punumber=8766227. [29] IEEE Computer Society. IEEE Standard for Floating-Point Arithmetic. IEEE Standard 754- 2008, Aug. 2008. Available at https://ieeexplore.ieee.org/servlet/opac?punumber=4610933. [30] C. Iordache and D. W. Matula. On infinitely precise rounding for division, square root, reciprocal and square root reciprocal. In Koren and Kornerup, editors, Proceedings of the 14th IEEE Symposium on Computer Arithmetic (Adelaide, Australia), pages 233–240. IEEE Computer Society Press, Los Alamitos, CA, Apr. 1999. [31] C.-P. Jeannerod, N. Louvet, J.-M. Muller, and A. Panhaleux. Midpoints and exact points of some algebraic functions in floating-point arithmetic. IEEE Trans. Comput., 60(2):228–241, 2011. [32] F. Johansson. Arb: efficient arbitrary-precision midpoint-radius interval arithmetic. IEEE Trans. Comput., 66(8):1281–1292, 2017. [33] W. Kahan. Why do we need a floating-point standard? Technical report, Computer Science, UC Berkeley, 1981. Available at https://www.cs.berkeley.edu/ wkahan/ieee754status/why- ieee.pdf. [34] S. Khémira and P. Voutier. Approximation diophantienne et approximants de Hermite-Padé de type I de fonctions exponentielles. Ann. Sci. Math. Québec, 35(1):85–116, 2011. [35] S. Khémira. Approximants de Hermite-Padé, déterminants d’interpolation et approximation diophantienne. PhD thesis, Université Paris 6, Paris, France, 2005. [36] O. Knill. Cauchy-Binet for pseudo-determinants. Linear Algebra Appl., 459:522–547, 2014. [37] T. Lang and J.-M. Muller. Bound on run of zeros and ones for algebraic functions. In N. Burgess and L. Ciminiera, editors, Proceedings of the 15th IEEE Symposium on Computer Arithmetic (ARITH-16), pages 13–20, June 2001. [38] C. Q. Lauter. Arrondi Correct de Fonctions Mathématiques. PhD thesis, École Normale Supérieure de Lyon, Lyon, France, Oct. 2008. [39] V. Lefèvre. Moyens Arithmétiques Pour un Calcul Fiable. PhD thesis, École Normale Supérieure de Lyon, Lyon, France, 2000. 2 [40] V. Lefèvre. New results on the distance between a segment and Z . Application to the exact rounding. In Proceedings of the 17th IEEE Symposium on Computer Arithmetic (ARITH-17), pages 68–75. IEEE Computer Society Press, Los Alamitos, CA, June 2005. 50 N. BRISEBARRE AND G. HANROT

[41] V. Lefèvre and J.-M. Muller. Worst cases for correct rounding of the elementary functions in double precision. In N. Burgess and L. Ciminiera, editors, Proceedings of the 15th IEEE Symposium on Computer Arithmetic (ARITH-16), pages 111–118, Vail, CO, June 2001. [42] V. Lefèvre, J.-M. Muller, and A. Tisserand. Towards correctly rounded transcendentals. In Proceedings of the 13th IEEE Symposium on Computer Arithmetic, pages 132–137. IEEE Computer Society Press, Los Alamitos, CA, 1997. [43] A. K. Lenstra, H. W. Lenstra, Jr., and L. Lovász. Factoring polynomials with rational coefficients. Math. Ann., 261(4):515–534, 1982. [44] J. Liouville. Nouvelle démonstration d’un théorème sur les irrationnelles algébriques. C. R. Acad. Sci. Paris, 18:910–911, 1844. [45] J. Liouville. Remarques relatives à des classes très-étendues de quantités dont la valeur n’est ni algébrique ni même réductible à des irrationnelles algébriques. C. R. Acad. Sci. Paris, 18:883–885, 1844. [46] J. Liouville. Sur des classes très étendues de quantités dont la valeur n’est ni algébrique ni même réductible à des irrationnelles algébriques. J. Math. Pures Appl., 16:133–142, 1851. [47] L. Lovász. An algorithmic theory of numbers, graphs and convexity, volume 50 of CBMS- NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1986. [48] P. W. Markstein. The New IEEE-754 Standard for Floating-Point Arithmetic. In Numerical Validation in Current Hardware Architectures, number 08021 in Dagstuhl Seminar Proceedings. Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany, 2008. [49] J. C. Mason and D. C. Handscomb. Chebyshev polynomials. CRC Press, 2002. [50] J.-M. Muller. Elementary Functions, Algorithms and Implementation. Birkhäuser, Boston, 3rd edition, 2016. [51] J.-M. Muller, N. Brisebarre, F. de Dinechin, C.-P. Jeannerod, V. Lefèvre, G. Melquiond, N. Revol, D. Stehlé, and S. Torres. Handbook of Floating-Point Arithmetic. Birkhäuser, 2010. [52] Y. V. Nesterenko and M. Waldschmidt. On the approximation of the values of exponential function and logarithm by algebraic numbers (in Russian). Mat. Zapiski, 2:23–42, 1996. Available in English at http://www.math.jussieu.fr/~miw/articles/ps/Nesterenko.ps. [53] P. Q. Nguyen and D. Stehlé. An LLL algorithm with quadratic complexity. SIAM J. Comput., 39(3):874–903, 2009. [54] P. Q. Nguyen and B. Vallée, editors. The LLL Algorithm - Survey and Applications. Information Security and Cryptography. Springer, 2010. [55] J. Pila. Density of integer points on plane algebraic curves. Internat. Math. Res. Notices, (18):903–912, 1996. [56] G. Plonka, D. Potts, G. Steidl, and M. Tasche. Numerical Fourier analysis. Applied and Numerical Harmonic Analysis. Birkhäuser/Springer, Cham, 2018. [57] M. J. D. Powell. Approximation theory and methods. Cambridge University Press, 1981. [58] T. J. Rivlin. The Chebyshev polynomials. Wiley-Interscience [John Wiley & Sons], New York-London-Sydney, 1974. Pure and Applied Mathematics. [59] M. J. Schulte and E. E. Swartzlander. Exact rounding of certain elementary functions. In E. E. Swartzlander, M. J. Irwin, and G. Jullien, editors, Proceedings of the 11th IEEE Symposium on Computer Arithmetic, pages 138–145. IEEE Computer Society Press, Los Alamitos, CA, June 1993. [60] D. Stehlé. On the randomness of bits generated by sufficiently smooth functions. In F. Hess, S. Pauli, and M. E. Pohst, editors, Proceedings of the 7th Algorithmic Number Theory Symposium, ANTS VII, volume 4076 of Lecture Notes in Computer Science, pages 257–274. Springer-Verlag, Berlin, 2006. [61] D. Stehlé, V. Lefèvre, and P. Zimmermann. Worst Cases and Lattice Reduction. In J.-C. Bajard and M. J. Schulte, editors, Proceedings of the 16th IEEE Symposium on Computer Arithmetic (ARITH-16), pages 142–147. IEEE Computer Society Press, Los Alamitos, CA, June 2003. [62] D. Stehlé, V. Lefèvre, and P. Zimmermann. Searching Worst Cases of a One-Variable Function Using Lattice Reduction. IEEE Trans. Comput., 54(3):340–346, Mar. 2005. [63] G. Strang. The discrete cosine transform. SIAM Rev., 41(1):135–147, 1999. [64] N. M. Temme. Special functions. A Wiley-Interscience Publication. John Wiley & Sons, Inc., New York, 1996. An introduction to the classical functions of mathematical physics. NEW RESULTS ON THE TABLE MAKER’S DILEMMA 51

[65] S. Torres. Tools for the Design of Reliable and Efficient Functions Evaluation Libraries. PhD thesis, École normale supérieure de Lyon – Université de Lyon, Lyon, France, 2016. [66] L. N. Trefethen. Approximation Theory and Approximation Practice. SIAM, 2013. [67] W. Tucker. Validated numerics, A short introduction to rigorous computations. Princeton University Press, Princeton, NJ, 2011. A short introduction to rigorous computations. [68] J. von zur Gathen and J. Gerhard. Modern computer algebra. Cambridge University Press, third edition, 2013. [69] M. Waldschmidt. Transcendence of periods: the state of the art. Pure Appl. Math. Q., 2(2, Special Issue: In honor of John H. Coates. Part 2):435–463, 2006. [70] K. Xu. The Chebyshev points of the first kind. Appl. Numer. Math., 102:17–30, 2016. [71] A. Ziv. Fast evaluation of elementary mathematical functions with correctly rounded last bit. ACM Trans. Math. Software, 17(3):410–423, Sept. 1991. [72] A. Zygmund. Trigonometric series. Vol. I, II. Cambridge Mathematical Library. Cambridge University Press, Cambridge, third edition, 2002.

Appendix A. Proofs of facts regarding Chebyshev polynomials First, we recall the aliasing phenomenon in the case of Chebyshev nodes of the first kind. Proposition A.1. For all 푁 > 1, ∙ for 푘 = 0, . . . , 푁−1, the polynomials 푇푘, −푇2푁−푘, −푇2푁+푘, 푇4푁−푘, 푇4푁+푘,... take the same values at the 휇푗,푁−1, 푗 = 0, . . . , 푁 − 1, ∙ for 푗 > 0, let ⌊︂푁 + 푗 ⌋︂ 푚 = |(푗 + 푁 − 1)(mod2푁) − 푁 + 1| and 푝 = , 2푁 푝 the polynomials 푇푗 and (−1) 푇푚 take the same values at the 휇푗,푁−1, 푗 = 0, . . . , 푁 − 1. Proof. These are Theorems 1 and 2 of [70].  Let 푓 be Lipschitz continuous over [−1, 1], we know [72, Chap. VI] that 푓 admits ′ ∑︁ (︀ 2 −1/2)︀ a series expansion 푎푛푇푛(푥) in 퐿2 [−1, 1], (1 − 푥 ) which converges 푛>0 uniformly to 푓. We now state a straightforward consequence of Proposition A.1. Corollary A.2. Let 푁 ∈ N, 푁 > 1, Let 푓 be Lipschitz continuous over [−1, 1], its Chebyshev coefficients (푎푘)푘>0 and the coefficients 푐푘, 푘 = 0, . . . , 푁 − 1, of the interpolation polynomial 푝푁−1 of 푓 at the Chebyshev nodes of the first kind satisfy, for 푘 = 0, . . . , 푁 − 1,

(A.1) 푐푘 = 푎푘 − 푎2푁−푘 − 푎2푁+푘 + 푎4푁−푘 + 푎4푁−푘 − · · · +∞ +∞ ∑︁ 푗 ∑︁ 푗 = (−1) 푎2푗푁+푘 + (−1) 푎2푗푁−푘. 푗=0 푗=1 Let 푁 ∈ N, 푁 > 1, we also define 1 (︂ 1 )︂ 훾 = 1 and 훾 = 1 + for 푘 = 1, . . . , 푁 − 1. 휌,0,푁−1 휌,푘,푁−1 1 − 휌−2푁 휌2(푁−푘) Proposition A.3. Let 휌 > 1, let 푁 ∈ N, 푁 > 1, 푓 be a function analytic in a neighbourhood of 퐸휌, the coefficients (푐푘)푘=0,...,푁−1 of the interpolation polynomial of 푓 at the Chebyshev nodes of the first kind satisfy 푀 (푓) |푐 | 2 휌 훾 , 푘 = 0, . . . , 푁 − 1, 푘 6 휌푘 휌,푘,푁−1 52 N. BRISEBARRE AND G. HANROT

where 푀휌(푓) = max푧∈ℰ휌 |푓(푧)|. Moreover, we have

4푀 (푓) ‖푓 − 푝 ‖ 휌 . 푁−1 ∞,[−1,1] 6 휌푁−1(휌 − 1)

Proof. First, we use the following consequence of [66, Thm 8.1]: the Chebyshev coefficients satisfy

푀 (푓) (A.2) |푎 | 푀 (푓) and |푎 | 2 휌 . 0 6 휌 푘 6 휌푘

Then, we combine Equation (A.1) and Inequalities (A.2) to obtain, for 푘 = 1, . . . , 푁− 1.

|푐푘| 6 |푎푘| + |푎2푁−푘| + |푎2푁+푘| + |푎4푁−푘| + |푎4푁−푘| + ··· , 푀 (푓) 푀 (푓) 푀 (푓) 푀 (푓) 푀 (푓) 2 휌 + 2 휌 + 2 휌 + 2 휌 + 2 휌 + ··· , 6 휌푘 휌2푁−푘 휌2푁+푘 휌4푁−푘 휌4푁+푘 푀 (푓) (︂ 1 1 1 1 )︂ 2 휌 1 + + + + + ··· , 6 휌푘 휌2푁−2푘 휌2푁 휌4푁−2푘 휌4푁 푀 (푓) 1 (︂ 1 )︂ 2 휌 1 + . 6 휌푘 1 − 휌−2푁 휌2(푁−푘)

2 ∑︀ Moreover, recall that 푐0 = 푓(휇ℓ), hence |푐0| = 2 max푥∈[−1,1] |푓(푥)| 푁 16ℓ6푁 6 2푀휌(푓) by the maximum principle. Now, we turn to the estimate on the remainder. Corollary A.2 yields, for any 푥 ∈ [−1, 1],

∑︁ 푝 푓(푥) − 푝푁−1(푥) = 푎푘(푇푘(푥) − (−1) 푇푚(푥)) 푘>푁 where 푚 and 푝 are defined as in Proposition A.1. Hence, we have, for any 푥 ∈ [−1, 1],

∑︁ 푝 |푓(푥) − 푝푁−1(푥)| 6 |푎푘||푇푘(푥) − (−1) 푇푚(푥)| 푘>푁 ∑︁ ∑︁ 4푀휌(푓) 2 |푎 | 4푀 (푓) 휌−푘 = . 6 푘 6 휌 휌푁−1(휌 − 1) 푘>푁 푘>푁



Regarding the two variable case, we start by establishing results analogous to [66, (︀ 2 −1/2 2 −1/2)︀ Thms 8.1 and 8.2]. Let 푓 in 퐿2 [−1, 1] × [−1, 1], (1 − 푥 ) (1 − 푦 ) , we ∑︁′ ∑︁′ denote by 푎푛1,푛2 푇푛1 (푥)푇푛2 (푦) its series expansion. 푛1>0 푛2>0

Proposition A.4. Let 휌1, 휌2 > 1, 푓 be a function analytic in a neighbourhood of

퐸휌1,휌2 , the coefficients 푎푛1,푛2 , 푛1, 푛2 > 0, of the Chebyshev series of 푓 satisfy, for all 푛1 and 푛2 ∈ N, 푀 (푓) (A.3) |푎 | 4 휌1,휌2 , 푛1,푛2 6 푛1 푛2 휌1 휌2 NEW RESULTS ON THE TABLE MAKER’S DILEMMA 53 where 푀 (푓) = max |푓(푧)|. Moreover, we have, for all 푛 and 푛 ∈ N, 휌1,휌2 푧∈ℰ휌1,휌2 1 2

⃦ 푛1 푛2 ⃦ ⃦ ∑︁′ ∑︁′ ⃦ ⃦푓(푥, 푦) − 푎 푇 (푥)푇 (푦)⃦ ⃦ 푘1,푘2 푘1 푘2 ⃦ ⃦ 푘1=0 푘2=0 ⃦∞,[−1,1]×[−1,1] 4휌 휌 푀 (푓) (︂ 1 1 )︂ 1 2 휌1,휌2 + . 6 푛1+1 푛2+1 (휌1 − 1)(휌2 − 1) 휌1 휌2

Proof. For 휌 > 0, we define 풞휌 = {푧 ∈ C, |푧| = 휌}. Extending what is done in the proof of [66, Thm. 8.1], we now introduce the change of variables 푥 = −1 −1 (푧1 + 푧1 )/2, 푦 = (푧2 + 푧2 )/2 where 푧1, 푧2 ∈ 풞1 and the function ∑︁′ ∑︁′ 푧푛1 + 푧−푛1 푧푛2 + 푧−푛2 푓(푥, 푦) = 퐹 (푧 , 푧 ) = 푎 . 1 2 푛1,푛2 2 2 푛1>0 푛2>0 Now we use Cauchy’s integral formula in two variables: for all 푛1, 푛2 ∈ N,

∫︁ 훿0푛 +훿0푛 1 퐹 (푧1, 푧2) 1 2 1 2 2 푛 +1 푛 +1 d푧1d푧2 = 훿 +훿 푎푛1,푛2 . (2푖휋) 1 2 0푛1 0푛2 4 풞1×풞1 푧1 푧2 2 −1 −1 If 푔 :(푧1, 푧2) ↦→ ((푧1 + 푧1 )/2, (푧2 + 푧2 )/2, the domain 퐸휌1,휌2 is the image of C −1 ℛ휌1 × ℛ휌2 , where ℛ휌 = {푧 ∈ , 휌 < |푧| < 휌}, via the application 푔. Note that, since 퐹 = 푓 ∘ 푔, 퐹 is analytic in a neighbourhood of ℛ휌1 × ℛ휌2 since it is the composition of two analytic functions, hence, for all 푛1, 푛2 ∈ N, ∫︁ 1 퐹 (푧1, 푧2) 푎푛 ,푛 = d푧1d푧2, 1 2 (푖휋)2 푧푛1+1푧푛2+1 풞휌1 ×풞휌2 1 2 from which follows 2 max |퐹 (푧 , 푧 )| (2휋) 휌1휌2 (푧1,푧2)∈풞휌1 ×풞휌2 1 2 4푀휌1,휌2 (푓) |푎푛 ,푛 | = for all 푛1, 푛2 ∈ N. 1 2 6 2 푛1+1 푛2+1 푛1 푛2 휋 휌1 휌2 휌1 휌2

As for the remainder, for all 푥, 푦 ∈ [−1, 1], for all 푛1 and 푛2 ∈ N, we have

푛1 푛2 ∑︁′ ∑︁′ 푓(푥, 푦) − 푎푘1,푘2 푇푘1 (푥)푇푘2 (푦)

푘1=0 푘2=0

푛1 ∑︁ ∑︁′ ∑︁′ ∑︁ = 푎푘1,푘2 푇푘1 (푥)푇푘2 (푦) + 푎푘1,푘2 푇푘1 (푥)푇푘2 (푦), 푘1>푛1+1 푘2>0 푘1=0 푘2>푛2+1 hence,

⃦ 푛1 푛2 ⃦ ⃦ ∑︁′ ∑︁′ ⃦ ⃦푓(푥, 푦) − 푎 푇 (푥)푇 (푦)⃦ ⃦ 푘1,푘2 푘1 푘2 ⃦ ⃦ 푘1=0 푘2=0 ⃦∞,[−1,1]×[−1,1]

푛1 ∑︁ ∑︁′ ∑︁′ ∑︁ = |푎푘1,푘2 | + |푎푘1,푘2 |, 푘1>푛1+1 푘2>0 푘1=0 푘2>푛2+1 (︂ )︂ 1 1 휌1휌2 4푀휌 ,휌 (푓) + . 6 1 2 푛1+1 푛2+1 휌1 휌2 (휌1 − 1)(휌2 − 1)  Now we can prove: 54 N. BRISEBARRE AND G. HANROT

N Proposition A.5. Let 휌1, 휌2 > 1, let 푀1, 푀2 ∈ , 푀1, 푀2 > 2, 푓 be a function analytic in a neighbourhood of 퐸휌1,휌2 , the coefficients 푐푘1,푘2 , 푘1 = 0, . . . , 푀1 − 1, 푘2 =

0, . . . , 푀2 − 1 of the interpolation polynomial 푃푀1−1,푀2−1 of 푓 at pairs of Chebyshev nodes of the first kind satisfy, for 푘1 = 1, . . . , 푀1 − 1, 푘2 = 1, . . . , 푀2 − 1,

푀휌1,휌2 (푓) |푐푘 ,푘 | 4 훾휌 ,푘 ,푀 −1훾휌 ,푘 ,푀 −1, 1 2 6 푘1 푘2 1 1 1 2 2 2 휌1 휌2 where 푀 (푓) = max |푓(푧)|. Moreover, we have 휌1,휌2 푧∈ℰ휌1,휌2

(︂ )︂ 16휌1휌2푀휌1,휌2 (푓) 1 1 ‖푓 − 푃푀 −1,푀 −1‖ + . 1 2 ∞,[−1,1]×[−1,1] 6 푀1 푀2 (휌1 − 1)(휌2 − 1) 휌1 휌2 ∑︁′ ∑︁′ Proof. Let 푎푛1,푛2 푇푛1 (푥)푇푛2 (푦) the series expansion of 푓, the 푛1>0 푛2>0 aliasing phenomenon presented above still exists: for 푘1 = 0, . . . , 푀1 − 1, 푘2 = 0, . . . , 푀2 − 1,

+∞ +∞ ∑︁ ∑︁ 푝1+푝2 (A.4) 푐푘1,푘2 = (−1) 푎2푝1푀1+푘1,2푝2푀2+푘2

푝1=0 푝2=0 +∞ +∞ +∞ +∞ ∑︁ ∑︁ 푝1+푝2 ∑︁ ∑︁ 푝1+푝2 + (−1) 푎2푝1푀1+푘1,2푝2푀2−푘2 + (−1) 푎2푝1푀1−푘1,2푝2푀2+푘2

푝1=0 푝2=1 푝1=1 푝2=0 +∞ +∞ ∑︁ ∑︁ 푝1+푝2 + (−1) 푎2푝1푀1−푘1,2푝2푀2−푘2 .

푝1=1 푝2=1

We now combine Equation (A.4) and Inequalities (A.3) to obtain, for 푘1 = 1, . . . , 푀1− 1, 푘2 = 1, . . . , 푀2 − 1,

+∞ +∞ +∞ +∞ ∑︁ ∑︁ ∑︁ ∑︁ |푐푘1,푘2 | 6 |푎2푝1푀1+푘1,2푝2푀2+푘2 | + |푎2푝1푀1+푘1,2푝2푀2−푘2 | 푝1=0 푝2=0 푝1=0 푝2=1 +∞ +∞ +∞ +∞ ∑︁ ∑︁ ∑︁ ∑︁ + |푎2푝1푀1−푘1,2푝2푀2+푘2 | + |푎2푝1푀1−푘1,2푝2푀2−푘2 |

푝1=1 푝2=0 푝1=1 푝2=1 +∞ +∞ +∞ +∞ ∑︁ ∑︁ 푀휌1,휌2 (푓) ∑︁ ∑︁ 푀휌1,휌2 (푓) 6 4 + 4 휌2푝1푀1+푘1 휌2푝2푀2+푘2 휌2푝1푀1+푘1 휌2푝2푀2−푘2 푝1=0 푝2=0 1 2 푝1=0 푝2=1 1 2 +∞ +∞ +∞ +∞ ∑︁ ∑︁ 푀휌 ,휌 (푓) ∑︁ ∑︁ 푀휌 ,휌 (푓) + 4 1 2 + 4 1 2 휌2푝1푀1−푘1 휌2푝2푀2+푘2 휌2푝1푀1−푘1 휌2푝2푀2−푘2 푝1=1 푝2=0 1 2 푝1=1 푝2=1 1 2 푀 (푓) 1 1 4 휌1,휌2 6 푘1 푘2 −2푀1 −2푀2 휌1 휌2 1 − 휌1 1 − 휌2 (︃ )︃ 1 1 1 1 + + + . 2(푀1−푘1) 2(푀2−푘2) 2(푀1−푘1) 2(푀2−푘2) 휌1 휌2 휌1 휌2 NEW RESULTS ON THE TABLE MAKER’S DILEMMA 55

If we use Equation (6.2), we get

|푐0,0| 6 4푀휌1,휌2 (푓) thanks to the maximum principle, (︃ )︃ 푀휌1,휌2 (푓) 1 1 |푐푘1,0| 4 1 + for 푘1 = 1, . . . , 푀1 − 1, 6 푘1 −2푀1 2(푀1−푘1) 휌1 1 − 휌1 휌1 (︃ )︃ 푀휌1,휌2 (푓) 1 1 |푐0,푘2 | 4 1 + for 푘2 = 1, . . . , 푀2 − 1. 6 푘2 −2푀2 2(푀2−푘2) 휌2 1 − 휌2 휌2 The last two inequalities are consequences of Proposition A.3. As for the remainder, for all 푥, 푦 ∈ [−1, 1], for all 푛1 and 푛2 ∈ N, we have thanks to the aliasing phenomenon

푀1−1푀2−1 ∑︁′ ∑︁′ 푓(푥, 푦) − 푐푘1,푘2 푇푘1 (푥)푇푘2 (푦)

푘1=0 푘2=0 ′ ∑︁ ∑︁ 푝1 푝2 = 푎푘1,푘2 (푇푘1 (푥) − (−1) 푇푚1 (푥))(푇푘2 (푦) − (−1) 푇푚2 (푦)) 푘1>푀1 푘2>0

푀1−1 ′ ∑︁ ∑︁ 푝1 푝2 + 푎푘1,푘2 (푇푘1 (푥) − (−1) 푇푚1 (푥))(푇푘2 (푦) − (−1) 푇푚2 (푦)), 푘1=0 푘2>푀2 where 푚1, 푚2 and 푝1, 푝2 are defined as in Proposition A.1. Hence, ∑︁ ∑︁′ ‖푓(푥, 푦) − 푃푀1−1,푀2−1(푥, 푦)‖∞,[−1,1]×[−1,1] = 4|푎푘1,푘2 | 푘1>푀1 푘2>0 푀1−1 (︂ )︂ ∑︁′ ∑︁ 1 1 휌1휌2 + 4|푎푘 ,푘 | 16푀휌 ,휌 (푓) + , 1 2 6 1 2 푀1 푀2 휌1 휌2 (휌1 − 1)(휌2 − 1) 푘1=0 푘2>푀2 thanks to Proposition A.4.  The next lemma eases the computations. Recall that we introduce in Section 4.1.2 2 2 휂휌,0 = 1 and 휂휌,푘 = (휌 + 1)(휌 − 1) for 푘 = 1, . . . , 푁 − 1.

Lemma A.6. Let 휌 > 1, 푁 > 2, for 푘 = 0, . . . , 푁 − 1, we have 훾휌,푘,푁−1 6 휂휌,푘. In particular, if 휌 > 2, 휂휌,푘 6 2 for 푘 = 1, . . . , 푁 − 1. Proof. The case 푘 = 0 is straightforward. For 푘 = 1, . . . , 푁 − 1, we have 1 (︂ 1 )︂ 1 휌2푁 훾 = 1 + = (︀휌2푁 + 휌2푘)︀ (︀1 + 휌−2)︀ . 휌,푘,푁−1 1 − 휌−2푁 휌2(푁−푘) 휌2푁 − 1 6 휌2푁 − 1 The function 푢 ↦→ 푢/(푢 − 1) is strictly decreasing over (1, +∞), hence 휌2푁 휌2 휌2 + 1 훾 (︀1 + 휌−2)︀ (︀1 + 휌−2)︀ = . 휌,푘,푁−1 6 휌2푁 − 1 6 휌2 − 1 휌2 − 1 The last statement is obvious.  (︀ 푏−푎 푎+푏 )︀ Proof of Proposition 4.1. We introduce 푓̂︀ : 푓̂︀(푧) = 푓 푧 2 + 2 for any 푧 in a suitable neighbourhood of 퐸휌. The coefficients 푐푘 are also the coefficients of the interpolation polynomial in R푁−1[푥] of 푓̂︀ at the Chebyshev nodes of the first 56 N. BRISEBARRE AND G. HANROT kind. Therefore, we obtain Proposition 4.1 by applying Proposition A.3 to 푓̂︀ and Lemma A.6. Proof of Proposition 4.2. It is identical to the previous one: it suffices to introduce (︀ 푏1−푎1 푎1+푏1 푏2−푎2 푎2+푏2 )︀ 푓̂︀ : 푓̂︀(푧1, 푧1) = 푓 푧1 2 + 2 , 푧2 2 + 2 for any (푧1, 푧2) in a suitable neighbourhood of 퐸휌1,휌2 , and then to apply Proposition A.5 to 푓̂︀ and Lemma A.6.

Appendix B. Proof of Theorem 5.27 For 푓(푧) = exp(exp(푧)), we take 퐾 = log 푑. A sufficient condition for success of Algorithm2 is, in view of Proposition 5.13, (for 푑 large enough)

1 −4/(3(1−휆2)푑)(1+표(1)) 푏 − 푎 < (︀(|푎 + 푏| + log 푑)푀 (푓))︀ 4 풟푎,푏,퐾 2 (푢푣)−4/(3(1−휆 )푑)(1+표(1)) log 푑.

푂(푑) We have 푀풟푎,푏,퐾 (푓) = 2 ; hence, as in the proof of Theorem 5.26, a sufficient condition is 2 (푢푣)−4/(3(1−휆 )푑) log 푑 → ∞, for which it suffices that (︂ 4 )︂ 푑 log log 푑 + 휀′ log(푢푣), > 3(1 − 휆2) for some 휀′ > 0, which follows from the assumption of the Theorem. The statement on 푤 follows from simple calculus using 푤 = 푂(퐾푁(1−휆)). For 푔(푧) = ∑︀ exp(−푛2)푧푛, we have 푛>0 ∑︁ max |푔(푧)| = 푔(휌) = exp(log2 휌/4) exp(−(푛 − log 휌/2)2) |푧|=휌 푛>0 2 ∑︁ 2 2 6 exp(log 휌/4) exp(−푛 ) 6 2 exp(log 휌/4). 푛∈Z 2 We choose 퐾 = exp(3(1 − 휆 )푑/2). In this case, we have 푀풟푎,푏,퐾 (푓) = exp(9(1 − 휆2)2푑2/16 + 푂(푑)). A sufficient success condition for Algorithm2 is thus, as 푑 → ∞,

1 2 푏 − 푎 < (|푎 + 푏| + exp(3(1 − 휆2)푑/2))−4/(3(1−휆 )푑)(1+표(1)) 4 2 −4/(3(1−휆2)푑)(1+표(1)) exp(3(1 − 휆 )푑/2)(푢푣푀풟푎,푏,퐾 (푓)) . It follows from 3(1 − 휆2)푑 4(1 + 표(1)) (︂ 9(1 − 휆2)2푑2 )︂ − log(푢푣) + → ∞, 2 3(1 − 휆2)푑 16 for which it suffices to have (︂ 16 )︂ 푑2 + 휀′ log(푢푣), > 9(1 − 휆2)2 for some 휀′ > 0. Again, the statement on 푤 follows from simple calculus using 푤 = 푂(퐾푁(1−휆)). NEW RESULTS ON THE TABLE MAKER’S DILEMMA 57

Appendix C. Precision required, 1D case We now estimate the precision required for the computations performed in Algorithm1. We shall use a computation model where our real numbers are represented by fixed point numbers, with p > 1 binary digits following the binary point. This means that for each elementary operation, the result differs from the ideal mathematical result by at most 2−p (such a result is usually called faithful rounding in precision p). The notations used hereafter correspond to those of Algorithm1. This somewhat artificial model is a simplification of the natural model, which is a floating-point model where the total precision is p + 푃 , where 푃 is the size of the largest real number encountered during the computation. It allows for a simpler, though probably slightly rougher, analysis; as a consequence, Theorem C.9 is valid for floating-point computations in precision p + 푃 . The following lemma summarizes basic facts on this model: Lemma C.1. Let 푥, 푦 be real numbers and 푋, 푌 be fixed-point numbers in preci- sion p which are approximations of those, such that |푋 − 푥| 6 휀푥, |푌 − 푦| 6 휀푦, with max(휀푥, 휀푦) < 1/2. Then, if ⊕ and ⊗ are the arithmetic operations of our computational model, we have

(C.1) |(푋 ⊕ 푌 ) − (푥 + 푦)| 6 휀푥 + 휀푦, −p −p (C.2) |푋 ⊗ 푌 − 푥 · 푦| 6 휀푥푌 + |푥|휀푦 + 2 6 휀푥|푦| + |푥|휀푦 + 휀푥휀푦 + 2 . 1 Further, if 푍1 is the fixed point result of the operation exp(푋), and if 푔 is a 퐶 function over [푎, 푏], 푍2 the fixed point result of the operation 푔(푋), we have: −p −p ′ (C.3) |푍1 − exp(푥)| 6 2 + 2휀푥 exp(푥), |푍2 − 푓(푥)| 6 2 + 휀푥 max |푔 |. [푥−휀,푥+휀]

As a consequence, if 푥1, . . . , 푥푛 are real numbers and 푋1, . . . , 푋푛 be fixed-point numbers in precision p which are approximations of those such that max16푖6푛 |푥푖 − 푋푖| 6 휀, the error on the sum 푥1 + ··· + 푥푛 (which can be evaluated in any order) is at most 푛휀. In the sequel, we shall put ′ 퐶 = max(1, 푢, 푣) max(1, 퐵푥, 퐵푓 ) max(1, max |푓 |). [푎,푏] The error analysis of the DCT follows:

Proposition C.2. Assume that each 휙(퐿푐ℎ푒푏[푖]) is given by an approximation with −p error 휀 < 1, with 휀 > 2 and that the cosines involved in the DCT definition −p are given by an approximation 6 1 with error 2 . Assume that we compute each coefficient of the DCT by computing first the 푁 products, then the sum. Then, we obtain an approximation Δ of DCT- II(푈) at Step 11 of Algorithm1 such that −p 푑 ‖Δ − DCT- II(푈)‖∞ 6 푁(휀 + 2 (1 + 퐶 )).

Proof. We deduce from (C.2) that each product 휙(퐿푐ℎ푒푏[푖]) cos(푘(푖 + 1/2)휋)/푁 −p −p −p 푑 incurs an error at most 휀 + 2 |휙(퐿푐ℎ푒푏[푖])| + 2 6 휀 + 2 (1 + 퐶 ). The sum of −p 푑 those terms then, cf. (C.1), incurs an error 6 푁(휀 + 2 (1 + 퐶 )), from which the result follows.  We now turn to the computation of 훼푘; our practical application cases have 훼 ≫ 1, and we need all the values 훼, . . . , 훼푘, so that we compute 훼푘 by the recurrence 훼푘 = 훼푘−1 · 훼. 58 N. BRISEBARRE AND G. HANROT

Proposition C.3. Let 푥 be a nonnegative real number, and 푋 a fixed point number −p in precision p approximating 푥, so that |푋 − 푥| 6 휀, with 1 > 휀 > 2 . If 푘 is an integer, and if we define 푋1 = 푋 and 푋푘 = 푋 ⊗ 푋푘−1 we have, for 푘 > 1, 푘 푘−1 |푋푘 − 푥 | 6 푘휀(푥 + 1) . 푘 Proof. Induction on 푘, clear for 푘 = 1. We let 휀푘 be |푋푘 − 푥 |. Then, we have 푘+1 푘 푘 |푋푘+1 − 푥 | = |푋푘+1 − 푋 · 푋푘| + |푋 · (푋푘 − 푥 )| + |푋 − 푥|푥 −p 푘 6 2 + (푥 + 휀)휀푘 + 푥 휀, from (C.2) and the induction hypothesis, so that 푘 −p 푘 푘 휀푘+1 6 (푥 + 휀)휀푘 + 푥 휀 + 2 6 (푥 + 휀)휀푘 + (푥 + 1)휀 6 (푥 + 1)휀푘 + (푥 + 1) 휀, from which the result follows by induction. 

Corollary C.4. Assume that 휀 6 1/푑. Let 훼 be a real number with |훼| 6 퐵푥, and 훽 be a real number with |훽| 6 퐵푓 . If 푋 is a fixed-point approximation of 푢훼 with error 6 휀 and 푌 is a fixed-point approximation of 푣훽 with error 6 휀, and if 푋푘 and 푌ℓ are defined as in Proposition C.3, we define 푍 = 푋푘 ⊗ 푌ℓ. If 푘 + ℓ 6 푑, we have 푘 ℓ −p 푑−1 |푍 − (푢훼) (푣훽) | 6 2 + 2푑휀(퐶 + 1) . 푘 푘−1 Proof. By Proposition C.3, the error on 푋푘 compared to (푢훼) is 6 푘휀(퐶 + 1) ; ℓ ℓ−1 the error on 푌ℓ compared to (푣훽) is at most 6 ℓ휀(퐶 + 1) . Finally, we have 푘 ℓ 푘 푘 ℓ |푍 − (푢훼) (푣훽) | 6 |푍 − 푋푘 · 푌ℓ| + |푋푘 − (푢훼) ||푌ℓ| + |푢훼| |푌ℓ − (푣훽) | −p 푘−1 ℓ ℓ−1 ℓ−1 푘 6 2 + 푘휀(퐶 + 1) (퐶 + ℓ휀(퐶 + 1) ) + ℓ휀(퐶 + 1) 퐶 −p 푑−1 6 2 + 2푑휀(퐶 + 1) , using (C.2) and ℓ휀 6 푑휀 6 1.  We can now combine the previous results to get an estimate of the precision required for 푀푐[푖, 푗]; for the sake of simplicity, we assume that approximations of the cos((푘 + 1/2)휋/푁)’s to the precision 2−p are known and that all are less than 1. Theorem C.5. Assume that 푎, 푏, 푢, 푣 are exactly representable in our computation 3−p model. Then, the error on the values 푢퐿푐ℎ푒푏[푖] and 푣푓(퐿푐ℎ푒푏[푖]) is at most 2 퐶, 푑 4−p and the error on the vector 퐿DCT is at most 푑(퐶 + 1) 2 . −p Proof. The computations of (푏 − 푎)/2 and (푏 + 푎)/2 each incur an error 6 2 . Hence, we deduce from Lemma C.1 that 퐿푐ℎ푒푏[푖] is computed with an error −p −p −p −p −p 6 2 + 1 · 2 + (푏 − 푎)/2 · 2 +2 = (3 + (푏 − 푎)/2) · 2 . ⏟ ⏞ error on (푏−푎)/2 cos((푗+1/2)휋/푁) −p (︀ ′ )︀ Hence, 푓(퐿푐ℎ푒푏[푖]) incurs an error 6 2 1 + (3 + (푏 − 푎)/2) max[푎,푏] |푓 | , as we know that 퐿푐ℎ푒푏[푖] is in [푎, 푏] and can always ensure that the approximation of 퐿푐ℎ푒푏[푖] is also in [푎, 푏], up to replacing it by min(푏, max(푎, 퐿푐ℎ푒푏[푖])). Thus, the error incurred on 푢푥 and 푣푓(푥) for 푥 = 퐿푐ℎ푒푏[푖] is at most, cf. (C.3), 2−p + 2−p max(푢 · (3 + (푏 − 푎)/2), 푣(1 + (3 + (푏 − 푎)/2 max |푓 ′|))) [푎,푏] −p −p 6 2 (1 + 5퐶) 6 6 · 2 퐶. NEW RESULTS ON THE TABLE MAKER’S DILEMMA 59

−p 푑−1 Corollary C.4 finally bounds the error on 푈 by 2 (1 + 12푑(퐶 + 1) 퐶) 6 13푑(퐶 + 1)푑2−p. Hence, thanks to Proposition C.2, the overall error on DCT-II(푈) is at most 푑 −p −p 푑 푑 −p 푁(13푑(퐶 + 1) 2 + 2 (1 + 퐶 )) 6 푁(13푑 + 1)(퐶 + 1) 2 . As 푁 is exactly representable, after multiplication by 2/푁 (or 1/푁 for the zero-th coefficient), we obtain, from (C.2), an error on 퐿DCT of at most

푑 −p −p 푑 4−p (13푑 + 1)(퐶 + 1) 2 + 2 6 푑(퐶 + 1) 2 . 

Remark C.6. This theorem can be read as a proof in this case of the rule of thumb valid in this computational model that one should use as a precision “the final precision required, plus the size of the largest element encountered inthe computation, plus a few guard bits”.

We now turn to similar estimates for the remainders. For the sake of simplicity again, we shall assume that 휌, 휌−1, 휔 are exactly representable in our computational model – which is, in practice, a very mild restriction. As 푁 is an integer and p > 0, 푁 and 푁 − 1 are also exactly representable in our computational model.

We compute 푅휔0 as exp(−(푁 − 1 − 휔0) log 휌)/(휌 − 1)/4).

′ Proposition C.7. Define 퐶 = max(1, (푁 − 휔0)/(휌 − 1)). Then, the quantity 푅휔0 −p ′ can be computed with error at most 7·2 퐶 , and the quantity − log2(푅휔0 )+log2(푁) with error at most 5 · 2−p(휌 − 1)퐶′.

−p Proof. The error on (푁−1−휔0) log 휌 is at most, cf. (C.2), 2 (log 휌+푁−1−휔0+1) 6 2퐶′(휌 − 1)2−p. −(푁−1−휔 ) −p ′ −(푁−1−휔 ) The error on 휌 0 is upper bounded by 2 (1+4퐶 (휌−1))휌 0 6 5· 2−p퐶′(휌−1)휌−(푁−1−휔0). Then, we deduce from (C.3) that the error on 휌−(푁−1−휔0)/(휌− −p ′ −(푁−1−휔 ) ′ −p 1) is 6 2 (1 + 5퐶 휌 0 ) 6 6퐶 2 , and division by 4 incurs a precision loss −p −p ′ of 2 , so the total error on 푅휔0 is 6 7 · 2 퐶 . −p Similarly, the error on (푁 −1−휔0) log2 휌 is at most 2 (log2 휌+푁 −1−휔0 +1) 6 ′ −p −p 3퐶 (휌 − 1)2 , while the errors on log2(휌 − 1) and log2(푁) are each at most 2 . ′ −p Hence, the total error on − log2(푅휔0 ) + log2(푁) is at most 5퐶 (휌 − 1)2 . 

As we inherently have to allow for overestimation of the quantity 퐵푓 , as has been pointed, up to rounding upwards this overestimated quantity we shall assume that 퐵푓 is exactly representable.

푘 ℓ −p ′ Corollary C.8. The error on 푅휔0 (푢퐵푥) (푣퐵푓 ) is at most 2 퐶 (퐶 + 4푢휌)(퐶 + 1)푑−1(24푑 + 8).

Proof. The error on 퐵푥 is at most, cf. (C.1), the sum of the error on (푎 + 푏)/2, −p −1 which is 6 2 , and the error on the product (푏 − 푎)(휌 + 휌 )/4, which is 6 −p −1 −p −p −p 2 ((푏 − 푎)/4 + 휌 + 휌 + 2 ) + 2 6 ((푏 − 푎)/4 + 4휌)2 , cf. (C.2). Hence, −p −p the error on 푢퐵푥 is at most (퐶 + 4푢휌)2 and the error on 푣퐵푓 at most 퐶 · 2 ; 푘 ℓ −p thus, Corollary C.4 bounds the error on (푢퐵푥) (푣퐵푓 ) by 2 + 2푑(퐶 + 4푢휌)(퐶 + 푑−1 −p 푑−1 −p 1) 2 6 3푑(퐶 + 4푢휌)(퐶 + 1) 2 . 푘 ℓ 푑 Further, note that max((푢퐵푥) , (푣퐵푓 ) ) 6 퐶 and 푅휔0 6 1/(휌 − 1), and recall −p ′ that the error on 푅휔0 is at most 7 · 2 퐶 (cf. Proposition C.7). Hence, finally, the 60 N. BRISEBARRE AND G. HANROT error on the product is at most (퐶 + 1)푑−1 7 · 2−p퐶′(퐶푑 + 3푑(퐶 + 4푢휌)(퐶 + 1)푑−12−p) + 3푑(퐶 + 4푢휌) 2−p + 2−p, 휌 − 1 which is in turn upper bounded by 2−p퐶′(퐶 + 4푢휌)(퐶 + 1)푑−1(24푑 + 8), as claimed.  Now we can state:

′ Theorem C.9. Put ℳ = max(푢, 푣, |푎|, |푏|, 휌, 퐵푓 , max[푎,푏] |푓 |). For p > tprec + 푂(max(푑 log ℳ, | log(휌 − 1)|)), if the faithful rounding mode is set at each step, the computation in a fixed precision model with p bits after the binary point allows for the computation of tpreccomp, 푀푐,comp, 푀푟,comp with the property that tpreccomp ∈ {tprec, tprec + 1}, 푀푐[푖, 푗] − 푀푐,comp[푖, 푗] ∈ {0, sgn(푀푐[푖, 푗])}, and 푀푟[푖, 푗] − 푀푟,comp[푖, 푗] ∈ {0, 1} for 푖, 푗 = 0, . . . , 푁 − 1. Proof. Follows from Proposition C.7, Theorems C.5 and C.8 and the discussion in Subsection 5.3.3.  Note as a conclusion that as the largest real number encountered in this computa- tion has size 푂(max(푑 log ℳ, | log(휌 − 1)|), the result stated in the theorem remains valid in a floating point model with precision tprec + 푂(max(푑 log ℳ, | log(휌 − 1)|)).

Appendix D. Lemmata on 휙, 휓 In this appendix, we group the facts concerning the function 휓 of Section6. Lemma D.1. Let 휙 be the function from [1, +∞) to [1, +∞) defined by 휙(푥) = (1 + ⌊푥⌋)(푥−⌊푥⌋/2). Then 휙 is continuous and strictly increasing, and defines a bijection 2 from√[1, +∞) to [1, +∞). For any 푥 > 1, we have 푥(푥 + 1)/2 6 휙(푥) 6 (푥 + 1/2) /2 −1 √︀ and 2푥 − 1/2 6 휙 (푥) 6 2푥 + 1/4 − 1/2. 1 ′ Proof. For 푥 ̸∈ Z, it is clear that 휙 is 풞 in a neighbourhood of 푥 and that 휙 (푥) > 1. For 푥 in Z, we have 휙(푥) = lim푡→푥+ 휙(푡) = (1 + 푥)푥/2, whereas lim푡→푥− 휙(푡) = 푥(푥 − (푥 − 1)/2) = 푥(푥 + 1)/2. This proves continuity, and the remaining assertions follow. Let 푘 ∈ N, 푎 ∈ R and 푔푎(푥) = (푥 + 1 − 푎)(푥 + 푎)/2. We denote by 휙푘 the ′ restriction of 휙 to [푘, 푘 + 1]. For all 푥 ∈ [푘, 푘 + 1], (휙푘 − 푔푎) (푥) = 푘 + 1/2 − 푥 : the function 휙푘 − 푔푎 is decreasing over [푘, 푘 + 1/2] and increasing over [푘 + 1/2, 푘 + 1]. Now, we remark that 휙푘(푘) = 푔0(푘) and 휙푘(푘 + 1) = 푔0(푘 + 1), which yields 휙푘(푥) > 푔0(푥) = 푥(푥 + 1)/2 for all 푥 ∈ [푘, 푘 + 1], and 휙푘(푘 + 1/2) = 푔1/2(푘 + 1/2), 2 which yields 휙푘(푥) 6 푔1/2(푥) = (푥 + 1/2) /2 for all 푥 ∈ [푘, 푘 + 1]. The proof of the remaining inequalities is straightforward.  √ Lemma D.2. Let 3 훾 푁, 푠 = 훾휙−1(푁/훾), 푁 = ⌊ 2푁훾⌋ and 푁 = ⌈√︀2푁/훾⌉. 6 6 √ √ 1 √ 2 √ We have 훾 > 푁1/푁2 > 1, 푠 < 푁1 and 2푁( 2푁 − 3/3) 6 푁1푁2 6 (2 + 2)푁. Assume that 푁 → ∞, then card 풦푠 6 푁 + 푂(푠/훾). NEW RESULTS ON THE TABLE MAKER’S DILEMMA 61

Proof. We have √︀ √︀ √︀ 2푁훾 − 2푁/훾 = 2푁훾(1 − 1/훾) > 1, √ √︀ since 푁 > 훾 > 3. It follows 푁1 = ⌊ 2푁훾⌋ > ⌈ 2푁/훾⌉ = 푁2, hence 푁1/푁2 > 1. Moreover, for any 훾 > 3, √︀ √︀ 푁1 = ⌊ 2푁훾⌋ 6 훾 2푁/훾 6 훾푁2. Also, √︀ √︀ √︀ √︀ ( 2푁훾 − 1) 2푁/훾 6 푁1푁2 6 2푁훾(1 + 2푁/훾), hence √ √ √ √ 2푁( 2푁 − 3/3) 6 푁1푁2 6 (2 + 2)푁 since 푁 > 훾 > 2. √ √ Finally, from Lemma D.1 and the fact that ( 8푥 + 1 − 1)/2 < 2푥 − 3/8 for −1 √︀ √ all 푥 > 1, we have 푠 = 훾휙 (푁/훾) < 훾( 2푁/훾 − 3/8) < 2푁훾 − 1 6 푁1 since 푁 > 훾 > 3. Therefore, we can apply Lemma 6.3: we have, from (6.6), card 풦푠 = 훾(1+⌊푠/훾⌋)(푠/훾 −⌊푠/훾⌋/2)+푂(푠/훾) 6 훾휙(푠/훾)+푂(푠/훾) 6 푁 +푂(푠/훾).  Corollary D.3. With the assumptions of Lemma D.2, put 휆 = 푁/훾. Then, we have Ω훾 (푁, 푁1, 푁2) = 휓(휆)푁훾 + 푂(푁), where 1 + ⌊휙−1(휆)⌋ 휓(휆) = (︀6휙−1(휆)2 − ⌊휙−1(휆)⌋ − 2⌊휙−1(휆)⌋2)︀ . 12휆 √ −1 Note that√ if 훾 = 표(푁), we have 휆 → ∞ and 휙 (휆) = 2휆 + 푂(1), so that 휓(휆) = 2 2휆/3 + 푂(1) and √ 2 2 Ω (푁, 푁 , 푁 ) = 푁 3/2훾1/2 + 푂(푁훾). 훾 1 2 3 Proof. Lemma D.2 shows us that the assumptions of Lemma 6.3 are satisfied; we thus get ∑︁ (푖 + 푗훾) = 휓(휆)푁훾 + 푂(푁).

(푖,푗)∈풦푠

Further, note that for our value of 푠, the term 푠(푁 − card 풦푠) from Lemma 6.1 is 푂(푠2/훾), which is 푂(훾휙−1(휆)2) = 푂(푁휙−1(휆)2/휆) = 푂(푁), thanks to Lemma D.1. The result follows.  Lemma D.4. For 푥 ∈ [1, +∞), we have √ 2 2푥 −5/6 < 휓(푥) − < 0. 3 Proof. We have 6푥2 − ⌊푥⌋ − 2⌊푥⌋2 휓(휙(푥)) = . 12푥 − 6⌊푥⌋ For 푣 6 푢 < 푣 + 1, define 6푢2 − 푣 − 2푣2 퐹 (푢, 푣) = − 2푢/3, 12푢 − 6푣 so that 휓(휙(푥)) − 2푥/3 = 퐹 (푥, ⌊푥⌋). 62 N. BRISEBARRE AND G. HANROT

We maximize 퐹 (푢, 푣) for fixed 푣, hence computing 휕퐹 −2푢2 + 2푢푣 + 푣 (푢, 푣) = . 휕푣 3(4푢2 − 4푢푣 + 푣2) By evaluating 2푢2 − 2푢푣 − 푣 = 0 at 푣 and 푣 + 1, one checks that for fixed 푣, there is a unique 푢0 ∈ [푣, 푣 + 1) such that 퐹 (푢, 푣) increases over [푣, 푢0] and decreases over [푣0, 푣 + 1). Hence, for 푢 ∈ [푣, 푣 + 1), −1 = min(퐹 (푣, 푣)−2푣/3, 퐹 (푣+1, 푣)−2(푣+1)/3) 퐹 (푢, 푣)−2푢/3 퐹 (푢 , 푣)−2푢 /3. 6 6 6 0 0 Finally, we find 2푢 퐹 (푢 , 푣) − 0 = (푣 − 푢 )/3 0. 0 3 0 6 √ (note that the optimal√ bound for the latter is actually (1 − 3)/6, obtained for 푣 = 1, 푢0 = (1 + 3)/2). Hence, we have −1 −1 2휙 (푥)/3 − 1/6 6 휓(푥) 6 2휙 (푥)/3, √ √ which, in view of 2푥 − 1 < 휙−1(푥) < 2푥, gives √ −5/6 < 휓(푥) − 2 2푥/3 < 0.

 √ Remark D.5. Simple numerical experiments suggest that actually 휓(푥√) − 2 2푥/3 ∈ [−1/2, −0.44] for 푥 > 1, so that the asymptotic expansion 휓(푥) = 2 2푥/3 − 1/2 + 표(1), once truncated, actually gives an excellent approximation for all 푥 > 1. The following two lemmas yield useful information on the function 휓: invertibility, and inverse function. Lemma D.6. The function 푥 ↦→ 휓(푥) is continuous and increasing over [1, ∞). Proof. For the first part, since 휙−1 is continuous and 푥 > 0, it suffices to prove that 푥 ↦→ 푥(휓 ∘ 휙(푥)) is continuous, namely that 퐹 : 푥 ↦→ (1 + ⌊푥⌋)(6푥2 − ⌊푥⌋ − 2⌊푥⌋2) is continuous. Obviously, 퐹 is continuous on [1, ∞)∖Z>0. If 푛 is an integer, we check that 퐹 (푛) = 2 2 (1+푛)(4푛 −푛) whereas lim푥→푛− 퐹 (푥) = 푛(4푛 +3푛−1) = 푛(푛+1)(4푛−1) = 퐹 (푛). To prove that 푥 ↦→ 휓(푥) is increasing, as 휙 is increasing it suffices to study 푥 ↦→ 휓(휙(푥)). As the latter function is continuous over each interval (휙−1(푛), 휙−1(푛+1)), it suffices to prove that it increases over each of those intervals. Over such an interval, we have 휓(휙(푥)) = 퐴푥2/휙(푥) − 퐵/휙(푥) for some nonneg- ative constants 퐴, 퐵; it thus suffices to prove that 푥2/휙(푥) is increasing, or that 휙(푥)/푥2 is decreasing. As this function is continuous and has the form 퐴′(푥+퐵′)/푥2 for some nonnegative 퐴′, 퐵′ over each interval (푛, 푛 + 1) for integer 푛, we see that it is indeed decreasing.  Proposition D.7. We have, for any 휇 ∈ [1, +∞), 푘 + 1 (︁ )︁ 휓−1(휇) = 2휇 − 푘 + √︀4휇(휇 − 푘) + 2푘(2푘 + 1)/3 , 2 where 푘 = ⌊3휇/2 + 1/4⌋. For 휇 → ∞, we have 휓−1(휇) = 9휇2/8 + 푂(휇). NEW RESULTS ON THE TABLE MAKER’S DILEMMA 63

Proof. We start by noticing that for all integer ℓ, 휓(ℓ(ℓ + 1)/2) = (4ℓ − 1)/6, which follows easily from 휙−1(ℓ(ℓ + 1)/2) = ℓ. Let now 푘 = ⌊3휇/2 + 1/4⌋; then, 휓(푘(푘 + 1)/2) 6 휇 < 휓((푘 + 1)(푘 + 2)/2), so that −1 푘(푘 + 1)/2 6 휓 (휇) < (푘 + 1)(푘 + 2)/2, and ⌊휙−1(휓−1(휇))⌋ = 푘. The asymptotic expansion follows from these inequalities. As a consequence, 휓−1(휇) = 휙(휙−1(휓−1(휇))) = (푘 + 1)(휙−1(휓−1(휇)) − 푘/2), so −1 −1 −1 휓 (휇) that 휙 (휓 (휇)) = 푘+1 + 푘/2. Hence, 1 + 푘 휇 = 휓(휓−1(휇)) = (︀6휙−1(휓−1(휇))2 − 푘 − 2푘2)︀ , 12휓−1(휇) so that 12 (︂휓−1(휇) )︂2 휇휓−1(휇) = 6 + 푘/2 − 푘 − 2푘2. 1 + 푘 푘 + 1 Finally, 휓−1(휇)/(푘 + 1) is the positive root of the equation 푘(푘 + 2) 푋2 + (푘 − 2휇)푋 − = 0, 12 which gives the explicit form 푘 + 1 (︁ )︁ 휓−1(휇) = 2휇 − 푘 + √︀4휇(휇 − 푘) + 2푘(2푘 + 1)/3 . 2 

Appendix E. Proofs of Theorems 6.4 and 6.12

Proof of Theorem 6.4. For 0 6 푖, 푗 6 푁 − 1, 0 6 푘1 6 푁1 − 1, 0 6 푘2 6 푁2 − 1, we have from Proposition 4.2, ⃒ 푐 ⃒ ⃒(퐴 ) ⃒ ⃒ 푘1,푘2,푖 ⃒ ⃒ 1 푖,(푘1,푘2)⃒ 6 훿 +훿 6 ⃒2 0푘1 0푘2 ⃒ 푀 (푓 ) 휌2 + 1 휌2 + 1 4휌 휌 푀 (푓 ) 1 4 휌1,푎1,푏1,휌2,푎2,푏2 푖 1 2 1 2 휌1,푎1,푏1,휌2,푎2,푏2 푖 . 푘1 푘2 2 2 6 푘1 푘2 휌1 휌2 휌1 − 1 휌2 − 1 (휌1 − 1)(휌2 − 1) 휌1 휌2

푁1 푁2 Moreover, from the definition of 퐴2 and the assumption 휌1 6 휌2 , we know that, for 0 6 푖, 푗 6 푁 − 1, (︂ )︂ 16휌1휌2푀휌1,푎1,푏1,휌2,푎2,푏2 (푓푖) 2 |퐴2,푖,푗| . 6 푁1 (휌1 − 1)(휌2 − 1) 휌1

32휌1휌2푀휌1,푎1,푏1,휌2,푎2,푏2 (푓푖) We now apply Theorem 5.1. We put 푟푟푖 = for 푖 = (휌1−1)(휌2−1) 0, . . . , 푁 − 1. Then, notice that the product of the 푁 largest elements among the 푐푐푗’s 1 1 1 1 1,..., ,..., , ,..., . 푘1 푘2 푁1−1 푁2−1 푁1 푁1 휌1 휌2 휌1 휌2 휌1 휌1 ⏟ ⏞ 푁times

−Ω훾 (푁1,푁2,푁) 훾 is upper bounded by 휌1 thanks to the assumption 휌1 = 휌2, or equivalently, −푖 −푗 −푖−훾푗 휌1 휌2 = 휌1 . 64 N. BRISEBARRE AND G. HANROT

We can now apply Theorem 5.1 to obtain the bound

푁 푁 √ 푁 푁 푁 +푁 (︂ )︂ ∏︀ 푡 1/2 (︁ )︁ 1 2 휌1휌2 푖=1 푀휌1,푎1,푏1,휌2,푎2,푏2 (푓푖) (det 퐴퐴 ) 32 푁 2 2 , 6 Ω (푁,푁 ,푁 ) (휌1 − 1)(휌2 − 1) 훾 1 2 휌1 from which the Theorem follows, in view of the preliminary assumption that 푁1푁2 > 푁.  ^ Proof of Theorem 6.12. The assumption ‖Λ퐴‖2 6 1/(푁 + 푁1푁2) implies in particular, for all 푘, ℓ,

1−tprec 1/(푁 + 푁1푁2) |휆푘,ℓ| min 퐴^2[푖, 푖] 2 푁|휆푘,ℓ|, > 푖 >

tprec−1 cf. proof of Lemma 6.10, hence |휆푘,ℓ| 6 2 /(푁(푁 +푁1푁2)). For 푗 = 0, . . . , 푁 + ∑︀ −tprec 푁1푁2 −1, we have |(Λ퐴)[푗]−(Λ퐴^)[푗]| |휆푘,ℓ|2 1/(2(푁 +푁1푁2)), 6 06푘+ℓ6푑 6 ^ −1 from which follows ‖Λ퐴‖1 6 ‖Λ퐴‖1 + 2 . Let 푃 be as in the statement of the Theorem, and ∑︁ 푄(푥, 푡) = 푞푗1,푗2 푇푗1,[푎1,푏1](푥)푇푗2,[푎2,푏2](푡) 06푗16푁1−1 06푗26푁2−1 be the interpolation polynomial for 푃 (푢푥, 푣(푓(푥) + 푡)) at the order (푁1, 푁2) pairs of

Chebyshev nodes of the first kind. Then, the coordinates of Λ퐴1 are exactly 푞푗1,푗2 , 0 6 푗1 6 푁1 − 1, 0 6 푗2 6 푁2 − 1. Proposition 4.2 shows that (︂ )︂ 푀휌1,푎1,푏1,휌2,푎2,푏2 (푃 )휌1휌2 1 1 max |푄(푥, 푡)−푃 (푢푥, 푣(푓(푥)+푡))| 6 16 + 푥∈[푎 ,푏 ] 푁1 푁2 1 1 (휌1 − 1)(휌2 − 1) 휌1 휌2 푡∈[푎2,푏2] 푘 푘 ℓ ℓ (︂ )︂ ∑︁ 푢 푀휌1,푎1,푏1 (푥) 푣 푀휌1,푎1,푏1,휌2,푎2,푏2 (푓(푥) + 푡) 휌1휌2 1 1 16 |휆푘,ℓ| + 6 푁1 푁2 (휌1 − 1)(휌2 − 1) 휌1 휌2 06푘+ℓ6푑 hence

max |푃 (푢푥, 푣(푓(푥) + 푡))| 6 max |푄(푥, 푡)|+ 푥∈[푎1,푏1] 푥∈[푎1,푏1] 푡∈[푎2,푏2] 푡∈[푎2,푏2] 푘 푘 ℓ ℓ (︂ )︂ ∑︁ 푢 푀휌1,푎1,푏1 (푥) 푣 푀휌1,푎1,푏1,휌2,푎2,푏2 (푓(푥) + 푡) 휌1휌2 1 1 16 |휆푘,ℓ| + 푁1 푁2 (휌1 − 1)(휌2 − 1) 휌1 휌2 06푘+ℓ6푑 ∑︁ 6 |푞푗1,푗2 | + (recall that max |푇푘,[푎푖,푏푖](푥)| = 1 for all 푘) 푥∈[푎푖,푏푖] 06푗16푁1−1 06푗26푁2−1 푘 푘 ℓ ℓ (︂ )︂ ∑︁ 푢 푀휌1,푎1,푏1 (푥) 푣 푀휌1,푎1,푏1,휌2,푎2,푏2 (푓(푥) + 푡) 휌1휌2 1 1 16 |휆푘,ℓ| + 푁1 푁2 (휌1 − 1)(휌2 − 1) 휌1 휌2 06푘+ℓ6푑 ^ −1 = ‖Λ퐴‖1 6 ‖Λ퐴‖1 + 2 −1 √︀ ^ 6 2 + 푁 + 푁1푁2‖Λ퐴‖2 thanks to Cauchy-Schwarz inequality −1 √︀ 6 2 + 1/ 푁 + 푁1푁2 < 1 since 푁 > 3, 푁1, 푁2 > 2. NEW RESULTS ON THE TABLE MAKER’S DILEMMA 65

. Université de Lyon, CNRS, ENS de Lyon, Inria, Université Claude-Bernard Lyon 1, Laboratoire LIP (UMR 5668), Lyon, France. Email address: [email protected]

Université de Lyon, CNRS, ENS de Lyon, Inria, Université Claude-Bernard Lyon 1, Laboratoire LIP (UMR 5668), Lyon, France. Email address: [email protected]