Integer and Rational Arithmetic

Total Page:16

File Type:pdf, Size:1020Kb

Integer and Rational Arithmetic Mathematics and Algorithms for Computer Algebra Part 1 c 1992 Dr Francis J. Wright – CBPF, Rio de Janeiro July 9, 2003 3: Integer and rational arithmetic I now want to apply the basic notions of computational representations and abstract algebra that I have developed so far to concrete algorithms, and briefly to consider their complexity. As I explained earlier, underlying all of computer algebra are the integers, so I will begin by considering algorithms for performing arithmetic on integers of unlimited magnitude, which I will call long integers. It is then a very small step to extend this to arbitrary rational numbers. I will not consider any approximate representations, such as rounded representations of the reals, because they are not fundamental to CA and they lie more within the domain of numerical analysis. Given algorithms to manipulate long integers it is a small extension to manipulate rounded real representations of unlimited magnitude and precision. I will not explicitly consider complex numbers because they are a trivial extension of the real numbers, and similarly arithmetic on the integers mod m, which as we have seen is very important, is essentially a trivial extension of integer arithmetic. The definitive presentation of the material discussed in this set of notes is in Chapter 4 “Arithmetic” in Vol. 2 “Seminumerical Algorithms” of the clas- sic book The Art of Computer Programming by Donald E. Knuth (Addison- Wesley, 1981), which henceforth I will refer to simply as “Knuth”, and most of it is in section 4.3.1 “Multiple-precision arithmetic: The classical algo- rithms”. 1 Representation of long integers We have already seen how to do this in practice, by choosing a base B that conveniently fits into a word of memory, and then linking a sufficient 1 number of words together, either as a linked list or an array. There is a question about exactly what the base should be, to which we will return later. Any integer can be represented by its magnitude together with its sign. I will assume that the sign, which requires only one bit of information, is stored somewhere and can be accessed as required, and I will focus on the representation of non-negative integers. The manipulation of long integers essentially just formalizes the conventional techniques for performing integer arithmetic by hand using “long multiplication”, “long division”, etc. In order to understand the algorithms it is often useful to imagine the base B to be 10, even though the real intention is that the base will be more like 105 or 1010. The mathematics of this representation requires the lexicographic order- ing once again, but now in a purely numerical context. Each digit (compo- nent) of a long (unlimited magnitude) integer is itself a (small) integer, so a (long) integer is represented by a set of digits (small integers), but this set has an order corresponding to the positional representation used when writing integers conventionally. Hence we need to formalize the ordering on such sets of digits. 1.1 Lexicographic order Let E1,...,Er be ordered sets (e.g. each {0, 1, 2,...,B − 1}), and define the relation < on the Cartesian product set E = E1 × · · · × Er of r-tuples of the form x = (x1, . , xr) ∈ E by x < y if xi < yi for the first index i such that xi 6= yi. This is called the lexicographic order on E. For example, taking B = 10 as in conventional decimal representation, and r = 5 to allow integers with up to 5 digits, this ordering would give (1, 2, 3, ?, ?) < (1, 2, 4, ?, ?), where ? represents any digit, or expressed more conventionally and in words: 123 hundred and anything is less than 124 hundred and anything. [Note that if each Ei is an ordered set of variable identifiers together with 1 ordered so that xi > 1 ∀i and (x1, x2, . , xr) denotes the product (x1 × x2 × · · · × xr) then E is the set of all monomials of total degree at most r, and this lexicographic ordering is exactly the same as that which I described less formally in the first set of notes.] 2 The set E together with the lexicographic order relation is called the lexicographic product of the sets Ei. If each Ei is totally ordered then so is E. 1.2 Existence and uniqueness of the representation Theorem 1 Let B be an integer > 1, and for k a strictly positive integer let k B be the lexicographic product of k copies of the set B = {0, 1,...,B − 1}. k k Then the map fk : B → {0, 1,...,B − 1} defined by k−1 X k−i−1 fk :(d0, d1, . , dk−1) 7→ diB i=0 is an isomorphism. Proof is by induction on k. 2 This shows how to go from a representation to the integer that it rep- resents. To go the other way we need the observation that for any integer a > 0 there is a smallest integer k > 0 such that a < Bk. Then according to the above theorem a has a unique representation of the form a = (d0, d1, . , dk−1)B, (d0 6= 0). The (small) integers di are called the digits of a in base B; the base is normally only indicated (as a subscript) when necessary to avoid ambiguity. Finally, the following bounds will be useful later. Proposition 2 If the positive integer a admits the following representation in base B: k X i a = aiB , with ak 6= 0, i=0 then k k k+1 B ≤ akB ≤ a ≤ B − 1. [Beware of the change of notation here!] Proof The only non-trivial inequality is the last one, which follows from the observation that k k+1 ai ≤ B − 1 ⇒ a ≤ (B − 1)(1 + B + ··· + B ) = B − 1. 2 3 1.3 Number bases The most commonly used bases are 10, giving the decimal or denary sys- tem, 2 giving the binary system, 8 giving the octal system and 16 giving the hexadecimal system. Modern computers normally use binary internally, al- though because binary representation generates rather long strings of digits it is common for computer programmers to use hexadecimal or octal nota- tion. The smallest addressable amount of memory is usually 8 bits, called a byte, and one hexadecimal digits corresponds to 4 bits, which is sometimes called a nibble (or nybble) (because “nibble” means a small “bite” in the more conventional context of eating food). Therefore one byte is conve- niently represented by two hexadecimal digits. However, we only have con- ventional symbols for 10 digits, whereas hexadecimal representation needs 16 digits, so these are represented as {0, 1, 2,..., 9, a, b, . , f}. Octal repre- sentation has the advantage that it requires only (the first 8) conventional digit symbols. (In fact, many computers can also work in decimal internally using a representation called binary coded decimal (BCD), in which nybbles are used to store only decimal digits. If not much arithmetic is performed then BCD avoids the input-output overhead of converting from decimal to binary and back again, and may therefore be more efficient, but there is no floating- point analogue of BCD and it is used in commerce rather than in science.) When a representation requires more memory than is available it is said to overflow, which can easily happen in the conventional representation of numbers using a fixed number of words. In our flexible multi-word repre- sentation it would correspond to completely running out of memory, which is certainly possible. However, it is important to avoid conventional over- flow during arithmetic calculations with multi-word integers. Addition (and hence subtraction) of two integers in any base can produce at most one extra digit, and so a choice of base that leaves at least one (small) digit position free in each word of the representation is necessary. Thus if the word size of a computer is n bits the base should use at most n − 1 of them, and so we require B ≤ 2n−1. This implies a maximum digit size of 2n−1 − 1, hence we can add two maximal digits to get 2n − 2 which is less than the maximum (small) integer that can be stored in an n-bit word, namely 2n − 1. This choice of base is the most storage-efficient among reasonable choices, but it causes difficulty in multiplication, because the product of two (n−1)- bit digits can have up to 2n − 2 bits! Using only half of the available bits to avoid this problem does not make efficient use of memory, so the solution is 4 to take care when designing the multiplication algorithm, to which we will return below. 1.4 Converting between representation and long integer To convert from a representation a = (a0, a1, . , ak−1) to the integer that it represents is simply a matter of evaluating the sum k−1 X k−i−1 a = aiB . i=0 Note that this is the same as the standard numerical task of evaluating a (univariate) polynomial, and the efficient way to do it is exactly the same. For example, to numerically evaluate the polynomial 3 2 a0x + a1x + a2x + a3 one writes it in Horner’s nested form as ((a0x + a1)x + a2)x + a3, which (generally) involves the minimum number of multiplications, and (as always) evaluate from the inside of the nest of parentheses outwards. In the general case, this evaluation scheme is expressed by the following algorithm. I will write algorithms in an algorithmic pseudo-language that is similar to Pascal and REDUCE, and not unlike Maple, C, etc.
Recommended publications
  • Computation of 2700 Billion Decimal Digits of Pi Using a Desktop Computer
    Computation of 2700 billion decimal digits of Pi using a Desktop Computer Fabrice Bellard Feb 11, 2010 (4th revision) This article describes some of the methods used to get the world record of the computation of the digits of π digits using an inexpensive desktop computer. 1 Notations We assume that numbers are represented in base B with B = 264. A digit in base B is called a limb. M(n) is the time needed to multiply n limb numbers. We assume that M(Cn) is approximately CM(n), which means M(n) is mostly linear, which is the case when handling very large numbers with the Sch¨onhage-Strassen multiplication [5]. log(n) means the natural logarithm of n. log2(n) is log(n)/ log(2). SI and binary prefixes are used (i.e. 1 TB = 1012 bytes, 1 GiB = 230 bytes). 2 Evaluation of the Chudnovsky series 2.1 Introduction The digits of π were computed using the Chudnovsky series [10] ∞ 1 X (−1)n(6n)!(A + Bn) = 12 π (n!)3(3n)!C3n+3/2 n=0 with A = 13591409 B = 545140134 C = 640320 . It was evaluated with the binary splitting algorithm. The asymptotic running time is O(M(n) log(n)2) for a n limb result. It is worst than the asymptotic running time of the Arithmetic-Geometric Mean algorithms of O(M(n) log(n)) but it has better locality and many improvements can reduce its constant factor. Let S be defined as n2 n X Y pk S(n1, n2) = an . qk n=n1+1 k=n1+1 1 We define the auxiliary integers n Y2 P (n1, n2) = pk k=n1+1 n Y2 Q(n1, n2) = qk k=n1+1 T (n1, n2) = S(n1, n2)Q(n1, n2).
    [Show full text]
  • Course Notes 1 1.1 Algorithms: Arithmetic
    CS 125 Course Notes 1 Fall 2016 Welcome to CS 125, a course on algorithms and computational complexity. First, what do these terms means? An algorithm is a recipe or a well-defined procedure for performing a calculation, or in general, for transform- ing some input into a desired output. In this course we will ask a number of basic questions about algorithms: • Does the algorithm halt? • Is it correct? That is, does the algorithm’s output always satisfy the input to output specification that we desire? • Is it efficient? Efficiency could be measured in more than one way. For example, what is the running time of the algorithm? What is its memory consumption? Meanwhile, computational complexity theory focuses on classification of problems according to the com- putational resources they require (time, memory, randomness, parallelism, etc.) in various computational models. Computational complexity theory asks questions such as • Is the class of problems that can be solved time-efficiently with a deterministic algorithm exactly the same as the class that can be solved time-efficiently with a randomized algorithm? • For a given class of problems, is there a “complete” problem for the class such that solving that one problem efficiently implies solving all problems in the class efficiently? • Can every problem with a time-efficient algorithmic solution also be solved with extremely little additional memory (beyond the memory required to store the problem input)? 1.1 Algorithms: arithmetic Some algorithms very familiar to us all are those those for adding and multiplying integers. We all know the grade school algorithm for addition from kindergarten: write the two numbers on top of each other, then add digits right 1-1 1-2 1 7 8 × 2 1 3 5 3 4 1 7 8 +3 5 6 3 7 914 Figure 1.1: Grade school multiplication.
    [Show full text]
  • A Scalable System-On-A-Chip Architecture for Prime Number Validation
    A SCALABLE SYSTEM-ON-A-CHIP ARCHITECTURE FOR PRIME NUMBER VALIDATION Ray C.C. Cheung and Ashley Brown Department of Computing, Imperial College London, United Kingdom Abstract This paper presents a scalable SoC architecture for prime number validation which targets reconfigurable hardware This paper presents a scalable SoC architecture for prime such as FPGAs. In particular, users are allowed to se- number validation which targets reconfigurable hardware. lect predefined scalable or non-scalable modular opera- The primality test is crucial for security systems, especially tors for their designs [4]. Our main contributions in- for most public-key schemes. The Rabin-Miller Strong clude: (1) Parallel designs for Montgomery modular arith- Pseudoprime Test has been mapped into hardware, which metic operations (Section 3). (2) A scalable design method makes use of a circuit for computing Montgomery modu- for mapping the Rabin-Miller Strong Pseudoprime Test lar exponentiation to further speed up the validation and to into hardware (Section 4). (3) An architecture of RAM- reduce the hardware cost. A design generator has been de- based Radix-2 Scalable Montgomery multiplier (Section veloped to generate a variety of scalable and non-scalable 4). (4) A design generator for producing hardware prime Montgomery multipliers based on user-defined parameters. number validators based on user-specified parameters (Sec- The performance and resource usage of our designs, im- tion 5). (5) Implementation of the proposed hardware ar- plemented in Xilinx reconfigurable devices, have been ex- chitectures in FPGAs, with an evaluation of its effective- plored using the embedded PowerPC processor and the soft ness compared with different size and speed tradeoffs (Sec- MicroBlaze processor.
    [Show full text]
  • Version 0.5.1 of 5 March 2010
    Modern Computer Arithmetic Richard P. Brent and Paul Zimmermann Version 0.5.1 of 5 March 2010 iii Copyright c 2003-2010 Richard P. Brent and Paul Zimmermann ° This electronic version is distributed under the terms and conditions of the Creative Commons license “Attribution-Noncommercial-No Derivative Works 3.0”. You are free to copy, distribute and transmit this book under the following conditions: Attribution. You must attribute the work in the manner specified by the • author or licensor (but not in any way that suggests that they endorse you or your use of the work). Noncommercial. You may not use this work for commercial purposes. • No Derivative Works. You may not alter, transform, or build upon this • work. For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to the web page below. Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author’s moral rights. For more information about the license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/ Contents Preface page ix Acknowledgements xi Notation xiii 1 Integer Arithmetic 1 1.1 Representation and Notations 1 1.2 Addition and Subtraction 2 1.3 Multiplication 3 1.3.1 Naive Multiplication 4 1.3.2 Karatsuba’s Algorithm 5 1.3.3 Toom-Cook Multiplication 6 1.3.4 Use of the Fast Fourier Transform (FFT) 8 1.3.5 Unbalanced Multiplication 8 1.3.6 Squaring 11 1.3.7 Multiplication by a Constant 13 1.4 Division 14 1.4.1 Naive
    [Show full text]
  • Fast Integer Division – a Differentiated Offering from C2000 Product Family
    Application Report SPRACN6–July 2019 Fast Integer Division – A Differentiated Offering From C2000™ Product Family Prasanth Viswanathan Pillai, Himanshu Chaudhary, Aravindhan Karuppiah, Alex Tessarolo ABSTRACT This application report provides an overview of the different division and modulo (remainder) functions and its associated properties. Later, the document describes how the different division functions can be implemented using the C28x ISA and intrinsics supported by the compiler. Contents 1 Introduction ................................................................................................................... 2 2 Different Division Functions ................................................................................................ 2 3 Intrinsic Support Through TI C2000 Compiler ........................................................................... 4 4 Cycle Count................................................................................................................... 6 5 Summary...................................................................................................................... 6 6 References ................................................................................................................... 6 List of Figures 1 Truncated Division Function................................................................................................ 2 2 Floored Division Function................................................................................................... 3 3 Euclidean
    [Show full text]
  • Anthyphairesis, the ``Originary'' Core of the Concept of Fraction
    Anthyphairesis, the “originary” core of the concept of fraction Paolo Longoni, Gianstefano Riva, Ernesto Rottoli To cite this version: Paolo Longoni, Gianstefano Riva, Ernesto Rottoli. Anthyphairesis, the “originary” core of the concept of fraction. History and Pedagogy of Mathematics, Jul 2016, Montpellier, France. hal-01349271 HAL Id: hal-01349271 https://hal.archives-ouvertes.fr/hal-01349271 Submitted on 27 Jul 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. ANTHYPHAIRESIS, THE “ORIGINARY” CORE OF THE CONCEPT OF FRACTION Paolo LONGONI, Gianstefano RIVA, Ernesto ROTTOLI Laboratorio didattico di matematica e filosofia. Presezzo, Bergamo, Italy [email protected] ABSTRACT In spite of efforts over decades, the results of teaching and learning fractions are not satisfactory. In response to this trouble, we have proposed a radical rethinking of the didactics of fractions, that begins with the third grade of primary school. In this presentation, we propose some historical reflections that underline the “originary” meaning of the concept of fraction. Our starting point is to retrace the anthyphairesis, in order to feel, at least partially, the “originary sensibility” that characterized the Pythagorean search. The walking step by step the concrete actions of this procedure of comparison of two homogeneous quantities, results in proposing that a process of mathematisation is the core of didactics of fractions.
    [Show full text]
  • Introduction to Abstract Algebra “Rings First”
    Introduction to Abstract Algebra \Rings First" Bruno Benedetti University of Miami January 2020 Abstract The main purpose of these notes is to understand what Z; Q; R; C are, as well as their polynomial rings. Contents 0 Preliminaries 4 0.1 Injective and Surjective Functions..........................4 0.2 Natural numbers, induction, and Euclid's theorem.................6 0.3 The Euclidean Algorithm and Diophantine Equations............... 12 0.4 From Z and Q to R: The need for geometry..................... 18 0.5 Modular Arithmetics and Divisibility Criteria.................... 23 0.6 *Fermat's little theorem and decimal representation................ 28 0.7 Exercises........................................ 31 1 C-Rings, Fields and Domains 33 1.1 Invertible elements and Fields............................. 34 1.2 Zerodivisors and Domains............................... 36 1.3 Nilpotent elements and reduced C-rings....................... 39 1.4 *Gaussian Integers................................... 39 1.5 Exercises........................................ 41 2 Polynomials 43 2.1 Degree of a polynomial................................. 44 2.2 Euclidean division................................... 46 2.3 Complex conjugation.................................. 50 2.4 Symmetric Polynomials................................ 52 2.5 Exercises........................................ 56 3 Subrings, Homomorphisms, Ideals 57 3.1 Subrings......................................... 57 3.2 Homomorphisms.................................... 58 3.3 Ideals.........................................
    [Show full text]
  • 1 Multiplication
    CS 140 Class Notes 1 1 Multiplication Consider two unsigned binary numb ers X and Y . Wewanttomultiply these numb ers. The basic algorithm is similar to the one used in multiplying the numb ers on p encil and pap er. The main op erations involved are shift and add. Recall that the `p encil-and-pap er' algorithm is inecient in that each pro duct term obtained bymultiplying each bit of the multiplier to the multiplicand has to b e saved till all such pro duct terms are obtained. In machine implementations, it is desirable to add all such pro duct terms to form the partial product. Also, instead of shifting the pro duct terms to the left, the partial pro duct is shifted to the right b efore the addition takes place. In other words, if P is the partial pro duct i after i steps and if Y is the multiplicand and X is the multiplier, then P P + x Y i i j and 1 P P 2 i+1 i and the pro cess rep eats. Note that the multiplication of signed magnitude numb ers require a straight forward extension of the unsigned case. The magnitude part of the pro duct can b e computed just as in the unsigned magnitude case. The sign p of the pro duct P is computed from the signs of X and Y as 0 p x y 0 0 0 1.1 Two's complement Multiplication - Rob ertson's Algorithm Consider the case that we want to multiply two 8 bit numb ers X = x x :::x and Y = y y :::y .
    [Show full text]
  • Divide-And-Conquer Algorithms
    Chapter 2 Divide-and-conquer algorithms The divide-and-conquer strategy solves a problem by: 1. Breaking it into subproblems that are themselves smaller instances of the same type of problem 2. Recursively solving these subproblems 3. Appropriately combining their answers The real work is done piecemeal, in three different places: in the partitioning of problems into subproblems; at the very tail end of the recursion, when the subproblems are so small that they are solved outright; and in the gluing together of partial answers. These are held together and coordinated by the algorithm's core recursive structure. As an introductory example, we'll see how this technique yields a new algorithm for multi- plying numbers, one that is much more efficient than the method we all learned in elementary school! 2.1 Multiplication The mathematician Carl Friedrich Gauss (1777–1855) once noticed that although the product of two complex numbers (a + bi)(c + di) = ac bd + (bc + ad)i − seems to involve four real-number multiplications, it can in fact be done with just three: ac, bd, and (a + b)(c + d), since bc + ad = (a + b)(c + d) ac bd: − − In our big-O way of thinking, reducing the number of multiplications from four to three seems wasted ingenuity. But this modest improvement becomes very significant when applied recur- sively. 55 56 Algorithms Let's move away from complex numbers and see how this helps with regular multiplica- tion. Suppose x and y are two n-bit integers, and assume for convenience that n is a power of 2 (the more general case is hardly any different).
    [Show full text]
  • A Binary Recursive Gcd Algorithm
    A Binary Recursive Gcd Algorithm Damien Stehle´ and Paul Zimmermann LORIA/INRIA Lorraine, 615 rue du jardin botanique, BP 101, F-54602 Villers-l`es-Nancy, France, fstehle,[email protected] Abstract. The binary algorithm is a variant of the Euclidean algorithm that performs well in practice. We present a quasi-linear time recursive algorithm that computes the greatest common divisor of two integers by simulating a slightly modified version of the binary algorithm. The structure of our algorithm is very close to the one of the well-known Knuth-Sch¨onhage fast gcd algorithm; although it does not improve on its O(M(n) log n) complexity, the description and the proof of correctness are significantly simpler in our case. This leads to a simplification of the implementation and to better running times. 1 Introduction Gcd computation is a central task in computer algebra, in particular when com- puting over rational numbers or over modular integers. The well-known Eu- clidean algorithm solves this problem in time quadratic in the size n of the inputs. This algorithm has been extensively studied and analyzed over the past decades. We refer to the very complete average complexity analysis of Vall´ee for a large family of gcd algorithms, see [10]. The first quasi-linear algorithm for the integer gcd was proposed by Knuth in 1970, see [4]: he showed how to calculate the gcd of two n-bit integers in time O(n log5 n log log n). The complexity of this algorithm was improved by Sch¨onhage [6] to O(n log2 n log log n).
    [Show full text]
  • Numerical Stability of Euclidean Algorithm Over Ultrametric Fields
    Xavier CARUSO Numerical stability of Euclidean algorithm over ultrametric fields Tome 29, no 2 (2017), p. 503-534. <http://jtnb.cedram.org/item?id=JTNB_2017__29_2_503_0> © Société Arithmétique de Bordeaux, 2017, tous droits réservés. L’accès aux articles de la revue « Journal de Théorie des Nom- bres de Bordeaux » (http://jtnb.cedram.org/), implique l’accord avec les conditions générales d’utilisation (http://jtnb.cedram. org/legal/). Toute reproduction en tout ou partie de cet article sous quelque forme que ce soit pour tout usage autre que l’utilisation à fin strictement personnelle du copiste est constitutive d’une infrac- tion pénale. Toute copie ou impression de ce fichier doit contenir la présente mention de copyright. cedram Article mis en ligne dans le cadre du Centre de diffusion des revues académiques de mathématiques http://www.cedram.org/ Journal de Théorie des Nombres de Bordeaux 29 (2017), 503–534 Numerical stability of Euclidean algorithm over ultrametric fields par Xavier CARUSO Résumé. Nous étudions le problème de la stabilité du calcul des résultants et sous-résultants des polynômes définis sur des an- neaux de valuation discrète complets (e.g. Zp ou k[[t]] où k est un corps). Nous démontrons que les algorithmes de type Euclide sont très instables en moyenne et, dans de nombreux cas, nous ex- pliquons comment les rendre stables sans dégrader la complexité. Chemin faisant, nous déterminons la loi de la valuation des sous- résultants de deux polynômes p-adiques aléatoires unitaires de même degré. Abstract. We address the problem of the stability of the com- putations of resultants and subresultants of polynomials defined over complete discrete valuation rings (e.g.
    [Show full text]
  • Fast Integer Multiplication Using Modular Arithmetic ∗
    Fast Integer Multiplication Using Modular Arithmetic ∗ Anindya Dey Piyush P Kurur,z Computer Science Division Dept. of Computer Science and Engineering University of California at Berkeley Indian Institute of Technology Kanpur Berkeley, CA 94720, USA Kanpur, UP, India, 208016 [email protected] [email protected] Chandan Saha Ramprasad Saptharishix Dept. of Computer Science and Engineering Chennai Mathematical Institute Indian Institute of Technology Kanpur Plot H1, SIPCOT IT Park Kanpur, UP, India, 208016 Padur PO, Siruseri, India, 603103 [email protected] [email protected] Abstract ∗ We give an N · log N · 2O(log N) time algorithm to multiply two N-bit integers that uses modular arithmetic for intermediate computations instead of arithmetic over complex numbers as in F¨urer's algorithm, which also has the same and so far the best known complexity. The previous best algorithm using modular arithmetic (by Sch¨onhageand Strassen) has complexity O(N · log N · log log N). The advantage of using modular arithmetic as opposed to complex number arithmetic is that we can completely evade the task of bounding the truncation error due to finite approximations of complex numbers, which makes the analysis relatively simple. Our algorithm is based upon F¨urer'salgorithm, but uses FFT over multivariate polynomials along with an estimate of the least prime in an arithmetic progression to achieve this improvement in the modular setting. It can also be viewed as a p-adic version of F¨urer'salgorithm. 1 Introduction Computing the product of two N-bit integers is nearly a ubiquitous operation in algorithm design.
    [Show full text]