Posit Arithmetic John L

Total Page:16

File Type:pdf, Size:1020Kb

Posit Arithmetic John L Copyright © 2017 John L. Gustafson Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sub-license, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: This copyright and permission notice shall be included in all copies or substantial portions of the software. THE SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OR CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. Posit Arithmetic John L. Gustafson 10 October 2017 1 Overview Unums are for expressing real numbers and ranges of real numbers. There are two modes of operation, selectable by the user: posit mode and valid mode. In posit mode, a unum behaves much like a floating-point number of fixed size, rounding to the nearest expressible value if the result of a calculation is not expressible exactly; however, the posit representation offers more accuracy and a larger dynamic range than floats with the same number of bits, as well as many other advantages. We can refer to these simply as posits for short, just as we refer to IEEE 754 Standard floating-point numbers as floats. In valid mode, a unum represents a range of real numbers and can be used to rigorously bound answers much like interval arithmetic does, but with a number of improvements over traditional interval arithmetic. Valids will only be mentioned in passing here, and described in detail in a separate document. This document focuses on the details of posit mode, which can be thought of as “beating floats at their own game.” It introduces a type of unum, Type III, that combines many of the advantages of Type I and Type II unums, but is less radical and is designed as a “drop-in” replacement for an IEEE 754 Standard float with no changes needed to source code at the application level. The following table may help clarify the terminology. 2 Posits4.nb Date IEEE 754 Unum Intro- Compati- Advantages Disadvantages duced bility Variable width management needed; Yes; Most bit-efficient March Type I perfect rigorous-bound 2015 inherits IEEE 754 dis- superset representation advantages, such as redundant representations Maximum infor- mation per bit (can customize to a particular workload); Table look-up limits precision to No; perfect reciprocals ~20 bits or less; January Type II complete (+ – × ÷ equally easy); 2016 redesign exact dot product extremely fast via is usually expensive ROM table lookup; and impractical allows decimal representations Hardware-friendly; Too new to have posit form is a drop- vendor support Similar; in replacement February from vendors yet; Type III conversion for IEEE floats (less 2017 possible radical change); perfect reciprocals only for 2n, 0, and ±∞ Faster, more accurate, lower cost than float Type III posits use a fixed number of bits, though that number is very flexible, from as few as two bits up to many thousands. They are designed for simple hardware and software implementation. They make use of the same type of low-level circuit constructs that IEEE 754 floats use (integer adds, integer multiplies, shifts, etc.), and take less chip area since they are simpler than IEEE floats in many ways. Early FPGA experiments show markedly reduced latency for posits, com- pared to floats of the same precision. As with all unum types, there is an h-layer of human-readable values, a u-layer that represents values with unums, and a g-layer that performs mathematical operations exactly and temporarily, for return to the u-layer. Here, the h-layer looks very much like standard decimal input and output, though we are more careful about communicating the difference between exact and rounded values. The u-layer is posit-format numbers, which look very much like floats to a programmer. The g-layer represents temporary (scratch) values in a quire, which does exact operations that obey the distributive and associative laws of algebra and allow posits to achieve astonishingly high accuracy with fewer bits than floats. It's a lot to explain at once, but let's start with the u-layer since it is the closest cousin to floating-point arithmetic. Type III posits use a fixed number of bits, though that number is very flexible, from as few as two bits up to many thousands. They are designed for simple hardware and software implementation. They make use of the same type of low-level circuit constructs that IEEE 754 floats use (integer adds, integer multiplies, shifts, etc.), and take less chip area since they are simpler than IEEE floats in many ways. Early FPGA experiments show markedly reduced latency for posits, com- pared to floats of the same precision. As with all unum types, there is an h-layer of human-readable values, a u-layer that represents values with unums, and a g-layer that performs mathematical operations exactlyPosits4.nb and temporarily, 3 for return to the u-layer. Here, the h-layer looks very much like standard decimal input and output, though we are more careful about communicating the difference between exact and rounded values. The u-layer is posit-format numbers, which look very much like floats to a programmer. The g-layer represents temporary (scratch) values in a quire, which does exact operations that obey the distributive and associative laws of algebra and allow posits to achieve astonishingly high accuracy with fewer bits than floats. It's a lot to explain at once, but let's start with the u-layer since it is the closest cousin to floating-point arithmetic. 1.1 Brief description of the format for posit mode, Type III unums A posit for Type III unums is designed to be hardware-friendly, and looks like the following: sign regime exponent fraction bit bits bits, if any bits, if any s r r r r ⋯ r e1 e2 e3 ⋯ ees f1 f2 f3 f4 f5 f6 ⋯ nbits The boundary between regions is only shown after the sign bit, because the other boundaries vary with the size of the regime. The regime is the sequence of identical bits r, terminated by the opposite bit r or by the end of the posit. This way of representing integers is sometimes referred to as “unary arithmetic” but it is actually a bit more sophisticated. The most primitive way to record integers is with marks, where the number of marks is the number represented. Think of Roman numerals, 1 = I, 2 =II, 3 =III, but then it breaks down and uses a different system for 4 = IV. Or in Chinese and Japanese, 1 = Ӟ, 2 = ԫ, 3 = ㆔, but then the system breaks down and uses a different system for 4 = ㆕. The situation we see this system going beyond 3 is tally marks, the way countless cartoons depict prisoners tracking how many days they’ve been in prison. But how can tally marks record both positive, zero, and negative integers? We can imagine early attempts to notate the concept: 4 Posits4.nb But we have the ability to use two kinds of "tick mark": zero and one. This means we can express both positive and negative integers by repeating the mark. Like, negative integers by repeating 0 bits and zero or positive integers by repeating 1 bits. The number of exponent bits is es, but the number of those bits can be less than es if they are crowded off the end of the unum by the regime bits. For people used to IEEE floats, this is usually the part where confusion sets in. “Wait, exactly where are the exponent bits, and how many are there?” They do not work the same way as the exponent field of a float. They move around, and they can be clipped off. For now, think of the regime bits and exponent bits as together serving to represent an integer k, and the scaling of the number is 2k. The fraction bits are whatever bit locations are not used by the sign, regime, and exponent bits. They work exactly like the fraction bits for normalized floats. That's the main thing that makes Type III posits hardware-friendly. Just as with normalized (binary) floats, the representation boils down to (-1)sign ×2some integer power ×(1 plus a binary fraction) Hardware designers will be happy to hear that there are no “subnormal” posits like there are with floats, that there is only one rounding mode, and there are only two exception values (0 and ±∞) that do not work like the above formula. The reason for introducing regime bits is that they automatically and economically create tapered accuracy, where values with small exponents have more accuracy and very large or very small numbers have less accuracy. (IEEE floats have a crude form of tapered accuracy called gradual underflow for small-magnitude numbers, but they have no equivalent for large-magnitude num- bers.) The idea of tapered accuracy was first proposed by Morris in 1971 and a full implementation was created in 1989, but it used a separate field to indicate the number of bits in the exponent.
Recommended publications
  • Leveraging Posit Arithmetic in Deep Neural Networks
    Leveraging Posit Arithmetic in Deep Neural Networks Master’s Thesis Course 2020–2021 Author Raul Murillo Montero Advisors Dr. Alberto A. del Barrio García Dr. Guillermo Botella Juan Master in Computer Science Engineering Faculty of Computer Science Complutense University of Madrid This document is prepared for double-sided printing. Leveraging Posit Arithmetic in Deep Neural Networks Master’s Thesis in Computer Science Engineering Department of Computer Architecture and Automation Author Raul Murillo Montero Advisors Dr. Alberto A. del Barrio García Dr. Guillermo Botella Juan Call: January 2021 Grade: 10 Master in Computer Science Engineering Faculty of Computer Science Complutense University of Madrid Course 2020–2021 Copyright c Raul Murillo Montero Acknowledgements To my advisors, Alberto and Guillermo, thank you for giving me the opportunity to embark on this project, for all the corrections, the constant suggestions for improve- ment, and your support. Thanks to my family and to all of you who have accompanied me on this journey and have made it a little easier. v COMPLUTENSE UNIVERSITY OF MADRID Abstract Faculty of Computer Science Department of Computer Architecture and Automation Master in Computer Science Engineering Leveraging Posit Arithmetic in Deep Neural Networks by Raul Murillo Montero The IEEE 754 Standard for Floating-Point Arithmetic has been for decades imple- mented in the vast majority of modern computer systems to manipulate and com- pute real numbers. Recently, John L. Gustafson introduced a new data type called positTM to represent real numbers on computers. This emerging format was designed with the aim of replacing IEEE 754 floating-point numbers by providing certain ad- vantages over them, such as a larger dynamic range, higher accuracy, bitwise iden- tical results across systems, or simpler hardware, among others.
    [Show full text]
  • Study of the Posit Number System: a Practical Approach
    STUDY OF THE POSIT NUMBER SYSTEM: A PRACTICAL APPROACH Raúl Murillo Montero DOBLE GRADO EN INGENIERÍA INFORMÁTICA – MATEMÁTICAS FACULTAD DE INFORMÁTICA UNIVERSIDAD COMPLUTENSE DE MADRID Trabajo Fin Grado en Ingeniería Informática – Matemáticas Curso 2018-2019 Directores: Alberto Antonio del Barrio García Guillermo Botella Juan Este documento está preparado para ser imprimido a doble cara. STUDY OF THE POSIT NUMBER SYSTEM: A PRACTICAL APPROACH Raúl Murillo Montero DOUBLE DEGREE IN COMPUTER SCIENCE – MATHEMATICS FACULTY OF COMPUTER SCIENCE COMPLUTENSE UNIVERSITY OF MADRID Bachelor’s Degree Final Project in Computer Science – Mathematics Academic year 2018-2019 Directors: Alberto Antonio del Barrio García Guillermo Botella Juan To me, mathematics, computer science, and the arts are insanely related. They’re all creative expressions. Sebastian Thrun Acknowledgements I have to start by thanking Alberto Antonio del Barrio García and Guillermo Botella Juan, the directors of this amazing project. Thank you for this opportunity, for your support, your help and for trusting me. Without you any of this would have ever been possible. Thanks to my family, who has always supported me and taught me that I am only limited by the limits I choose to put on myself. Finally, thanks to all of you who have accompanied me on this 5 year journey. Sharing classes, exams, comings and goings between the two faculties with you has been a wonderful experience I will always remember. vii Abstract The IEEE Standard for Floating-Point Arithmetic (IEEE 754) has been for decades the standard for floating-point arithmetic and is implemented in a vast majority of modern computer systems.
    [Show full text]
  • Posits: an Alternative to Floating Point Calculations
    Rochester Institute of Technology RIT Scholar Works Theses 5-2020 Posits: An Alternative to Floating Point Calculations Matt Wagner [email protected] Follow this and additional works at: https://scholarworks.rit.edu/theses Recommended Citation Wagner, Matt, "Posits: An Alternative to Floating Point Calculations" (2020). Thesis. Rochester Institute of Technology. Accessed from This Master's Project is brought to you for free and open access by RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact [email protected]. POSITS:AN ALTERNATIVE TO FLOATING POINT CALCULATIONS by MATT WAGNER GRADUATE PAPER Submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in Electrical Engineering Approved by: Mr. Mark A. Indovina, Senior Lecturer Graduate Research Advisor, Department of Electrical and Microelectronic Engineering Dr. Sohail A. Dianat, Professor Department Head, Department of Electrical and Microelectronic Engineering DEPARTMENT OF ELECTRICAL AND MICROELECTRONIC ENGINEERING KATE GLEASON COLLEGE OF ENGINEERING ROCHESTER INSTITUTE OF TECHNOLOGY ROCHESTER,NEW YORK MAY, 2020 I dedicate this paper to everyone along the way. Declaration Declaration I hereby declare that except where specific reference is made to the work of others, that all content of this Graduate Paper are original and have not been submitted in whole or in part for consideration for any other degree or qualification in this, or any other University. This Graduate Project is the result of my own work and includes nothing which is the outcome of work done in collaboration, except where specifically indicated in the text.
    [Show full text]
  • Deckblatt KIT, 15-02
    Up-to-date Interval Arithmetic From Closed Intervals to Connected Sets of Real Numbers Ulrich Kulisch Preprint 15/02 INSTITUT FÜR WISSENSCHAFTLICHES RECHNEN UND MATHEMATISCHE MO DELLBILDUNG Anschriften der Verfasser: Prof. Dr. Ulrich Kulisch Institut fur¨ Angewandte und Numerische Mathematik Karlsruher Institut fur¨ Technologie (KIT) D-76128 Karlsruhe Up-to-date Interval Arithmetic From closed intervals to connected sets of real numbers Ulrich Kulisch Institut f¨ur Angewandte und Numerische Mathematik Karlsruher Institut f¨ur Technologie Kaiserstrasse 12 D-76128 Karlsruhe GERMANY [email protected] Abstract. This paper unifies the representations of different kinds of computer arithmetic. It is motivated by ideas developed in the book The End of Error by John Gustafson [5]. Here interval arithmetic just deals with connected sets of real numbers. These can be closed, open, half-open, bounded or unbounded. The first chapter gives a brief informal review of computer arithmetic from early floating-point arithmetic to the IEEE 754 floating-point arithmetic standard, to conventional interval arith- metic for closed and bounded real intervals, to the proposed standard IEEE P1788 for interval arithmetic, to advanced computer arithmetic, and finally to the just recently defined and pub- lished unum and ubound arithmetic [5]. Then in chapter 2 the style switches from an informal to a pure and strict mathematical one. Different kinds of computer arithmetic follow an abstract mathematical pattern and are just special realizations of it. The basic mathematical concepts are condensed into an abstract ax- iomatic definition. A computer operation is defined via a monotone mapping of an arithmetic operation in a complete lattice onto a complete sublattice.
    [Show full text]
  • Travail De Bachelor Rapport Final
    Travail de Bachelor NEXT GENERATION DIGITAL ARITHMETIC ±∞ P 0 sit DANS Rapport final Auteur: Baeriswyl Julien Superviseur: Dassati Alberto (ADI) 27 juillet 2018 Table des matières 1 Préface 9 1.1 Remerciements........................................9 1.2 Résumé............................................9 2 Introduction 11 2.1 Contexte........................................... 11 2.1.1 Cadre de déroulement................................ 11 2.1.2 Contexte informatique................................ 11 2.1.2.1 Types de représentation de nombres................... 11 2.1.2.2 Pourquoi a-t-on besoin d’une alternative ?................ 11 2.1.2.3 Alternatives à l’heure actuelle....................... 11 2.1.2.3.1 Inconvénients des alternatives.................. 12 2.1.2.3.2 Pourquoi les Posits ?....................... 12 2.2 Objectif principal....................................... 12 2.3 Objectifs pratiques...................................... 12 3 Notions mathématiques 13 3.1 Ensemble de nombres.................................... 13 3.2 Représentations des nombres................................ 13 3.3 Notation scientifique..................................... 14 3.4 Nombres dans l’informatique................................. 14 3.4.0.1 Problèmes liés aux approximations.................... 14 3.4.1 Nombres entiers................................... 15 3.4.2 Nombres à virgule.................................. 15 3.4.3 Principe de clôture.................................. 15 3.4.4 Arithmétique d’intervalles.............................
    [Show full text]
  • Up-To-Date Interval Arithmetic from Closed Intervals to Connected Sets of Real Numbers
    Up-to-date Interval Arithmetic From closed intervals to connected sets of real numbers Ulrich Kulisch Institut f¨ur Angewandte und Numerische Mathematik Karlsruher Institut f¨ur Technologie D-76128 Karlsruhe GERMANY [email protected] Abstract. This paper unifies the representations of different kinds of computer arithmetic. It is motivated by the book The End of Error by John Gustafson [5]. Here interval arithmetic just deals with connected sets of real numbers. These can be closed, open, half-open, bounded or unbounded. In an earlier paper [19] the author showed that different kinds of computer arithmetic like floating-point arithmetic, conventional interval arithmetic for closed real intervals and arithmetic for interval vectors and interval matrices can all be derived from an abstract axiomatic definition of computer arithmetic and are just special realizations of it. A computer operation is defined via a monotone mapping of an arithmetic operation in a complete lattice onto a complete sublattice. This paper shows that the newly defined unum and ubound arithmetic [5] can be deduced from the same abstract mathematical model. To a great deal unum and ubound arithmetic can be seen as an extension of arithmetic for closed real intervals to open and half-open real intervals, just to connected sets of real numbers. Deriving computer executable formulas for ubound arithmetic on the base of pure floating-point numbers (without the IEEE 754 exceptions) leads to a closed calculus that is totally free of exceptions, i.e., any arithmetic operation of the set +, −, ·,/, and the dot product for ubounds together with a number of elementary functions always delivers a ubound as result.
    [Show full text]