Slide Set 14 for ENCM 369 Winter 2014 Lecture Section 01

Total Page:16

File Type:pdf, Size:1020Kb

Slide Set 14 for ENCM 369 Winter 2014 Lecture Section 01 Slide Set 14 for ENCM 369 Winter 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2014 ENCM 369 W14 Section 01 Slide Set 14 slide 2/66 Contents Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 3/66 Outline of Slide Set 14 Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 4/66 Introduction to floating-point numbers We've finished ENCM 369 coverage of integer representations and arithmetic. We're moving on to floating-point numbers and arithmetic. (Section 5.3.2 in the textbook for concepts; 6.7.4 for a very brief introduction to MIPS floating-point registers and instructions.) Floating-point is the generic name given to the kinds of numbers you've seen in C and C++ with types double and float. ENCM 369 W14 Section 01 Slide Set 14 slide 5/66 Scientific Notation This is a format that engineering students should be very familiar with! Example: 6:02214179 × 1023 mol−1 Example: −1:60217656 × 10−19 C Floating-point representation has the same structure as scientific notation, but floating-point typically uses base two, not base ten. ENCM 369 W14 Section 01 Slide Set 14 slide 6/66 Introductory floating-point example A programmer gives a value to a constant in some C code: const double electron_charge = -1.60217656e-19; The C compiler will use the base ten constant in the C code to create a base two constant a computer can work with. When the program runs, the number the computer uses is −1:0111101001001101101000010110101110011100011110101101 × two−111111, which is very close to but not exactly equal to −1:60217656 × 10−19. ENCM 369 W14 Section 01 Slide Set 14 slide 7/66 Names for parts of a non-zero floating-point number significand exponent 00001011 -1 .01001100011010111010111 × two fractionsign The significand includes bits from both sides of the binary point. Another name for significand is mantissa. (Note: This is not base ten, so we should not use the term decimal point!) The fraction is the part of the significand that is to the right of the binary point. So the fraction represents some number that is ≥ 0 but < 1. ENCM 369 W14 Section 01 Slide Set 14 slide 8/66 Normalized non-zero floating-point numbers In normalized form, an f-p number must have a single 1 bit immediately to the left of the binary point, and no other 1 bits left of the binary point. Therefore, the significand of a normalized number must be ≥ 1:0 and must also be < 10:0two: (In English: greater than or equal to one, strictly less than two.) ENCM 369 W14 Section 01 Slide Set 14 slide 9/66 Normalized non-zero f-p numbers: examples Which of the following are in normalized form? 00000101 I A. −1:00000000 × two 00100101 I B. +10:0000000 × two 00010111 I C. +1:10001011 × two 00001100 I D. −0:11101100 × two 01001100 I E. +101:111011 × two ENCM 369 W14 Section 01 Slide Set 14 slide 10/66 Example conversion from base ten to base-two floating-point What is 9:375ten expressed as a normalized f-p number? What are the sign, significand, fraction, and exponent of this normalized f-p number? ENCM 369 W14 Section 01 Slide Set 14 slide 11/66 Standard organizations for bits of floating-point numbers For computer hardware to work with f-p numbers there must be precise rules about how to encode these numbers. The most usual overall sizes for f-p numbers are 32 bits or 64 bits, but other sizes (e.g., 16, 80, or 128 bits) are possible. We need one bit for the sign and some number of bits for information about the exponent; the remaining bits can be used for information about the significand. ENCM 369 W14 Section 01 Slide Set 14 slide 12/66 Sign information for non-zero f-p numbers This requires a single bit. A sign bit of 0 is used for positive numbers. A sign bit of 1 is used for negative numbers. ENCM 369 W14 Section 01 Slide Set 14 slide 13/66 Exponent information for a non-zero f-p numbers Exponents in f-p numbers are signed integers! f-p numbers with small magnitudes will have negative exponents. So of course two's complement is used for exponents, right . ? WRONG! In fact, an alternate system for signed integers, called biased notation, is used for exponents in f-p numbers. (This fact explains why many introductions to two's-complement systems state that two's complement is almost always used for signed integers in modern digital hardware.) ENCM 369 W14 Section 01 Slide Set 14 slide 14/66 How does biased notation work? The biased exponent is equal to the actual exponent plus some number called a bias. The bias is chosen so that roughly half the allowable actual exponents are negative, and roughly half are positive. Example: The bias for an 8-bit exponent is 127ten, or 0111_1111two. If the actual exponent is 3ten, what is the biased exponent in base ten and base two? ENCM 369 W14 Section 01 Slide Set 14 slide 15/66 Why is biased notation used for exponents in f-p numbers? It turns out that biased notation helps with the design of relatively small, speedy circuits to decide whether one f-p number is less than another f-p number. (We won't study the details of that in ENCM 369.) Also, it's useful that the bit pattern for an actual exponent of zero is not a sequence of zero bits|then a sequence of zero bits can have a different, special meaning. ENCM 369 W14 Section 01 Slide Set 14 slide 16/66 Significand information for a non-zero, normalized f-p number 1 XXX ··· XXX We know this bit Any pattern of 1's and will be a 1. 0's is possible here. There is no need to encode the entire significand. Instead we can record only the bits of the fraction. Leaving out the 1 bit from the left of the binary point allows more precision in the fraction. ENCM 369 W14 Section 01 Slide Set 14 slide 17/66 Outline of Slide Set 14 Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 18/66 MIPS formats for 32-bit and 64-bit f-p numbers bit 31 bits 30{23 bits 22{0 sign bit biased exponent fraction bit 63 bits 62{52 bits 51{0 sign bit biased exponent fraction Exponent bias for 32-bit format: 127ten = 0111_1111two. Exponent bias for 64-bit format: 1023ten = 011_1111_1111two. ENCM 369 W14 Section 01 Slide Set 14 slide 19/66 MIPS formats for 32-bit and 64-bit f-p numbers The 32-bit format is called single precision. The 64-bit format is called double precision. We'll see later that MIPS instruction mnemonics for single-precision operations end in .s, as in mov.s, while the mnemonics for double-precision operations end in .d, as in add.d. ENCM 369 W14 Section 01 Slide Set 14 slide 20/66 Example: How is 9:375ten encoded in 32-bit and 64-bit formats? From previous work: 1 1 9:375 = 9 + + 4 8 = 1001:011two = 1:001011 × twothree (normalized) For each of the 32-bit and 64-bit formats, what are the bit patterns for the biased exponents? What are the complete bit patterns for the f-p numbers? ENCM 369 W14 Section 01 Slide Set 14 slide 21/66 More examples How would −9:375ten be encoded in the 32-bit format? How would 0:125ten be encoded in the 32-bit format? What base ten number does the 32-bit pattern 1_0111_1110_11_[21 zeros] represent? ENCM 369 W14 Section 01 Slide Set 14 slide 22/66 How to represent zero in f-p formats A special rule says that if all exponent and fraction bits are zero, the number being represented is 0.0. So, what are the representations of 0.0 in 32-bit and 64-bit formats? ENCM 369 W14 Section 01 Slide Set 14 slide 23/66 Outline of Slide Set 14 Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic ENCM 369 W14 Section 01 Slide Set 14 slide 24/66 IEEE standards for floating-point numbers and arithmetic (1) IEEE: Institute of Electrical and Electronics Engineers \IEEE 754" and \IEEE floating-point" are informal names for both the original IEEE 754-1985 standard and the revised IEEE 754-2008 standard.
Recommended publications
  • Metaobject Protocols: Why We Want Them and What Else They Can Do
    Metaobject protocols: Why we want them and what else they can do Gregor Kiczales, J.Michael Ashley, Luis Rodriguez, Amin Vahdat, and Daniel G. Bobrow Published in A. Paepcke, editor, Object-Oriented Programming: The CLOS Perspective, pages 101 ¾ 118. The MIT Press, Cambridge, MA, 1993. © Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. Metaob ject Proto cols WhyWeWant Them and What Else They Can Do App ears in Object OrientedProgramming: The CLOS Perspective c Copyright 1993 MIT Press Gregor Kiczales, J. Michael Ashley, Luis Ro driguez, Amin Vahdat and Daniel G. Bobrow Original ly conceivedasaneat idea that could help solve problems in the design and implementation of CLOS, the metaobject protocol framework now appears to have applicability to a wide range of problems that come up in high-level languages. This chapter sketches this wider potential, by drawing an analogy to ordinary language design, by presenting some early design principles, and by presenting an overview of three new metaobject protcols we have designed that, respectively, control the semantics of Scheme, the compilation of Scheme, and the static paral lelization of Scheme programs. Intro duction The CLOS Metaob ject Proto col MOP was motivated by the tension b etween, what at the time, seemed liketwo con icting desires. The rst was to have a relatively small but p owerful language for doing ob ject-oriented programming in Lisp. The second was to satisfy what seemed to b e a large numb er of user demands, including: compatibility with previous languages, p erformance compara- ble to or b etter than previous implementations and extensibility to allow further exp erimentation with ob ject-oriented concepts see Chapter 2 for examples of directions in which ob ject-oriented techniques might b e pushed.
    [Show full text]
  • Chapter 2 Basics of Scanning And
    Chapter 2 Basics of Scanning and Conventional Programming in Java In this chapter, we will introduce you to an initial set of Java features, the equivalent of which you should have seen in your CS-1 class; the separation of problem, representation, algorithm and program – four concepts you have probably seen in your CS-1 class; style rules with which you are probably familiar, and scanning - a general class of problems we see in both computer science and other fields. Each chapter is associated with an animating recorded PowerPoint presentation and a YouTube video created from the presentation. It is meant to be a transcript of the associated presentation that contains little graphics and thus can be read even on a small device. You should refer to the associated material if you feel the need for a different instruction medium. Also associated with each chapter is hyperlinked code examples presented here. References to previously presented code modules are links that can be traversed to remind you of the details. The resources for this chapter are: PowerPoint Presentation YouTube Video Code Examples Algorithms and Representation Four concepts we explicitly or implicitly encounter while programming are problems, representations, algorithms and programs. Programs, of course, are instructions executed by the computer. Problems are what we try to solve when we write programs. Usually we do not go directly from problems to programs. Two intermediate steps are creating algorithms and identifying representations. Algorithms are sequences of steps to solve problems. So are programs. Thus, all programs are algorithms but the reverse is not true.
    [Show full text]
  • 5. Data Types
    IEEE FOR THE FUNCTIONAL VERIFICATION LANGUAGE e Std 1647-2011 5. Data types The e language has a number of predefined data types, including the integer and Boolean scalar types common to most programming languages. In addition, new scalar data types (enumerated types) that are appropriate for programming, modeling hardware, and interfacing with hardware simulators can be created. The e language also provides a powerful mechanism for defining OO hierarchical data structures (structs) and ordered collections of elements of the same type (lists). The following subclauses provide a basic explanation of e data types. 5.1 e data types Most e expressions have an explicit data type, as follows: — Scalar types — Scalar subtypes — Enumerated scalar types — Casting of enumerated types in comparisons — Struct types — Struct subtypes — Referencing fields in when constructs — List types — The set type — The string type — The real type — The external_pointer type — The “untyped” pseudo type Certain expressions, such as HDL objects, have no explicit data type. See 5.2 for information on how these expressions are handled. 5.1.1 Scalar types Scalar types in e are one of the following: numeric, Boolean, or enumerated. Table 17 shows the predefined numeric and Boolean types. Both signed and unsigned integers can be of any size and, thus, of any range. See 5.1.2 for information on how to specify the size and range of a scalar field or variable explicitly. See also Clause 4. 5.1.2 Scalar subtypes A scalar subtype can be named and created by using a scalar modifier to specify the range or bit width of a scalar type.
    [Show full text]
  • Lecture 2: Variables and Primitive Data Types
    Lecture 2: Variables and Primitive Data Types MIT-AITI Kenya 2005 1 In this lecture, you will learn… • What a variable is – Types of variables – Naming of variables – Variable assignment • What a primitive data type is • Other data types (ex. String) MIT-Africa Internet Technology Initiative 2 ©2005 What is a Variable? • In basic algebra, variables are symbols that can represent values in formulas. • For example the variable x in the formula f(x)=x2+2 can represent any number value. • Similarly, variables in computer program are symbols for arbitrary data. MIT-Africa Internet Technology Initiative 3 ©2005 A Variable Analogy • Think of variables as an empty box that you can put values in. • We can label the box with a name like “Box X” and re-use it many times. • Can perform tasks on the box without caring about what’s inside: – “Move Box X to Shelf A” – “Put item Z in box” – “Open Box X” – “Remove contents from Box X” MIT-Africa Internet Technology Initiative 4 ©2005 Variables Types in Java • Variables in Java have a type. • The type defines what kinds of values a variable is allowed to store. • Think of a variable’s type as the size or shape of the empty box. • The variable x in f(x)=x2+2 is implicitly a number. • If x is a symbol representing the word “Fish”, the formula doesn’t make sense. MIT-Africa Internet Technology Initiative 5 ©2005 Java Types • Integer Types: – int: Most numbers you’ll deal with. – long: Big integers; science, finance, computing. – short: Small integers.
    [Show full text]
  • Kind of Quantity’
    NIST Technical Note 2034 Defning ‘kind of quantity’ David Flater This publication is available free of charge from: https://doi.org/10.6028/NIST.TN.2034 NIST Technical Note 2034 Defning ‘kind of quantity’ David Flater Software and Systems Division Information Technology Laboratory This publication is available free of charge from: https://doi.org/10.6028/NIST.TN.2034 February 2019 U.S. Department of Commerce Wilbur L. Ross, Jr., Secretary National Institute of Standards and Technology Walter Copan, NIST Director and Undersecretary of Commerce for Standards and Technology Certain commercial entities, equipment, or materials may be identifed in this document in order to describe an experimental procedure or concept adequately. Such identifcation is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the entities, materials, or equipment are necessarily the best available for the purpose. National Institute of Standards and Technology Technical Note 2034 Natl. Inst. Stand. Technol. Tech. Note 2034, 7 pages (February 2019) CODEN: NTNOEF This publication is available free of charge from: https://doi.org/10.6028/NIST.TN.2034 NIST Technical Note 2034 1 Defning ‘kind of quantity’ David Flater 2019-02-06 This publication is available free of charge from: https://doi.org/10.6028/NIST.TN.2034 Abstract The defnition of ‘kind of quantity’ given in the International Vocabulary of Metrology (VIM), 3rd edition, does not cover the historical meaning of the term as it is most commonly used in metrology. Most of its historical meaning has been merged into ‘quantity,’ which is polysemic across two layers of abstraction.
    [Show full text]
  • Occam 2.1 Reference Manual
    occam 2.1 reference manual SGS-THOMSON Microelectronics Limited May 12, 1995 iv occam 2.1 REFERENCE MANUAL SGS-THOMSON Microelectronics Limited First published 1988 by Prentice Hall International (UK) Ltd as the occam 2 Reference Manual. SGS-THOMSON Microelectronics Limited 1995. SGS-THOMSON Microelectronics reserves the right to make changes in specifications at any time and without notice. The information furnished by SGS-THOMSON Microelectronics in this publication is believed to be accurate, but no responsibility is assumed for its use, nor for any infringement of patents or other rights of third parties resulting from its use. No licence is granted under any patents, trademarks or other rights of SGS-THOMSON Microelectronics. The INMOS logo, INMOS, IMS and occam are registered trademarks of SGS-THOMSON Microelectronics Limited. Document number: 72 occ 45 03 All rights reserved. No part of this publication may be reproduced, stored in a retrival system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior permission, in writing, from SGS-THOMSON Microelectronics Limited. Contents Contents v Contents overview ix Preface xi Introduction 1 Syntax and program format 3 1 Primitive processes 5 1.1 Assignment 5 1.2 Communication 6 1.3 SKIP and STOP 7 2 Constructed processes 9 2.1 Sequence 9 2.2 Conditional 11 2.3 Selection 13 2.4 WHILE loop 14 2.5 Parallel 15 2.6 Alternation 19 2.7 Processes 24 3 Data types 25 3.1 Primitive data types 25 3.2 Named data types 26 3.3 Literals
    [Show full text]
  • Floating-Point Exception Handing
    ISO/IEC JTC1/SC22/WG5 N1372 Working draft of ISO/IEC TR 15580, second edition Information technology ± Programming languages ± Fortran ± Floating-point exception handing This page to be supplied by ISO. No changes from first edition, except for for mechanical things such as dates. i ISO/IEC TR 15580 : 1998(E) Draft second edition, 13th August 1999 ISO/IEC Foreword To be supplied by ISO. No changes from first edition, except for for mechanical things such as dates. ii ISO/IEC Draft second edition, 13th August 1999 ISO/IEC TR 15580: 1998(E) Introduction Exception handling is required for the development of robust and efficient numerical software. In particular, it is necessary in order to be able to write portable scientific libraries. In numerical Fortran programming, current practice is to employ whatever exception handling mechanisms are provided by the system/vendor. This clearly inhibits the production of fully portable numerical libraries and programs. It is particularly frustrating now that IEEE arithmetic (specified by IEEE 754-1985 Standard for binary floating-point arithmetic, also published as IEC 559:1989, Binary floating-point arithmetic for microprocessor systems) is so widely used, since built into it are the five conditions: overflow, invalid, divide-by-zero, underflow, and inexact. Our aim is to provide support for these conditions. We have taken the opportunity to provide support for other aspects of the IEEE standard through a set of elemental functions that are applicable only to IEEE data types. This proposal involves three standard modules: IEEE_EXCEPTIONS contains a derived type, some named constants of this type, and some simple procedures.
    [Show full text]
  • Subtyping, Declaratively an Exercise in Mixed Induction and Coinduction
    Subtyping, Declaratively An Exercise in Mixed Induction and Coinduction Nils Anders Danielsson and Thorsten Altenkirch University of Nottingham Abstract. It is natural to present subtyping for recursive types coin- ductively. However, Gapeyev, Levin and Pierce have noted that there is a problem with coinductive definitions of non-trivial transitive inference systems: they cannot be \declarative"|as opposed to \algorithmic" or syntax-directed|because coinductive inference systems with an explicit rule of transitivity are trivial. We propose a solution to this problem. By using mixed induction and coinduction we define an inference system for subtyping which combines the advantages of coinduction with the convenience of an explicit rule of transitivity. The definition uses coinduction for the structural rules, and induction for the rule of transitivity. We also discuss under what condi- tions this technique can be used when defining other inference systems. The developments presented in the paper have been mechanised using Agda, a dependently typed programming language and proof assistant. 1 Introduction Coinduction and corecursion are useful techniques for defining and reasoning about things which are potentially infinite, including streams and other (poten- tially) infinite data types (Coquand 1994; Gim´enez1996; Turner 2004), process congruences (Milner 1990), congruences for functional programs (Gordon 1999), closures (Milner and Tofte 1991), semantics for divergence of programs (Cousot and Cousot 1992; Hughes and Moran 1995; Leroy and Grall 2009; Nakata and Uustalu 2009), and subtyping relations for recursive types (Brandt and Henglein 1998; Gapeyev et al. 2002). However, the use of coinduction can lead to values which are \too infinite”. For instance, a non-trivial binary relation defined as a coinductive inference sys- tem cannot include the rule of transitivity, because a coinductive reading of transitivity would imply that every element is related to every other (to see this, build an infinite derivation consisting solely of uses of transitivity).
    [Show full text]
  • Guide for the Use of the International System of Units (SI)
    Guide for the Use of the International System of Units (SI) m kg s cd SI mol K A NIST Special Publication 811 2008 Edition Ambler Thompson and Barry N. Taylor NIST Special Publication 811 2008 Edition Guide for the Use of the International System of Units (SI) Ambler Thompson Technology Services and Barry N. Taylor Physics Laboratory National Institute of Standards and Technology Gaithersburg, MD 20899 (Supersedes NIST Special Publication 811, 1995 Edition, April 1995) March 2008 U.S. Department of Commerce Carlos M. Gutierrez, Secretary National Institute of Standards and Technology James M. Turner, Acting Director National Institute of Standards and Technology Special Publication 811, 2008 Edition (Supersedes NIST Special Publication 811, April 1995 Edition) Natl. Inst. Stand. Technol. Spec. Publ. 811, 2008 Ed., 85 pages (March 2008; 2nd printing November 2008) CODEN: NSPUE3 Note on 2nd printing: This 2nd printing dated November 2008 of NIST SP811 corrects a number of minor typographical errors present in the 1st printing dated March 2008. Guide for the Use of the International System of Units (SI) Preface The International System of Units, universally abbreviated SI (from the French Le Système International d’Unités), is the modern metric system of measurement. Long the dominant measurement system used in science, the SI is becoming the dominant measurement system used in international commerce. The Omnibus Trade and Competitiveness Act of August 1988 [Public Law (PL) 100-418] changed the name of the National Bureau of Standards (NBS) to the National Institute of Standards and Technology (NIST) and gave to NIST the added task of helping U.S.
    [Show full text]
  • Mark Vie Controller Standard Block Library for Public Disclosure Contents
    GEI-100682AC Mark* VIe Controller Standard Block Library These instructions do not purport to cover all details or variations in equipment, nor to provide for every possible contingency to be met during installation, operation, and maintenance. The information is supplied for informational purposes only, and GE makes no warranty as to the accuracy of the information included herein. Changes, modifications, and/or improvements to equipment and specifications are made periodically and these changes may or may not be reflected herein. It is understood that GE may make changes, modifications, or improvements to the equipment referenced herein or to the document itself at any time. This document is intended for trained personnel familiar with the GE products referenced herein. Public – This document is approved for public disclosure. GE may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not provide any license whatsoever to any of these patents. GE provides the following document and the information included therein as is and without warranty of any kind, expressed or implied, including but not limited to any implied statutory warranty of merchantability or fitness for particular purpose. For further assistance or technical information, contact the nearest GE Sales or Service Office, or an authorized GE Sales Representative. Revised: July 2018 Issued: Sept 2005 © 2005 – 2018 General Electric Company. ___________________________________ * Indicates a trademark of General
    [Show full text]
  • Adaptivfloat: a Floating-Point Based Data Type for Resilient Deep Learning Inference
    ADAPTIVFLOAT:AFLOATING-POINT BASED DATA TYPE FOR RESILIENT DEEP LEARNING INFERENCE Thierry Tambe 1 En-Yu Yang 1 Zishen Wan 1 Yuntian Deng 1 Vijay Janapa Reddi 1 Alexander Rush 2 David Brooks 1 Gu-Yeon Wei 1 ABSTRACT Conventional hardware-friendly quantization methods, such as fixed-point or integer, tend to perform poorly at very low word sizes as their shrinking dynamic ranges cannot adequately capture the wide data distributions commonly seen in sequence transduction models. We present AdaptivFloat, a floating-point inspired number representation format for deep learning that dynamically maximizes and optimally clips its available dynamic range, at a layer granularity, in order to create faithful encoding of neural network parameters. AdaptivFloat consistently produces higher inference accuracies compared to block floating-point, uniform, IEEE-like float or posit encodings at very low precision (≤ 8-bit) across a diverse set of state-of-the-art neural network topologies. And notably, AdaptivFloat is seen surpassing baseline FP32 performance by up to +0.3 in BLEU score and -0.75 in word error rate at weight bit widths that are ≤ 8-bit. Experimental results on a deep neural network (DNN) hardware accelerator, exploiting AdaptivFloat logic in its computational datapath, demonstrate per-operation energy and area that is 0.9× and 1.14×, respectively, that of equivalent bit width integer-based accelerator variants. (a) ResNet-50 Weight Histogram (b) Inception-v3 Weight Histogram 1 INTRODUCTION 105 Max Weight: 1.32 105 Max Weight: 1.27 Min Weight: -0.78 Min Weight: -1.20 4 4 Deep learning approaches have transformed representation 10 10 103 103 learning in a multitude of tasks.
    [Show full text]
  • PDF (Dissertation.Pdf)
    Kind Theory Thesis by Joseph R. Kiniry In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy California Institute of Technology Pasadena, California 2002 (Defended 10 May 2002) ii © 2002 Joseph R. Kiniry All Rights Reserved iii Preface This thesis describes a theory for representing, manipulating, and reasoning about structured pieces of knowledge in open collaborative systems. The theory's design is motivated by both its general model as well as its target user commu- nity. Its model is structured information, with emphasis on classification, relative structure, equivalence, and interpretation. Its user community is meant to be non-mathematicians and non-computer scientists that might use the theory via computational tool support once inte- grated with modern design and development tools. This thesis discusses a new logic called kind theory that meets these challenges. The core of the work is based in logic, type theory, and universal algebras. The theory is shown to be efficiently implementable, and several parts of a full realization have already been constructed and are reviewed. Additionally, several software engineering concepts, tools, and technologies have been con- structed that take advantage of this theoretical framework. These constructs are discussed as well, from the perspectives of general software engineering and applied formal methods. iv Acknowledgements I am grateful to my initial primary adviser, Prof. K. Mani Chandy, for bringing me to Caltech and his willingness to let me explore many unfamiliar research fields of my own choosing. I am also appreciative of my second adviser, Prof. Jason Hickey, for his support, encouragement, feedback, and patience through the later years of my work.
    [Show full text]