What Every Scientist Should Know About Floating-Point Arithmetic
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Nios II Custom Instruction User Guide
Nios II Custom Instruction User Guide Subscribe UG-20286 | 2020.04.27 Send Feedback Latest document on the web: PDF | HTML Contents Contents 1. Nios II Custom Instruction Overview..............................................................................4 1.1. Custom Instruction Implementation......................................................................... 4 1.1.1. Custom Instruction Hardware Implementation............................................... 5 1.1.2. Custom Instruction Software Implementation................................................ 6 2. Custom Instruction Hardware Interface......................................................................... 7 2.1. Custom Instruction Types....................................................................................... 7 2.1.1. Combinational Custom Instructions.............................................................. 8 2.1.2. Multicycle Custom Instructions...................................................................10 2.1.3. Extended Custom Instructions................................................................... 11 2.1.4. Internal Register File Custom Instructions................................................... 13 2.1.5. External Interface Custom Instructions....................................................... 15 3. Custom Instruction Software Interface.........................................................................16 3.1. Custom Instruction Software Examples................................................................... 16 -
Bit Nove Signal Interface Processor
www.audison.eu bit Nove Signal Interface Processor POWER SUPPLY CONNECTION Voltage 10.8 ÷ 15 VDC From / To Personal Computer 1 x Micro USB Operating power supply voltage 7.5 ÷ 14.4 VDC To Audison DRC AB / DRC MP 1 x AC Link Idling current 0.53 A Optical 2 sel Optical In 2 wire control +12 V enable Switched off without DRC 1 mA Mem D sel Memory D wire control GND enable Switched off with DRC 4.5 mA CROSSOVER Remote IN voltage 4 ÷ 15 VDC (1mA) Filter type Full / Hi pass / Low Pass / Band Pass Remote OUT voltage 10 ÷ 15 VDC (130 mA) Linkwitz @ 12/24 dB - Butterworth @ Filter mode and slope ART (Automatic Remote Turn ON) 2 ÷ 7 VDC 6/12/18/24 dB Crossover Frequency 68 steps @ 20 ÷ 20k Hz SIGNAL STAGE Phase control 0° / 180° Distortion - THD @ 1 kHz, 1 VRMS Output 0.005% EQUALIZER (20 ÷ 20K Hz) Bandwidth @ -3 dB 10 Hz ÷ 22 kHz S/N ratio @ A weighted Analog Input Equalizer Automatic De-Equalization N.9 Parametrics Equalizers: ±12 dB;10 pole; Master Input 102 dBA Output Equalizer 20 ÷ 20k Hz AUX Input 101.5 dBA OPTICAL IN1 / IN2 Inputs 110 dBA TIME ALIGNMENT Channel Separation @ 1 kHz 85 dBA Distance 0 ÷ 510 cm / 0 ÷ 200.8 inches Input sensitivity Pre Master 1.2 ÷ 8 VRMS Delay 0 ÷ 15 ms Input sensitivity Speaker Master 3 ÷ 20 VRMS Step 0,08 ms; 2,8 cm / 1.1 inch Input sensitivity AUX Master 0.3 ÷ 5 VRMS Fine SET 0,02 ms; 0,7 cm / 0.27 inch Input impedance Pre In / Speaker In / AUX 15 kΩ / 12 Ω / 15 kΩ GENERAL REQUIREMENTS Max Output Level (RMS) @ 0.1% THD 4 V PC connections USB 1.1 / 2.0 / 3.0 Compatible Microsoft Windows (32/64 bit): Vista, INPUT STAGE Software/PC requirements Windows 7, Windows 8, Windows 10 Low level (Pre) Ch1 ÷ Ch6; AUX L/R Video Resolution with screen resize min. -
Scilab Textbook Companion for Numerical Methods for Engineers by S
Scilab Textbook Companion for Numerical Methods For Engineers by S. C. Chapra And R. P. Canale1 Created by Meghana Sundaresan Numerical Methods Chemical Engineering VNIT College Teacher Kailas L. Wasewar Cross-Checked by July 31, 2019 1Funded by a grant from the National Mission on Education through ICT, http://spoken-tutorial.org/NMEICT-Intro. This Textbook Companion and Scilab codes written in it can be downloaded from the "Textbook Companion Project" section at the website http://scilab.in Book Description Title: Numerical Methods For Engineers Author: S. C. Chapra And R. P. Canale Publisher: McGraw Hill, New York Edition: 5 Year: 2006 ISBN: 0071244298 1 Scilab numbering policy used in this document and the relation to the above book. Exa Example (Solved example) Eqn Equation (Particular equation of the above book) AP Appendix to Example(Scilab Code that is an Appednix to a particular Example of the above book) For example, Exa 3.51 means solved example 3.51 of this book. Sec 2.3 means a scilab code whose theory is explained in Section 2.3 of the book. 2 Contents List of Scilab Codes4 1 Mathematical Modelling and Engineering Problem Solving6 2 Programming and Software8 3 Approximations and Round off Errors 10 4 Truncation Errors and the Taylor Series 15 5 Bracketing Methods 25 6 Open Methods 34 7 Roots of Polynomials 44 9 Gauss Elimination 51 10 LU Decomposition and matrix inverse 61 11 Special Matrices and gauss seidel 65 13 One dimensional unconstrained optimization 69 14 Multidimensional Unconstrainted Optimization 72 3 15 Constrained -
The MPFR Team [email protected] This Manual Documents How to Install and Use the Multiple Precision Floating-Point Reliable Library, Version 2.2.0
MPFR The Multiple Precision Floating-Point Reliable Library Edition 2.2.0 September 2005 The MPFR team [email protected] This manual documents how to install and use the Multiple Precision Floating-Point Reliable Library, version 2.2.0. Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with the Front-Cover Texts being “A GNU Manual”, and with the Back-Cover Texts being “You have freedom to copy and modify this GNU Manual, like GNU software”. A copy of the license is included in Appendix A [GNU Free Documentation License], page 30. i Table of Contents MPFR Copying Conditions ................................ 1 1 Introduction to MPFR ................................. 2 1.1 How to use this Manual ........................................................ 2 2 Installing MPFR ....................................... 3 2.1 How to install ................................................................. 3 2.2 Other make targets ............................................................ 3 2.3 Known Build Problems ........................................................ 3 2.4 Getting the Latest Version of MPFR ............................................ 4 3 Reporting Bugs ........................................ 5 4 MPFR Basics ......................................... -
Extended Precision Floating Point Arithmetic
Extended Precision Floating Point Numbers for Ill-Conditioned Problems Daniel Davis, Advisor: Dr. Scott Sarra Department of Mathematics, Marshall University Floating Point Number Systems Precision Extended Precision Patriot Missile Failure Floating point representation is based on scientific An increasing number of problems exist for which Patriot missile defense modules have been used by • The precision, p, of a floating point number notation, where a nonzero real decimal number, x, IEEE double is insufficient. These include modeling the U.S. Army since the mid-1960s. On February 21, system is the number of bits in the significand. is expressed as x = ±S × 10E, where 1 ≤ S < 10. of dynamical systems such as our solar system or the 1991, a Patriot protecting an Army barracks in Dha- This means that any normalized floating point The values of S and E are known as the significand • climate, and numerical cryptography. Several arbi- ran, Afghanistan failed to intercept a SCUD missile, number with precision p can be written as: and exponent, respectively. When discussing float- trary precision libraries have been developed, but leading to the death of 28 Americans. This failure E ing points, we are interested in the computer repre- x = ±(1.b1b2...bp−2bp−1)2 × 2 are too slow to be practical for many complex ap- was caused by floating point rounding error. The sentation of numbers, so we must consider base 2, or • The smallest x such that x > 1 is then: plications. David Bailey’s QD library may be used system measured time in tenths of seconds, using binary, rather than base 10. -
Computer Architectures
Computer Architectures Motorola 68000, 683xx a ColdFire – CISC CPU Principles Demonstrated Czech Technical University in Prague, Faculty of Electrical Engineering AE0B36APO Computer Architectures Ver.1.10 1 Original Desktop/Workstation 680X0 Feature 68000 'EC000 68010 68020 68030 68040 68060 Data bus 16 8/16 16 8/16/32 8/16/32 32 32 Addr bus 23 23 23 32 32 32 32 Misaligned Addr - - - Yes Yes Yes Yes Virtual memory - - Yes Yes Yes Yes Yes Instruct Cache - - 3 256 256 4096 8192 Data Cache - - - - 256 4096 8192 Memory manager 68451 or 68851 68851 Yes Yes Yes ATC entries - - - - 22 64/64 64/64 FPU interface - - - 68881 or 68882 Internal FPU built-in FPU - - - - - Yes Yes Burst Memory - - - - Yes Yes Yes Bus Cycle type asynchronous both synchronous Data Bus Sizing - - - Yes Yes use 68150 Power (watts) 1.2 0.13-0.26 0.13 1.75 2.6 4-6 3.9-4.9 at frequency of 8.0 8-16 8 16-25 16-50 25-40 50-66 MIPS/kDhryst. 1.2/2.1 2.5/4.3 6.5/11 14/23 35/60 100/300 Transistors 68k 84k 190k 273k 1,170k 2,500k Introduction 1979 1982 1984 1987 1991 1994 AE0B36APO Computer Architectures 2 M68xxx/CPU32/ColdFire – Basic Registers Set 31 16 15 8 7 0 User programming D0 D1 model registers D2 D3 DATA REGISTERS D4 D5 D6 D7 16 15 0 A0 A1 A2 A3 ADDRESS REGISTERS A4 A5 A6 16 15 0 A7 (USP) USER STACK POINTER 0 PC PROGRAM COUNTER 15 8 7 0 0 CCR CONDITION CODE REGISTER 31 16 15 0 A7# (SSP) SUPERVISOR STACK Supervisor/system POINTER 15 8 7 0 programing model (CCR) SR STATUS REGISTER 31 0 basic registers VBR VECTOR BASE REGISTER 31 3 2 0 SFC ALTERNATE FUNCTION DFC CODE REGISTERS AE0B36APO Computer Architectures 3 Status Register – Conditional Code Part USER BYTE SYSTEM BYTE (CONDITION CODE REGISTER) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 T1 T0 S 0 0 I2 I1 I0 0 0 0 X N Z V C TRACE INTERRUPT EXTEND ENABLE PRIORITY MASK NEGATIVE SUPERVISOR/USER ZERO STATE OVERFLOW CARRY ● N – negative .. -
RTEMS CPU Supplement Documentation Release 4.11.3 ©Copyright 2016, RTEMS Project (Built 15Th February 2018)
RTEMS CPU Supplement Documentation Release 4.11.3 ©Copyright 2016, RTEMS Project (built 15th February 2018) CONTENTS I RTEMS CPU Architecture Supplement1 1 Preface 5 2 Port Specific Information7 2.1 CPU Model Dependent Features...........................8 2.1.1 CPU Model Name...............................8 2.1.2 Floating Point Unit..............................8 2.2 Multilibs........................................9 2.3 Calling Conventions.................................. 10 2.3.1 Calling Mechanism.............................. 10 2.3.2 Register Usage................................. 10 2.3.3 Parameter Passing............................... 10 2.3.4 User-Provided Routines............................ 10 2.4 Memory Model..................................... 11 2.4.1 Flat Memory Model.............................. 11 2.5 Interrupt Processing.................................. 12 2.5.1 Vectoring of an Interrupt Handler...................... 12 2.5.2 Interrupt Levels................................ 12 2.5.3 Disabling of Interrupts by RTEMS...................... 12 2.6 Default Fatal Error Processing............................. 14 2.7 Symmetric Multiprocessing.............................. 15 2.8 Thread-Local Storage................................. 16 2.9 CPU counter...................................... 17 2.10 Interrupt Profiling................................... 18 2.11 Board Support Packages................................ 19 2.11.1 System Reset................................. 19 3 ARM Specific Information 21 3.1 CPU Model Dependent Features.......................... -
IEEE Std 754-1985 Revision of Reaffirmed1990 IEEE Standard for Binary Floating-Point Arithmetic
Recognized as an American National Standard (ANSI) IEEE Std 754-1985 An American National Standard IEEE Standard for Binary Floating-Point Arithmetic Sponsor Standards Committee of the IEEE Computer Society Approved March 21, 1985 Reaffirmed December 6, 1990 IEEE Standards Board Approved July 26, 1985 Reaffirmed May 21, 1991 American National Standards Institute © Copyright 1985 by The Institute of Electrical and Electronics Engineers, Inc 345 East 47th Street, New York, NY 10017, USA No part of this publication may be reproduced in any form, in an electronic retrieval system or otherwise, without the prior written permission of the publisher. IEEE Standards documents are developed within the Technical Committees of the IEEE Societies and the Standards Coordinating Committees of the IEEE Standards Board. Members of the committees serve voluntarily and without compensation. They are not necessarily members of the Institute. The standards developed within IEEE represent a consensus of the broad expertise on the subject within the Institute as well as those activities outside of IEEE which have expressed an interest in participating in the development of the standard. Use of an IEEE Standard is wholly voluntary. The existence of an IEEE Standard does not imply that there are no other ways to produce, test, measure, purchase, market, or provide other goods and services related to the scope of the IEEE Standard. Furthermore, the viewpoint expressed at the time a standard is approved and issued is subject to change brought about through developments in the state of the art and comments received from users of the standard. Every IEEE Standard is subjected to review at least once every five years for revision or reaffirmation. -
Signedness-Agnostic Program Analysis: Precise Integer Bounds for Low-Level Code
Signedness-Agnostic Program Analysis: Precise Integer Bounds for Low-Level Code Jorge A. Navas, Peter Schachte, Harald Søndergaard, and Peter J. Stuckey Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia Abstract. Many compilers target common back-ends, thereby avoid- ing the need to implement the same analyses for many different source languages. This has led to interest in static analysis of LLVM code. In LLVM (and similar languages) most signedness information associated with variables has been compiled away. Current analyses of LLVM code tend to assume that either all values are signed or all are unsigned (except where the code specifies the signedness). We show how program analysis can simultaneously consider each bit-string to be both signed and un- signed, thus improving precision, and we implement the idea for the spe- cific case of integer bounds analysis. Experimental evaluation shows that this provides higher precision at little extra cost. Our approach turns out to be beneficial even when all signedness information is available, such as when analysing C or Java code. 1 Introduction The “Low Level Virtual Machine” LLVM is rapidly gaining popularity as a target for compilers for a range of programming languages. As a result, the literature on static analysis of LLVM code is growing (for example, see [2, 7, 9, 11, 12]). LLVM IR (Intermediate Representation) carefully specifies the bit- width of all integer values, but in most cases does not specify whether values are signed or unsigned. This is because, for most operations, two’s complement arithmetic (treating the inputs as signed numbers) produces the same bit-vectors as unsigned arithmetic. -
Introduction & Floating Point NMMC §1.4.1, Atkinson §1.2 1 Floating
Introduction & Floating Point NMMC x1.4.1, Atkinson x1.2 1 Floating-point representation, IEEE 754. In binary, 11:01 = 1 × 21 + 1 × 20 + 0 × 2−1 + 1 × 2−2: Double-precision floating-point format has 64 bits: ±; a1; : : : ; a11; b1; : : : ; b52 The associated number is a1···a11−1023 # = ±1:b1 ··· b52 × 2 : If all the a are zero, then the number is `subnormal' and the representation is −1022 # = ±0:b1 ··· b52 × 2 If the a are all 1 and the b are all 0 then it's ±∞ (± 1/0). If the a are all 1 and the b are not all 0 then it's NaN (±0=0). Machine-epsilon is the difference between 1 and the smallest representable number larger than 1, i.e. = 2−52 ≈ 2 × 10−16. The smallest and largest positive numbers are ≈ 10±308. Numbers larger than the max are set to 1. 2 Rounding & Arithmetic. Round-toward-nearest rounds a number towards the nearest floating-point num- ber. For nonzero numbers x within the range of max/min, the relative errors due to rounding are jround(x) − xj ≤ 2−53 ≈ 10−16 = u jxj which is half of machine epsilon. Floating-point arithmetic: add, subtract, multiply, divide. IEEE 754: If you start with 2 exactly- represented floating-point numbers x and y and compute any arithmetic operation `op', the magnitude of the relative error in the result is less than half machine precision. I.e. there is some δ with jδj ≤u such that fl(x op y) = (x op y)(1 + δ): The exception is underflow/overflow, where the result is either larger or smaller than can be represented. -
Metaclasses: Generative C++
Metaclasses: Generative C++ Document Number: P0707 R3 Date: 2018-02-11 Reply-to: Herb Sutter ([email protected]) Audience: SG7, EWG Contents 1 Overview .............................................................................................................................................................2 2 Language: Metaclasses .......................................................................................................................................7 3 Library: Example metaclasses .......................................................................................................................... 18 4 Applying metaclasses: Qt moc and C++/WinRT .............................................................................................. 35 5 Alternatives for sourcedefinition transform syntax .................................................................................... 41 6 Alternatives for applying the transform .......................................................................................................... 43 7 FAQs ................................................................................................................................................................. 46 8 Revision history ............................................................................................................................................... 51 Major changes in R3: Switched to function-style declaration syntax per SG7 direction in Albuquerque (old: $class M new: constexpr void M(meta::type target, -
1.2 Round-Off Errors and Computer Arithmetic
1.2 Round-off Errors and Computer Arithmetic 1 • In a computer model, a memory storage unit – word is used to store a number. • A word has only a finite number of bits. • These facts imply: 1. Only a small set of real numbers (rational numbers) can be accurately represented on computers. 2. (Rounding) errors are inevitable when computer memory is used to represent real, infinite precision numbers. 3. Small rounding errors can be amplified with careless treatment. So, do not be surprised that (9.4)10= (1001.0110)2 can not be represented exactly on computers. • Round-off error: error that is produced when a computer is used to perform real number calculations. 2 Binary numbers and decimal numbers • Binary number system: A method of representing numbers that has 2 as its base and uses only the digits 0 and 1. Each successive digit represents a power of 2. (… . … ) where 0 1, for each = 2,1,0, 1, 2 … 3210 −1−2−3 2 • Binary≤ to decimal:≤ ⋯ − − (… . … ) (… 2 + 2 + 2 + 2 = 3210 −1−2−3 2 + 2 3 + 22 + 1 2 …0) 3 2 1 0 −1 −2 −3 −1 −2 −3 10 3 Binary machine numbers • IEEE (Institute for Electrical and Electronic Engineers) – Standards for binary and decimal floating point numbers • For example, “double” type in the “C” programming language uses a 64-bit (binary digit) representation – 1 sign bit (s), – 11 exponent bits – characteristic (c), – 52 binary fraction bits – mantissa (f) 1. 0 2 1 = 2047 11 4 ≤ ≤ − This 64-bit binary number gives a decimal floating-point number (Normalized IEEE floating point number): 1 2 (1 + ) where 1023 is called exponent − 1023bias.