Floating Point Case Study
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Fujitsu SPARC64™ X+/X Software on Chip Overview for Developers]
White paper [Fujitsu SPARC64™ X+/X Software on Chip Overview for Developers] White paper Fujitsu SPARC64™ X+/X Software on Chip Overview for Developers Page 1 of 13 www.fujitsu.com/sparc White paper [Fujitsu SPARC64™ X+/X Software on Chip Overview for Developers] Table of Contents Table of Contents 1 Software on Chip Innovative Technology 3 2 SPARC64™ X+/X SIMD Vector Processing 4 2.1 How Oracle Databases take advantage of SPARC64™ X+/X SIMD vector processing 4 2.2 How to use of SPARC64™ X+/X SIMD instructions in user applications 5 2.3 How to check if the system and operating system is capable of SIMD execution? 6 2.4 How to check if SPARC64™ X+/X SIMD instructions indeed have been generated upon compilation of a user source code? 6 2.5 Is SPARC64™ X+/X SIMD implementation compatible with Oracle’s SPARC SIMD? 6 3 SPARC64™ X+/X Decimal Floating Point Processing 8 3.1 How Oracle Databases take advantage of SPARC64™ X+/X Decimal Floating-Point 8 3.2 Decimal Floating-Point processing in user applications 8 4 SPARC64™ X+/X Extended Floating-Point Registers 9 4.1 How Oracle Databases take advantage of SPARC64™ X+/X Decimal Floating-Point 9 5 SPARC64™ X+/X On-Chip Cryptographic Processing Capabilities 10 5.1 How to use the On-Chip Cryptographic Processing Capabilities 10 5.2 How to use the On-Chip Cryptographic Processing Capabilities in user applications 10 6 Conclusions 12 Page 2 of 13 www.fujitsu.com/sparc White paper [Fujitsu SPARC64™ X+/X Software on Chip Overview for Developers] Software on Chip Innovative Technology 1 Software on Chip Innovative Technology Fujitsu brings together innovations in supercomputing, business computing, and mainframe computing in the Fujitsu M10 enterprise server family to help organizations meet their business challenges. -
Intel® IA-64 Architecture Software Developer's Manual
Intel® IA-64 Architecture Software Developer’s Manual Volume 1: IA-64 Application Architecture Revision 1.1 July 2000 Document Number: 245317-002 THIS DOCUMENT IS PROVIDED “AS IS” WITH NO WARRANTIES WHATSOEVER, INCLUDING ANY WARRANTY OF MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR ANY PARTICULAR PURPOSE, OR ANY WARRANTY OTHERWISE ARISING OUT OF ANY PROPOSAL, SPECIFICATION OR SAMPLE. Information in this document is provided in connection with Intel products. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel's Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. Intel® IA-64 processors may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800- 548-4725, or by visiting Intel’s website at http://developer.intel.com/design/litcentr. -
Decimal Layouts for IEEE 754 Strawman3
IEEE 754-2008 ECMA TC39/TG1 – 25 September 2008 Mike Cowlishaw IBM Fellow Overview • Summary of the new standard • Binary and decimal specifics • Support in hardware, standards, etc. • Questions? 2 Copyright © IBM Corporation 2008. All rights reserved. IEEE 754 revision • Started in December 2000 – 7.7 years – 5.8 years in committee (92 participants + e-mail) – 1.9 years in ballot (101 voters, 8 ballots, 1200+ comments) • Removes many ambiguities from 754-1985 • Incorporates IEEE 854 (radix-independent) • Recommends or requires more operations (functions) and more language support 3 Formats • Separates sets of floating-point numbers (and the arithmetic on them) from their encodings (‘interchange formats’) • Includes the 754-1985 basic formats: – binary32, 24 bits (‘single’) – binary64, 53 bits (‘double’) • Adds three new basic formats: – binary128, 113 bits (‘quad’) – decimal64, 16-digit (‘double’) – decimal128, 34-digit (‘quad’) 4 Why decimal? A web page… • Parking at Manchester airport… • £4.20 per day … … for 10 days … … calculated on-page using ECMAScript Answer shown: 5 Why decimal? A web page… • Parking at Manchester airport… • £4.20 per day … … for 10 days … … calculated on-page using ECMAScript Answer shown: £41.99 (Programmer must have truncated to two places.) 6 Where it costs real money… • Add 5% sales tax to a $ 0.70 telephone call, rounded to the nearest cent • 1.05 x 0.70 using binary double is exactly 0.734999999999999986677323704 49812151491641998291015625 (should have been 0.735) • rounds to $ 0.73, instead of $ 0.74 -
Chapter 2 DIRECT METHODS—PART II
Chapter 2 DIRECT METHODS—PART II If we use finite precision arithmetic, the results obtained by the use of direct methods are contaminated with roundoff error, which is not always negligible. 2.1 Finite Precision Computation 2.2 Residual vs. Error 2.3 Pivoting 2.4 Scaling 2.5 Iterative Improvement 2.1 Finite Precision Computation 2.1.1 Floating-point numbers decimal floating-point numbers The essence of floating-point numbers is best illustrated by an example, such as that of a 3-digit floating-point calculator which accepts numbers like the following: 123. 50.4 −0.62 −0.02 7.00 Any such number can be expressed as e ±d1.d2d3 × 10 where e ∈{0, 1, 2}. (2.1) Here 40 t := precision = 3, [L : U] := exponent range = [0 : 2]. The exponent range is rather limited. If the calculator display accommodates scientific notation, e g., 3.46 3 −1.56 −3 then we might use [L : U]=[−9 : 9]. Some numbers have multiple representations in form (2.1), e.g., 2.00 × 101 =0.20 × 102. Hence, there is a normalization: • choose smallest possible exponent, • choose + sign for zero, e.g., 0.52 × 102 → 5.20 × 101, 0.08 × 10−8 → 0.80 × 10−9, −0.00 × 100 → 0.00 × 10−9. −9 Nonzero numbers of form ±0.d2d3 × 10 are denormalized. But for large-scale scientific computation base 2 is preferred. binary floating-point numbers This is an important matter because numbers like 0.2 do not have finite representations in base 2: 0.2=(0.001100110011 ···)2. -
Decimal Floating Point for Future Processors Hossam A
1 Decimal Floating Point for future processors Hossam A. H. Fahmy Electronics and Communications Department, Cairo University, Egypt Email: [email protected] Tarek ElDeeb, Mahmoud Yousef Hassan, Yasmin Farouk, Ramy Raafat Eissa SilMinds, LLC Email: [email protected] F Abstract—Many new designs for Decimal Floating Point (DFP) hard- Simple decimal fractions such as 1=10 which might ware units have been proposed in the last few years. To date, only represent a tax amount or a sales discount yield an the IBM POWER6 and POWER7 processors include internal units for infinitely recurring number if converted to a binary decimal floating point processing. We have designed and tested several representation. Hence, a binary number system with a DFP units including an adder, multiplier, divider, square root, and fused- multiply-add compliant with the IEEE 754-2008 standard. This paper finite number of bits cannot accurately represent such presents the results of using our units as part of a vector co-processor fractions. When an approximated representation is used and the anticipated gains once the units are moved on chip with the in a series of computations, the final result may devi- main processor. ate from the correct result expected by a human and required by the law [3], [4]. One study [5] shows that in a large billing application such an error may be up to 1 WHY DECIMAL HARDWARE? $5 million per year. Ten is the natural number base or radix for humans Banking, billing, and other financial applications use resulting in a decimal number system while a binary decimal extensively. -
The Hexadecimal Number System and Memory Addressing
C5537_App C_1107_03/16/2005 APPENDIX C The Hexadecimal Number System and Memory Addressing nderstanding the number system and the coding system that computers use to U store data and communicate with each other is fundamental to understanding how computers work. Early attempts to invent an electronic computing device met with disappointing results as long as inventors tried to use the decimal number sys- tem, with the digits 0–9. Then John Atanasoff proposed using a coding system that expressed everything in terms of different sequences of only two numerals: one repre- sented by the presence of a charge and one represented by the absence of a charge. The numbering system that can be supported by the expression of only two numerals is called base 2, or binary; it was invented by Ada Lovelace many years before, using the numerals 0 and 1. Under Atanasoff’s design, all numbers and other characters would be converted to this binary number system, and all storage, comparisons, and arithmetic would be done using it. Even today, this is one of the basic principles of computers. Every character or number entered into a computer is first converted into a series of 0s and 1s. Many coding schemes and techniques have been invented to manipulate these 0s and 1s, called bits for binary digits. The most widespread binary coding scheme for microcomputers, which is recog- nized as the microcomputer standard, is called ASCII (American Standard Code for Information Interchange). (Appendix B lists the binary code for the basic 127- character set.) In ASCII, each character is assigned an 8-bit code called a byte. -
Midterm-2020-Solution.Pdf
HONOR CODE Questions Sheet. A Lets C. [6 Points] 1. What type of address (heap,stack,static,code) does each value evaluate to Book1, Book1->name, Book1->author, &Book2? [4] 2. Will all of the print statements execute as expected? If NO, write print statement which will not execute as expected?[2] B. Mystery [8 Points] 3. When the above code executes, which line is modified? How many times? [2] 4. What is the value of register a6 at the end ? [2] 5. What is the value of register a4 at the end ? [2] 6. In one sentence what is this program calculating ? [2] C. C-to-RISC V Tree Search; Fill in the blanks below [12 points] D. RISCV - The MOD operation [8 points] 19. The data segment starts at address 0x10000000. What are the memory locations modified by this program and what are their values ? E Floating Point [8 points.] 20. What is the smallest nonzero positive value that can be represented? Write your answer as a numerical expression in the answer packet? [2] 21. Consider some positive normalized floating point number where p is represented as: What is the distance (i.e. the difference) between p and the next-largest number after p that can be represented? [2] 22. Now instead let p be a positive denormalized number described asp = 2y x 0.significand. What is the distance between p and the next largest number after p that can be represented? [2] 23. Sort the following minifloat numbers. [2] F. Numbers. [5] 24. What is the smallest number that this system can represent 6 digits (assume unsigned) ? [1] 25. -
Appendix D an Alternative to RISC: the Intel 80X86
D.1 Introduction D-2 D.2 80x86 Registers and Data Addressing Modes D-3 D.3 80x86 Integer Operations D-6 D.4 80x86 Floating-Point Operations D-10 D.5 80x86 Instruction Encoding D-12 D.6 Putting It All Together: Measurements of Instruction Set Usage D-14 D.7 Concluding Remarks D-20 D.8 Historical Perspective and References D-21 D An Alternative to RISC: The Intel 80x86 The x86 isn’t all that complex—it just doesn’t make a lot of sense. Mike Johnson Leader of 80x86 Design at AMD, Microprocessor Report (1994) © 2003 Elsevier Science (USA). All rights reserved. D-2 I Appendix D An Alternative to RISC: The Intel 80x86 D.1 Introduction MIPS was the vision of a single architect. The pieces of this architecture fit nicely together and the whole architecture can be described succinctly. Such is not the case of the 80x86: It is the product of several independent groups who evolved the architecture over 20 years, adding new features to the original instruction set as you might add clothing to a packed bag. Here are important 80x86 milestones: I 1978—The Intel 8086 architecture was announced as an assembly language– compatible extension of the then-successful Intel 8080, an 8-bit microproces- sor. The 8086 is a 16-bit architecture, with all internal registers 16 bits wide. Whereas the 8080 was a straightforward accumulator machine, the 8086 extended the architecture with additional registers. Because nearly every reg- ister has a dedicated use, the 8086 falls somewhere between an accumulator machine and a general-purpose register machine, and can fairly be called an extended accumulator machine. -
2018-19 MAP 160 Byte File Layout Specifications
2018-19 MAP 160 Byte File Layout Specifications OVERVIEW: A) ISAC will provide an Eligibility Status File (ESF) record for each student to all schools listed as a college choice on the student’s Student Aid Report (SAR). The ESF records will be available daily as Record Type = 7. ESF records may be retrieved via the File Extraction option in MAP. B) Schools will transmit Payment Requests to ISAC via File Transfer Protocol (FTP) using the MAP 160byte layout and identify these with Record Type = 4. C) When payment requests are processed, ISAC will provide payment results to schools through the MAP system. The payment results records can be retrieved in the 160 byte format using the MAP Payment Results File Extraction Option. MAP results records have a Record Type = 5. The MAP Payment Results file contains some eligibility status data elements. Also, the same student record may appear on both the Payment Results and the Eligibility Status extract files. Schools may also use the Reports option in MAP to obtain payment results. D) To cancel Payment Requests, the school with the current Payment Results record on ISAC's Payment Database must transmit a matching record with MAP Payment Request Code = C, with the Requested Award Amount field equal to zero and the Enrollment Hours field equal to 0 along with other required data elements. These records must be transmitted to ISAC as Record Type = 4. E) Summary of Data Element Changes, revision (highlighted in grey) made to the 2018-19 layout. NONE 1 2018-19 MAP 160 Byte File Layout Specifications – 9/17 2018-19 MAP 160 Byte File Layout Specifications F) The following 160 byte record layout will be used for transmitting data between schools and ISAC. -
United States Patent [19] [11] E Patent Number: Re
United States Patent [19] [11] E Patent Number: Re. 33,629 Palmer et a1. [45] Reissued Date of Patent: Jul. 2, 1991 [54] NUMERIC DATA PROCESSOR 1973, IEEE Transactions on Computers vol. C-22, pp. [75] Inventors: John F. Palmer, Cambridge, Mass; 577-586. Bruce W. Ravenel, Nederland, Co1o.; Bulman, D. M. "Stack Computers: An Introduction," Ra? Nave, Haifa, Israel May 1977, Computer pp. 18-28. Siewiorek, Daniel H; Bell, C. Gordon; Newell, Allen, [73] Assignee: Intel Corporation, Santa Clara, Calif. “Computer Structures: Principles and Examples," 1977, [21] Appl. No.: 461,538 Chapter 29, pp. 470-485 McGraw-Hill Book Co. Palmer, John F., “The Intel Standard for Floatin [22] Filed: Jun. 1, 1990 g-Point Arithmetic,“ Nov. 8-11, 1977, IEEE COMP SAC 77 Proceedings, 107-112. Related US. Patent Documents Coonen, J. T., "Speci?cations for a Proposed Standard Reissue of: t for Floating-Point Arithmetic," Oct. 13, 1978, Mem. [64] Patent No.: 4,338,675 #USB/ERL M78172, pp. 1-32. Issued: Jul. 6, 1982 Pittman, T. and Stewart, R. G., “Microprocessor Stan Appl. No: 120.995 dards,” 1978, AFIPS Conference Proceedings, vol. 47, Filed: Feb. 13, 1980 pp. 935-938. “7094-11 System Support For Numerical Analysis,“ by [51] Int. Cl.-‘ ........................ .. G06F 7/48; G06F 9/00; William Kahan, Dept. of Computer Science, Univ. of G06F 11/00 Toronto, Aug. 1966, pp. 1-51. [52] US. Cl. .................................. .. 364/748; 364/737; “A Uni?ed Decimal Floating-Point Architecture For 364/745; 364/258 The Support of High-Level Languages,‘ by Frederic [58] Field of Search ............. .. 364/748, 745, 737, 736, N. -
Faster Math Functions, Soundly
Faster Math Functions, Soundly IAN BRIGGS, University of Utah, USA PAVEL PANCHEKHA, University of Utah, USA Standard library implementations of functions like sin and exp optimize for accuracy, not speed, because they are intended for general-purpose use. But applications tolerate inaccuracy from cancellation, rounding error, and singularities—sometimes even very high error—and many application could tolerate error in function implementations as well. This raises an intriguing possibility: speeding up numerical code by tuning standard function implementations. This paper thus introduces OpTuner, an automatic method for selecting the best implementation of mathematical functions at each use site. OpTuner assembles dozens of implementations for the standard mathematical functions from across the speed-accuracy spectrum. OpTuner then uses error Taylor series and integer linear programming to compute optimal assignments of function implementation to use site and presents the user with a speed-accuracy Pareto curve they can use to speed up their code. In a case study on the POV-Ray ray tracer, OpTuner speeds up a critical computation, leading to a whole program speedup of 9% with no change in the program output (whereas human efforts result in slower code and lower-quality output). On a broader study of 37 standard benchmarks, OpTuner matches 216 implementations to 89 use sites and demonstrates speed-ups of 107% for negligible decreases in accuracy and of up to 438% for error-tolerant applications. Additional Key Words and Phrases: Floating point, rounding error, performance, approximation theory, synthesis, optimization 1 INTRODUCTION Floating-point arithmetic is foundational for scientific, engineering, and mathematical software. This is because, while floating-point arithmetic is approximate, most applications tolerate minute errors [Piparo et al. -
Floating Point Representation (Unsigned) Fixed-Point Representation
Floating point representation (Unsigned) Fixed-point representation The numbers are stored with a fixed number of bits for the integer part and a fixed number of bits for the fractional part. Suppose we have 8 bits to store a real number, where 5 bits store the integer part and 3 bits store the fractional part: 1 0 1 1 1.0 1 1 $ 2& 2% 2$ 2# 2" 2'# 2'$ 2'% Smallest number: Largest number: (Unsigned) Fixed-point representation Suppose we have 64 bits to store a real number, where 32 bits store the integer part and 32 bits store the fractional part: "# "% + /+ !"# … !%!#!&. (#(%(" … ("% % = * !+ 2 + * (+ 2 +,& +,# "# "& & /# % /"% = !"#× 2 +!"&× 2 + ⋯ + !&× 2 +(#× 2 +(%× 2 + ⋯ + ("%× 2 0 ∞ (Unsigned) Fixed-point representation Range: difference between the largest and smallest numbers possible. More bits for the integer part ⟶ increase range Precision: smallest possible difference between any two numbers More bits for the fractional part ⟶ increase precision "#"$"%. '$'#'( # OR "$"%. '$'#'(') # Wherever we put the binary point, there is a trade-off between the amount of range and precision. It can be hard to decide how much you need of each! Scientific Notation In scientific notation, a number can be expressed in the form ! = ± $ × 10( where $ is a coefficient in the range 1 ≤ $ < 10 and + is the exponent. 1165.7 = 0.0004728 = Floating-point numbers A floating-point number can represent numbers of different order of magnitude (very large and very small) with the same number of fixed bits. In general, in the binary system, a floating number can be expressed as ! = ± $ × 2' $ is the significand, normally a fractional value in the range [1.0,2.0) .