Analysis and Evaluation of Binary Cascade Iterative Refinement And

Total Page:16

File Type:pdf, Size:1020Kb

Analysis and Evaluation of Binary Cascade Iterative Refinement And MASTERARBEIT Titel der Masterarbeit Analysis and Evaluation of Binary Cascade Iterative ” Refinement and Comparison to other Iterative Refinement Algorithms for Solving Linear Systems“ Verfasser Karl E. Prikopa, BSc angestrebter akademischer Grad Diplom-Ingenieur (Dipl.-Ing.) Wien, 2011 Studienkennzahl lt. Studienblatt: A 066 940 Studienrichtung lt. Studienblatt: Scientific Computing UG2002 Betreuer: Univ.-Ass. Privatdoz. Dr. Wilfried Gansterer Abstract Iterative refinement is a widely used method to improve the round-off errors of a solution of a linear system and is also used in software packets like LAPACK. The cost of the iterative improvement is very low compared to the cost of the factorization of the matrix but results in a solution which can be accurate to machine precision. Many variations on the standard iterative refinement method exist, which use different working precisions to refine the solution. The extra precise iterative refinement can use extended precision to improve the result. The mixed precision iterative refinement tries to exploit the benefits of using lower precisions to compute a solution and then uses iterative refinement to achieve the higher precision accuracy. The focus of this thesis will be the binary cascade iterative refinement (BCIR), which chooses the working precisions according to the input data. This algorithm depends on arbitrary precision arithmetic to support working precisions outside the IEEE standard data types provided by most hardware vendors. The thesis will analyse the properties of BCIR and conduct experiments which will compare the algorithm to other iterative refinement methods and focus on the numerical accuracy and the convergence. The arbitrary precision arithmetic will be implemented using the GNU MPFR software library. The simulated arbitrary precision does not provide accurate information about the gains and losses in performance due to the use of the different precisions. Therefore a per- formance model will be introduced in order to be able to compare the performance of the algorithms and to analyse the possible performance gains, which could be exploited in future works by hardware implementations for example using reconfigurable hardware like FPGAs. Contents Contents i List of Figures iv List of Listings vi 1 Introduction 1 1.1 ThesisOutline ................................... 2 2 Precisions of Floating Point Arithmetic 4 2.1 StandardPrecision ............................... 4 2.1.1 History ................................... 4 2.1.2 IEEEStandard............................... 6 2.2 ArbitraryPrecision.............................. ... 8 2.2.1 Constant Folding with Arbitrary Precision. ....... 9 2.3 Arbitrary-precision Software and Libraries . ........... 9 2.3.1 Hardwaresupport ............................. 10 2.3.2 Stand-aloneSoftware. 10 2.3.3 Programming languages and software libraries . ....... 11 2.4 GNUMPFR .................................... 12 2.4.1 MPFRVariables .............................. 12 2.4.2 MPFRRoundingModes. 14 2.4.3 MPFRFunctions.............................. 14 2.4.4 MPFRExampleSourceCode . 15 2.5 PerformanceEvaluation . ... 16 3 Iterative Refinement 19 3.1 StandardIterativeRefinement(SIR) . ...... 19 3.2 ExtraPreciseIterative Refinement (EPIR) . ....... 21 3.3 Mixed Precision Iterative Refinement (MPIR) . ........ 22 3.4 ModelEstimatingNumberofIterations . ...... 25 i 4 BCIR - Binary Cascade Iterative Refinement 26 4.1 TheAlgorithm ................................... 26 4.2 TheParameters................................... 27 4.2.1 Analysis of the Levels of Recursion p ................... 30 4.2.2 Analysis of the Precisions βj ....................... 31 4.3 RefinedVersionofBCIR............................. 33 4.4 Accessing Matrix A and Vector b atDifferentPrecisions . 35 4.5 Conclusion ..................................... 36 5 Implementation 37 5.1 Requirements .................................... 37 5.2 GenerateMatrix .................................. 38 5.3 FileFormat ..................................... 39 5.4 BlockLUdecomposition. 39 5.5 ImplementedPrograms. 40 5.5.1 FileStructureoftheResults . 41 5.5.2 Binary Cascade Iterative Refinement (BCIR) . ..... 42 5.5.3 StandardIRandMPIR(APIR) . 44 5.5.4 Extra Precise Iterative Refinement (EPIR) . ..... 46 5.5.5 DirectLUSolver(NoIR). 47 6 Experiments 48 6.1 GeneratedData................................... 49 6.2 TargetandWorkingPrecisions . .... 50 6.3 TerminationCriteria . ... 50 6.4 PrecisionsusedbyBCIR. 51 6.5 Comparison of Iterative Refinement Methods . ....... 53 6.5.1 RandomMatrices.............................. 53 6.5.2 HilbertMatrices .............................. 58 6.6 ExperimentsforRefinedVersionofBCIR . ..... 59 6.6.1 PrecisionsusedbyRefinedBCIR . 60 6.6.2 NumberofIterations. 61 6.6.3 RelativeResidual.............................. 62 6.6.4 ConclusionofRefinedBCIR. 63 7 Performance Model 64 7.1 PerformanceModels ............................... 65 7.1.1 PerformanceModelforDirectLUSolver. .... 65 7.1.2 PerformanceModelforSIR . 65 7.1.3 PerformanceModelforMPIR. 65 7.1.4 PerformanceModelforEPIR . 65 7.1.5 PerformanceModelforBCIR . 67 7.2 ResultsfortheModelledSpeed-Up . ..... 68 ii 8 Conclusion 71 Bibliography 73 A Additional and Extended Information 78 A.1 List of available Arbitrary Precision Libraries . ............ 78 A.2 ProjectFiles .................................... 80 A.3 TestEnvironment................................. 81 B Summaries and Curriculum Vitae 82 B.1 EnglishSummary.................................. 82 B.2 DeutscheZusammenfassung . ... 84 B.3 CurriculumVitae................................. 86 iii List of Figures 2.1 General representation of floating-point numbers [48].............. 7 2.2 Slow down effect of a matrix multiplication using the built-in C variable type double and arbitrary precision packages GNU MPFR and ARPREC compared toDGEMMfromtheATLASBLASlibrary . 17 2.3 Slow down effect of a matrix multiplication using the GNU MPFR library, comparedtoDGEMMfromtheATLASBLASlibrary . 18 3.1 Comparison of the performance of the direct solver and the mixed precision iterative refinement using the Cell BE processor on the Sony PlayStation 3 [27]. 24 4.1 Initial working precisions β0 coloured by the number of recursion levels p . 31 4.2 Initial working precisions β0 for the two matrix sizes n = 10 and n = 1000 31 4.3 Initial working precisions β0 coloured by the condition number κ ....... 32 4.4 Final working precisions βp coloured by the number of recursion levels p ... 33 4.5 Final working precisions βp for the two matrix sizes n = 10 and n = 1000 33 4.6 Initial working precisions β0 coloured by the number of recursion levels p using the definition of c in Equation (4.22) ....................... 34 4.7 Initial working precisions β0 for the two matrix sizes n = 10 and n = 1000 using the definition of c in Equation (4.22) ................... 34 4.8 Final working precisions βp coloured by the number of recursion levels p using the definition of c in Equation (4.22) ....................... 35 4.9 Final working precisions βp for the two matrix sizes n = 10 and n = 1000 using the definition of c in Equation (4.22) .................... 35 5.1 Block LU factorization [30] ............................ 40 6.1 Precisions βj used throughout the iterative process for n = 10 plotted on the left y-axis and the relative residual depicted by the orange dotted line on the righty-axis ..................................... 51 6.2 Precisions βj used throughout the iterative process for n = 2000 plotted on the left y-axis and the relative residual depicted by the orange dotted line on therighty-axis ................................... 52 6.3 Number of iterations for the different iterative refinement methods for n = 100 53 6.4 Number of iterations for the different iterative refinement methods for n = 2000 54 6.5 Number of iterations for the different iterative refinement methods for κ = 103 55 6.6 Relative residual for different iterative refinement methods for n = 2000 . 56 iv 6.7 Relative residual for different iterative refinement methods for κ = 103 .... 57 6.8 Relative residual for different iterative refinement methods for κ = 107 .... 57 6.9 Number of iterations for the different iterative refinement methods for Hilbert matrices....................................... 58 6.10 Relative residual for different iterative refinement methods for Hilbert matrices 59 6.11 Precisions βj used throughout the iterative process by the refined version of BCIR plotted on the left y-axis and the relative residual depicted by the orange dottedlineontherighty-axis. 60 6.12 Number of iterations for the different iterative refinement methods for n = 1000 61 6.13 Relative residual for different iterative refinement methods for κ =1 ..... 62 7.1 Modelled speed-up for different system sizes . ......... 69 7.2 Modelled speed-up for different condition numbers . .......... 70 v Listings 2.1 The definition of the data type mpfr t ...................... 13 2.2 All available MPFR functions to add two numbers using different input data types......................................... 14 2.3 DotproductimplementedinstandardC . ..... 15 2.4 Dot product implemented using the GNU MPFR library . ....... 16 5.1 Example for an output file created by bcir ................... 43 5.2 Example for an XML file created by bcir .................... 44 5.3 Example for an output file created by apir ................... 45 5.4 Example for an XML file created by apir .................... 45 5.5 Example for an output file created by apir ................... 47 5.6 Example for an XML file created
Recommended publications
  • Recent and Future Developments of GNU MPFR
    Recent and future developments of GNU MPFR Paul Zimmermann iRRAM/MPFR/MPC workshop, Dagstuhl, April 18, 2018 The GNU MPFR library • a software implementation of binary IEEE-754 • variable/arbitrary precision (up to the limits of your computer) • each variable has its own precision: mpfr_init2 (a, 35) • global user-defined exponent range (might be huge): mpfr_set_emin (-123456789) • mixed-precision operations: a b − c where a has 35 bits, b has 42 bits, c has 17 bits • correctly rounded mathematical functions (exp; log; sin; cos; :::) as in Section 9 of IEEE 754-2008 2 History I 2000: first public version; I 2008: MPFR is used by GCC 4.3.0 for constant folding: double x = sin (3.14); I 2009: MPFR becomes GNU MPFR; I 2016: 4th developer meeting in Toulouse. I Dec 2017: release 4.0.0 I mpfr.org/pub.html mentions 2 books, 27 PhD theses, 63 papers citing MPFR I Apr 2018: iRRAM/MPFR/MPC developer meeting in Dagstuhl 3 MPFR is used by SageMath SageMath version 8.1, Release Date: 2017-12-07 Type "notebook()" for the browser-based notebook interface. Type "help()" for help. sage: x=1/7; a=10^-8; b=2^24 sage: RealIntervalField(24)(x+a*sin(b*x)) [0.142857119 .. 0.142857150] 4 Representation of MPFR numbers (mpfr_t) I precision p ≥ 1 (in bits); I sign (−1 or +1); I exponent (between Emin and Emax), also used to represent special numbers (NaN, ±∞, ±0); I significand (array of dp=64e limbs), defined only for regular numbers (neither NaN, nor ±∞ and ±0, which are singular values).
    [Show full text]
  • Mod Perl 2.0 Source Code Explained 1 Mod Perl 2.0 Source Code Explained
    mod_perl 2.0 Source Code Explained 1 mod_perl 2.0 Source Code Explained 1 mod_perl 2.0 Source Code Explained 15 Feb 2014 1 1.1 Description 1.1 Description This document explains how to navigate the mod_perl source code, modify and rebuild the existing code and most important: how to add new functionality. 1.2 Project’s Filesystem Layout In its pristine state the project is comprised of the following directories and files residing at the root direc- tory of the project: Apache-Test/ - test kit for mod_perl and Apache2::* modules ModPerl-Registry/ - ModPerl::Registry sub-project build/ - utilities used during project build docs/ - documentation lib/ - Perl modules src/ - C code that builds libmodperl.so t/ - mod_perl tests todo/ - things to be done util/ - useful utilities for developers xs/ - source xs code and maps Changes - Changes file LICENSE - ASF LICENSE document Makefile.PL - generates all the needed Makefiles After building the project, the following root directories and files get generated: Makefile - Makefile WrapXS/ - autogenerated XS code blib/ - ready to install version of the package 1.3 Directory src 1.3.1 Directory src/modules/perl/ The directory src/modules/perl includes the C source files needed to build the libmodperl library. Notice that several files in this directory are autogenerated during the perl Makefile stage. When adding new source files to this directory you should add their names to the @c_src_names vari- able in lib/ModPerl/Code.pm, so they will be picked up by the autogenerated Makefile. 1.4 Directory xs/ Apache2/ - Apache specific XS code APR/ - APR specific XS code ModPerl/ - ModPerl specific XS code maps/ - tables/ - Makefile.PL - 2 15 Feb 2014 mod_perl 2.0 Source Code Explained 1.4.1 xs/Apache2, xs/APR and xs/ModPerl modperl_xs_sv_convert.h - modperl_xs_typedefs.h - modperl_xs_util.h - typemap - 1.4.1 xs/Apache2, xs/APR and xs/ModPerl The xs/Apache2, xs/APR and xs/ModPerl directories include .h files which have C and XS code in them.
    [Show full text]
  • Typedef in C
    CC -- TTYYPPEEDDEEFF http://www.tutorialspoint.com/cprogramming/c_typedef.htm Copyright © tutorialspoint.com The C programming language provides a keyword called typedef, which you can use to give a type a new name. Following is an example to define a term BYTE for one-byte numbers: typedef unsigned char BYTE; After this type definitions, the identifier BYTE can be used as an abbreviation for the type unsigned char, for example:. BYTE b1, b2; By convention, uppercase letters are used for these definitions to remind the user that the type name is really a symbolic abbreviation, but you can use lowercase, as follows: typedef unsigned char byte; You can use typedef to give a name to user defined data type as well. For example you can use typedef with structure to define a new data type and then use that data type to define structure variables directly as follows: #include <stdio.h> #include <string.h> typedef struct Books { char title[50]; char author[50]; char subject[100]; int book_id; } Book; int main( ) { Book book; strcpy( book.title, "C Programming"); strcpy( book.author, "Nuha Ali"); strcpy( book.subject, "C Programming Tutorial"); book.book_id = 6495407; printf( "Book title : %s\n", book.title); printf( "Book author : %s\n", book.author); printf( "Book subject : %s\n", book.subject); printf( "Book book_id : %d\n", book.book_id); return 0; } When the above code is compiled and executed, it produces the following result: Book title : C Programming Book author : Nuha Ali Book subject : C Programming Tutorial Book book_id : 6495407 typedef vs #define The #define is a C-directive which is also used to define the aliases for various data types similar to typedef but with following differences: The typedef is limited to giving symbolic names to types only where as #define can be used to define alias for values as well, like you can define 1 as ONE etc.
    [Show full text]
  • Enum, Typedef, Structures and Unions CS 2022: Introduction to C
    Enum, Typedef, Structures and Unions CS 2022: Introduction to C Instructor: Hussam Abu-Libdeh Cornell University (based on slides by Saikat Guha) Fall 2009, Lecture 6 Enum, Typedef, Structures and Unions CS 2022, Fall 2009, Lecture 6 Numerical Types I int: machine-dependent I Standard integers I defined in stdint.h (#include <stdint.h>) I int8 t: 8-bits signed I int16 t: 16-bits signed I int32 t: 32-bits signed I int64 t: 64-bits signed I uint8 t, uint32 t, ...: unsigned I Floating point numbers I float: 32-bits I double: 64-bits Enum, Typedef, Structures and Unions CS 2022, Fall 2009, Lecture 6 Complex Types I Enumerations (user-defined weekday: sunday, monday, ...) I Structures (user-defined combinations of other types) I Unions (same data, multiple interpretations) I Function pointers I Arrays and Pointers of the above Enum, Typedef, Structures and Unions CS 2022, Fall 2009, Lecture 6 Enumerations enum days {mon, tue, wed, thu, fri, sat, sun}; // Same as: // #define mon 0 // #define tue 1 // ... // #define sun 6 enum days {mon=3, tue=8, wed, thu, fri, sat, sun}; // Same as: // #define mon 3 // #define tue 8 // ... // #define sun 13 Enum, Typedef, Structures and Unions CS 2022, Fall 2009, Lecture 6 Enumerations enum days day; // Same as: int day; for(day = mon; day <= sun; day++) { if (day == sun) { printf("Sun\n"); } else { printf("day = %d\n", day); } } Enum, Typedef, Structures and Unions CS 2022, Fall 2009, Lecture 6 Enumerations I Basically integers I Can use in expressions like ints I Makes code easier to read I Cannot get string equiv.
    [Show full text]
  • Evaluation of On-Ramp Control Algorithms
    CALIFORNIA PATH PROGRAM INSTITUTE OF TRANSPORTATION STUDIES UNIVERSITY OF CALIFORNIA, BERKELEY Evaluation of On-ramp Control Algorithms Michael Zhang, Taewan Kim, Xiaojian Nie, Wenlong Jin University of California, Davis Lianyu Chu, Will Recker University of California, Irvine California PATH Research Report UCB-ITS-PRR-2001-36 This work was performed as part of the California PATH Program of the University of California, in cooperation with the State of California Business, Transportation, and Housing Agency, Department of Transportation; and the United States Department of Transportation, Federal Highway Administration. The contents of this report reflect the views of the authors who are responsible for the facts and the accuracy of the data presented herein. The contents do not necessarily reflect the official views or policies of the State of California. This report does not constitute a standard, specification, or regulation. Final Report for MOU 3013 December 2001 ISSN 1055-1425 CALIFORNIA PARTNERS FOR ADVANCED TRANSIT AND HIGHWAYS Evaluation of On-ramp Control Algorithms September 2001 Michael Zhang, Taewan Kim, Xiaojian Nie, Wenlong Jin University of California at Davis Lianyu Chu, Will Recker PATH Center for ATMS Research University of California, Irvine Institute of Transportation Studies One Shields Avenue University of California, Davis Davis, CA 95616 ACKNOWLEDGEMENTS Technical assistance on Paramics simulation from the Quadstone Technical Support sta is also gratefully acknowledged. ii EXECUTIVE SUMMARY This project has three objectives: 1) review existing ramp metering algorithms and choose a few attractive ones for further evaluation, 2) develop a ramp metering evaluation framework using microscopic simulation, and 3) compare the performances of the selected algorithms and make recommendations about future developments and eld tests of ramp metering systems.
    [Show full text]
  • GNU MPFR the Multiple Precision Floating-Point Reliable Library Edition 4.1.0 July 2020
    GNU MPFR The Multiple Precision Floating-Point Reliable Library Edition 4.1.0 July 2020 The MPFR team [email protected] This manual documents how to install and use the Multiple Precision Floating-Point Reliable Library, version 4.1.0. Copyright 1991, 1993-2020 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back- Cover Texts. A copy of the license is included in Appendix A [GNU Free Documentation License], page 59. i Table of Contents MPFR Copying Conditions ::::::::::::::::::::::::::::::::::::::: 1 1 Introduction to MPFR :::::::::::::::::::::::::::::::::::::::: 2 1.1 How to Use This Manual::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 2 2 Installing MPFR ::::::::::::::::::::::::::::::::::::::::::::::: 3 2.1 How to Install ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 3 2.2 Other `make' Targets :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 4 2.3 Build Problems :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 4 2.4 Getting the Latest Version of MPFR ::::::::::::::::::::::::::::::::::::::::::::::: 4 3 Reporting Bugs::::::::::::::::::::::::::::::::::::::::::::::::: 5 4 MPFR Basics ::::::::::::::::::::::::::::::::::::::::::::::::::: 6 4.1 Headers and Libraries :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 6
    [Show full text]
  • This PDF Is Provided for Academic Purposes
    This PDF is provided for academic purposes. If you want this document for commercial purposes, please get this article from IEEE. FLiT: Cross-Platform Floating-Point Result-Consistency Tester and Workload Geof Sawaya∗, Michael Bentley∗, Ian Briggs∗, Ganesh Gopalakrishnan∗, Dong H. Ahny ∗University of Utah, yLaurence Livermore National Laboratory Abstract—Understanding the extent to which computational much performance gain to pass up, and so one must embrace results can change across platforms, compilers, and compiler flags result-changes in practice. However, exploiting such compiler can go a long way toward supporting reproducible experiments. flags is fraught with many dangers. A scientist publishing In this work, we offer the first automated testing aid called FLiT (Floating-point Litmus Tester) that can show how much these a piece of code with these flags used in the build may not results can vary for any user-given collection of computational really understand the extent to which the results would change kernels. Our approach is to take a collection of these kernels, across inputs, platforms, and other compilers. For example, the disperse them across a collection of compute nodes (each with optimization flag -O3 does not hold equal meanings across a different architecture), have them compiled and run, and compilers. Also, some flags are exclusive to certain compilers; bring the results to a central SQL database for deeper analysis. Properly conducting these activities requires a careful selection in those cases, a user who is forced to use a different compiler (or design) of these kernels, input generation methods for them, does not know which substitute flags to use.
    [Show full text]
  • Presentation on Ocaml Internals
    OCaml Internals Implementation of an ML descendant Theophile Ranquet Ecole Pour l’Informatique et les Techniques Avancées SRS 2014 [email protected] November 14, 2013 2 of 113 Table of Contents Variants and subtyping System F Variants Type oddities worth noting Polymorphic variants Cyclic types Subtyping Weak types Implementation details α ! β Compilers Functional programming Values Why functional programming ? Allocation and garbage Combinatory logic : SKI collection The Curry-Howard Compiling correspondence Type inference OCaml and recursion 3 of 113 Variants A tagged union (also called variant, disjoint union, sum type, or algebraic data type) holds a value which may be one of several types, but only one at a time. This is very similar to the logical disjunction, in intuitionistic logic (by the Curry-Howard correspondance). 4 of 113 Variants are very convenient to represent data structures, and implement algorithms on these : 1 d a t a t y p e tree= Leaf 2 | Node of(int ∗ t r e e ∗ t r e e) 3 4 Node(5, Node(1,Leaf,Leaf), Node(3, Leaf, Node(4, Leaf, Leaf))) 5 1 3 4 1 fun countNodes(Leaf)=0 2 | countNodes(Node(int,left,right)) = 3 1 + countNodes(left)+ countNodes(right) 5 of 113 1 t y p e basic_color= 2 | Black| Red| Green| Yellow 3 | Blue| Magenta| Cyan| White 4 t y p e weight= Regular| Bold 5 t y p e color= 6 | Basic of basic_color ∗ w e i g h t 7 | RGB of int ∗ i n t ∗ i n t 8 | Gray of int 9 1 l e t color_to_int= function 2 | Basic(basic_color,weight) −> 3 l e t base= match weight with Bold −> 8 | Regular −> 0 in 4 base+ basic_color_to_int basic_color 5 | RGB(r,g,b) −> 16 +b+g ∗ 6 +r ∗ 36 6 | Grayi −> 232 +i 7 6 of 113 The limit of variants Say we want to handle a color representation with an alpha channel, but just for color_to_int (this implies we do not want to redefine our color type, this would be a hassle elsewhere).
    [Show full text]
  • Effectiveness of Floating-Point Precision on the Numerical Approximation by Spectral Methods
    Mathematical and Computational Applications Article Effectiveness of Floating-Point Precision on the Numerical Approximation by Spectral Methods José A. O. Matos 1,2,† and Paulo B. Vasconcelos 1,2,∗,† 1 Center of Mathematics, University of Porto, R. Dr. Roberto Frias, 4200-464 Porto, Portugal; [email protected] 2 Faculty of Economics, University of Porto, R. Dr. Roberto Frias, 4200-464 Porto, Portugal * Correspondence: [email protected] † These authors contributed equally to this work. Abstract: With the fast advances in computational sciences, there is a need for more accurate compu- tations, especially in large-scale solutions of differential problems and long-term simulations. Amid the many numerical approaches to solving differential problems, including both local and global methods, spectral methods can offer greater accuracy. The downside is that spectral methods often require high-order polynomial approximations, which brings numerical instability issues to the prob- lem resolution. In particular, large condition numbers associated with the large operational matrices, prevent stable algorithms from working within machine precision. Software-based solutions that implement arbitrary precision arithmetic are available and should be explored to obtain higher accu- racy when needed, even with the higher computing time cost associated. In this work, experimental results on the computation of approximate solutions of differential problems via spectral methods are detailed with recourse to quadruple precision arithmetic. Variable precision arithmetic was used in Tau Toolbox, a mathematical software package to solve integro-differential problems via the spectral Tau method. Citation: Matos, J.A.O.; Vasconcelos, Keywords: floating-point arithmetic; variable precision arithmetic; IEEE 754-2008 standard; quadru- P.B.
    [Show full text]
  • Linux from Scratch Version 7.7-Systemd
    Linux From Scratch Version 7.7-systemd 创建者:Gerard Beekmans 编辑者:Matthew Burgess 和 Armin K. 翻译团队:LCTT 译者/校对:wxy, ictlyh, dongfengweixiao, zpl1025, H-mudcup, Yuking-net, kevinSJ Copyright © 1999-2015 Gerard Beekmans 目录 序章 前言 致读者 LFS 的目标架构 LFS 和标准 本书中的软件包逻辑 前置需求 宿主系统需求 排版约定 本书结构 勘误表 第一部分 介绍 第一章 介绍 如何构建 LFS 系统 上次发布以来的更新 更新日志 资源 帮助 第二部分 准备构建 第二章 准备新分区 简介 创建新分区 在分区上创建文件系统 挂载新分区 设置 $LFS 变量 第三章 软件包与补丁 简介 所有软件包 所需补丁 第四章 最后的准备 简介 创建 $LFS/tools 目录 添加 LFS 用户 设置环境 关于 SBU 关于测试套件 第五章 构建临时文件系统 简介 工具链技术备注 通用编译指南 Binutils-2.25 - 第一遍 GCC-4.9.2 - 第一遍 Linux-3.19 API 头文件 Glibc-2.21 Libstdc++-4.9.2 Binutils-2.25 - 第二遍 GCC-4.9.2 - 第二遍 Tcl-8.6.3 Expect-5.45 DejaGNU-1.5.2 Check-0.9.14 Ncurses-5.9 Bash-4.3.30 Bzip2-1.0.6 Coreutils-8.23 Diffutils-3.3 File-5.22 Findutils-4.4.2 Gawk-4.1.1 Gettext-0.19.4 Grep-2.21 Gzip-1.6 M4-1.4.17 Make-4.1 Patch-2.7.4 Perl-5.20.2 Sed-4.2.2 Tar-1.28 Texinfo-5.2 Util-linux-2.26 Xz-5.2.0 清理无用内容 改变属主 第三部分 构建 LFS 系统 第六章 安装基本的系统软件 简介 准备虚拟内核文件系统 软件包管理 进入 chroot 环境 创建目录 创建必需的文件和符号链接 Linux-3.19 API Headers Man-pages-3.79 Glibc-2.21 调整工具链 Zlib-1.2.8 File-5.22 Binutils-2.25 GMP-6.0.0a MPFR-3.1.2 MPC-1.0.2 GCC-4.9.2 Bzip2-1.0.6 Pkg-config-0.28 Ncurses-5.9 Attr-2.4.47 Acl-2.2.52 Libcap-2.24 Sed-4.2.2 Shadow-4.2.1 Psmisc-22.21 Procps-ng-3.3.10 E2fsprogs-1.42.12 Coreutils-8.23 Iana-Etc-2.30 M4-1.4.17 Flex-2.5.39 Bison-3.0.4 Grep-2.21 Readline-6.3 Bash-4.3.30 Bc-1.06.95 Libtool-2.4.6 GDBM-1.11 Expat-2.1.0 Inetutils-1.9.2 Perl-5.20.2 XML::Parser-2.44 Autoconf-2.69 Automake-1.15 Diffutils-3.3 Gawk-4.1.1 Findutils-4.4.2 Gettext-0.19.4 Intltool-0.50.2 Gperf-3.0.4 Groff-1.22.3 Xz-5.2.0 GRUB-2.02~beta2 Less-458 Gzip-1.6 IPRoute2-3.19.0 Kbd-2.0.2 Kmod-19 Libpipeline-1.4.0 Make-4.1 Patch-2.7.4 Systemd-219 D-Bus-1.8.16 Util-linux-2.26 Man-DB-2.7.1 Tar-1.28 Texinfo-5.2 Vim-7.4 关于调试符号 再次清理无用内容 清理 第七章 基本系统配置 简介 通用网络配置 LFS 系统中的设备和模块控制 定制设备的符号链接 配置系统时钟 配置 Linux 主控台 配置系统本地化 创建 /etc/inputrc 文件 创建 /etc/shells 文件 使用及配置 Systemd 第八章 让 LFS 系统可引导 简介 创建 /etc/fstab 文件 Linux-3.19 用 GRUB 设置引导过程 第九章 尾声 最后的最后 为 LFS 用户数添砖加瓦 重启系统 接下来做什么呢? 第四部分 附录 附录 A.
    [Show full text]
  • Fast Fourier Transform Using Compiler Auto-Vectorization Master’S Thesis in Complex Adaptive Systems
    DF Fast Fourier Transform using compiler auto-vectorization Master’s thesis in Complex Adaptive Systems DUNDAR GOC¨ Department of Mathematical Sciences CHALMERS UNIVERSITY OF TECHNOLOGY Gothenburg, Sweden 2019 Master’s thesis 2019 Fast Fourier Transform using compiler auto-vectorization DUNDAR GOC¨ DF Department of Mathematical Sciences Chalmers University of Technology Gothenburg, Sweden 2019 Fast Fourier Transform using compiler auto-vectorization DUNDAR GOC¨ c DUNDAR GOC,¨ 2019. Supervisor: Martin Raum, Department of Mathematical Sciences at Chalmers University of Technology Examiner: Martin Raum, Department of Mathematical Sciences at Chalmers University of Technology Master’s Thesis 2019 Department of Mathematical Sciences Chalmers University of Technology SE-412 96 Gothenburg Telephone +46 31 772 1000 Cover: An illustration of the FFT algorithm structure. Typeset in LATEX Digital Gothenburg, Sweden 2019 iv Fast Fourier Transform using compiler auto-vectorization DUNDAR GOC¨ Department of Mathematical Sciences Chalmers University of Technology Abstract The purpose of this thesis is to develop a fast Fourier transformation algorithm written in C with the use of GCC (GNU Compiler Collection) auto-vectorization in the multiplicative group of integers modulo n, where n is a word sized odd integer. The specific Fourier transform algorithm employed is the cache-friendly version of the Truncated Fourier Transform. The algorithm was implemented by modifying an existing library for modular arithmetic written in C called zn poly. The majority of the thesis work consisted of changing the code in a way to make integration into FLINT possible. FLINT is a versatile library, written in C, aimed at fast arithmetic of different kinds. The results show that auto-vectorization is possible with a potential speedup factor of 3.
    [Show full text]
  • Introduction to the GNU MPFR Library
    Introduction to the GNU MPFR Library Vincent LEFÈVRE AriC, INRIA Grenoble – Rhône-Alpes / LIP, ENS-Lyon GdT AriC, 2014-04-22 [gdt201404.tex 68929 2014-04-22 10:17:19Z vinc17/xvii] Outline Introduction Presentation of GNU MPFR, History Why MPFR? MPFR Basics Output Functions Test of MPFR (make check) Applications Timings The Future – Work in Progress Conclusion [gdt201404.tex 68929 2014-04-22 10:17:19Z vinc17/xvii] Vincent LEFÈVRE (INRIA / LIP, ENS-Lyon) Introduction to the GNU MPFR Library GdT AriC, 2014-04-22 2 / 56 Introduction: Arbitrary Precision Arbitrary precision: the ability to perform calculations on numbers whose “size” can be arbitrarily large, with environmental limits (available memory, size of basic data types, etc.). [gdt201404.tex 68929 2014-04-22 10:17:19Z vinc17/xvii] Vincent LEFÈVRE (INRIA / LIP, ENS-Lyon) Introduction to the GNU MPFR Library GdT AriC, 2014-04-22 3 / 56 Introduction: Arbitrary Precision Arbitrary precision: the ability to perform calculations on numbers whose “size” can be arbitrarily large, with environmental limits (available memory, size of basic data types, etc.). Different kinds of (arbitrary-precision) arithmetics: integer (e.g., GMP/mpz), rational (e.g., GMP/mpq), fixed-point, floating-point, etc. [gdt201404.tex 68929 2014-04-22 10:17:19Z vinc17/xvii] Vincent LEFÈVRE (INRIA / LIP, ENS-Lyon) Introduction to the GNU MPFR Library GdT AriC, 2014-04-22 3 / 56 Introduction: Arbitrary Precision Arbitrary precision: the ability to perform calculations on numbers whose “size” can be arbitrarily large, with environmental limits (available memory, size of basic data types, etc.). Different kinds of (arbitrary-precision) arithmetics: integer (e.g., GMP/mpz), rational (e.g., GMP/mpq), fixed-point, floating-point, etc.
    [Show full text]