Algorithms and Architectures for Decimal Transcendental Function Computation

Total Page:16

File Type:pdf, Size:1020Kb

Algorithms and Architectures for Decimal Transcendental Function Computation View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by University of Saskatchewan's Research Archive Algorithms and Architectures for Decimal Transcendental Function Computation A Thesis Submitted to the College of Graduate Studies and Research in Partial Fulfillment of the Requirements for the degree of Doctor of Philosophy in the Department of Electrical and Computer Engineering University of Saskatchewan Saskatoon, Saskatchewan, Canada By Dongdong Chen c Dongdong Chen, January, 2011. All rights reserved. Permission to Use In presenting this thesis in partial fulfilment of the requirements for a Postgraduate degree from the University of Saskatchewan, I agree that the Libraries of this University may make it freely available for inspection. I further agree that permission for copying of this thesis in any manner, in whole or in part, for scholarly purposes may be granted by the professor or professors who supervised my thesis work or, in their absence, by the Head of the Department or the Dean of the College in which my thesis work was done. It is understood that any copying or publication or use of this thesis or parts thereof for financial gain shall not be allowed without my written permission. It is also understood that due recognition shall be given to me and to the University of Saskatchewan in any scholarly use which may be made of any material in my thesis. Requests for permission to copy or to make other use of material in this thesis in whole or part should be addressed to: Head of the Department of Electrical and Computer Engineering 57 Campus Drive University of Saskatchewan Saskatoon, Saskatchewan Canada S7N 5A9 i Abstract Nowadays, there are many commercial demands for decimal floating-point (DFP) arith- metic operations such as financial analysis, tax calculation, currency conversion, Internet based applications, and e-commerce. This trend gives rise to further development on DFP arithmetic units which can perform accurate computations with exact decimal operands. Due to the significance of DFP arithmetic, the IEEE 754-2008 standard for floating-point arithmetic includes it in its specifications. The basic decimal arithmetic unit, such as dec- imal adder, subtracter, multiplier, divider or square-root unit, as a main part of a decimal microprocessor, is attracting more and more researchers’ attentions. Recently, the decimal- encoded formats and DFP arithmetic units have been implemented in IBM’s system z900, POWER6, and z10 microprocessors. Increasing chip densities and transistor count provide more room for designers to add more essential functions on application domains into upcoming microprocessors. Decimal transcendental functions, such as DFP logarithm, antilogarithm, exponential, reciprocal and trigonometric, etc, as useful arithmetic operations in many areas of science and engineering, has been specified as the recommended arithmetic in the IEEE 754-2008 standard. Thus, vir- tually all the computing systems that are compliant with the IEEE 754-2008 standard could include a DFP mathematical library providing transcendental function computation. Based on the development of basic decimal arithmetic units, more complex DFP transcendental arithmetic will be the next building blocks in microprocessors. In this dissertation, we researched and developed several new decimal algorithms and architectures for the DFP transcendental function computation. These designs are com- posed of several different methods: 1) the decimal transcendental function computation based on the table-based first-order polynomial approximation method; 2) DFP logarithmic and antilogarithmic converters based on the decimal digit-recurrence algorithm with selec- tion by rounding; 3) a decimal reciprocal unit using the efficient table look-up based on Newton-Raphson iterations; and 4) a first radix-100 division unit based on the non-restoring algorithm with pre-scaling method. Most decimal algorithms and architectures for the DFP transcendental function computation developed in this dissertation have been the first at- ii tempt to analyze and implement the DFP transcendental arithmetic in order to achieve faithful results of DFP operands, specified in IEEE 754-2008. To help researchers evaluate the hardware performance of DFP transcendental arithmetic units, the proposed architectures based on the different methods are modeled, verified and synthesized using FPGAs or with CMOS standard cells libraries in ASIC. Some of implemen- tation results are compared with those of the binary radix-16 logarithmic and exponential converters; recent developed high performance decimal CORDIC based architecture; and Intel’s DFP transcendental function computation software library. The comparison results show that the proposed architectures have significant speed-up in contrast to the above designs in terms of the latency. The algorithms and architectures developed in this dis- sertation provide a useful starting point for future hardware-oriented DFP transcendental function computation researches. iii Acknowledgements It is a pleasure to express my sincere appreciation to the people who have assisted me throughout the years of research that have led to this Ph.D dissertation at University of Saskatchewan. First and foremost, I sincerely thank my supervisor, Dr. Seok-Bum Ko, who has sup- ported me throughout this dissertation with his expertise, understanding, and patience whilst allowing me the room to work in my own way. Without his constant guidance, advice and encouragement, this dissertation would not have been possible. Second, I would like to thank the members of my dissertation committee, Dr. Daniel Teng, Dr. Khan Wahid and Dr. Derek Eager, for the invaluable suggestions for my Ph.D dissertation. Also, I would like to thank Dr. Tor Aamodt from the University of British Columbia for taking time out from his busy schedule to serve as my external examiner. Third, I thank all people in the Department of Electrical and Computer Engineering for providing such a great academic environment, in which I can carry out this dissertation. In particular, I would like to thank the faculties and staff of VLSI research group; and thank all workmates for collaborations on several research projects. Fourth, I would appreciate the anonymous reviewers from IEEE Symposium on Computer Arithmetic and IEEE Transactions on Computer for their invaluable comments. I am very grateful to the Dr. Ivan Godard for his insightful advice and brilliant idea for several future research projects. I would like to mention Dr. Liang-Kai Wang, Dr. Mark A. Erle and Dr. Alvaro´ V´azquez for their impressive Ph.D works in the area of the DFP computer arithmetic, which continually inspire me on doing my Ph.D research. Fifth, I would to deliver my thanks to my friends in 2C60 for their friendship and help; special thanks to my beloved family for their love and support. They always encouraged me and asked me to be patience and work harder, and that is the thing I really kept in mind. Finally, I am grateful to the College of Graduate Studies and the Department of Electrical and Computer Engineering for providing financial assistance through scholarships that were invaluable to me. iv Dedicate to my beloved family v Contents Permission to Use i Abstract ii Acknowledgements iv Contents vi List of Tables x List of Figures xi List of Abbreviations xii I Preface 1 1 Introduction 2 1.1 WhyDecimalArithmetic............................. 2 1.2 Motivation..................................... 4 1.3 ResearchOverview ................................ 7 1.4 ResearchContributions.............................. 9 II Research Background 12 2 Decimal Transcendental Arithmetic 13 2.1 DFPFormatsinIEEE754-2008Standard . 13 2.1.1 DFPFormatsandEncodings . 14 2.1.2 DFPArithmeticOperations . 17 2.1.3 DFPRoundingModes .......................... 19 2.1.4 ExceptionHandling............................ 20 2.2 DecimalTranscendentalUnitDesign . 20 2.2.1 Some Details of DFP Transcendental Operations . 21 2.2.2 Classification of Hardware Approaches in BFP . 23 2.2.3 Considerations of Hardware Implementation . 26 2.2.4 Related Basic Decimal Arithmetic . 28 III Table-based First-Order Polynomial Approximation 29 3 A Dynamic Non-Uniform Segmentation Method 30 vi 3.1 Introduction.................................... 30 3.2 Minimax Polynomial Approximation . 32 3.2.1 Notations ................................. 32 3.2.2 Minimax Error Analysis in One Segment . 33 3.3 ANon-UniformSegmentationMethod . 35 3.3.1 Determination of Initial UFB by MiniBit Approach . 37 3.3.2 Evaluation of Bit-Width of Segment Boundary . 38 3.3.3 Partition of Non-Uniform Segments by BSPS . 39 3.4 HardwareArchitecture .............................. 40 3.4.1 SegmentIndexEncoder ......................... 41 3.4.2 EstimationofMemorySizes . .. .. 42 3.5 ExperimentalResults............................... 43 3.5.1 ComparisonResults............................ 43 3.5.2 Evaluation Results for More Functions . 44 3.5.3 MemorySizesforTwoMethods . 45 3.5.4 CPUTimeConsumed .......................... 46 3.6 Summary ..................................... 47 4 Decimal Logarithmic and Antilogarithmic Converters 50 4.1 Introduction.................................... 50 4.2 Decimal Logarithm and Antilogarithm Conversion . .. 51 4.2.1 Binary-based Decimal Logarithm Conversion (Alg. 1) . 51 4.2.2 Decimal Logarithm Conversion (Alg. 2) . 52 4.2.3 Decimal Antilogarithm Conversion (Alg. 3) . 53 4.3 Piecewise Linear Approximation Method . 54 4.3.1 Notations ................................
Recommended publications
  • High Performance Decimal Floating-Point Units
    UNIVERSIDADE DE SANTIAGO DE COMPOSTELA DEPARTAMENTO DE ELECTRONICA´ E COMPUTACION´ PhD. Dissertation High-Performance Decimal Floating-Point Units Alvaro´ Vazquez´ Alvarez´ Santiago de Compostela, January 2009 To my family A´ mina˜ familia Acknowledgements It has been a long way to see this thesis successfully concluded, at least longer than what I imagined it. Perhaps the moment to thank and acknowledge everyone’s contributions is the most eagerly awaited. This thesis could not have been possible without the support of several people and organizations whose contributions I am very grateful. First of all, I want to express my sincere gratitude to my thesis advisor, Elisardo Antelo. Specially, I would like to emphasize the invaluable support he offered to me all these years. His ideas and contributions have a major influence on this thesis. I would like to thank all people in the Departamento de Electronica´ e Computacion´ for the material and personal help they gave me to carry out this thesis, and for providing a friendly place to work. In particular, I would like to mention to Prof. Javier D. Bruguera and the other staff of the Computer Architecture Group. Many thanks to Paula, David, Pichel, Marcos, Juanjo, Oscar,´ Roberto and my other workmates for their friendship and help. I am very grateful to IBM Germany for their financial support though a one-year research contract. I would like to thank Ralf Fischer, lead of hardware development, and Peter Roth and Stefan Wald, team managers at IBM Deutchland Entwicklung in Boblingen.¨ I would like to extend my gratitude to the FPU design team, in special to Silvia Muller¨ and Michael Kroner,¨ for their help and the warm welcome I received during my stay in Boblingen.¨ I would also like to thank Eric Schwarz from IBM for his support.
    [Show full text]
  • Fully Redundant Decimal Arithmetic
    2009 19th IEEE International Symposium on Computer Arithmetic Fully Redundant Decimal Arithmetic Saeid Gorgin and Ghassem Jaberipur Dept. of Electrical & Computer Engr., Shahid Beheshti Univ. and School of Computer Science, institute for research in fundamental sciences (IPM), Tehran, Iran [email protected], [email protected] Abstract In both decimal and binary arithmetic, partial products in multipliers and partial remainders in Hardware implementation of all the basic radix-10 dividers are often represented via a redundant number arithmetic operations is evolving as a new trend in the system (e.g., Binary signed digit [11], decimal carry- design and implementation of general purpose digital save [5], double-decimal [6], and minimally redundant processors. Redundant representation of partial decimal [9]). Such use of redundant digit sets, where products and remainders is common in the the number of digits is sufficiently more than the radix, multiplication and division hardware algorithms, allows for carry-free addition and subtraction as the respectively. Carry-free implementation of the more basic operations that build-up the product and frequent add/subtract operations, with the byproduct of remainder, respectively. In the aforementioned works enhancing the speed of multiplication and division, is on decimal multipliers and dividers, inputs and outputs possible with redundant number representation. are nonredundant decimal numbers. However, a However, conversion of redundant results to redundant representation is used for the intermediate conventional representations entails slow carry partial products or remainders. The intermediate propagation that can be avoided if the results are kept additions and subtractions are semi-redundant in redundant format for later use as operands of other operations in that only one of the operands as well as arithmetic operations.
    [Show full text]
  • Mock In-Class Test COMS10007 Algorithms 2018/2019
    Mock In-class Test COMS10007 Algorithms 2018/2019 Throughout this paper log() denotes the binary logarithm, i.e, log(n) = log2(n), and ln() denotes the logarithm to base e, i.e., ln(n) = loge(n). 1 O-notation 1. Let f : N ! N be a function. Define the set Θ(f(n)). Proof. Θ(f(n)) = fg(n) : There exist positive constants c1; c2 and n0 s.t. 0 ≤ c1f(n) ≤ g(n) ≤ c2f(n) for all n ≥ n0g 2. Give a formal proof of the statement: p 10 n 2 O(n) : p Proof. We need to show that there are positive constants c; n0 such that 10 n ≤ c · n, 10 p for every n ≥ n0. The previous inequality is equivalent to c ≤ n, which in turn 100 100 gives c2 ≤ n. Hence, we can pick c = 1 and n0 = 12 = 100. 3. Use the racetrack principle to prove the following statement: n 2 O(2n) : Hint: The following facts can be useful: • The derivative of 2n is ln(2)2n. 1 • 2 ≤ ln(2) ≤ 1 holds. n Proof. We need to show that there are positive constants c; n0 such that n ≤ c·2 . We n pick c = 1 and n0 = 1. Observe that n ≤ 2 holds for n = n0(= 1). It remains to show n that n ≤ 2 also holds for every n ≥ n0. To show this, we use the racetrack principle. Observe that the derivative of n is 1 and the derivative of 2n is ln(2)2n. Hence, by n the racetrack principle it is enough to show that 1 ≤ ln(2)2 holds for every n ≥ n0, 1 1 1 1 or log( ln 2 ) ≤ n.
    [Show full text]
  • Algorithms and Architectures for Decimal Transcendental Function Computation
    Algorithms and Architectures for Decimal Transcendental Function Computation A Thesis Submitted to the College of Graduate Studies and Research in Partial Fulfillment of the Requirements for the degree of Doctor of Philosophy in the Department of Electrical and Computer Engineering University of Saskatchewan Saskatoon, Saskatchewan, Canada By Dongdong Chen c Dongdong Chen, January, 2011. All rights reserved. Permission to Use In presenting this thesis in partial fulfilment of the requirements for a Postgraduate degree from the University of Saskatchewan, I agree that the Libraries of this University may make it freely available for inspection. I further agree that permission for copying of this thesis in any manner, in whole or in part, for scholarly purposes may be granted by the professor or professors who supervised my thesis work or, in their absence, by the Head of the Department or the Dean of the College in which my thesis work was done. It is understood that any copying or publication or use of this thesis or parts thereof for financial gain shall not be allowed without my written permission. It is also understood that due recognition shall be given to me and to the University of Saskatchewan in any scholarly use which may be made of any material in my thesis. Requests for permission to copy or to make other use of material in this thesis in whole or part should be addressed to: Head of the Department of Electrical and Computer Engineering 57 Campus Drive University of Saskatchewan Saskatoon, Saskatchewan Canada S7N 5A9 i Abstract Nowadays, there are many commercial demands for decimal floating-point (DFP) arith- metic operations such as financial analysis, tax calculation, currency conversion, Internet based applications, and e-commerce.
    [Show full text]
  • Optimizing MPC for Robust and Scalable Integer and Floating-Point Arithmetic
    Optimizing MPC for robust and scalable integer and floating-point arithmetic Liisi Kerik1, Peeter Laud1, and Jaak Randmets1,2 1 Cybernetica AS, Tartu, Estonia 2 University of Tartu, Tartu, Estonia {liisi.kerik, peeter.laud, jaak.randmets}@cyber.ee Abstract. Secure multiparty computation (SMC) is a rapidly matur- ing field, but its number of practical applications so far has been small. Most existing applications have been run on small data volumes with the exception of a recent study processing tens of millions of education and tax records. For practical usability, SMC frameworks must be able to work with large collections of data and perform reliably under such conditions. In this work we demonstrate that with the help of our re- cently developed tools and some optimizations, the Sharemind secure computation framework is capable of executing tens of millions integer operations or hundreds of thousands floating-point operations per sec- ond. We also demonstrate robustness in handling a billion integer inputs and a million floating-point inputs in parallel. Such capabilities are ab- solutely necessary for real world deployments. Keywords: Secure Multiparty Computation, Floating-point operations, Protocol design 1 Introduction Secure multiparty computation (SMC) [19] allows a group of mutually distrust- ing entities to perform computations on data private to various members of the group, without others learning anything about that data or about the intermedi- ate values in the computation. Theory-wise, the field is quite mature; there exist several techniques to achieve privacy and correctness of any computation [19, 28, 15, 16, 21], and the asymptotic overheads of these techniques are known.
    [Show full text]
  • Version 13.0 Release Notes
    Release Notes The tool of thought for expert programming Version 13.0 Dyalog is a trademark of Dyalog Limited Copyright 1982-2011 by Dyalog Limited. All rights reserved. Version 13.0 First Edition April 2011 No part of this publication may be reproduced in any form by any means without the prior written permission of Dyalog Limited. Dyalog Limited makes no representations or warranties with respect to the contents hereof and specifically disclaims any implied warranties of merchantability or fitness for any particular purpose. Dyalog Limited reserves the right to revise this publication without notification. TRADEMARKS: SQAPL is copyright of Insight Systems ApS. UNIX is a registered trademark of The Open Group. Windows, Windows Vista, Visual Basic and Excel are trademarks of Microsoft Corporation. All other trademarks and copyrights are acknowledged. iii Contents C H A P T E R 1 Introduction .................................................................................... 1 Summary........................................................................................................................... 1 System Requirements ....................................................................................................... 2 Microsoft Windows .................................................................................................... 2 Microsoft .Net Interface .............................................................................................. 2 Unix and Linux ..........................................................................................................
    [Show full text]
  • CS 270 Algorithms
    CS 270 Algorithms Week 10 Oliver Kullmann Binary heaps Sorting Heapification Building a heap 1 Binary heaps HEAP- SORT Priority 2 Heapification queues QUICK- 3 Building a heap SORT Analysing 4 QUICK- HEAP-SORT SORT 5 Priority queues Tutorial 6 QUICK-SORT 7 Analysing QUICK-SORT 8 Tutorial CS 270 General remarks Algorithms Oliver Kullmann Binary heaps Heapification Building a heap We return to sorting, considering HEAP-SORT and HEAP- QUICK-SORT. SORT Priority queues CLRS Reading from for week 7 QUICK- SORT 1 Chapter 6, Sections 6.1 - 6.5. Analysing 2 QUICK- Chapter 7, Sections 7.1, 7.2. SORT Tutorial CS 270 Discover the properties of binary heaps Algorithms Oliver Running example Kullmann Binary heaps Heapification Building a heap HEAP- SORT Priority queues QUICK- SORT Analysing QUICK- SORT Tutorial CS 270 First property: level-completeness Algorithms Oliver Kullmann Binary heaps In week 7 we have seen binary trees: Heapification Building a 1 We said they should be as “balanced” as possible. heap 2 Perfect are the perfect binary trees. HEAP- SORT 3 Now close to perfect come the level-complete binary Priority trees: queues QUICK- 1 We can partition the nodes of a (binary) tree T into levels, SORT according to their distance from the root. Analysing 2 We have levels 0, 1,..., ht(T ). QUICK- k SORT 3 Level k has from 1 to 2 nodes. Tutorial 4 If all levels k except possibly of level ht(T ) are full (have precisely 2k nodes in them), then we call the tree level-complete. CS 270 Examples Algorithms Oliver The binary tree Kullmann 1 ❚ Binary heaps ❥❥❥❥ ❚❚❚❚ ❥❥❥❥ ❚❚❚❚ ❥❥❥❥ ❚❚❚❚ Heapification 2 ❥ 3 ❖ ❄❄ ❖❖ Building a ⑧⑧ ❄ ⑧⑧ ❖❖ heap ⑧⑧ ❄ ⑧⑧ ❖❖❖ 4 5 6 ❄ 7 ❄ HEAP- ⑧ ❄ ⑧ ❄ SORT ⑧⑧ ❄ ⑧⑧ ❄ Priority 10 13 14 15 queues QUICK- is level-complete (level-sizes are 1, 2, 4, 4), while SORT ❥ 1 ❚❚ Analysing ❥❥❥❥ ❚❚❚❚ QUICK- ❥❥❥ ❚❚❚ SORT ❥❥❥ ❚❚❚❚ 2 ❥❥ 3 ♦♦♦ ❄❄ ⑧ Tutorial ♦♦ ❄❄ ⑧⑧ ♦♦♦ ⑧ 4 ❄ 5 ❄ 6 ❄ ⑧⑧ ❄❄ ⑧ ❄ ⑧ ❄ ⑧⑧ ❄ ⑧⑧ ❄ ⑧⑧ ❄ 8 9 10 11 12 13 is not (level-sizes are 1, 2, 3, 6).
    [Show full text]
  • CS211 ❒ Overview • Code (Number Guessing Example)
    CS211 1. Number Guessing Example ASYMPTOTIC COMPLEXITY 1.1 Code ❒ Overview import java.io.*; • code (number guessing example) public class NumberGuess { • approximate analysis to motivate math public static void main(String[] args) throws IOException { int guess; • machine model int count; final int LOW=Integer.parseInt(args[0]); • analysis (space, time) final int HIGH=Integer.parseInt(args[1]); final int STOP=HIGH-LOW+1; • machine architecture and time assumptions int target = (int)(Math.random()*(HIGH-LOW+1))+(int)(LOW); BufferedReader in = new BufferedReader(new • code analysis InputStreamReader(System.in)); • more assumptions System.out.print("\nGuess an integer: "); guess = Integer.parseInt(in.readLine()); • Big Oh notation count = 1; • examples while (guess != target && guess >= LOW && • extra material (theory, background, math) guess <= HIGH && count < STOP ) { if (guess < target) System.out.println("Too low!"); else if (guess>target) System.out.println("Too high!"); else System.exit(0); System.out.print("\nGuess an integer: "); guess = Integer.parseInt(in.readLine()); count++ ; } if (target == guess) System.out.println("\nCongratulations!\n"); } } 12 1.2 Solution Algorithms 1.4 More General • random: • count only comparisons of guess to target - pick any number (same number as guesses) - worst-case time could be infinite • other examples: • linear: range linea binar pattern r y - guess one number at a time 1111→1 - start from bottom and head to top 1–10 10 5 5→7→8→9→10 1–100 100 8 50→75→87→93→96→98→99→100 - worst-case time could
    [Show full text]
  • History of Binary and Other Nondecimal Numeration
    HISTORY OF BINARY AND OTHER NONDECIMAL NUMERATION BY ANTON GLASER Professor of Mathematics, Pennsylvania State University TOMASH PUBLISHERS Copyright © 1971 by Anton Glaser Revised Edition, Copyright 1981 by Anton Glaser All rights reserved Printed in the United States of America Library of Congress Cataloging in Publication Data Glaser, Anton, 1924- History of binary and other nondecimal numeration. Based on the author's thesis (Ph. D. — Temple University), presented under the title: History of modern numeration systems. Bibliography: p. 193 Includes Index. 1. Numeration — History. I. Title QA141.2.G55 1981 513'.5 81-51176 ISBN 0-938228-00-5 AACR2 To My Wife, Ruth ACKNOWLEDGMENTS THIS BOOK is based on the author’s doctoral dissertation, History of Modern Numeration Systems, written under the guidance of Morton Alpren, Sara A. Rhue, and Leon Steinberg of Temple University in Philadelphia, Pa. Extensive help was received from the libraries of the Academy of the New Church (Bryn Athyn, Pa.), the American Philosophical Society, Pennsylvania State University, Temple University, the University of Michigan, and the University of Pennsylvania. The photograph of Figure 7 was made available by the New York Public Library; the library of the University of Pennsylvania is the source of the photographs in Figures 2 and 6. The author is indebted to Harold Hanes, Joseph E. Hofmann, Donald E. Knuth, and Brian J. Winkel, who were kind enough to communicate their comments about the strengths and weaknesses of the original edition. The present revised edition is the better for it. A special thanks is also owed to John Wagner for his careful editorial work and to Adele Clark for her thorough preparation of the Index.
    [Show full text]
  • Binary Search Algorithm Anthony Lin¹* Et Al
    WikiJournal of Science, 2019, 2(1):5 doi: 10.15347/wjs/2019.005 Encyclopedic Review Article Binary search algorithm Anthony Lin¹* et al. Abstract In In computer science, binary search, also known as half-interval search,[1] logarithmic search,[2] or binary chop,[3] is a search algorithm that finds a position of a target value within a sorted array.[4] Binary search compares the target value to an element in the middle of the array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found. If the search ends with the remaining half being empty, the target is not in the array. Binary search runs in logarithmic time in the worst case, making 푂(log 푛) comparisons, where 푛 is the number of elements in the array, the 푂 is ‘Big O’ notation, and 푙표푔 is the logarithm.[5] Binary search is faster than linear search except for small arrays. However, the array must be sorted first to be able to apply binary search. There are spe- cialized data structures designed for fast searching, such as hash tables, that can be searched more efficiently than binary search. However, binary search can be used to solve a wider range of problems, such as finding the next- smallest or next-largest element in the array relative to the target even if it is absent from the array. There are numerous variations of binary search.
    [Show full text]
  • Introduction to Algorithms
    CS 5002: Discrete Structures Fall 2018 Lecture 9: November 8, 2018 1 Instructors: Adrienne Slaughter, Tamara Bonaci Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications. They may be distributed outside this class only with the permission of the Instructor. Introduction to Algorithms Readings for this week: Rosen, Chapter 2.1, 2.2, 2.5, 2.6 Sets, Set Operations, Cardinality of Sets, Matrices 9.1 Overview Objective: Introduce algorithms. 1. Review logarithms 2. Asymptotic analysis 3. Define Algorithm 4. How to express or describe an algorithm 5. Run time, space (Resource usage) 6. Determining Correctness 7. Introduce representative problems 1. foo 9.2 Asymptotic Analysis The goal with asymptotic analysis is to try to find a bound, or an asymptote, of a function. This allows us to come up with an \ordering" of functions, such that one function is definitely bigger than another, in order to compare two functions. We do this by considering the value of the functions as n goes to infinity, so for very large values of n, as opposed to small values of n that may be easy to calculate. Once we have this ordering, we can introduce some terminology to describe the relationship of two functions. 9-1 9-2 Lecture 9: November 8, 2018 Growth of Functions nn 4096 n! 2048 1024 512 2n 256 128 n2 64 32 n log(n) 16 n 8 p 4 n log(n) 2 1 1 2 3 4 5 6 7 8 From this chart, we see: p 1 log n n n n log(n) n2 2n n! nn (9.1) Complexity Terminology Θ(1) Constant Θ(log n) Logarithmic Θ(n) Linear Θ(n log n) Linearithmic Θ nb Polynomial Θ(bn) (where b > 1) Exponential Θ(n!) Factorial 9.2.1 Big-O: Upper Bound Definition 9.1 (Big-O: Upper Bound) f(n) = O(g(n)) means there exists some constant c such that f(n) ≤ c · g(n), for large enough n (that is, as n ! 1).
    [Show full text]
  • Comparison of Parallel Sorting Algorithms
    Comparison of parallel sorting algorithms Darko Božidar and Tomaž Dobravec Faculty of Computer and Information Science, University of Ljubljana, Slovenia Technical report November 2015 TABLE OF CONTENTS 1. INTRODUCTION .................................................................................................... 3 2. THE ALGORITHMS ................................................................................................ 3 2.1 BITONIC SORT ........................................................................................................................ 3 2.2 MULTISTEP BITONIC SORT ................................................................................................. 3 2.3 IBR BITONIC SORT ................................................................................................................ 3 2.4 MERGE SORT .......................................................................................................................... 4 2.5 QUICKSORT ............................................................................................................................ 4 2.6 RADIX SORT ........................................................................................................................... 4 2.7 SAMPLE SORT ........................................................................................................................ 4 3. TESTING ENVIRONMENT .................................................................................... 5 4. RESULTS ................................................................................................................
    [Show full text]