Hardware Implementations of Fixed-Point Atan2 Florent De Dinechin, Matei Istoan
Total Page:16
File Type:pdf, Size:1020Kb
Hardware implementations of fixed-point Atan2 Florent de Dinechin, Matei Istoan To cite this version: Florent de Dinechin, Matei Istoan. Hardware implementations of fixed-point Atan2. 22nd IEEE Symposium on Computer Arithmetic, Jun 2015, Lyon, France. hal-01091138 HAL Id: hal-01091138 https://hal.inria.fr/hal-01091138 Submitted on 4 Dec 2014 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution| 4.0 International License Hardware implementations of fixed-point Atan2 Florent de Dinechin, Matei Istoan Universite´ de Lyon, INRIA, INSA-Lyon, CITI-INRIA, F-69621, Villeurbanne, France Abstract—The atan2 function computes the polar angle y arctan(x/y) of a point given by its cartesian coordinates. 1 It is widely used in digital signal processing to recover (x,y) the phase of a signal. This article studies for this context α = atan2(y,x) the implementation of atan2 with fixed-point inputs and outputs. It compares the prevalent CORDIC shift-and-add -1 1 x algorithm to two multiplier-based techniques. The first one reduces the bivariate atan2 function to two functions of one variable: the reciprocal, and the arctangent. These two -1 functions may be tabulated, or evaluated using bipartite or polynomial approximation methods. The second technique Fig. 1. Fixed-point arctan(y/x) directly uses piecewise bivariate polynomial approximations, in degree 1 and degree 2. It requires larger tables but has the shortest latency. Each of these approaches requires a that may represent the interval [ 4, 4). Therefore, all the − relevant argument reduction, which is also discussed. All codes between π and 4 will never be used. the algorithms are described with the same accuracy target (faithful rounding) and implemented with similar care in an The second is that it slightly simplifies implementa- open-source library. Based on synthesis results on FPGAs, tion: the modulo-2π periodicity of radian angles becomes their relevance domains are discussed. a modulo-2 periodicity that comes for free in two’s complement binary arithmetic. For instance, the fact that I. INTRODUCTION the two’s complement output range [ 1, 1) is slightly − A. Definitions and notations asymmetrical is not a problem, as modulo-2 arithmetic will map angle 1 π to angle 1 π as it should. We In all this article we implement the function × − × atan2(y,x) = arctan(y/x) with fixed-point input and will see that it also saves some computations in the outputs. This function (part of the standard mathemat- range reduction, and allows it to be exact, where radian ical library) returns an angle in [ π, π]. Compared to a angles require computations with the constant π/2 which − plain composition of division and the arctan function cannot be represented exactly in binary. (that returns an angle in [ π/2, π/2]), atan2(x,y) keeps The third reason for preferring binary angles is that − track of the respective signs of x and y. It is used to it generally simplifies the application context as well. compute the phase of a complex number x + iy. For instance, in QPSK decoding, the atan2 of incoming We will denote w the input and output width, in bits. samples are averaged to estimate a phase shift. Comput- ky y ing this average is simpler with binary angles than with As arctan( kx ) = arctan( x ), the range of the inputs is not really relevant, as long as both x and y are in the same radian angles, again because it is a modulo operation. format. We choose to have both inputs as fixed-point Therefore, we implement in this article the function numbers in the range [ 1, 1), represented classically in 1 − f(x,y)= atan2(y,x) . two’s complement. π There are two sensible choices for the output. However, all the algorithms in this paper may be • It may be expressed in radian, from π to π; − straightforwardly adapted to radian angles with only • It may be expressed in [ 1, 1), in which case the − 1 minor overhead. function being computed is π atan2(y,x). Through- out this article, we will refer to this option as “binary B. Overview of hardware implementation methods angles” This paper compares several hardware implementa- There are several reasons for preferring binary angles. tion techniques for fixed-point atan2, some classical, The first is that it fully uses the representation space. An some new. To ensure a meaningful comparison, each output on [ π, π] will be coded on a fixed-point format − technique is implemented with the same care and with This work is partly supported by the MetaLibm project (ANR-13- the same accuracy objective: f(x,y) is computed with INSE-0007) of the French Agence Nationale de la Recherche. last-bit accuracy, i.e. the error (the difference between the returned result and the infinitely accurate one) is bivariate approach having shorter latency, but larger less than the weight of the LSB of the result. In other tables. The main objective of this article is therefore to words, all the architectures presented are as accurate as study the relevance domain of each method in Section their output format allows. They are also almost bit-for- VI. bit compatible with each other (differing by at most one This work essentially focuses on FPGAs. An unex- bit in the last place) which enables a fair comparison of pected result is that, even on modern FPGAs enhanced methods. with DSP blocks and memories, CORDIC is a clear All these techniques begin with a range reduction winner. This will be analyzed in SectionVI. However, this (detailed in Section II) to the first octant defined by work also contributes an open-source VHDL generator 0 x 1; 0 y 1; y x (see Figure 2). that covers all the presented methods, so that compar- { ≤ ≤ ≤ ≤ ≤ } The most classical technique for fixed-point hardware isons can be performed in other contexts. implementation of atan2 is CORDIC. It performs a series of micro-rotations of the point (x,y) to send it on the C. Notations for fixed-point formats x axis. The angle of each micro-rotation is chosen so In all the article, we will describe a signed fixed- that it can be implemented by shift and add. CORDIC point format by sfix(m, l) and an unsigned fixed-point is studied in Section III, which contributes a fine error format by ufix(m, l). The two integers m and l denote and range analysis that guarantees last-bit accuracy at respectively the weights of the most and least significant the smallest possible datapath width. bits (MSB and LSB) of the format. For instance our inputs An option would be to simply tabulate all the values and outputs in [ 1, 1) on w bits will be on the format − of atan2 in a table addressed by a concatenation of x sfix(0, w + 1): the sign bit has weight 20 and the LSB 2w − − and y. However, this table would have 2 entries of w has weight 2 w+1. bits each, which is impractical even for small values of w. Besides, classical table-compression techniques (bipar- II. RANGE REDUCTIONS tite, etc) do not apply here, since they address functions A. Using parity and symmetries of one variable, whereas f(x,y) is a function of two As arctan is an odd function, inputs may be straight- variables. forwardly reduced to the positive quadrant. Besides, The next technique, studied in Section IV, is to de- there is a symmetry between x and y: compose the computation in two steps: first compute the division z = y/x, then compute arctan(z). It is arctan(y/x)= π/2 arctan(x/y). possible to compute the division y/x by first computing − the reciprocal of x, then multiplying y it by, as illustrated If y > x, we may therefore swap x and y, so the in Figure 5. Thus, the computation of atan2, a function computation is reduced to the first octant, i.e. of two variables, is reduced to the computation of two x 0, y 0, y<x. functions of one variable. There is a wide body of ≥ ≥ techniques that can be used to compute functions of Out of the first quadrant, one must use similar range one variable. However, we now have the problem that reduction formulae, one example being given by Figure z may become very large when x comes near to zero. 2. This would require either a floating-point format, or a In these formulae, a non-radian output has a positive very large fixed-point format for z. This can be avoided benefit. Indeed, the final addition of π/2 becomes an by a scaling-based range reduction that will be detailed addition of 1/2 to a number in [ 1, 1). This is an exact − in Section II. operation that involves only a one-bit carry propagation. The last and most original technique, studied in Sec- Conversely, adding π/2 requires a full-width carry prop- tion V, is the use of a piecewise approximation by agation, while entailing a systematic error due to the bivariate polynomials.