A Survey of CORDIC Algorithms for Fpgas

A survey of CORDIC algorithms for FPGA based computers Ray Andraka Andraka Consulting Group, Inc 16 Arcadia Drive North Kingstown, RI 02852 401/884-7930 FAX 401/884-7950 email:[email protected] 1. ABSTRACT transcendental functions that use only shifts and adds to perform. The trigonometric functions are based on vector The current trend back toward hardware rotations, while other functions such as square root are intensive signal processing has uncovered a implemented using an incremental expression of the desired relative lack of understanding of hardware function. The trigonometric algorithm is called CORDIC, signal processing architectures. Many an acronym for COordinate Rotation DIgital Computer. The incremental functions are performed with a very simple hardware efficient algorithms exist, but these extension to the hardware architecture, and while not are generally not well known due to the CORDIC in the strict sense, are often included because of dominance of software systems over the past the close similarity. The CORDIC algorithms generally quarter century. Among these algorithms is a produce one additional bit of accuracy for each iteration. set of shift-add algorithms collectively known The trigonometric CORDIC algorithms were originally as CORDIC for computing a wide range of developed as a digital solution for real-time navigation functions including certain trigonometric, problems. The original work is credited to Jack Volder hyperbolic, linear and logarithmic functions. [4,9]. Extensions to the CORDIC theory based on work by John Walther[1] and others provide solutions to a broader While there are numerous articles covering class of functions. The CORDIC algorithm has found its various aspects of CORDIC algorithms, very way into diverse applications including the 8087 math few survey more than one or two, and even coprocessor[7], the HP-35 calculator, radar signal fewer concentrate on implementation in processors[3] and robotics. CORDIC rotation has also been FPGAs. This paper attempts to survey proposed for computing Discrete Fourier[4], Discrete Cosine[4], Discrete Hartley[10] and Chirp-Z [9] transforms, commonly used functions that may be filtering[4], Singular Value Decomposition[14], and solving accomplished using a CORDIC architecture, linear systems[1]. explain how the algorithms work, and explore This paper attempts to survey the existing CORDIC and implementation specific to FPGAs. CORDIC-like algorithms with an eye toward 1.1 Keywords implementation in Field Programmable Gate Arrays CORDIC, sine, cosine, vector magnitude, polar conversion (FPGAs). First a brief description of the theory behind the algorithm and the derivation of several functions is 2. INTRODUCTION presented. Then the theory is extended to the so-called The digital signal processing landscape has long been unified CORDIC algorithms, after which implementation of dominated by microprocessors with enhancements such as FPGA CORDIC processors is discussed. single cycle multiply-accumulate instructions and special Permission to make digital or hard copies of part or all of this work for addressing modes. While these processors are low cost personal or classroom use is granted without fee provided that copies are and offer extreme flexiblility, they are often not fast enough not made or distributed for profit or commercial advantage and that for truly demanding DSP tasks. The advent of copies bear this notice and the full citation on the first page. Copyrights reconfigurable logic computers permits the higher speeds of for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to dedicated hardware solutions at costs that are competitive republish, to post on servers, or to redistribute to lists, requires prior with the traditional software approach. Unfortunately, specific permission and/or a fee. Request permissions from Publications algorithms optimized for these microprocessor based Dept, ACM Inc., fax +1 (212) 869-0481, or [email protected]. systems do not usually map well into hardware. While FPGA 98 Monterey CA USA hardware-efficient solutions often exist, the dominance of Copyright 1998 ACM 0-89791-978-5/98/01..$5.00 the software systems has kept those solutions out of the spotlight. Among these hardware-efficient algorithms is a class of iterative solutions for trigonometric and other 3. CORDIC THEORY: AN ALGORITHM system based on binary arctangents. Conversions between this angular system and any other can be accomplished FOR VECTOR ROTATION using a look-up. A better conversion method uses an All of the trigonometric functions can be computed or additional adder-subtractor that accumulates the elementary derived from functions using vector rotations, as will be rotation angles at each iteration. The elementary angles can discussed in the following sections. Vector rotation can be expressed in any convenient angular unit. Those angular also be used for polar to rectangular and rectangular to values are supplied by a small lookup table (one entry per polar conversions, for vector magnitude, and as a building iteration) or are hardwired, depending on the block in certain transforms such as the DFT and DCT. The implementation. The angle accumulator adds a third CORDIC algorithm provides an iterative method of difference equation to the CORDIC algorithm: performing vector rotations by arbitrary angles using only shifts and adds. The algorithm, credited to Volder[4], is =−⋅ −−1 i zzdiii+1 tan ()2 derived from the general (Givens) rotation transform: xx'cossin=−φφ y Obviously, in cases where the angle is useful in the arctangent base, this extra element is not needed. yy'=+ cosφφ x sin The CORDIC rotator is normally operated in one of two which rotates a vector in a Cartesian plane by the angle φ. modes. The first, called rotation by Volder[4], rotates the These can be rearranged so that: input vector by a specified angle (given as an argument). The second mode, called vectoring, rotates the input vector xxy'cos=⋅−φφ[] tan to the x axis while recording the angle required to make that rotation. yyx'cos=⋅+φφ[] tan In rotation mode, the angle accumulator is initialized with So far, nothing is simplified. However, if the rotation the desired rotation angle. The rotation decision at each angles are restricted so that tan(φ)=±2-i, the multiplication iteration is made to diminish the magnitude of the residual by the tangent term is reduced to simple shift operation. angle in the angle accumulator. The decision at each Arbitrary angles of rotation are obtainable by performing a iteration is therefore based on the sign of the residual angle series of successively smaller elementary rotations. If the after each step. Naturally, if the input angle is already decision at each iteration, i, is which direction to rotate expressed in the binary arctangent base, the angle rather than whether or not to rotate, then the cos(δi) term accumulator may be eliminated. For rotation mode, the becomes a constant (because cos(δi) = cos(-δi)). The CORDIC equations are: iterative rotation can now be expressed as: =−⋅⋅−i xxydiiii+1 2 =−⋅⋅[]−i =+⋅⋅−i xKxydiiiii+1 2 yyxdiiii+1 2 =+⋅⋅−i =−⋅ −−1 i yKyxdiiiii+1 []2 zzdiii+1 tan ()2 where: where −−12ii − ==+() di= -1 if zi < 0, +1 otherwise Ki cos tan 2112 =± which provides the following result: di 1 =−[] Removing the scale constant from the iterative equations xAxzyznn0000cos sin yields a shift-add algorithm for vector rotation. The yAyzxz=+[]cos sin product of the Ki’s can be applied elsewhere in the system nn0000 or treated as part of a system processing gain. That product z = 0 approaches 0.6073 as the number of iterations goes to n infinity. Therefore, the rotation algorithm has a gain, A , =+−2i n An ∏ 12 of approximately 1.647. The exact gain depends on the n number of iterations, and obeys the relation In the vectoring mode, the CORDIC rotator rotates the =+−2i input vector through whatever angle is necessary to align An ∏ 12 n the result vector with the x axis. The result of the vectoring operation is a rotation angle and the scaled magnitude of The angle of a composite rotation is uniquely defined by the the original vector (the x component of the result). The sequence of the directions of the elementary rotations. That vectoring function works by seeking to minimize the y sequence can be represented by a decision vector. The set component of the residual vector at each rotation. The sign of all possible decision vectors is an angular measurement of the residual y component is used to determine which The CORDIC rotator described is usable to compute direction to rotate next. If the angle accumulator is several trigonometric functions directly and others initialized with zero, it will contain the traversed angle at indirectly. Judicious choice of initial values and modes the end of the iterations. In vectoring mode, the CORDIC permits direct computation of sine, cosine, arctangent, equations are: vector magnitude and transformations between polar and =−⋅⋅−i Cartesian coordinates. xxydiiii+1 2 =+⋅⋅−i 3.1 Sine and Cosine yyxdiiii+1 2 The rotational mode CORDIC operation can simultaneously zzd=−⋅tan −−1()2 i compute the sine and cosine of the input angle. Setting the iii+1 y component of the input vector to zero reduces the rotation where mode result to: =⋅ di= +1 if yi < 0, -1 otherwise. xAxznn00cos =⋅ Then: yAxznn00sin =+2 2 By setting x equal to 1/ A, the rotation produces the xAxynn0 0 0 n = unscaled sine and cosine of the angle argument, z0. Very yn 0 often, the sine and cosine values modulate a magnitude value. Using other techniques (e.g., a look up table) =+ −1 y0 zzn 0 tan requires a pair of multipliers to obtain the modulation. The x0 CORDIC technique performs the multiply as part of the =+−2i rotation operation, and therefore eliminates the need for a An ∏ 12 n pair of explicit multipliers.

A Survey of CORDIC Algorithms for Fpgas

Subchapter 2.4–Hp Server Rp5400 Series

Analysis of GPGPU Programs for Data-Race and Barrier Divergence

3.2 the CORDIC Algorithm

CORDIC-Like Method for Solving Kepler's Equation

CORDIC V6.0 Logicore IP Product Guide

A Review on Hardware Accelerator Design and Implementation of CORDIC Algorithm for a Gaming Application

Evolving GPU Machine Code

A.1 CORDIC Algorithm

An Optimization of CORDIC Algorithm and FPGA Implementation

FPGA Technology in Beam Instrumentation and Related Tools

Readingsample

A Unified Reconfigurable CORDIC Processor for Floating-Point Arithmetic