<<

IMPLEMENTATION AND OPTIMIZATION OF CORDIC DESIGN ON MODULATION AND

1ANANT PUKALE, 2RAMYA H.R

1Digital Communication Engineering, VTU, 2Telecommunication Engineering, VTU MSRIT, Bangalore, India

Abstract- Rotation of vectors through fixed and known angles has wide applications in robotics, digital processing, graphics, games, and animation. But, we do not find any optimized coordinate rotation digital computer (CORDIC) design for vector-rotation through specific angles. Therefore, in this paper, we present optimization schemes and CORDIC circuits for fixed and known rotations with different levels of accuracy. For reducing the area- and time-complexities, we have proposed a hardwired pre-shifting scheme in barrel-shifters of the proposed circuits. Two dedicated CORDIC cells are proposed for the fixed-angle rotations. In one of those cells, micro-rotations and scaling are interleaved, and in the other they are implemented in two separate stages. Pipelined schemes are suggested further for cascading dedicated single-rotation CORDIC units for high-throughput and reduced implementations. We have the optimized set of micro-rotations for fixed and known angles. The derived optimized scale-factors from a set of micro rotations and dedicated shift-add circuits are used to implement the scaling. We have synthesized the proposed CORDIC cells using Xilinx field programmable gate- array platform and shown that the proposed design offer higher throughput and less area- product than the reference CORDIC design for fixed and known angles of rotation.

Index Terms- Coordinate rotation digital computer (CORDIC),numerically controlled oscillator(NCO), look up table (LUT)

I. INTRODUCTION some areas like , engineering application, communication by multiplication of The name CORDIC is nothing but iterative based complex number with a known complex constants. mathematics for coordinated rotation of digital Previously, CORDIC circuits were developed for computer. In 1959, Volder was the scientist who implanting complex multiplication which is more developed CORDIC iterative form of computation for often used in digital signal processing (DSP) trigonometry functions , multiplication and division. application[16]-[18], but there is no detailed Over the period of time many application of CORDIC information with respect to efficiency of CORDIC has been formed with variety of for high realization of fixed and known angle rotations and performance and low cost hardware solutions. This constant complex multiplication. paper stresses on design of CORDIC in terms fixed angle rotation and implementation of same on Latency of computation is the major issue with the modulation and demodulation units. implementation of CORDIC algorithm due to its linear-rate convergence [19]. It requires (n+1) For known and fixed angel of vector rotation has iterations to have - precision of the output. Overall wide applications in graphics, game, robotics and latency of computation increases linearly with the robotics. Modulation and demodulation can be product of the word-length and the CORDIC iteration performed by successive vector rotation through fixed period. The speed of CORDIC operations is, there- angels and translation of the links. The translation fore, constrained either by the precision requirement operation is realized by simple addition of old (iteration count) or the duration of the clock period. coordinated values to obtain new coordinates for The angle recoding (AR) schemes [5]–[9] could be rotational steps to be accomplished by suitable applied for reducing the iteration count for CORDIC successive rotation of CORDIC circuit for fixed implementation of constant complex multiplications rotation through know small angels. There are plenty by encoding the angle of rotation as a linear of examples of uniform rotation starting from combination of a set of selected elementary angles of electrons inside an atom to the planets and satellites. micro rotations. In the conventional CORDIC, any A simple example of uniform rotations is the hands of given n rotation angle is expressed as a linear an animated mechanical clock which perform one combination of values of elementary angles that degree rotation each time. There are several cases belong to the set {(σ. Arctan(2-r)):σ {-1,1} where high-speed constant rotation is required in to obtain an -bit value of games, graphic, and animation. In modeling, . simulation, animation and games are area where objects with constant rotation are more often used. However, in AR methods, this constraint is relaxed Dedicated CORDIC circuits are used to implement by adding zero into the linear combination to obtain efficiently through known small angle rotation. In the desired angle using relatively fewer terms of

Proceedings of 4th IRF International Conference, 06th April-2014, Hyderabad, India, ISBN: 978-93-84209-00-1 52 Implementation and Optimization of Cordic Design on Modulation and Demodulation for {1,0,-1}. The elementary angle presented and in VI comparion of results VII set (EAS) used by the AR scheme is given by SEAS= conclusion is presented.

Hu and Naganathan [5] have proposed an AR method II. OPTIMIZATION OF ELEMENTARY based on the greedy algorithm that tries to represent ANGLE SET the remaining angle using the closest elementary angle ±arctan2-i. .Using this recoding schemes the The rotation-mode CORDIC algorithm to rotate a T total number of iterations could be reduced to less vector U=[UXUY] through an angle Φ to obtain a T than half of the conventional CORDIC algorithm for rotated vector V=[VXVY] is given by [1], [2] the same accuracy. Wu et al. [7] have suggested an AR scheme based on an extended elementary-angle- set (EEAS), that provides a more flexible way of decomposing the target rotation angle. . EEAS has better recoding efficiency in terms of the number of iterations and can yield better error performance than Such that when n is sufficiently large the AR scheme based on EAS. But the iteration period for EEAS is longer, and involves double the numbers of adders/ in the CORDIC cell compared with that of the other. Most of the advantages gained in the AR schemes are cancelled out by the hardware and time involved in scaling the pseudo-rotated vector. Where if σi = -1 then Φi < 0 and σi=1 otherwise, and is the scale-factor of the CORDIC algorithm, given It is desirable to have exhaustive search than greedy by search for elementary angle set (EAS) due to the fact that angle of rotation for fixed rotation case is known a priori. Moreover , the complexity is reduced to almost half of that of a CORDIC. So thus we have proposed some techniques to minimize For fixed angle rotation case, the pre-computed value complexity of barrel shifter. As CORDIC is an Φi and σi indicating sign bit can be stored in sign bit sequential it is desirable to use pipelined register (SBR) in CORDIC circuit. This during the implementation which is yet to be exploited but this CORDIC iteration, CORDIC circuit need not makes it out of parallel process execution. compute rest of angle Φi.[3].

This paper tries to present optimization schemes for reducing the number of iterations of micro rotation and even reducing the complexity of barrel shifter for vector rotation in case of fixed angle. We also aiming for cascaded pipelined circuit for future use, which is expected to be involving less area delay complexity and faster than the existing approaches.

The contribution of paper are listed as below 1. Implementation of fixed angle vector rotation with optimized set of micro-rotation.

2. Scaling circuits are derived from shift-add Fig. 1. Reference CORDIC circuit for fixed rotations operation. 3. To reduce barrel shifter complexity, a hardware As shown in fig.1. and with respect to (1a) and (1b) a pre-shifting scheme is been introduced. reference CORDIC circuit for fixed rotation X0 and

4. Single rotation CORDIC circuits are designed. Y0 can b fed as set or reset input in to the input register and hence later feedback values Xi and Yi are The remainder of this paper is organized as follows. even fed parallel to input register at ith iteration. So Section II deals with the optimization of elementary with the conventional concept of CORDIC we feed in angle set for different accuracies of implementation. initial values X0 and Y0 as well as feedback values Xi Efficient circuits for implementation of micro- and Yi via a pair of MUX. rotations for fixed rotations are presented in Section III. Implementation of micro rotation. Scaling Here we can find a set of small number of pre- optimization and implementation is discussed in determined elementary angle {αi, for 0 ≤ i ≤ m-1}, -k(i) Section IV. Application on modulation and where αi = arctan (2 ) for known and fixed angle demodulation is discussed. Section V application are rotation of vector through rotation mode CORDIC

Proceedings of 4th IRF International Conference, 06th April-2014, Hyderabad, India, ISBN: 978-93-84209-00-1 53 Implementation and Optimization of Cordic Design on Modulation and Demodulation circuit which are used in ith micro rotation of selected micro-rotations is enough. In Table I, it is CORDIC algorithm(1), where m is minimum number shown that rotations through any angle in the range of micro-rotation. The rotation through any angle in 0 0<Φ<=π/4 (in odd integer degrees) could be achieved ≤ θ ≤ 2π, can be mapped on to positive rotation 0 ≤ Φ with maximum angular deviation ΔΦ=0.0370, where . ≤ π/4 without much arithmetic operation[10]. Hence ΔΦ =|Φ-ΦA| Using a maximum of two selected to perform optimization rotation mapping is done so micro-rotations, the rotations could be achieved with that rotation angle lies in the range of 0 ≤ Φ ≤ π/4. maximum angular deviation with ΔΦ=1.8750 (0.033 Thus in next step of optimization, the elementary radian). In case of six micro-rotations, angular angles {αi} as according to the requirement of deviation ΔΦ could be reduced to ~0.0005. accuracy. Thus (1) can be modified as given in (3a) and (3b) TABLE I OPTIMIZATION OF FULL ROTATIONS WITH FOUR MICRO-ROTATIONS

Such that minimum number m

Where the scale factor K depends on the set {αi}. now thus the accuracy of CORDIC algorithm depends on how closely the resultant rotation ΦA due to all micro-rotation in (1) approximated and desired rotation Φ, which are in turn factor to determine the deviation actual rotation angle from estimated value.

Algorithm 1: it gives the simple pseudo code to optimize a set of micro-rotation. Φ is the maximum accuracy which is defined as maximum tolerable error between approximated angle and desired angle is given as input, the parameters k(i) and σi are searched in optimization algorithm such that the objective function ΔΦ minimizes. The algorithm In Table II, it is shown further that rotations begins with the first micro rotation i.e. m=1, if the through 0.10<|Φ|<20 in an interval of 0.1 could be micro rotation smaller angle of deviation than Φ obtained by four micro-rotations with angular cannot be found, m is increased by 1 and optimization deviation, ~0.0030. Here we can make an observation algorithm is increased by 1. In algorithm exhaustive that we can always achieve higher accuracy with search is employed in entire parameter space for all more number of micro-rotations. From Table II, we combination of k(i) and σ . Based on obtained result i find that higher accuracy could be achieved in case of on micro rotation, the scaling parameters are searched small rotation angles like 10 or 20 , compared to the with different objective function. If optimal set of most of the larger angles when the same number of micro rotations cannot satisfy design constraint for micro-rotations is used. scaling, then sub-optimal set of micro-rotations are 0 0 used. As shown in Table 1. For angle 31 and 35 are III. IMPLEMENTATION OF MICRO- obtained by sub-optimal solutions because scaling ROTATIONS requires more terms in these two cases if optimal way is used. For a given angle of rotation the elementary angles

and direction of micro rotation are already predetermined, the angle estimation data-path is not required in the CORDIC circuit for fixed and know rotations. The control can be stored in ROM as only few micro rotation are involved corresponding to control bits. Fig.2. shows the circuit for complex constant multiplication.

The barrel shifter is used to obtain number of In the experiment with the maximum input angular required bit shift corresponding control bits and 0 deviation єΦ=0.04 , I found that a set of four direction of micro rotation is stored

Proceedings of 4th IRF International Conference, 06th April-2014, Hyderabad, India, ISBN: 978-93-84209-00-1 54 Implementation and Optimization of Cordic Design on Modulation and Demodulation TABLE II shifts and linearly with word length. We can reduce OPTIMIZATION OF SMALL ROTATIONS WITH FOUR number of stages of MUXes and effective word MICRO-ROTATIONS length of MUXes of barrel shifter by as shown in fig.3. hardwired pre-shifting in basics CORDIC. So thus load only (L - l) MSB’s of an input to barrel shifter, where l is minimum number of shifts required, anyhow l LSB’s are truncated while shifting. If s is maximum number of shifts in the set selected micro rotation then the barrel shifter require only (s-l) shifts. The add/sub unit will receive (L- l) LSB’s from barrel shifter and the l MSB’s of corresponding operand of add/sub unit are hardwired to 0.So thus reducing hardware complexity of barrel shifter by hardwired pre-shifting method. In turn it reduces time involved in barrel shifter by hardwired pre-shifting, since delay of barrel shifter is due number of stages of MUXes, and its also possible to reduce number of stages by hardwired pre-shifting. More than 75% cases in table-1 require more than one shifts for minimum number of shift l. where as in table-2 we find that l is always greater than five except at angle 1.50. the hardware complexity of barrel shifter can be substantially reduced by hardwired pre-shifting method. If number shifts are known priori, one design barrel shifter for particular number of shifts. For implementation of 4 shifts (see table1) irrespective of maximum number of shifts, the barrel shifter would require only 3 stages of MUXes of 2:1 by hardwiring the shifts.

Fig. 2. CORDIC cell for constant complex multiplications 2) High-Throughput Implementation Using Cascaded in sign-bit register (SBR). The major contributors to Multi-Stage CORDIC: the hardware- complexity in the implementation of a If we observe carefully table1 and table2, the CORDIC circuit are the barrel-shifters and the adders. implementation of small rotations has the minimum There are several options for the implementation of of 5≤ l. thus with an advantage of hardwired pre- adders [22], from which a designer can always shifting, if separate module of CORDIC cascaded in a choose depending on the constraints and requirements pipelined fashion, it would provide high throughput. of the application. But, we have some scope to To implement CORDIC with higher degree of develop techniques for reducing the complexity of accuracy without affecting the throughput of result, barrel-shifters over the conventional designs as we can cascade multistage CORDIC, each stage discussed in the followings consisting of single rotation cells.

Fig. 3. Hardwired pre-shifting in basic CORDIC module.

1) Minimization of Barrel-Shifter Complexity by Hardwired Pre-Shifting: A barrel sifter having [log2(S+1)] stages demultiplexors , where S is maximum number of shifts required on word length L, where each stage Fig. 4. (a) Multi-stage single-rotation cascaded CORDIC circuit. (b) Structure of ith rotation module.  s(i) indicates will have L number of 1:2 MUXes. Thus hardware right-shift by bit-locations. complexity increases logarithmically with number of

Proceedings of 4th IRF International Conference, 06th April-2014, Hyderabad, India, ISBN: 978-93-84209-00-1 55 Implementation and Optimization of Cordic Design on Modulation and Demodulation 3)Cascaded CORDIC with Single-Rotation Cells: As depicted in fig.4. the multiple stages of cascaded pipelined CORDIC circuit with each stage performing specific rotation in a dedicated CORDIC Replacing each term in (4) by the expression of (6), module. The functional block of each stage is as we can obtain an approximate scale-factor as a shown in fig.4(b). Each stage of rotation module will product of shift-add terms of form have pair of add/sub units. To each add/subtract unit one of pair of input is loaded directly where as other input is pre-shifted form (L-s(i)) LSB location, where s(i) being number of right shifts required in the implementation of ith micro-rotation. The s(i) MSB Where m2 is maximum number of scaling iterations locations are set to 0. The rotation module does not required for the approximation and s(i) is the number require any barrel shifter as only 1 fixed micro- of shifts performed for the ith iteration of scaling, δi rotation is required and the SBR are also not required = ±1. as each module does either add or subtract operation. As show in fig. 4 the output of preceding stage is The number of accountable terms of (6) depends on taken as input to succeeding stages. In this module desired degree of output accuracy with an critical-path amounts only one add/sub computation approximated scale factor KA [see (7)] is limited by in module. Thus latency of n-stage single number of micro rotation used. For six micro-rotation rotation cascaded is n(TA+TFF) , where TFF and TA are CORDIC implementation, where the error in Φ is D-flip flop delay and addition/subtraction time, ~0.00050 , the first two terms in (6) contribute for respectively. (0≤i≤8) , while up to the third, fourth and fifth terms contribute for (0≤i≤3), (0≤i≤2), and (0≤i≤1) , More than two-third of rotation angles in table1 respectively and similarly it can be found that for four require 3 micro rotations for maximum deviation of micro-rotation CORDIC implementation, where the Φ up to 0.040. Using only 3 stages of CORDIC circuit error in Φ is ~0.040 , only the first two terms in (6) as shown in fig 4 even the complex multiplications contribute for (0≤i≤4) , while up to the third and the can be implemented involving 3 micro rotation (n=3). fifth terms contribute for (0≤ i ≤ 2) and (0≤ i ≤ 1) , This can also be implemented using non pipelined respectively. Accordingly, we have obtained the form of (n-1) carry propagate adders having total recursive shift-add expressions of scale-factor KA in latency of where TFA and TA are, the form of (7). respectively, full adder delay and the time required for L-bit addition-time. k(i) is the number of shifts of Here is algorithm2 gives optimization method to th the i stages, , L being word-length of search the parameter k(i) and σi. on knowing the set implementation. of micro rotation from algorithm1,from (4) the ideal scaling factor can be computed. The objective IV. SCALING OPTIMIZATION AND function ΔK=[1- KA/K] gives the deviation of KA/K IMPLEMENTATION from 1. The maximum tolerable deviation K is set to be same that of algorithm1 Φ. Here the algorithm To match with the optimized set of elementary angles begins with the single term of scaling, then is for the micro-rotations, the optimization of scaling increased by 1 until ΔK becomes smaller than K.. factor is discussed here. TABLE III A. Scaling Approximation for Fixed Rotations OPTIMIZED SHIFTS TO IMPLEMENT SCALING FOR THE CASE OF ROTATION WITH FOUR MICRO- The generalized form of expression for the scaling ROTATIONS factor as given in (2) can be explicitly expressed for the selected set of micro-rotations as

where k(i) for 0 ≤ i < m1 is the number of shifts in the ith micro rotation. Except for k(i)=0 (i.e., rotation by 450 ), by binomial expansion, any term in (4) can be written as

Where x=2-2i , i being the number of shifts in a micro rotation, and can be expressed alternatively in terms of i as

Proceedings of 4th IRF International Conference, 06th April-2014, Hyderabad, India, ISBN: 978-93-84209-00-1 56 Implementation and Optimization of Cordic Design on Modulation and Demodulation

We derive here the expression of scale factors separately for 43 and 45 rotations to get scaling with desired accuracy with less number of iterations compared with the above approach.

Fig. 6. CORDIC circuit for interleaved implementation of micro-rotations as well as scaling circuit. (a) The CORDIC circuit. (b) Structure and function of line-changer. For control- bit=1 it performs micro-rotations and for control-bit=0 it performs the shift-add operations for scaling.

2)Implementation of Scaling for Cascaded Single- Rotation CORDIC:

Fig. 5. Shift-add scaling circuit using hardwired pre-shifted loading.

To have an accuracy up to , the scale- factor for rotation through 43 and 45 can be expressed as Fig. 7. Shift-add circuit for single-rotation-cascaded-scaling using hardwired pre-shifting.

Equation (8) in recursive shift add forms The dedicated pair of adder sub-tractors as shown in fig.7. it does not require any sign bit register or multiplexor. A pair of input are fed to add/sub unit with one of input fed directly where as other input is shifted to right by s(i) locations. The shift-add term B. Implementation of Scaling sign factor will decide the computation of either Micro rotations and scaling can be implemented on 2 addition or subtraction. separate stages or on same circuit. Desired level of accuracy affects the implementation of scaling and V. APPLICATION micro rotation, and even the scaling implementation depends on implementation of micro rotation. Numerically controlled oscillator (NCO): Therefore the realization of scaling circuit IQ modulation is widely used in wireless corresponding to different implantation of micro communication in order to convert the base band rotation are discussed here. signal to pass band. In a typical wireless transmitter system the base band signal is converted to pass band 1) Generalized Implementation of Scaling: signal of required carrier frequency and then applied According to (7) the shift add circuit is shown in to a DAC after which the analog wave is transmitted fig.5. for minimizing the barrel shifter complexity the in air via an antenna. This wave is received by the scaling circuit of fig.5 can be hardwired and could antenna in the receiver system and is converted to a placed after CORDIC circuit of fig.2 to perform digital signal by using an ADC. Then this pass band micro rotation and scaling in 2 separate stage. The signal is converted back to base band via generalized CORDIC circuit for fixed rotation is as demodulation schemes for further processing. NCO is shown in fig.6 can perform scaling and micro rotation basically a digital oscillator which is used to generate in interleaved manner of alternate cycles. The cell of synchronous sine and cosine carrier waves which are fig.6 is similar to that of fig.2. an addition line used in modulation and demodulation processes in changer circuit is introduced to change un-shifted wireless communication. It consists of a phase input path. As shown in fig.(6b) the structure and accumulator and a waveform generation unit as functional block of line changer. shown in the following fig 8.

Proceedings of 4th IRF International Conference, 06th April-2014, Hyderabad, India, ISBN: 978-93-84209-00-1 57 Implementation and Optimization of Cordic Design on Modulation and Demodulation wrong base band appear after demodulation. The demodulation process is followed by a low pass filter (LPF) to get back the original base band I and Q signals.

Fig8: numerically controlled oscillator block diagram Phase increment = (Fo * 2^m)/Fclk

Where Fo is the desired output frequency, m is the accumulator precision and Fclk is the clock frequency. Fig 10: demodulation block diagram

VI. COMPARISON RESULTS

Fig 9: modulation block diagram

The above diagrams depict modulation and Fig 11: comparison graph Fig demodulation schemes. Equation for modulation and demodulation are as follows: mod_data = (i_data * nco_cos) + ( q_data * nco_sin)

The phase increment is applied as the input is accumulated every clock cycle inside the phase accumulator. The waveform generation unit can be a memory based, multiplier based or CORDIC based. For our application, CORDIC is used as the waveform generation unit. The CORDIC algorithm computes the sine and cosine of an input phase value by iteratively shifting the phase angle to approximate the Cartesian coordinate values for the input angle. At the end of the CORDIC iteration, the x and y coordinates for a given angle represent the cosine and sine of that angle, respectively. These sinusoidal waves are used to modulate the base band I and Q 12: comparison table signals and demodulate to get back I and Q signals as well. The numerically controlled oscillator as show in fig 11, the chart manhattans graph depicts that number of demod_i_data = mod_data * nco_cos flip flop required are reduced almost 3 times in demod_q_data = mod_data * nco_sin proposed system. Number of look up table required are also reduced drastically almost 3 times in The same NCO which has been used for modulation proposed system and power requirement is also should be used for demodulation as well. Otherwise reduced almost half in proposed system

Proceedings of 4th IRF International Conference, 06th April-2014, Hyderabad, India, ISBN: 978-93-84209-00-1 58 Implementation and Optimization of Cordic Design on Modulation and Demodulation CONCLUSION [10] K. Maharatna, S. Banerjee, E. Grass, M. Krstic, and A. Troya, “Modified virtually scaling free adaptive CORDIC rotator algorithm and architecture,” IEEE Trans. Circuits The proposed CORDIC as used in modulation and Syst. for Video Technol., vol. 15, no. 11, pp. 1463–1474, demodulation of incoming in-phase and quadrature Nov. 2005. phase data are modulated using sine and cosine waves generated by CORDIC waveform generator and a [11] C. Y. Kang and E. E. Swartzlander, Jr., “Digit-pipelined direct digital frequency synthesis based on differential reverse process is performed to retrieve the original CORDIC,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. fed data. The above fig12: table of comparison shows 53, no. 5, pp. 1035–1044, May 2006. that number of flip flop , number of 4 input LUT , number of shift register and power(mW) are [12] L. Vachhani, K. Sridharan, and P. K. Meher, “Efficient CORDIC and architectures for low area and high drastically reduced so thus improving throughput throughput implementation,” IEEE Trans. Circuits Syst. II, Less latency, Less ADP (Area Delay Product = Exp. Briefs, vol. 56, no. 1, pp. 61–65, Jan. 2009. Number of LUTs x CLK Period of design). As show in above comparison table it is clear that the number [13] B. G. Blundell, An Introduction to and Creative 3-D Environments. London, U.K.: Springer- of resources required is drastically less than the Verlag, 2008. conventional radix 4 CORDIC process than the processed system. [14] T. Lang and E. Antelo, “High-throughput CORDIC-based geometry operations for 3D computer graphics,” IEEE Trans. Comput., vol. 54, no. 3, pp. 347–361, Mar. 2005. REFERENCES [15] M.-P. Cani, T. Igarashi, and G.Wyvill, Interactive [1] J. E. Volder, “The CORDIC trigonometric computing ShapeDesign. San Rafael, CA: Morgan & Claypool, 2007. technique,” IRE Trans. Electron. Comput., vol. EC-8, pp. 330–334, Sep. 1959. [16] K. J. Jones, “2D systolic solution to discrete Fourier transform,” IEEE Proc. Comput. Digit. Techn., vol. 136, no. [2] J. S. Walther, “A unified algorithm for elementary 3, pp. 211–216,May 1989. functions,” in Proc. 38th Spring Joint Comput. Conf., 1971, pp. 379–385. [17] P. K. Meher, J. K. Satapathy, and G. Panda, “Efficient systolic solution for a new prime factor discrete Hartley [3] Y. H. Hu, “CORDIC-based VLSI architectures for digital transformalgorithm,” IEEE Proc. Circuits, Devices Syst., signal processing,” IEEE Signal Process. Mag., vol. 9, no. 3, vol. 140, no. 2, pp. 135–139, Apr. 1993. pp. 16–35, Jul. 1992. [18] S. Freeman and M. O’Donnell, “A complex arithmetic [4] P. K. Meher, J. Valls, T.-B. Juang,K. Sridharan, andK. digital signal using rotators,” in Proc. IEEE Maharatna, “50 years of CORDIC: Algorithms, Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1995, architectures and applications,” IEEE Trans. Circuits Syst. I, pp. 3191–3194. Reg. Papers, vol. 56, no. 9, pp. 1893–1907, Sep. 2009. [19] J. R. Cavallaro and F. T. Luk, “CORDIC arithmetic for a [5] Y. H. Hu and S. Naganathan, “An angle recoding method SVD processor,” J. for Parallel Distrib. Comput., vol. 5, pp. for CORDIC algorithm implementation,” IEEE Trans. 271–290, 1988. Comput., vol. 42, no. 1, pp.99–102, Jan. 1993. [6] Y. H. Hu and H. H.M. Chern, “A novel implementation of [20] C. H. Lin and A. Y. Wu, “Mixed-scaling-rotation CORDIC CORDIC algorithmusing backward angle recoding (BAR),” (MSRCORDIC) algorithm and architecture for high- IEEE Trans.Comput., vol. 45, no. 12, pp. 1370–1378, Dec. performance vector rotational DSP applications,” IEEE 1996. Trans. Circuits Syst. I, Reg. Papers, vol. 52, no. 11, pp. 2385–2396, Nov. 2005. [7] C.-S.Wu, A.-Y.Wu, and C.-H. Lin, “A high- performance/low-latency vector rotational CORDIC [21] S. Y. Park and N. I. Cho, “Fixed-point error analysis of architecture based on extended elementary angle set and CORDIC processor based on the variance propagation trellis-based searching schemes,” IEEE Trans. Circuits Syst. formula,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 51, II, Analog Digit. Signal Process., vol. 50, no. 9, pp. 589– no. 3, pp. 573–584, Mar. 2004. 601, Sep. 2003. [22] A. R. Omondi, Computer Arithmetic Systems: Algorithms, [8] T.-B. Juang, S.-F. Hsiao, and M.-Y. Tsai, “Para-CORDIC: Architectures, and Implementations. New York: Prentice- Parallel cordic rotation algorithm,” IEEE Trans. Circuits Hall, 1994. Syst. I, Reg. Papers, vol. 51, no. 8, pp. 1515–1524, Aug. 2004. [23] A. K. Jain, Fundamentals of . Englewood Cliffs, NJ: Prentice-Hall, 1989. [9] T. K. Rodrigues and E. E. Swartzlander, “Adaptive CORDIC: Using parallel angle recoding to accelerate [24] Signal Compression : Coding of Speech, Audio, Text, CORDIC rotations,” in Proc. 40th Asilomar Conf. Signals, Image And Video,ser. Selected Topics in and Syst. Comput. (ACSSC), 2006, pp. 323–327. Systems, N. E. Jayant, Ed. Singapore: World Scientific, 1997, vol. 9.



Proceedings of 4th IRF International Conference, 06th April-2014, Hyderabad, India, ISBN: 978-93-84209-00-1 59