A Dual-Purpose Real/Complex Logarithmic Number System ALU

A Dual-Purpose Real/Complex Logarithmic Number System ALU Mark G. Arnold Sylvain Collange Computer Science and Engineering ELIAUS Lehigh University Universite´ de Perpignan Via Domitia [email protected] [email protected] Abstract—The real Logarithmic Number System (LNS) allows advantages to using logarithmic arithmetic to represent <[X¯] fast and inexpensive multiplication and division but more expen- and =[X¯]. Several hundred papers [26] have considered con- sive addition and subtraction as precision increases. Recent ad- ventional, real-valued LNS; most have found some advantages vances in higher-order and multipartite table methods, together with cotransformation, allow real LNS ALUs to be implemented for low-precision implementation of multiply-rich algorithms, effectively on FPGAs for a wide variety of medium-precision like (1) and (2). special-purpose applications. The Complex LNS (CLNS) is a This paper considers an even more specialized number generalization of LNS which represents complex values in log- system in which complex values are represented in log-polar polar form. CLNS is a more compact representation than coordinates. The advantage is that the cost of complex multi- traditional rectangular methods, reducing the cost of busses and memory in intensive complex-number applications like the plication and division is reduced even more than in rectangular FFT; however, prior CLNS implementations were either slow LNS at the cost of a very complicated addition algorithm. CORDIC-based or expensive 2D-table-based approaches. This This paper proposes a novel approach to reduce the cost of paper attempts to leverage the recent advances made in real- this log-polar addition algorithm by using a conventional real- valued LNS units for the more specialized context of CLNS. This valued LNS ALU, together with additional hardware that is paper proposes a novel approach to reduce the cost of CLNS addition by re-using a conventional real-valued LNS ALU with less complex than the real-valued LNS ALU to which it is specialized CLNS hardware that is smaller than the real-valued attached. LNS ALU to which it is attached. The resulting ALU is much less Arnold et al. [1] introduced a generalization of logarith- expensive than prior fast CLNS units at the cost of some extra mic arithmetic, known as the Complex-Logarithmic Number delay. The extra hardware added to the ALU is for trigonometric- System (CLNS), which represents each complex point in log- related functions, and may be useful in LNS applications other than CLNS. The novel algorithm proposed here is implemented polar coordinates. CLNS was inspired by a nineteenth-century using the FloPoCo library (which incorporates recent HOTBM paper by Mehmke [16] on manual usage of such log-polar advances in function-unit generation), and FPGA synthesis re- representation, and shares only the polar aspect with highly- sults are reported. unusual complex-level-index representation [20]. The initial Keywords: Complex Arithmetic, Logarithmic Number System, CLNS implementations, like the manual approach, were based hardware function evaluation, FPGA. on straight table lookup, which grows quite expensive as pre- I. INTRODUCTION cision increases since the lookup involves a two-dimensional table. The usual approach to complex arithmetic works with Cotransformation [2] cuts the cost of CLNS significantly pairs of real numbers that represent points in a rectangular- by converting difficult cases for addition and subtraction into coordinate system. To multiply two complex numbers, de- easier cases, but the table sizes are still large. noted in this paper by upper-case variables, X¯ and Y¯ , using Lewis [14] overcame such large area requirements in the rectangular-coordinates, involves four real multiplications: design of a 32-bit CLNS ALU by using a CORDIC algorithm, <[X¯Y¯ ] = <[X¯] · <[Y¯ ] − =[X¯] · =[Y¯ ] (1) which is significantly less expensive; however, the implemen- =[X¯Y¯ ] = <[X¯] · =[Y¯ ] + =[X¯] · <[Y¯ ]; (2) tation involves many steps, making it rather slow. Despite the implementation cost, CLNS may be preferred in certain where <[X¯] is the real part and =[X¯] is the imaginary part applications. For example, Arnold et al. [3], [4] showed that of a complex value, X¯. The bar indicates this is an exact CLNS is significantly more compact than a comparable rectan- linearly-represented unbounded-precision number. There are gular fixed-point representation for a radix-two FFT, making many implementation alternatives for the real arithmetic in the memory and busses in the system much less expensive. (1) and (2): fixed-point, Floating-Point (FP), or more unusual These savings counterbalance the extra cost of the CLNS ALU systems like the scaled Residue Number System (RNS) [8] or if the precision requirements are low enough. Vouzis et al. the real-Logarithmic Number System (LNS)[19]. Swartzlander [21] analyzed a CLNS approach for Orthogonal Frequency et al. [19] analyzed fixed-point, FP, and LNS usage in an FFT Division Multiplexing (OFDM) demodulation of Ultra-Wide implemented with rectangular coordinates, and found several Band (UWB) receivers. To reduce the implementation cost of such CLNS applications, Vouzis et al. [22] proposed using a back exactly to rectangular form: Range-Addressable Lookup Table (RALUT), similar to [17]. <[X¯] = bXL cos(X ) θ (4) =[X¯] = bXL sin(X ): II. REAL-VALUED LNS θ The format of the real-valued LNS [18] has a base-b In a practical implementation, both the logarithm and angle will be quantized: (usually, b = 2) logarithm (consisting of kL signed integer bits and f fractional bits) to represent the absolute value of fL −fL L X^L = bXL2 + 0:5c2 a real number and often an additional sign bit to allow for f +2 −f (5) X^θ = bXθ2 θ /π + 0:5c2 θ : that real to be negative. We can describe the ideal logarithmic transformation and the quantization with distinct notations. For We scale the angle by 4/π so that the quantization near an arbitrary non-zero real, x¯, the ideal (infinite precision) LNS the unit circle will be roughly the same in the angular and value is xL = logb jx¯j. The exact linear x¯ has been transformed radial axes when fL = fθ while allowing the complete circle into an exact but non-linear xL, which is then quantized as of 2π radians to be represented by a power of two. The fL −fL ~ x^L = bxL2 + 0:5c2 . In analogy to the FP number sys- rectangular value, X, represented by the quantized CLNS has tem, the kL integer bits behave like an FP exponent (negative the following real and imaginary parts: exponents mean smaller than unity; positive exponents mean X^L <[X~] = b cos(π=4X^θ) larger than unity); the fL fractional bits are similar to the FP ^ (6) ~ XL ^ mantissa. Multiplication or division simply require adding or =[X] = b sin(π=4Xθ): subtracting the logarithms and exclusive-ORing the sign bits. To summarize our notation, an arbitrary-precision complex To compute real LNS sums, a conventional LNS ALU value X¯ 6= 0 is transformed losslessly to its ideal CLNS z computes a special function, known as sb(z) = logb(1 + b ), representation, X. Quantizing X to X^ produces an absolute which, in effect, increments the LNS representation by 1.0. error perceived by the user in the rectangular system as (The s reminds us this function is only used for sums when the jX¯ − X~j. sign bits are the same.) The LNS addition algorithm starts with Complex multiplication and division are trivial in CLNS. xL = logb jx¯j and yL = logb jy¯j already in LNS format. The Given the exact CLNS representations, X and Y , the result of result, tL = logb(¯x+¯y) is computed as tL = yL +sb(xL −yL), multiplication, Z = X + Y , can be computed by two parallel which is justified by the fact that t¯ =x ¯ +y ¯ =y ¯(¯x=y¯ + 1), adders: even though such a step never occurs in the hardware. Since Z = X + Y L L L (7) x¯ +y ¯ =y ¯ +x ¯, we can always choose zL = −|xL − yLj [18] Zθ = (Xθ + Yθ) mod 2π: by simultaneously choosing the maximum of x and y . In L L The modular operation for the imaginary part is not strictly other words, t = max(x ; y ) + s (−|x − y j), thereby L L L b L L necessary, but reduces the range of angles that the hardware restricting z < 0. needs to accept. In the quantized system, this reduction comes Analogously to s , there is a similar function to compute b at no cost in the underlying binary adder because of the 4/π the logarithm of the differences with t = max(x ; y ) + L L L scaling. d (−|x − y j), where d (z) = log j1 − bzj. The decision b L L b b For example, if X¯ = −1 + i and Y¯ = 4i, the b = 2 polar- whether to compute s or d is based on the signs of the real b b logarithmic representations are X = 0:5 log (12 + 12) = values. L b 0:5;X = 3π=4 and Y = 0:5 log (42) = 2:0;Y = π=2. The Early LNS implementations [18] used ROM to lookup s^ θ L b θ b representation of the product is computed simply as Z = and d^ achieving moderate (f = 12 bits) accuracy. More L b L X + Y = 0:5 + 2:0 = 2:5 and Z = X + Y = 3π=4 + recent methods [11], [12], [13], [23], [24] can obtain accuracy L L θ θ θ π=2 = 5π=4. We can verify manually that this is correct: Z¯ = near single-precision floating point at reasonable cost. The goal p p p b2:5(cos(5π=4) + i sin(5π=4)) = 4 2(− 2=2 − 2=2i) = here is to leverage the recent advances made in real-valued −4 − 4i.

A Dual-Purpose Real/Complex Logarithmic Number System ALU

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support