Elementary Functions Electronics and Communications Department

Cairo University What are the elementary functions Electronics and Communications Department For mathematicians, they are elementary. For hardware people, • they are Higher level functions. Computer Arithmetic: Originally, calculators and computers only had: sin(x), log(x), • Elementary functions √n x, tanh(x), . Hossam A. H. Fahmy Now, some DSPs may include other operations such as the • gamma function. Simply, they are any functions that can be easily tabulated or • represented in a series, a polynomial, or a ratio of polynomials. c Hossam A. H. Fahmy 1/25 Implementation steps Reduction and reconstruction The range reduction and reconstruction steps are related and • The approximation of an elementary function is more efficient (time they are function-dependent. delay and area) if the argument is constrained in a small interval. There is no single reduction and reconstruction technique that Hence, there are three main steps in any elementary function cal- • is applicable to all functions. culation: The modular reduction is applicable to the exponential and si- • 1. range reduction, nusoidal functions, the case of the logarithm is even simpler. x N ln(2)+y N y 2. approximation, and e = e = 2 e sin(x) = sin(N π + y) × × 2 sin(x) = sin(y),Nmod4 = 0 sin(x) = cos(y),Nmod4 = 1 sin(x) = sin(y),Nmod4 = 2 3. reconstruction. − sin(x) = cos(y),Nmod4 = 3 log(x) = log(2− exp 1.f) = exp log(2) + log(1.f) × 2/25 3/25 The approximation step The digit and functional recurrence Digit recurrence techniques: The approximation algorithms are classified as: converge linearly. • use addition, subtraction, shift, and single digit multiplication. 1. digit recurrence techniques, • restoring/non-restoring division, SRT, Cordic, Briggs and DeL- • ugish, . 2. functional recurrence techniques, Functional recurrence techniques: 3. polynomial approximation techniques, and converge quadratically (or better for higher orders). • 4. rational approximation techniques. use addition, subtraction, multiplication, and table lookup. • Newton-Raphson of any order. • 4/25 5/25 The polynomial approximation The rational approximation Polynomial approximation techniques: Rational approximation techniques: depending on the function and the implementation details, may • converge directly to the required precision. depending on the function and the implementation details, may • converge directly to the required precision. use addition, subtraction, multiplication, and table lookup. • use addition, subtraction, multiplication, tables, and division. • divide the interval of the argument to a number of sub-intervals • where the elementary function is approximated by a polynomial for each sub-interval, approximate the given function by a ratio- • of a suitable degree. One or more tables contain the coefficients nal function (a polynomial divided by another polynomial). of the polynomials. 6/25 7/25 Digit recurrence: Cordic Basic idea of Cordic (xn, yn) J. Volder in 1959 developed a digit by digit algorithm to compute • all the trigonometric functions with minimal hardware support. θ - (x0, y0) The generalized algorithm calculates also the hyperbolic and the • arc functions. To reach an angle of θ, we rotate the initial vector in each iteration 1 i by a small angle αi = tan− 2− and watch the error zi = θ i ± − j=0 αj. Cordic has been widely used in calculators and in some proces- • P At the end (when zn 0), we reach sors. ≈ xn = x0 cos θ yn = x0 sin θ. Setting x0 = 1, we directly get the cosine and sine functions. 8/25 9/25 Derivation of Cordic The simple Cordic iteration yi+1 Volder’s algorithm is based on yi i ¡ xi+1 = xi diyi2− ¡ − i ¡ yi+1 = yi + dixi2− ¡ z = z d tan 1(2 i) ¡ i+1 i − i − − α¡ φi d = 1 if z 0, ¡ i x x i i ≥ ¡ i i+1 d = 1 otherwise. i − Notice that, with each iteration, We rotate the current vector (x , y ) by an angle α = tan 1 2 i. i i i ± − − x0 = R cos(φi + αi) the magnitude of z is decreasing by α and = R(cos φ cos α sin φ sin α ) • i | i| i i − i i = x cos α y sin α i i − i i x the magnitude of the vector is increasing due to the division by 0 = tan • xi yi αi cos . cos αi − αi x = x d y 2 i i+1 i − i i − i 0 1 2 3 4 5 6 7 8 9 αi (degrees) 45 26.6 14 7.1 3.6 1.8 0.9 0.4 0.2 0.1 10/25 11/25 Compensation Concluding Cordic 1 i 1 2i Since αi = tan− 2− then = 1 + 2− . Let us define ± cos αi What we have just explained is the rotation mode of the circular q • type. There is also a vectoring mode and two other types:linear, k = ∞ 1 + 2 2i = 1.646760258 − ··· and hyperbolic. iY=0 q and start from 1 With the generalized Cordic, it is possible to compute many x0 = = 0.60725293 • k ··· functions with minimal hardware support (three additions and a y = 0 0 comparison). z0 = θ. Cordic is slow but area efficient. It has been used in calculators Is there a maximum for θ? What is it? What if you want to calculate • and in the 8087 coprocessor. for a larger angle? 12/25 13/25 Polynomial approximations Different polynomials Taylor series are available for most functions if we know their • derivatives. However, such series do not provide the minimum error term. To improve the precision, it is better to divide the domain of • the input to sub-intervals and to have a specific polynomial for Chebyshev polynomials minimize the maximum error (mini-max) each sub-interval. • in the domain of the approximation. However, the calculation of the coefficients of the polynomial may take some effort. Note that we actually use truncated coefficients so we need to com- Many different polynomials may approximate the same function, • pensate for that. how do we choose the best? How do we define best? ⇒ Instead of saving the coefficients in a table, we can save the • How do we actually make the calculation? What is the best values of the function at various points and interpolate. • way? – How many points? Which points? – How to interpolate between those points? 14/25 15/25 Evaluating a polynomial Rational approximations It is possible to optimize the evaluation of a polynomial f(x) = n n 1 cnx + c x + + c in order to minimize the hardware or to n 1 − ··· 0 If a polynomial approximation requires too many terms, there might increase the− speed. be a rational function ( ) = P (x) of a lower degree that gives a R x Q(x) good approximation. For Hardware: use Horner’s rule f(x) = ( (((c x + c )x + c )x + )x + c ) 2 n n 1 n 2 0 To reduce the calculation order, you may use R = xPm(x ). ··· − − ··· m,n ( 2) and iterate. This might be even driven and controlled by soft- • Qn x ware. Typically, R is enough for over sixty bits of precision. However, • 4,4 some functions require higher degrees. For Speed: use parallel powering units to reach the required precision in • Rational approximations are “better” for double or extended one iteration if possible. • precisions, Cordic is preferred for single precision. use the PPA of a multiplier. • 16/25 17/25 Using the PPA of a multiplier Reciprocal with PPA In 1993, Eric Schwarz proposed to use the PPA to evaluate the If we want to calculate q such that b q = 0.1111 1.0 where b = 0.1b b b we write × · · · ≈ reciprocal. It is possible to generalize the idea for any polynomial. 2 3 4 ··· 0. 1 b2 b3 b4 b5 = b ··· q0 q1 q2 q3 q4 q5 = q × . ··· Write the operation in terms of the bits and expand the poly- . q4 q4b2 q4b3 q4b4 q4b5 • ··· nomial symbolically. q3 q3b2 q3b3 q3b4 q3b5 ··· q2 q2b2 q2b3 q2b4 q2b5 ··· q1 q1b2 q1b3 q1b4 q1b5 ··· q0 q0b2 q0b3 q0b4 q0b5 Group the terms with the same power of 2 numerical weight 0. 1 1 1 1 1··· 1 1 1 1 1 1 • ≈ and write them in columns. and use redundant digits for the qi to make each column indepen- dent. Hence, q0 = 1 The resulting matrix is similar to the partial products. Hence, • q1 + q0b2 = 1 we add the rows of this matrix using the reduction tree and the q2 + q1b2 + q0b3 = 1 carry propagate adder of the multiplier. 18/25 19/25 Steps of PPA implementation Reduction rules for the PPA evaluation i 1. Any M a ( ki2 )a. For example 5a (a, 0, a) over three Solve the equations: × ⇒ ⇒ • columns. P q0 = 1 q = 1 q b = 1 b 1 − 0 2 − 2 2. Algebraic reductions. For example, 2a a a. q = 1 q b q b = 1 b − ⇒ 2 − 1 2 − 0 3 − 3 3. Boolean reductions: Put in a PPA form: • q0 q1 q2 q3 q4 ··· a ab = a(1 b) a¯b. b4 b5 − − • − − ⇒ b3 2b2b4 − ··· 2b2b3 b4 a + b ab = a + ¯ab aOR b. − b2 b3 b2 b2b3 • − ⇒ 1− 1− 1− 1− 1 a + b 2ab a b. • − ⇒ ⊕ 20/25 21/25 The rest of the steps Other functions using the PPA After applying the reduction rules, If the function is expressed as a polynomial, we represent the coefficients and the parameter using their bits and expand symbolically. 2 Compensate for any approximation errors to improve the accu- For example, let us evaluate P (h) = c0 + c1h + c2h where • racy. 5 6 h = h12− + h22− 1 Complement the negative elements and subtract one (remember c0 = c00 + c012− • that ¯a = 1 a). c = c + c 2 1 − 1 10 11 − 1 Reduce all the constants.

Load more