Table Lookup Structures for Multiplicative Inverses Modulo 2K

Table Lookup Structures for Multiplicative Inverses Modulo 2k David W. Matula∗ Alex Fit-Florea Mitchell Aaron Thornton† Southern Methodist University Dallas, Texas matula,alex,[email protected] Abstract to componentwise modular addition of the terms in Ben- schop’s exponent triple analogous to the use of traditional We introduce an inheritance property and related ta- logarithms for performing real-valued multiplication as a ble lookup structures applicable to simplified evaluation of sum of the argument’s logarithms. the modular operations “multiplicative inverse”, “discrete In this paper we employ modular function notation [12] k log”, and “exponential residue” in the particular modu- using |n|2k = j to denote the congruence n ≡ j (mod 2 ) lus 2k. Regarding applications, we describe an integer for k ≥ 1, with the further condition that j is the stan- representation system of Benschop for transforming integer dard residue for modulus 2k satisfying 0 ≤ j ≤ 2k − 1. multiplications into additions which benefits from our table Thus, the exponent triple (s, e, p) for j is specified by s e p lookup function evaluation procedures. |(−1) 3 2 |2k = j. We focus herein on the multiplicative inverse modulo 2k Note that Benschop’s representation is essentially a “dis- to exhibit simplifications in hardware implementations real- crete log triple transform”. Conversion between standard ized from the inheritance property. A table lookup structure binary and Benschop’s exponent triples requires efficient al- e given by a bit string that can be interpreted with reference gorithms for the exponential residue operation |3 |2k and to a binary tree is described and analyzed. Using observed the discrete logarithm dlg(j), which is the exponential symmetries, the lookup structure size is reduced allowing a residue inverse operation (when it exists) satisfying j = dlg(j) −1 novel direct lookup process for multiplicative inverses for |3 |2k . The modular multiplicative inverse |n |2k is all 16-bit odd integers to be obtained from a table of size defined for every odd integer 1 ≤ n ≤ 2k −1 by the relation −1 less than two KBytes. The 16-bit multiplicative inverse op- |nn |2k =1. Collectively the three unary operations of eration is also applicable for providing a seed inverse for discrete log, exponential residue, and multiplicative inverse, obtaining 32/64-bit multiplicative inverses by one/two iter- provide a set of arithmetic operations with regard to the par- ations of a known quadratic refinement algorithm. ticular modulus 2k that has the potential both to simplify and significantly extend standard integer arithmetic hardware. The three operations share significant fundamental properties that simplify their evaluation, in practice allow- 1 Introduction and Summary ing 16-bit evaluations by relatively small new lookup table structures (e.g. less than 2 KBytes each). Hardware integer arithmetic is generally provided for ad- Our focus in this paper is on the multiplicative inverse dition and multiplication modulo 2k for k=16, 32, and pos- modulo 2k. The discrete log and an improved exponential sibly 64. Benschop [1] has shown a transformed binary rep- residue algorithm are covered in [4, 5, 8, 9]. The multi- resentation that allows multiplication to be performed as an plicative inverse is particularly efficiently evaluated by the addition of “discrete logarithms”. Specifically, Benschop quadratic refinement formula (e.g. see [7]) k employs the fact that every integer j in the range [0, 2 − 1] −1 −1 −1 |i |22k = ||i |2k (2 − i|i |2k )|22k (1) can be represented by the exponent triple (s, e, p) such that s e p k (−1) 3 2 ≡ j (mod 2 ). Multiplication is then reduced given that we may start with a substantially sized multi- ∗ plicative inverse seed (e.g. 16-bit modular inverses). This work was supported in part by the Semiconductor Research Cor- Note that Equation 1 doubles the number of bits in the poration (SRC) grant RID-1289 †This work was supported in part by the Texas Advanced Technology modular multiplicative inverse with each iteration in a man- Program (ATP) grant 003613-0029-2003 ner strikingly similar to determining a more accurate ap- proximate divisor reciprocal. Recall that the Newton Raph- 2 The Inheritance Property and Operations son reciprocal refinement ρ = ρ(2 − yρ) realizes twice Modulo 2k the “number-of-bits-of-accuracy” where ρ is an approxima- 1 tion of y accurate to a specified “number-of-bits”. Oper- There are three unary operations which have the poten- ationally, at the bit level, the Newton-Raphson reciprocal tial to simplify and extend the applications of integer arith- refinement procedure employed for some floating point di- metic utilizing k-bit strings for typical values k=16, 32, 64, vision implementations is an approximation process gener- and 128. All three operations inherently employ reductions ating accurate bits of the reciprocal from left-to-right, where for residues modulo 2k and share fundamental properties excess low order bits are rounded off. The important dis- in their computation. These operations are the unary op- tinction herein is that the modular Equation 1 is an exact −1 erations of determining the inverse |i |2k for an odd in- process generating the modular multiplicative inverse bits k −1 teger 1 ≤ i ≤ 2 − 1, which satisfies |ii |2k =1,the right-to-left with excess overflow bits simply truncated off i exponential residue function |3 |2k here utilizing the base the top in each iteration. 3, and the discrete logarithm dlg(j), which is the appro- priately defined inverse (when it exists) to the exponential Hardware integer arithmetic units already provide addi- ( ) = |3dlg j | k k residue function yielding j 2 . tion and multiplication modulo 2 for values of k typically e There is an application of the exponential function |3 |2k including k =8, 16, 32 and possibly k =64. Thus, a that has motivated our interest in all three of these opera- seed lookup table for inverses modulo 216 would immedi- tions. It is readily shown that the set of odd k-bit integers ately expand modular integer arithmetic capability for typ- is given by the set of residues modulo 2k determined by ical small (k =8) and half word (k =16) size inte- s e k−2 {|(−1) 3 |2k |s ∈{0, 1}, 0 ≤ e ≤ 2 − 1}. For example, gers. From a half word (k =16) modular inverse, only e for k =4, note that {|3 |16|0 ≤ e ≤ 3} = {1, 3, 9, 11}, and one iteration of quadratic refinement employing Equation 1 e {| − 3 |16|0 ≤ e ≤ 3} = {5, 7, 13, 15}. Benschop [1] de- would be needed to obtain a 32-bit integer modular inverse. veloped an innovative application of this fact in fashioning a Just two iterations would yield a 64-bit integer modular in- representation system where each k-bit integer in [0, 2k −1] verse. In addition to assisting in algorithms for realizing is encoded as a triple (s, e, p) of exponents employing the Benschop’s novel integer transform representation, the in- s e p modular factorization |(−1) 3 2 |2k . verse operation modulo 2k can also be employed for more Conversion of standard k-bit positive integers to Ben- specific applications such as obtaining extremal rounding schop’s exponent triples allows integer multiplication to be test cases for floating point division [10]. performed by additions and the operation of raising an integer to any power from 2 to 10 to be performed by shifts and In Section 2 we introduce the fundamental inheritance at most a single addition/subtraction. To utilize Benschop’s property that simplifies the representation of all three oper- representation in practice it is essential to obtain efficient ations: multiplicative inverse, discrete log, and exponential hardware implementable algorithms for the discrete log and residue. We provide further details on Benschop’s exponent exponential residue operations with regard to the particu- triple representation to justify our focus on operations in the lar modulus 2k. In implementing and understanding these particular binary modulus family 2k. two operations it is helpful to also have an efficient algo- 2k In Section 3 we show how the table lookup structure for rithm for the multiplicative inverse modulo . All three inverses modulo 2k can be given by a bit string of size 2k −1 of these unary operations satisfy an important “inheritance” which can be interpreted with reference to a binary “lookup property simplifying their computation. tree”. We also provide Binary Decision Diagrams (BDDs) Formalization of the inheritance property is a main con- and flowgraph visualizations of the multiplicative inverse tribution of this paper that is introduced in a general form. operation to show how the inheritance property is reflected Definition 1. Let f(ak−1ak−2 ···a0)=bk−1bk−2 ···b0 in simpler structures in both cases. The lookup tree struc- be a one-to-one mapping of k-bit strings to k-bit strings ture is noted to be efficient to realize and represent by an defined for all k ≥ 1. Then f satisfies the inher- array, with access specified by the well known heap data itance property and provides a k-bit hereditary func- structure indexing procedure. Symmetries in the tree reduc- tion if f(ak−1ak−2 ···a0)=bk−1bk−2 ···b0 implies ing needed array storage size are investigated in Section 4, f(an−1an−2 ···a0)=bn−1bn−2 ···b0 for all n ≤ k. That with a resultant table size of less than 2 KBytes sufficient th is, the n output bit bn−1 of f(ak−1ak−2 ···a0) depends for a 16-bit multiplicative inverse table. only on the low order n-bit input an−1an−2 ···a0 indepen- dent of the value of the input bits ak−1ak−2 ···an.

Load more