Hardware Implementation of the Binary Method for Exponentiation in GF(2m) Mario Alberto García Martínez Guillermo Morales Luna Francisco Rodríguez Henríquez Instituto Tecnológico de Sección de Computación, Sección de Computación, Orizaba CINVESTAV, IPN CINVESTAV, IPN Av. ITO, Col. Zapata, Orizaba Av. IPN 2508 Av. IPN 2508 Ver. 94300 07300, México D.F. 07300, México D.F. [email protected] [email protected] [email protected] Abstract so called look-up tables. But this method cannot be Exponentiation in finite or Galois fields, GF(2m), is a efficiently implemented in VLSI circuits. For security and basic operation for several algorithms in areas such as performance reasons, it is often more advantageous to cryptography, error-correction codes and digital signal develop cryptographic algorithms in hardware. processing. Nevertheless the involved calculations are In recent years, several hardware algorithms and architectures have been proposed for computing very time consuming, especially when they are performed m by software. Due to performance and security reasons, it exponentiation in GF(2 ) [8]-[12]. Some of them have is often more convenient to implement cryptographic been implemented in VLSI circuits and others in FPGA´s, algorithms by hardware. In order to overcome the well- taking advantage of the inherent characteristic of known drawback of little or inexistent flexibility programmability of such devices. Nevertheless, most of associated to traditional Application Specific Integrated those implementations operate with small values of word Circuits (ASIC) solutions, we propose an architecture length since, for values greater to 8 bits the hardware using Field Programmable Gate Arrays (FPGA). A cheap requirements grow considerably but still flexible modular exponentiation can be In order to avoid the little flexibility inherent to implemented using these devices. We provide the VHDL traditional ASIC designs, we propose an architecture description of an architecture for exponentiation in especially tailored for FPGA implementations. Using such GF(2m) based in the square-and-multiply method, called devices an economic and flexible structure for binary method, using two multipliers in parallel exponentiation can be obtained. previously developed by ourselves. Our structure, The square and multiply method for exponentiation, compared with other designs reported earlier, introduces known as the binary method [3], is generally an accepted an important saving in hardware resources. technique to compute exponentiation in finite fields. Let us recall [6] that the multiplicative group of the Galois field GF(2m) is cyclic with 2m-1 elements. Indeed a I. Introduction generator is an irreducible polynomial of degree m, with coefficients in the prime field Z2. Thus for any non-zero Exponentiation operation in finite or Galois fields is element M in GF(2m) and any integer exponent e, we have fundamental in several cryptographic algorithms of m M e = M e mod (2 −1) generalized use at the moment, such as the Diffie-Helman m protocol for key exchange [1], El-Gamal algorithm for Moreover, the product in GF(2 ) is realized as the digital signatures [2] or the RSA cryptosystem [7]. The polynomial multiplication reduced modulo a chosen calculations required for such algorithms imply a high irreducible polynomial of degree m. Hence, the modular consumption of processing time, especially when they are exponentiation operation gives, for any given non-zero element M in GF(2m) and any integer exponent e, with implemented by software. A conventional method for m software exponentiation in finite fields makes use of the e<2 , the element R = M e = M M mod G (1) Algorithm: ( Exponentiation LSB-first ) 142L43 e times where g is the irreducible polynomial representing Input: M, e, G e GF(2m). Output: R =M ( mod G) If we write the exponent e in base 2, =================== 1.- C:= M; R:= 1 ; e = em-1em-2 … e1e0 then we can express R as 2.- for i:= 0 to n-1 do 2i 2.a).- if ei := 1 then R:=R*C ( mod G) R = ∏ M (2) 2.b).- C:= C*C (mod G) ei =1 end for ; There exist two main binary algorithms to evaluate the 3.- return R; right hand side term in eq. (2): MSB-first and LSB-first ================ (Most and Least Significant Bit, respectively). Each one of them depends on which is the first bit of the exponent to Example be scanned by the procedure. In this work we have used e = 1 1 1 1 1 0 1 0 = 250 the LSB-approach because it is possible to obtain a parallelized version of it in order to compute the e Step 2.a (R) Step 2.b (C) exponentiation operation defined in eq. (1). The 0 1 (M)2 = M2 exponentiation operation is computed using two 1 1*M2 = M2 (M2)2 = M4 multipliers designed previously in [4] as main building 0 M2 (M4)2 = M8 blocks. 1 M2 * M8= M10 (M8)2 = M16 In the remaining sections of this paper, we outline the 1 M10 * M16= M26 (M16)2 = M32 LSB-first algorithm and we give two examples of it, then 1 M26 * M32 = M58 (M32)2 = M64 we introduce our proposed exponentiator algorithm and its 1 M58 * M64 = M122 (M64)2 = M128 corresponding hardware implementation. Thereafter we 1 M122 * M128= M250 (M128)2 = M256 provide a complexity analysis of our design and finally we formulate some conclusions and comparisons with other designs previously reported in the literature. III. Modular Exponentiator Architecture II. Exponentiation Algorithm The flow chart of the binary algorithm is shown in Let GF(2m) be the Galois field over GF(2) = {0,1} of figure1. order m. Let g(x) be an m-degree irreducible polynomial (0) m C := M generating the field GF(2 ). Let α be a root of g(x). Then (0) 2 m-1 R := “1” the powers 1, α, α ,. .,α form a canonical basis in i:= -1 GF(2m). Let M be an arbitrary element in GF(2m) expressed in canonical basis as: and let i ++ m−1 i no M = ∑ miα i=0 Out R i < m? m−1 i e = ∑ei 2 = (em−1,em−2 ,...,e1,e0 );ei = {0,1} yes i=0 m be an m-bit integer, 1≤ e ≤ 2 -1. It is enough to consider C := C * C exponents of this size, since the multiplicative structure of m m GF(2 ) is cyclic with 2 -1 elements. e The power R= M , modulo the irreducible polynomial no G=g(x), is also in GF(2m) and, by using the binary e i = 1? exponentiation method [3], can be computed as shown in the next algorithm: yes IV. Design Comparisons R := R * C Table 1 shows the hardware requirements of our Figure 1. Flow chart of binary method architecture and its comparison with [5] and [8]. In figure 2, we propose a parallel architecture for m exponentiation in GF(2 ) based on that method. Table 1. Comparison table As we can observe in the algorithm shown in figure 1, registers R and C are loaded initially with “1” and M [5] [8] Here respectively. Then, with each clock cycle, one multiplier Multiplications 2(m-1) 2(m-1) 2m will operate to calculate C*C; and, depending of ei value, Multipliers m-1 2(m-1) 2 other multiplier will work to make the R*C product Squarers m-1 ------ -- whenever ei =1. The final result will be obtained in Registers ---- ------ 2 register R when the em-1 bit be examined. Multiplexers M ------ 1 As we mentioned before, the algorithm described in Time delay m2-m/2+1 2m2 +2 3m2 -m figure 1 can be efficiently implemented in hardware by using the architecture shown in figure 2. As we can observe, our time delays tend to be 50% greater than those in [8] but we require a constant number of multipliers while in [8] this number grows linearly with m. We have used the tools of program ISE 4.1i from Xilinx to describe the circuits with VHDL (VHSIC- Hardware Description Language, VHSIC in turn is Very High Scale Integration Circuits) and also they will be used in the synthesis process and implementation in the FPGA. We have a prototype card whose device is a Virtex FPGA XSV300 from Xilinx which is integrated with 3072 CLB´s (Configurable Logic Blocks). V. Conclusions Figure 2. Exponentiator Architecture We have presented the architecture and VHDL description of a structure for finite field exponentiation That structure requires 2m multiplications and sm clock that is based on the LSB-first version of the binary cycles for computing a modular exponentiation. The method. The architecture proposed here is a structure that exponent is a m-bit word and s is the number of clock operates using two multipliers in parallel and that is cycles required to calculate the multiplication. economic in terms of hardware requirements. If we We use a systolic and serial multiplier (SSM) [4] with compare our structure with other designs, as those time delay of 3m-1 clock cycles as the main building reported in [5] and [8], we can see that some hardware block of the exponentiator design. The architecture uses complexity may be saved by using our design. Later we two SSM that work in parallel form, two registers R and C will have to physically implement it in a FPGA, which of m-bits and a multiplexer that selects the corresponding will be used as a basic element for specific algorithms in operation of SMM(1). cryptography and error correction codes. The signals and block labels shown in figure 2 stand Acknowledgements: We thank the suggestions of for: anonymous referees who helped us to improve the final SSM=Serial and Systolic Multiplier presentation and contents of this paper.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages5 Page
-
File Size-