Hardware Architectures of Elliptic Curve Based Cryptosystems Over Binary Fields
Total Page:16
File Type:pdf, Size:1020Kb
Hardware Architectures of Elliptic Curve Based Cryptosystems over Binary Fields Chang Shu Doctoral Dissertation Defense Feb. 8, 2007 Advisor: Dr. Kris Gaj Dept. of Electrical & Computer Engineering George Mason University 1 Acknowledgements Dr. Kris Gaj (Dissertation Director) Dr. Soonhak Kwon (Dept. of Mathematics, Sungkyunkwan University, Korea) Dr. Shih-Chun Chang (Committee Member) Dr. Brian L. Mark (Committee Member) Dr. Ravi Sandu (Committer Member) Dr. Andre Manitius (Chair of ECE) Dr. Yariv Ephraim (Ph.D. Coordinator) Dr. Tarek El-Ghazawi (Dept. of ECE at The George Washington University) 2 Overview • Introduction – Elliptic Curve Cryptography – Tate Pairing Based Cryptography • Architectures for Finite Field Arithmetic – Polynomial basis multiplier – Normal basis multiplier – Composite field arithmetic • Architectures for Elliptic Curve Cryptosystems – Optimizations for a single FPGA device – Reconfigurable computing approach • Architectures for Tate Pairing Based Cryptosystems – Optimizations for a single FPGA device – Reconfigurable computing approach • Summary 3 Elliptic Curve Cryptosystems • Family of public key cryptosystems • Invented in 1985 by Miller and Koblitz independently • Used primarily for digital signatures & key exchange • Included in multiple industry, government, and banking standards, such as IEEE p1363, ANSI 9.62, and FIPS 186-2 • Part of standard security protocols, such as IPSec and SSL (proposed extension) 4 Why Elliptic Curve Cryptography ? – ECC vs. RSA Key size comparison: Security Level (bits) 80 112 128 192 256 SKIPJACK Triple-DES AES AES AES ECC n 160 224 256 384 512 RSA n 1024 2048 3072 8192 15360 Hardware implementation consideration: Less area, less memory, narrower bandwidth, and more efficient underlying arithmetic Flexibility: There exists a family of cryptosystems for ECC 5 Why Hardware Implementations of Cryptography SOFTWARE HARDWARE security of data during transmission speed random key low cost generation access control to keys flexibility (new cryptoalgorithms, tamper resistance protection against new attacks ) (viruses, internal attacks ) 6 Why Hardware Accelerators for Elliptic Curve Cryptosystems ? • Hardware accelerators for web servers – SSL (Secure Socket Layer), high speed requirements for a large number of key exchanges • Hardware accelerators for Virtual Private Networks (VPNs) – IPSec (Secure Internet Protocol), establishment of a large number of security association • Hardware accelerators for wireless gateways – IEEE 802.11, secure key exchange, achieving low power • Secure smart cards – Need to shorten latency, due to limitations, such as low power, low frequency, and low cost embedded microprocessors • Selected cryptographic chip manufacturers 7 What is Elliptic Curve Cryptography ? • Elliptic Curve Cryptosytems (ECC) are a class of public key cryptosystems • The security of ECC is based on the hardness of the elliptic curve discrete logarithm problem (ECDLP). • Let E be an elliptic curve over a finite field F q . Let P be a point in E Ʊ F q Ʋ , and suppose that P has a prime order n . Then the cyclic subgroup of E Ʊ F q Ʋ generated by P is < P >= ∞ P 2 P L Ʊ n − 1 Ʋ P. ¢ £ • Private key: an integer d chosen randomly from the interval 1 ¡ 2 ¡ ¡ n −1 • Public key: Q = dP • Encryption: C =ƱVU Ʋ=ƱkP M + kQ Ʋ • Decryption: M = U − dV = U − d ⋅ kP = U − kQ 8 Elliptic Curve Arithmetic – Group Law Point addition: P + Q Point doubling: 2P = P + P Scalar Multiplication: kP = P + P + L + P k times 9 Pairing Based Cryptography • New family of public key cryptosystems • First proposed by Menezes, Okamoto, and Vanstone in 1993 for Weil decent attack against ECC • Applied to identity based cryptography, key exchange, and digital signature by Boneh, Joux, Sakai, et al. • Not a part of any standard yet • Very limited number of software and hardware implementations • Believed to be slower than elliptic curve cryptography 10 Mathematical Basics of Pairing Based Cryptography • Pairing is a map between groups, → where e: G 1 x G 1 G 2 , G 1 = E( F q )G and 2 = Fqk • The most important property of this map is bilinearity e(aP, bQ) = e(P, Q) ab a, b: integers P,Q: points on elliptic curves • In practice, Tate or Weil pairing are used. 11 Identity-Based Encryption Trusted Authority s: secret value ID(Bob) P: public value H 1 PTA = s P ID(Bob) SID(Bob) PTA public key of TA PID(Bob) r P PID(Bob)= H 1(ID(Bob)) Bob’s public key AliceC Bob SID(Bob)= s PID(Bob) M Encryption Decryption M Bob’s private key r r: random number C = (U, V) = (rP, M + H 2(e(P ID(Bob) , P TA ) ) M = (V + H 2(e(S ID(Bob) , U )) r r By bilinearity, e(S ID(Bob , U) = e(sP ID(Bob), rP) = e(P ID(Bob ), sP) = e(P ID(Bob) , P TA ) 12 Major Contributions of this Thesis • Finite field arithmetic – A novel large extension field multiplier architecture for Tate pairing based cryptosystems – A novel hybrid multiplier architecture for composite fields – A new mathematical scheme for basis conversion for selected field degrees • Elliptic curve cryptosystems – Latency optimization scheme for a single FPGA device – Analysis of several partitioning schemes for a reconfigurable computer, SRC 6 – Extensive library of over 25 hardware macros for SRC 6 and SGI Altix-4700 • Tate pairing based cryptosystems – Comparative analysis of two novel algorithms from the point of view of hardware efficiency – First published implementations via a single FPGA device – Porting the IP core of pairing over 8 binary fields to SGI Altix-4700 – Comparative analysis of Tate pairing based cryptosystems vs. elliptic curve cryptosystems in hardware 13 Architectures for Finite Field Arithmetic 14 Basis Choices in Finite Fields ¢ 2 m−1 £ • Polynomial basis: the subsequent powers 1 ¡ α ¡ α ¡ ¡ α of the root of an irreducible polynomial f mƱ x Ʋ . – Low Hamming weight irreducible polynomial, e.g., trinomial or pentanomial – Maximum Hamming weight irreducible polynomial, e.g., All-One- Polynomial m −1 ¤ ¦ 2 2 2 § 2 • Normal basis: the conjugates β ¥ β ¥ β ¥ β , where β is the root of an irreducible polynomial f m Ʊ x Ʋ . – Type I or Type II optimal normal basis ¨ © γ ¨ ¨ © © • Hybrid basis for composite fields α β γ ¨ © 15 Polynomial Basis Multiplier (1) Bit-serial multiplier is area efficient while the operational speed is sacrificed = 9 + + f9 (x) x x 1 Linear feedback shift registers (LFSRs) are adopted in both architectures. Least significant bit-serial multiplier based on right-to-left algorithm The registers of b(x) can be saved in MSB-serial multiplier because only the partial products need to be updated in each clock cycle § ¦ Less power is consumed in the ¥ ¤ £ ¢ second architecture because the ¡ value of b(x) is fixed during computations. Most significant bit-serial multiplier based on left-to-right algorithm 16 Polynomial Basis Multiplier (2) Bit-parallel multiplier can complete one multiplication in one clock cycle. It is impossible to be implemented in case of large field sizes. But it can be applied to the ground field arithmetic of the composite multiplier. = 5 + 2 + f5 (x) x x 1 Two steps to derive the bit-parallel multiplier: 1. Use Mastrovito’s method to compute the partial product with 2m-1 bits 2. Perform the reduction exploiting the standard technique for low Hamming weight irreducible polynomials. 17 Polynomial Basis Multiplier (3) The digit-serial multiplier is a parallel version of the bit-serial one. Instead of computing one bit of the product, the digit-serial multiplier can compute multiple bits each clock cycle. Allows the tradeoff between area and latency. MSD serial multiplier in , where the digit size D=4, f (x) = x239 + x36 +1 F2239 239 D−1 Two parts: 1. LFSRs, c(x) ← c(x)x D + a xib(x) f (x) for ∑ n−D+i mod m 2. AND-XOR arrays i=0 18 19 1 = 11 θ 10 2 F ∈ θ 5 2 F o useo same the combinational circuits rially. is the normal basis generator of of generator basis isthe normal 1 ¤ § − ¦ θ ¡ + £ ¥ θ ¥ ¥ £ ¤ = ¡ γ £ ¢ ¡ ¨ Normal Multiplier Basis (1) ¨ © © ¨ ¤ ¤ ¨ ¢ ¢ ¨ together rotatewith registers computing theproduct se Massey-Omura’s architecture fornormal basis multiplier ist 20 itecture shortening by the criticalpath ¡ decreasing the circuit complexity ¡ ¡ ¢ § ¡ ¢ § § ¨ ¨ © ¦ ¡ ¢ ¦ ¦ ¨ ¨ © © © © ¥ ¥ ¥ ¨ ¡ ¨ © ¢ Normal Multiplier Basis (2) ¤ ¤ ¤ ¤ ¨ ¨ © ¤ ¤ ¡ ¢ ¢ £ £ ¢ ¨ ¨ © ¢ £ ¡ ¢ Agnew etAgnew improved al. the original Massey-Omura’s arch Kwon et et Kwon improvedal. the Agnew al’set architectureby A Novel Normal Basis Hybrid Multiplier for Composite Binary Fields (1) 1. Kwon’s bit-serial structure is applied to the tower field multiplication in GF(2 3x5 ). 2. Special irreducible trinomial is used to construct the ground field, so that the bit-parallel structure can be efficient. 21 A Novel Normal Basis Hybrid Multiplier for Composite Binary Fields (2) Squarer: Inverter: d02 d01 d00 d'02 d'01 d'00 nm − 1 − 2 −1 a 1 = ⋅ar 1,r = ar 2n −1 r Obviously, A = a is an element in F2n Since r-1 can be represented as a sum of powers r −1 = 2n + 22n +... + 2(m− )1 n Computation at the top level is free and r−1 equivalent to cyclic shift. a can be computed using the addition chain, the method ¡ The standard technique for polynomial requires log (m − )1 + HW (m − )1 +1 basis can be applied to the ground field. 2 general multiplications. 22 A Novel Normal Basis Hybrid Multiplier for Composite Binary Fields (3) To apply hybrid multipliers in cryptography properly, another issue must be taken into account. The matrix for basis conversion can be obtained within reasonable amount of time. g g − t − Special irreducible trinomials of the form f ( x ) = x 2 + x +1 or f ( x ) = x 2 1 + x 2 1 + 1 can be used to construct the ground field so that computing such a conversion matrix is equivalent to solving a set of linear equations. Field Size n trinomials Field Size n trinomials 2 x2+x+1 15 x15 +x+1 3 x3+x+1 31 x31 +x 3+1 4 x4+x+1 63 x63 +x+1 7 x7+x+1 127 x127 +x+1 Summary: 1.