Fpga-Targeted Hardware Implementations of K2
Total Page:16
File Type:pdf, Size:1020Kb
FPGA-TARGETED HARDWARE IMPLEMENTATIONS OF K2 Shinsaku Kiyomoto, Toshiaki Tanaka KDDI R & D Laboratories Inc., 2-1-15 Ohara Fujimino-shi Saitama 356-8502, Japan Kouichi Sakurai Kyushu University, 744 Motooka Nishi-ku Fukuoka 819-0395, Japan Keywords: Stream Cipher, K2, Pseudorandom Generator, Hardware Implementation, FPGA. Abstract: K2 is a new type of word oriented stream cipher that has dynamic feedback control. Existing research has shown that K2 v2.0 is a high performance stream cipher in software implementations and can be used in several applications. However, no evaluation results for its performance in hardware implementations have been published. In this paper, we presented two hardware implementations of K2 v2.0: a high speed implementation and a compact implementation. We then show the evaluation results on FPGA implementation simulations. The implementations of K2 demonstrated high efficiency compared with other stream ciphers, with K2 being 4-10 times higher than AES implementations. We think that the FPGA implementation of K2 is suitable for applications using high speed encryption/decryption. 1 INTRODUCTION tions. On the eSTREAM project 2, many evaluation results of hardware performances of stream ciphers Stream ciphers are used extensively to provide a re- have been reported. liable, efficient method for secure communications. K2 (Kiyomoto et al., 2007b) (Kiyomoto et al., A basic stream cipher uses several independent linear 2007a) is a new word-oriented stream cipher us- feedback shift registers (LFSRs) together with non- ing dynamic feedback control as irregular clocking. linear functions in order to produce a keystream. The The stream cipher has a dynamic feedback control keystream is then XORed with plaintext to produce mechanism for the byte-level feedback function of a ciphertext. Recently, word-oriented stream ciphers FSRs and realizes fast encryption/decryptionfor soft- have been developed in order to improve the perfor- ware implementations. Several cryptographic algo- mance of software implementations. In the NESSIE rithms have been considered hardware implementa- project 1, many word-oriented stream ciphers were tions (Rodr´ıguez-Henr´ıquez et al., 2007). Existing re- proposed, such as SNOW(Ekdahl and Johansson, search has shown that K2 v2.0 is a secure and high- 2000) and SOBER(Rose and Hawkes, 1999), and performance stream cipher in software implementa- demonstrated good performance in software. tions and can be used in several applications. How- The need for secure data exchange becomes im- ever, no evaluation results for the performance of portant not only for high-end PCs but also low-end hardware implementations have been published. In customer products. Most of low-end devices require this paper, we presented two hardware implementa- an additional hardware implementations for encryp- tions of K2 v2.0: a high speed implementation and tion/decryption of transaction data. Efficient hard- a compact implementation. We then show their eval- ware implementations of stream ciphers are impor- uation results on FPGA implementations. The eval- tant in both high-performanceand low-power applica- uation results suggested that the cipher is faster than existing ciphers and attains a reasonable level of ef- 1New European Schemes for Signatures, Integrity, and Encryption (NESSIE), https://www.cosic.esat. 2ECRYPT eSTREAM http://www.ecrypt.eu.org/ kuleuven.be/nessie/ stream/ 270 Kiyomoto S., Tanaka T. and Sakurai K. (2008). FPGA-TARGETED HARDWARE IMPLEMENTATIONS OF K2. In Proceedings of the International Conference on Security and Cryptography, pages 270-277 DOI: 10.5220/0001917002700277 Copyright c SciTePress FPGA-TARGETED HARDWARE IMPLEMENTATIONS OF K2 ficiency. We think that the FPGA implementation of K2 is suitable for applications using high speed en- cryption/decryption. 0 2 STREAM CIPHER K2 V2.0 At FSR-A At+4 Dynamic Feedback In this section, we describe the stream cipher algo- Controller rithm K2 v2.0 (Kiyomoto et al., 2007a), which has a dynamic feedback control mechanism. 1 1 2.1 Linear Feedback Shift Registers or or 3 2 The K2 v2.0 stream cipher consists of two feedback shift registers (FSRs), FSR-A and FSR-B, a non-linear Bt+10 FSR-B Bt function with four internal registers R1, R2, L1, and L2, and a dynamic feedback controller as shown in Fig. 1. FSR-B is a dynamic feedback shift register. The size of each register is 32 bits. FSR-A has five Non-Linear Function registers, and FSR-B has eleven registers. Let β be the roots of the primitive polynomial; H L z t z t x8 + x7 + x6 + x + 1 ∈ GF(2)[x] Keystream (64bits) Figure 1: K2 v2.0 Stream Cipher. A byte string y denotes (y7,y6,...,y1,y0), where y7 is the most significant bit and y0 is the least significant bit. y is represented by x4 + δ34x3 + δ16x2 + δ199x + δ248 ∈ GF(28)[x] 7 6 4 ζ157 3 ζ253 2 ζ56 ζ16 8 y = y7β + y6β + ... + y1β + y0 x + x + x + x + ∈ GF(2 )[x] In the same way, let γ, δ, ζ be the roots of the primitive respectively. polynomials, The feedback polynomials fA(x), and fB(x) of FSR-A and FSR-B, respectively, are as follows; x8 + x5 + x3 + x2 + 1 ∈ GF(2)[x] α 5 2 x8 + x6 + x3 + x2 + 1 ∈ GF(2)[x] fA(x)= 0x + x + 1 8 6 5 2 αcl1t α1−cl1t 11 10 5 αcl2t 3 x + x + x + x + 1 ∈ GF(2)[x] fB(x) = ( 1 + 2 −1)x +x +x + 3 x +1 respectively. Let cl1 and cl2 be the sequences describing the output Let α0 be the root of the irreducible polynomial of of the dynamic feedback controller. The outputs at degree four time t are defined in terms of some bits of FSR-A. Let Ax denote the output of FSR-A at time x, and Ax[y]= {0,1} denote the yth bit of Ax, where Ax[31] is the x4 + β24x3 + β3x2 + β12x + β71 ∈ GF(28)[x] most significant bit of Ax. Then cl1 and cl2 (called clock control bits) are described as follows; A 32-bit string Y denotes (Y3,Y2,Y1,Y0), where Yi is a byte string and Y is the most significant byte. Y is 3 cl1 = A [30], cl2 = A [31] represented by t t+2 t t+2 3 2 Both cl1t and cl2t are binary variables; more pre- Y = Y3α +Y2α +Y1α0 +Y0 0 0 cisely, cl1t = {0,1}, and cl2t = {0,1}. Stop-and-go clocking is effective in terms of computational cost, Let α1, α2, α3 be the roots of the irreducible polyno- mials of degree four because no computation is required in the case of 0. However, the feedback function has no transfor- mation for feedback registers with a probability 1/4 x4 + γ230x3 + γ156x2 + γ93x + γ29 ∈ GF(28)[x] where all clockings are stop-and-go clockings. Thus, 271 SECRYPT 2008 - International Conference on Security and Cryptography we use two types of clocking for the feedback func- FSR-A tion. FSR-B is defined by a primitive polynomial, 0 2 4 where cl2t = 0. Dynamic Feedback Controller 2.2 Nonlinear Function FSR-B 10 9 4 0 The non-linear function of K2 v2.0 is fed the values of two registers of FSR-A and four registers of FSR- B and that of internal registers R1, R2, L1, L2, and L2 R2 outputs 64 bits of the keystream every cycle. Fig. 2 shows the non-linear function of K2 v2.0. The non- Sub Sub Sub Sub linear function includes four substitution steps that are indicated by Sub. L1 R1 The Sub step divides the 32-bit input string into four 1-byte strings and applies a non-linear permuta- tion to each byte using an 8-to-8 bit substitution, and then applies a 32-to-32 bit linear permutation. The 8-to-8 bit substitution is the same as s-boxes of AES (Daemen and Rijmen, 1998), and the permutation is the same as AES Mix Column operation. The 8-to- 8 bit substitution consists of two functions: g and Keystream (64bits) f . The g calculates the multiplicative inverse modulo Figure 2: Non-Linear Function of K2 v2.0. of the irreducible polynomial m(x)= x8 + x4 + x3 + x + 1 without 0x00, and 0x00 is transformed to itself H ⊞ (0x00). f is an affine transformation defined by; zt = Bt+10 L2t ⊕ L1t ⊕ At b7 11111000 a7 0 where Ax and Bx denote outputs of FSR-A and FSR- b6 01111100 a6 1 b 00111110 a 1 B at time x, and R1x, R2x, L1x, and L2x denote the 5 5 b 00011111 a 0 internal registers at time x. The symbol ⊕ denotes 4 = × 4 ⊕ ⊞ b3 10001111 a3 0 bitwise exclusive-or operation and the symbol de- b2 11000111 a2 0 notes a 32-bit addition. Finally, the internal registers b1 11100011 a1 1 are updated as follows; b0 11110001 a0 1 where a = (a7,...,a0) is the input and b = (b7,...,b0) R1t+1 = Sub(L2t ⊞ Bt+9), R2t+1 = Sub(R1t) is the output, and a0 and b0 are the least significant bit (LSB). L1t+1 = Sub(R2t ⊞ Bt+4), L2t+1 = Sub(L1t) Let C be (c3,c2,c1,c0) and output D be (d ,d ,d ,d ), where c , d are 8-bit values. The lin- 3 2 1 0 i i where Sub(X) is an output of the Sub step for X. ear permutation D = p(C) is described as follows; d0 02 03 01 01 c0 2.4 Initialization Process d 01 02 03 01c 1 = 1 d 01 01 02 03 c 2 2 d3 03 01 01 02c3 The initialization process of K2 v2.0 consists of two steps, a key loading step and an internal state 8 8 in GF(2 ) of the irreducible polynomial m(x)= x + initialization step.