Compact Implementation of Threefish and Skein on FPGA

Total Page:16

File Type:pdf, Size:1020Kb

Compact Implementation of Threefish and Skein on FPGA Compact Implementation of Threefish and Skein on FPGA Nuray At∗, Jean-Luc Beuchaty, and Ismail˙ San∗ ∗Department of Electrical and Electronics Engineering, Anadolu University, Eskis¸ehir, Turkey Email: fnat, [email protected] yFaculty of Engineering, Information and Systems, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8573, Japan Email: [email protected] Table I Abstract—The SHA-3 finalist Skein is built from the tweakable NUMBER OF ROUNDS FOR DIFFERENT KEY SIZES (REPRINTED FROM [1]). Threefish block cipher. In order to have a better understanding of the computational efficiency of Skein (resource sharing, memory access scheme, scheduling, etc.), we design a low-area coprocessor Key size # words # rounds for Threefish and describe how to implement Skein on our [bits] Nw Nr architecture. We harness the intrinsic parallelism of Threefish 256 4 72 to design a pipelined ALU and interleave several tasks in order 512 8 72 to achieve a tight scheduling. From our point of view, the main 1024 16 80 advantage of Skein over other SHA-3 finalists is that the same coprocessor allows one to encrypt or hash a message. I. INTRODUCTION T = (t0; t1). K and T are extended with one parity word (Algorithm 1, lines 1 and 2). Each subkey is a combination In this article, we propose a novel hardware implementation of N words of the extended key, two words of the extended of the SHA-3 finalist Skein [1]. As emphasized by Kerckhof et w tweak, and a counter s (Algorithm 1, lines 5 to 9). Note that al., “fully unrolled and pipelined architectures may sometimes the extended key and the extended tweak are rotated by one hide a part of the algorithms’ complexity that is better revealed word position between two consecutive subkeys. in compact implementations” [2]. In order to have a deeper un- derstanding of the computational efficiency of Skein (resource Algorithm 1 Key schedule. sharing, memory access scheme, scheduling, etc.), we decided Input: K = (k ; k ; : : : ; k ) to design a low-area coprocessor on a Field-Programmable A block cipher key 0 1 Nw−1 ; Gate Array (FPGA). Furthermore, such an implementation is a tweak T = (t0; t1); the constant C240 = valuable for constrained environments, where some security 1BD11BDAA9FC1A22. Output: N =4 + 1 k k k protocols mainly rely on cryptographic hash functions (see r subkeys s;0, s;1,..., s;Nw−1, where for instance [3]). 0 ≤ s ≤ Nr=4. Nw−1 Skein is built from the tweakable Threefish block cipher, M 1. k C ⊕ k ; defined with a 256-, 512-, and 1024-bit block size. The main Nw 240 i i=0 contribution of this article is a lightweight implementation of 2. t2 t ⊕ t1; Threefish (Section II). Then, we describe how to implement 3. for s 0 to Nr=4 do Skein on our coprocessor (Section III). We have prototyped 4. for i 0 to Nw − 4 do our architecture on a Xilinx Virtex-6 device and discuss our 5. ks;i k(s+i) mod (Nw+1); results in Section IV. 6. end for 7. ks;Nw−3 k(s+Nw−3) mod (Nw+1) ts mod 3; II. THE THREEFISH BLOCK CIPHER 8. ks;Nw−2 k(s+Nw−2) mod (Nw+1) t(s+1) mod 3; A. Algorithm Specification 9. ks;Nw−1 k(s+Nw−1) mod (Nw+1) s; Threefish operates entirely on unsigned 64-bit integers and 10. end for 11. return k , k ,..., k , where 0 ≤ s ≤ N =4; involves only three operations: rotation of k bits to the left s;0 s;1 s;Nw−1 r (denoted by n k), bitwise exclusive OR (denoted by ⊕), and 64 addition modulo 2 (denoted by ). Therefore, the plaintext A series of Nr rounds (Figure 1 and Algorithm 2, lines 4 P and the cipher key K are converted to Nw 64-bit words. to 19) and a final subkey addition (Algorithm 2, line 21) are Note that the number of words Nw and the number of rounds applied to produce the ciphertext. The core of a round is Nr depend on the key size (Table I). the simple non-linear mixing function Mixd;j (Algorithm 2, The key schedule generates the subkeys from a block lines 13 and 14). It consists of an addition, a rotation by a cipher key K = (k0; k1; : : : ; kNw−1) and a 128-bit tweak constant Rd mod 8;j (repeated every eight rounds and defined in [1, Table 4]), and a bitwise exclusive OR. A word permu- B. The UBI Chaining Mode tation π(i) (defined in [1, Table 3]) is then applied to obtain Let E(K; T; P ) be a tweakable encryption function. The the output of the round (Algorithm 2, line 17). Furthermore, Unique Block Iteration (UBI) chaining mode allows one to a subkey is injected every four rounds (Algorithm 2, line 7). build a compression function out of E. Each block Mi of the message is processed with a unique tweak value Ti encoding v4;0 v4;1 v4;2 v4;3 how many bytes have been processed so far, a type field (see [1] for details), and two bits specifying whether it is the k k k k 1;0 1;1 1;2 1;3 first and/or last block. The UBI chaining mode is computed e4;0 e4;1 e4;2 e4;3 as: H0 G, Hi+1 Mi ⊕ E(Hi;Ti;Mi), 0 1 ; ; 4 4 R4;0 R4;1 nn where G is a starting value of Nw words. Mix Mix C. Hardware Implementation This short description of Threefish gives us the first hints on f f f f 4;0 4;1 4;2 4;3 designing a dedicated coprocessor (Figure 2). Our architecture consists of a register file implemented by means of dual- ported memory, an ALU, and a control unit. The register file Permute is organized into 64-bit words, and stores a plaintext block, v5;0 v5;1 v5;2 v5;3 an internal state (ed;i, where 0 ≤ i ≤ Nw − 1), an extended Figure 1. One of the 72 rounds of Threefish-256. block cipher key, an extended tweak, the constant C240, and all possible values of s involved in the key schedule. Thanks to this approach, the word permutation π(i) and the word rotation of the key schedule are conveniently implemented Algorithm 2 Encryption with the Threefish block cipher. by addressing the register file accordingly. Since the round constants repeat every eight rounds (Algorithm 2, line 14), Input: A plaintext block P = (p ; p ; : : : ; p ); N =4 + 1 0 1 Nw−1 r we decided to unroll eight iterations of the main loop of subkeys k , k ,..., k , where 0 ≤ s ≤ N =4; s;0 s;1 s;Nw−1 r Threefish (Algorithm 2, lines 4 to 19). The rotation constants 4N rotation constants R , where 0 ≤ i ≤ 7 and 0 ≤ w i;j R are included in the microcode executed by the control j ≤ N =2. d;i w unit. Note that our register file is designed for Threefish-1024 Output: A ciphertext block C = (c ; c ; : : : ; c ). 0 1 Nw−1 (i.e. N = 16 and N = 80). It is therefore straightforward 1. for i 0 to N − 1 do w r w to implement the two other variants of the algorithm on our 2. v p ; 0;i i architecture. 3. end for 4. for d 0 to Nr − 1 do Register file (dual-ported memory block) 5. for i 0 to N − 1 do w WE 0 6. if d mod 4 = 0 then A Data p0,..., p15 7. ed;i vd;i kd=4;i; (Key injection) Port 16 Address ei;0,..., ei;15 8. else Port A Port B 32 9. ed;i vd;i; (Rename) k0,..., k15 Arithmetic and 48 k16 Logic Unit 10. end if 64 WE t0, t1, t2, C240 (Figure 3b) 11. end for 96 Address B 0, 1,..., 14, 15 12. for j 0 to Nw=2 − 1 do 16 17 18 19 20 112 Port , , , , Data 13. fd;2j ed;2j ed;2j+1; (Mixd;j) 14. fd;2j+1 fd;2j ⊕ (ed;2j+1 n Rd mod 8;j); 15. end for ctrl10:0 16. for i 0 to N − 1 do w StartControl Idle unit 17. vd+1;i fd,π(i); (Permute) Nw 18. end for 19. end for Figure 2. Architecture of the Threefish coprocessor. 20. for i 0 to Nw − 1 do The next step consists in defining the architecture of the 21. ci vNr ;i kNr =4;i; (Key injection) 22. end for ALU and the instruction set of our coprocessor. In the fol- 23. return C = (c0; c1; : : : ; cNw−1); lowing, Ri denotes a 64-bit register. Figure 3a illustrates our scheduling of the two mixing functions Mix4;0 and Mix4;1 of the fifth round of Threefish-256: LUT6 2 1 bit ctrl1:0 ctrl3:2 ctrl5:4 1 48 3 64 12 bits or or or 6 b carryj+1 2 1 j 32 8 a _ b R j j , , , 1 R4 4 R5 R6 , 0 2 6 2 6 16 , 0 , R R R R 0 ⊕ ⊕ 0 1 3 1 3 From 0 1 n R R R R n n ctrl7 2 0 1 3 3 3 3 3 R R R R R R B 1 aj 0 o (aj ^ bj) ⊕ Ctrl10 T From Port A e4;1 e4;0 e4;3 e4;2 port 1 1 1 0 R 0 0 b R1 e4;1 e4;1 e4;3 e4;3 From R2 1 B 1 bj A aj _ bj R2 e4;0 e4;2 From or R3 port 0 3 o 1 1 port n n R R4 e e 0 T 4;1 4;3 R1 0 a carryj R5 en9 en1 1 0 4;1 4;3 From 1 From (carry0 = Ctrl10) n25 n33 1 aj R6 e4;1 e4;3 10 3 f f f f R 4;0 4;1 4;2 4;3 Ctrl ctrl6 ctrl8 ctrl9 ctrl10 (a) Computation of Mix4;0 and Mix4;1 (Threefish-256) (b) Architecture of the ALU (c) Computation of R3 R1 R2 or R3 R3 ⊕ R6 on a Virtex-6 device Figure 3.
Recommended publications
  • IMPLEMENTATION and BENCHMARKING of PADDING UNITS and HMAC for SHA-3 CANDIDATES in FPGAS and ASICS by Ambarish Vyas a Thesis Subm
    IMPLEMENTATION AND BENCHMARKING OF PADDING UNITS AND HMAC FOR SHA-3 CANDIDATES IN FPGAS AND ASICS by Ambarish Vyas A Thesis Submitted to the Graduate Faculty of George Mason University in Partial Fulfillment of The Requirements for the Degree of Master of Science Computer Engineering Committee: Dr. Kris Gaj, Thesis Director Dr. Jens-Peter Kaps. Committee Member Dr. Bernd-Peter Paris. Committee Member Dr. Andre Manitius, Department Chair of Electrical and Computer Engineering Dr. Lloyd J. Griffiths. Dean, Volgenau School of Engineering Date: ---J d. / q /9- 0 II Fall Semester 2011 George Mason University Fairfax, VA Implementation and Benchmarking of Padding Units and HMAC for SHA-3 Candidates in FPGAs and ASICs A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science at George Mason University By Ambarish Vyas Bachelor of Science University of Pune, 2009 Director: Dr. Kris Gaj, Associate Professor Department of Electrical and Computer Engineering Fall Semester 2011 George Mason University Fairfax, VA Copyright c 2011 by Ambarish Vyas All Rights Reserved ii Acknowledgments I would like to use this oppurtunity to thank the people who have supported me throughout my thesis. First and foremost my advisor Dr.Kris Gaj, without his zeal, his motivation, his patience, his confidence in me, his humility, his diverse knowledge, and his great efforts this thesis wouldn't be possible. It is difficult to exaggerate my gratitude towards him. I also thank Ekawat Homsirikamol for his contributions to this project. He has significantly contributed to the designs and implementations of the architectures. Additionally, I am indebted to my student colleagues in CERG for providing a fun environment to learn and giving invaluable tips and support.
    [Show full text]
  • The Boomerang Attacks on the Round-Reduced Skein-512 *
    The Boomerang Attacks on the Round-Reduced Skein-512 ? Hongbo Yu1, Jiazhe Chen3, and Xiaoyun Wang2;3 1 Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China 2 Institute for Advanced Study, Tsinghua University, Beijing 100084, China fyuhongbo,[email protected] 3 Key Laboratory of Cryptologic Technology and Information Security, Ministry of Education, School of Mathematics, Shandong University, Jinan 250100, China [email protected] Abstract. The hash function Skein is one of the ¯ve ¯nalists of the NIST SHA-3 competition; it is based on the block cipher Three¯sh which only uses three primitive operations: modular addition, rotation and bitwise XOR (ARX). This paper studies the boomerang attacks on Skein-512. Boomerang distinguishers on the compression function reduced to 32 and 36 rounds are proposed, with complexities 2104:5 and 2454 respectively. Examples of the distinguishers on 28-round and 31- round are also given. In addition, the boomerang distinguishers are applicable to the key-recovery attacks on reduced Three¯sh-512. The complexities for key-recovery attacks reduced to 32-/33- /34-round are about 2181, 2305 and 2424. Because Laurent et al. [14] pointed out that the previous boomerang distinguishers for Three¯sh-512 are in fact not compatible, our attacks are the ¯rst valid boomerang attacks for the ¯nal round Skein-512. Key words: Hash function, Boomerang attack, Three¯sh, Skein 1 Introduction Cryptographic hash functions, which provide integrity, authentication and etc., are very impor- tant in modern cryptology. In 2005, as the most widely used hash functions MD5 and SHA-1 were broken by Wang et al.
    [Show full text]
  • An Analytic Attack Against ARX Addition Exploiting Standard Side-Channel Leakage
    An Analytic Attack Against ARX Addition Exploiting Standard Side-Channel Leakage Yan Yan1, Elisabeth Oswald1 and Srinivas Vivek2 1University of Klagenfurt, Klagenfurt, Austria 2IIIT Bangalore, India fyan.yan, [email protected], [email protected] Keywords: ARX construction, Side-channel analysis, Hamming weight, Chosen plaintext attack Abstract: In the last few years a new design paradigm, the so-called ARX (modular addition, rotation, exclusive-or) ciphers, have gained popularity in part because of their non-linear operation’s seemingly ‘inherent resilience’ against Differential Power Analysis (DPA) Attacks: the non-linear modular addition is not only known to be a poor target for DPA attacks, but also the computational complexity of DPA-style attacks grows exponentially with the operand size and thus DPA-style attacks quickly become practically infeasible. We however propose a novel DPA-style attack strategy that scales linearly with respect to the operand size in the chosen-message attack setting. 1 Introduction ever are different: they offer a potentially ‘high reso- lution’ for the adversary. In principle, under suitably Ciphers that base their round function on the sim- strong assumptions, adversaries can not only observe ple combination of modular addition, rotation, and leaks for all instructions that are executed on a proces- exclusive-or, (short ARX) have gained recent popu- sor, but indeed attribute leakage points (more or less larity for their lightweight implementations that are accurately) to instructions (Mangard et al., 2007). suitable for resource constrained devices. Some re- Achieving security in this scenario has proven to cent examples include Chacha20 (Nir and Langley, be extremely challenging, and countermeasures such 2015) and Salsa20 (Bernstein, 2008) family stream as masking (secret sharing) are well understood but ciphers, SHA-3 finalists BLAKE (Aumasson et al., costly (Schneider et al., 2015).
    [Show full text]
  • SHA-3 Conference, March 2012, Skein: More Than Just a Hash Function
    Skein More than just a hash function Third SHA-3 Candidate Conference 23 March 2012 Washington DC 1 Skein is Skein-512 • Confusion is common, partially our fault • Skein has two special-purpose siblings: – Skein-256 for extreme memory constraints – Skein-1024 for the ultra-high security margin • But for SHA-3, Skein is Skein-512 – One hash function for all output sizes 2 Skein Architecture • Mix function is 64-bit ARX • Permutation: relocation of eight 64-bit words • Threefish: tweakable block cipher – Mix + Permutation – Simple key schedule – 72 rounds, subkey injection every four rounds – Tweakable-cipher design key to speed, security • Skein chains Threefish with UBI chaining mode – Tweakable mode based on MMO – Provable properties – Every hashed block is unique • Variable size output means flexible to use! – One function for any size output 3 The Skein/Threefish Mix 4 Four Threefish Rounds 5 Skein and UBI chaining 6 Fastest in Software • 5.5 cycles/byte on 64-bit reference platform • 17.4 cycles/byte on 32-bit reference platform • 4.7 cycles/byte on Itanium • 15.2 cycles/byte on ARM Cortex A8 (ARMv7) – New numbers, best finalist on ARMv7 (iOS, Samsung, etc.) 7 Fast and Compact in Hardware • Fast – Skein-512 at 32 Gbit/s in 32 nm in 58 k gates – (57 Gbit/s if processing two messages in parallel) • To maximize hardware performance: – Use a fast adder, rely on simple control structure, and exploit Threefish's opportunities for pipelining – Do not trust your EDA tool to generate an efficient implementation • Compact design: – Small FPGA
    [Show full text]
  • Authenticated Encryption with Small Stretch (Or, How to Accelerate AERO) ⋆
    Authenticated Encryption with Small Stretch (or, How to Accelerate AERO) ? Kazuhiko Minematsu NEC Corporation, Japan [email protected] Abstract. Standard form of authenticated encryption (AE) requires the ciphertext to be expanded by the nonce and the authentication tag. These expansions can be problematic when messages are relatively short and communication cost is high. To overcome the problem we propose a new form of AE scheme, MiniAE, which expands the ciphertext only by the single variable integrating nonce and tag. An important feature of MiniAE is that it requires the receiver to be stateful not only for detecting replays but also for detecting forgery of any type. McGrew and Foley already proposed a scheme having this feature, called AERO, however, there is no formal security guarantee based on the provable security framework. We provide a provable security analysis for MiniAE, and show several provably-secure schemes using standard symmetric crypto primitives. This covers a generalization of AERO, hence our results imply a provable security of AERO. Moreover, one of our schemes has a similar structure as OCB mode of operation and enables rate-1 operation, i.e. only one blockcipher call to process one input block. This implies that the computation cost of MiniAE can be as small as encryption-only schemes. Keywords: Authenticated Encryption, Stateful Decryption, Provable Security, AERO, OCB 1 Introduction Authenticated encryption (AE) is a symmetric-key cryptographic function for communication which provides both confidentiality and integrity of messages. A standard form of AE requires the ciphertext to be expanded by the amount of nonce, a never-repeating value maintained by the sender, and the authentication tag.
    [Show full text]
  • Bicliques for Preimages: Attacks on Skein-512 and the SHA-2 Family
    Bicliques for Preimages: Attacks on Skein-512 and the SHA-2 family Dmitry Khovratovich1 and Christian Rechberger2 and Alexandra Savelieva3 1Microsoft Research Redmond, USA 2DTU MAT, Denmark 3National Research University Higher School of Economics, Russia Abstract. We present the new concept of biclique as a tool for preimage attacks, which employs many powerful techniques from differential cryptanalysis of block ciphers and hash functions. The new tool has proved to be widely applicable by inspiring many authors to publish new re- sults of the full versions of AES, KASUMI, IDEA, and Square. In this paper, we demonstrate how our concept results in the first cryptanalysis of the Skein hash function, and describe an attack on the SHA-2 hash function with more rounds than before. Keywords: SHA-2, SHA-256, SHA-512, Skein, SHA-3, hash function, meet-in-the-middle attack, splice-and-cut, preimage attack, initial structure, biclique. 1 Introduction The last years saw an exciting progress in preimage attacks on hash functions. In contrast to collision search [29, 26], where differential cryptanalysis has been in use since 90s, there had been no similarly powerful tool for preimages. The situation changed dramatically in 2008, when the so-called splice-and-cut framework has been applied to MD4 and MD5 [2, 24], and later to SHA-0/1/2 [1, 3], Tiger [10], and other primitives. Despite amazing results on MD5, the applications to SHA-x primitives seemed to be limited by the internal properties of the message schedule procedures. The only promising element, the so-called initial structure tool, remained very informal and complicated.
    [Show full text]
  • Darxplorer a Toolbox for Cryptanalysis and Cipher Designers
    DARXplorer a Toolbox for Cryptanalysis and Cipher Designers Dennis Hoppe Bauhaus-University Weimar 22nd April 2009 Dennis Hoppe (BUW) DARXplorer 22nd April 2009 1 / 31 Agenda 1 Introduction to Hash Functions 2 The ThreeFish Block Cipher 3 Differential Cryptanalysis 4 DARXplorer { DC of ThreeFish 5 Results on ThreeFish 6 Generalization of DARXplorer Dennis Hoppe (BUW) DARXplorer 22nd April 2009 2 / 31 Agenda 1 Introduction to Hash Functions 2 The ThreeFish Block Cipher 3 Differential Cryptanalysis 4 DARXplorer { DC of ThreeFish 5 Results on ThreeFish 6 Generalization of DARXplorer Dennis Hoppe (BUW) DARXplorer 22nd April 2009 3 / 31 Introduction to Hash Functions Hash Functions A hash function H : f0; 1g∗ ! f0; 1gn is used to compute an n-bit fingerprint from an arbitrarily-sized input M 2 f0; 1g∗ Most of them are based on a compression function C : f0; 1gn × f0; 1gm ! f0; 1gn with fixed size input Computation: Hi := C(Hi−1;Mi) C C C H[0] H[1] . H[L-1] H[L] M[1] M[2] . M[L] Dennis Hoppe (BUW) DARXplorer 22nd April 2009 4 / 31 Introduction to Hash Functions { cont'd Compression Functions A crucial building block of iterated hash functions is the compression function C Designer often make use of block ciphers Which properties should be imposed on C to guarantee that the hash function satisfies certain properties? Theorem (Damg˚ard-Merkle) If the compression function C is collision-resistant, then the hash function H is collision-resistant as well. If the compression function C is preimage-resistant, then the hash function H is preimage-resistant as well.
    [Show full text]
  • Modes of Operation for Compressed Sensing Based Encryption
    Modes of Operation for Compressed Sensing based Encryption DISSERTATION zur Erlangung des Grades eines Doktors der Naturwissenschaften Dr. rer. nat. vorgelegt von Robin Fay, M. Sc. eingereicht bei der Naturwissenschaftlich-Technischen Fakultät der Universität Siegen Siegen 2017 1. Gutachter: Prof. Dr. rer. nat. Christoph Ruland 2. Gutachter: Prof. Dr.-Ing. Robert Fischer Tag der mündlichen Prüfung: 14.06.2017 To Verena ... s7+OZThMeDz6/wjq29ACJxERLMATbFdP2jZ7I6tpyLJDYa/yjCz6OYmBOK548fer 76 zoelzF8dNf /0k8H1KgTuMdPQg4ukQNmadG8vSnHGOVpXNEPWX7sBOTpn3CJzei d3hbFD/cOgYP4N5wFs8auDaUaycgRicPAWGowa18aYbTkbjNfswk4zPvRIF++EGH UbdBMdOWWQp4Gf44ZbMiMTlzzm6xLa5gRQ65eSUgnOoZLyt3qEY+DIZW5+N s B C A j GBttjsJtaS6XheB7mIOphMZUTj5lJM0CDMNVJiL39bq/TQLocvV/4inFUNhfa8ZM 7kazoz5tqjxCZocBi153PSsFae0BksynaA9ZIvPZM9N4++oAkBiFeZxRRdGLUQ6H e5A6HFyxsMELs8WN65SCDpQNd2FwdkzuiTZ4RkDCiJ1Dl9vXICuZVx05StDmYrgx S6mWzcg1aAsEm2k+Skhayux4a+qtl9sDJ5JcDLECo8acz+RL7/ ovnzuExZ3trm+O 6GN9c7mJBgCfEDkeror5Af4VHUtZbD4vALyqWCr42u4yxVjSj5fWIC9k4aJy6XzQ cRKGnsNrV0ZcGokFRO+IAcuWBIp4o3m3Amst8MyayKU+b94VgnrJAo02Fp0873wa hyJlqVF9fYyRX+couaIvi5dW/e15YX/xPd9hdTYd7S5mCmpoLo7cqYHCVuKWyOGw ZLu1ziPXKIYNEegeAP8iyeaJLnPInI1+z4447IsovnbgZxM3ktWO6k07IOH7zTy9 w+0UzbXdD/qdJI1rENyriAO986J4bUib+9sY/2/kLlL7nPy5Kxg3 Et0Fi3I9/+c/ IYOwNYaCotW+hPtHlw46dcDO1Jz0rMQMf1XCdn0kDQ61nHe5MGTz2uNtR3bty+7U CLgNPkv17hFPu/lX3YtlKvw04p6AZJTyktsSPjubqrE9PG00L5np1V3B/x+CCe2p niojR2m01TK17/oT1p0enFvDV8C351BRnjC86Z2OlbadnB9DnQSP3XH4JdQfbtN8 BXhOglfobjt5T9SHVZpBbzhDzeXAF1dmoZQ8JhdZ03EEDHjzYsXD1KUA6Xey03wU uwnrpTPzD99cdQM7vwCBdJnIPYaD2fT9NwAHICXdlp0pVy5NH20biAADH6GQr4Vc
    [Show full text]
  • High-Speed Hardware Implementations of BLAKE, Blue
    High-Speed Hardware Implementations of BLAKE, Blue Midnight Wish, CubeHash, ECHO, Fugue, Grøstl, Hamsi, JH, Keccak, Luffa, Shabal, SHAvite-3, SIMD, and Skein Version 2.0, November 11, 2009 Stefan Tillich, Martin Feldhofer, Mario Kirschbaum, Thomas Plos, J¨orn-Marc Schmidt, and Alexander Szekely Graz University of Technology, Institute for Applied Information Processing and Communications, Inffeldgasse 16a, A{8010 Graz, Austria {Stefan.Tillich,Martin.Feldhofer,Mario.Kirschbaum, Thomas.Plos,Joern-Marc.Schmidt,Alexander.Szekely}@iaik.tugraz.at Abstract. In this paper we describe our high-speed hardware imple- mentations of the 14 candidates of the second evaluation round of the SHA-3 hash function competition. We synthesized all implementations using a uniform tool chain, standard-cell library, target technology, and optimization heuristic. This work provides the fairest comparison of all second-round candidates to date. Keywords: SHA-3, round 2, hardware, ASIC, standard-cell implemen- tation, high speed, high throughput, BLAKE, Blue Midnight Wish, Cube- Hash, ECHO, Fugue, Grøstl, Hamsi, JH, Keccak, Luffa, Shabal, SHAvite-3, SIMD, Skein. 1 About Paper Version 2.0 This version of the paper contains improved performance results for Blue Mid- night Wish and SHAvite-3, which have been achieved with additional imple- mentation variants. Furthermore, we include the performance results of a simple SHA-256 implementation as a point of reference. As of now, the implementations of 13 of the candidates include eventual round-two tweaks. Our implementation of SIMD realizes the specification from round one. 2 Introduction Following the weakening of the widely-used SHA-1 hash algorithm and concerns over the similarly-structured algorithms of the SHA-2 family, the US NIST has initiated the SHA-3 contest in order to select a suitable drop-in replacement [27].
    [Show full text]
  • An Efficient Parallel Algorithm for Skein Hash Functions
    AN EFFICIENT PARALLEL ALGORITHM FOR SKEIN HASH FUNCTIONS K´evin Atighehchi, Adriana Enache, Traian Muntean, Gabriel Risterucci ERISCS Research Group Universit´ede la M´editerran´ee Parc Scientifique de Luminy-Marseille, France email: [email protected] ABSTRACT by SHA-3: it may be parallelizable, more suitable for cer- Recently, cryptanalysts have found collisions on the MD4, tain classes of applications, more efficient to implement MD5, and SHA-0 algorithms; moreover, a method for find- on actual platforms, or may avoid some of the incidental ing SHA1 collisions with less than the expected calculus generic properties (such as length extension) of the Merkle- complexity has been published. The NIST [1] has thus de- Damgard construct ([10] [9]) that often results in insecure cided to develop a new hash algorithm, so called SHA-3, applications. which will be developed through a public competition [3]. This paper describes sequential and parallel algo- From the set of accepted proposals for the further steps of rithms for Skein cryptographic hash functions, and the the competition, we have decided to explore the design of analysis, testing and optimization thereof. Our approach an efficient parallel algorithm for the Skein [12] hash func- for parallelizing Skein uses the tree hash mode which cre- tion family. The main reason for designing such an algo- ates one virtual thread for each node of the tree, thus rithm is to obtain optimal performances when dealing with providing a generic method for fine-grain maximal paral- critical applications which require efficiently tuned imple- lelism. mentations on multi-core target processors. This prelimi- The paper is structured as following: Section 2 and nary work presents one of the first parallel implementation 3 give a brief description of Skein and a detailed descrip- and associated performance evaluation of Skein available tion of the associated Tree Mode, the Section 4 presents the in the literature.
    [Show full text]
  • Rotational Rebound Attacks on Reduced Skein
    Rotational Rebound Attacks on Reduced Skein Dmitry Khovratovich1;2, Ivica Nikoli´c1, and Christian Rechberger3 1: University of Luxembourg; 2: Microsoft Research Redmond, USA; 3: Katholieke Universiteit Leuven, ESAT/COSIC, and IBBT, Belgium [email protected], [email protected], [email protected] Abstract. In this paper we combine the recent rotational cryptanalysis with the rebound attack, which results in the best cryptanalysis of Skein, a candidate for the SHA-3 competition. The rebound attack approach was so far only applied to AES-like constructions. For the first time, we show that this approach can also be applied to very different construc- tions. In more detail, we develop a number of techniques that extend the reach of both the inbound and the outbound phase, leading to rotational collisions for about 53/57 out of the 72 rounds of the Skein-256/512 com- pression function and the Threefish cipher. At this point, the results do not threaten the security of the full-round Skein hash function. The new techniques include an analytical search for optimal input values in the rotational cryptanalysis, which allows to extend the outbound phase of the attack with a precomputation phase, an approach never used in any rebound-style attack before. Further we show how to combine multiple inside-out computations and neutral bits in the inbound phase of the rebound attack, and give well-defined rotational distinguishers as certificates of weaknesses for the compression functions and block ciphers. Keywords: Skein, SHA-3, hash function, compression function, cipher, rotational cryptanalysis, rebound attack, distinguisher. 1 Introduction Rotational cryptanalysis and the rebound attack proved to be very effective in the analysis of SHA-3 candidates and related primitives.
    [Show full text]
  • Authenticated Encryption on Fpgas from the Reconfigurable Part to the Static Part Karim Moussa Ali Abdellatif
    Authenticated Encryption on FPGAs from the Reconfigurable Part to the Static Part Karim Moussa Ali Abdellatif To cite this version: Karim Moussa Ali Abdellatif. Authenticated Encryption on FPGAs from the Reconfigurable Part to the Static Part. Other [cs.OH]. Université Pierre et Marie Curie - Paris VI, 2014. English. NNT : 2014PA066660. tel-01158431 HAL Id: tel-01158431 https://tel.archives-ouvertes.fr/tel-01158431 Submitted on 1 Jun 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THESE` DE DOCTORAT DE L'UNIVERSITE´ PIERRE ET MARIE CURIE Sp´ecialit´e:Informatique, T´el´ecommunication et Electronique´ Ecole´ Doctorale Informatique, T´el´ecommunication et Electronique´ Pr´esent´eepar: Karim Moussa Ali Abdellatif Pour obtenir le grade de Docteur de l'Universit´ePierre et Marie Curie CHIFFREMENT AUTHENTIFIE´ SUR FPGAs DE LA PARTIE RECONFIGURABLE A` LA PARTIE STATIC Soutenue le 07/10/2014 devant le jury compos´ede: M. Bruno Robisson ENSM.SE/CEA Rapporteur M. Lilian Bossuet Laboratoire Hubert Curien Rapporteur M. Jean-Claude Bajard LIP6 Examinateur M. Hayder Mrabet FLEXRAS Examinateur M. Olivier Lepape NanoXplore Examinateur M. Habib Mehrez LIP6 Directeur de th`ese Mme. Roselyne.Chotin-Avot LIP6 Encaderante de th`ese Ph.D.
    [Show full text]