LNCS 9065, Pp

Combined Cache Timing Attacks and Template Attacks on Stream Cipher MUGI Shaoyu Du1,4, , Zhenqi Li1, Bin Zhang1,2, and Dongdai Lin3 1 Trusted Computing and Information Assurance Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing, China 2 State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China 3 State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China 4 University of Chinese Academy of Sciences, Beijing, China du [email protected] Abstract. The stream cipher MUGI was proposed by Hitachi, Ltd. in 2002 and it was specified as ISO/IEC 18033-4 for keystream generation. Assuming that noise-free cache timing measurements are possible, we give the cryptanalysis of MUGI under the cache attack model. Our simulation results show that we can reduce the computation complexity of recovering all the 1216-bits internal state of MUGI to about O(276) when it is implemented in processors with 64-byte cache line. The attack reveals some new inherent weaknesses of MUGI’s structure. The weaknesses can also be used to conduct a noiseless template attack of O(260.51 ) computation complexity to restore the state of MUGI. And then combining these two attacks we can conduct a key-recovery attack on MUGI with about O(230) computation complexity. To the best of our knowledge, it is the first time that the analysis of cache timing attacks and template attacks are applied to full version of MUGI and that these two classes of attacks are combined to attack some cipher. Moreover, the combination can be used to improve the error-tolerance capability of each attack. If each measurement has one additional error, the key-recovery attack will take about O(250) computation complexity. Keywords: Stream cihper, MUGI, analytical side-channel attacks, cache timing attacks, template attacks. 1 Introduction Cache Timing attacks[2,3,4] and template attacks[16] are two classes of side- channel attacks. The cache timing attack assumes that the attacker can use This work was supported by the National Grand Fundamental Research 973 Pro- grams of China(Grant No. 2013CB338002, 2011CB302400),the programs of the National Natural Science Foundation of China (Grant No. 61303258, 60833008, 60603018, 61173134, 91118006, 61272476). Corresponding author. c Springer International Publishing Switzerland 2015 J. Lopez and Y. Wu (Eds.): ISPEC 2015, LNCS 9065, pp. 235–249, 2015. DOI: 10.1007/978-3-319-17533-1_17 236 S. Du et al. time measurements to learn something about the cache accesses of the legitimate party. Since 2005, it has drawn a lot of attention[5,6,7,8], most of which mainly focused on blocks ciphers. In SAC 2008, Erik Zenner[1] applied the cache timing attack to HC-256 and provided a cache timing attack model for stream cipher. Soon after that Gregor Leander[9] et al. applied the model to stream ciphers that use word-based LFSRs. And Goutam Paul et al.[15] extended the attack in [9] to HC-128 in 2012. The common template attack[17,19,20,21,22] model targets on cipher’s S-boxes and collects hamming weight leakages of S-box’s output. In Inscrypt 2009, [23] combined the template attacks’ hamming weight leakage model with algebraic analysis. Then in 2010, [24] combined template attack with cube attack. And in 2014, [18] generalized these two attacks on the template attack model and analyzed its ability in the face of noise. At FSE 2002[10], Hitachi, Ltd. proposed a new keystreamgenerator MUGI, which was designed for use as a streamcipher. MUGI has a 128-bit secret key and a 128-bit initial vector and generates 64-bit output words per round. It was specified as ISO/IEC 18033-4 for keystream generation. To our knowledge, the exiting cryptanalysis of MUGI mainly focuses on its linear[11] and nonlinear[12] part re- spectively or attacks its simplified version[13]. A fault analysis on MUGI[14] was proposed in 2011, which mainly uses the characteristics that MUGI’s two kinds of update functions are mutually dependent. It was stated in [10] that the user can utilize the fast software implementation of AES to speed up the software implementation of MUGI, which is exactly the target of cache timing attacks. And also due to the usage of S-box, the cipher’s microcontroller implementations are insecure under template attacks. In this paper, we firstly present the cache timing analysis of MUGI under Erik Zenner’s model. In the common processors whose cache line is 64 bytes, we only need 10 consecutive rounds’ noise-free cache access measurements to recover the whole state with about O(276) computation complexity and O(271) memory complexity. Then we give a noise-free template attack analysis of MUGI with about O(260.51) computation complexity. These two attacks both guess the internal state of MUGI upon the side-channel leakage information and then filter out the wrong candidates through equations derived from MUGI’s update function. Then due to the characteristic of additive stream cipher that encryption and decryption have the same process of generating keystreams, we can combine these two attacks together. The combined side-channel attack takes advantage of the leakage information of cache accesses and hamming weights simultaneously. The increase of the information contributes to the decrease of the candidates’ num- ber and reduces the attack complexity to about O(230). Moreover, the combined attack can tolerate more noise, which makes the attack model more robust and closer to the practical situation. Assuming that each measurement has one additional error, the combined attack takes about O(250) computation complexity to retrieve the key. All the complexity results are computed by the simulation experiments. Combined Cache Timing Attacks and Template Attacks 237 The paper is organized as follows. In Section 2, we present the notations and a brief description of the MUGI cipher with its implementations. Section 3 specifies the cache timing attack model, the cache timing attack of MUGI and the experimental details. And we introduce the template attack, the combined attack and the attack under noise model in Section 4. Finally, conclusions and future works are given in Section 5. 2 Description of MUGI and Its Implementations 2.1 Notations in This Paper In this paper, “word” is used to denote an 8-byte block and we use the following notations throughout this paper. t ai: An 8-byte state in round t where i =0,...,2 t bi: An 8-byte state in round t where i =0,...,15 (x)i,...,j:Theith to jth bits of word x (0 denotes the most significant bit) x ⊕ y: Bitwise exclusive-OR operation ≫n: Circular rotation of n bits to the right (in the 64-bit register) ≪n: Circular rotation of n bits to the left (in the 64-bit register) (x)i:Theith byte of an 8-byte word x 2.2 Structure of MUGI t The state of MUGI in round t can be divided into two parts: the linear words bi t (i =0,...,15) and nonlinear words ai (i =0,...,2). There are two dependent update functions λ and ρ. The function λ updates the linear states as: t+1 t bj = bj−1(j =0, 4, 10), t+1 t ⊕ t b0 = b15 a0, t+1 t ⊕ t b4 = b3 b7, t+1 t ⊕ t ≪ b10 = b9 (b13 32), and the function ρ updates the nonlinear states as: t+1 t a0 = a1, t+1 t ⊕ t t ⊕ a1 = a2 F (a1,b4) C1, t+1 t ⊕ t t ≪ ⊕ a2 = a0 F (a1,b10 17) C2, where C1 and C2 are known constants and F is the round function. The F function, whose structure is showed in the left part of Fig.1, consists of a key t t ≪ addition (the data addition from the buffer b4 or b10 17), a nonlinear transformation using the S-box, a linear transformation using the MDS matrix and a byte shuffle. The S-box and MDS matrix are the same as that of AES. The output function of MUGI is simple which outflows a 64-bit nonlinear t state word directly, i.e., Output[t]=a2. Note that the output of round t is the state word in the beginning of round t. We ignore the initialization phase which has less correlation with the attack in this paper. For that and a more detailed description of MUGI, please refer to [10]. 238 S. Du et al. 2.3 Target Implementations of MUGI The fast software implementation of MUGI takes advantage of four tables: T0, T1, T2 and T3,whereTi[x]=mi ·S[x], i =0,...,3. Here and hereafter S[·] stands for the S-box of AES and mi stands for one column of AES’s mix-column matrix. The right part of Fig. 1 displays F function’s fast software implementation. t t Once MUGI outputs a word, the F function is called twice, i.e., F (a1,b4)and S S S S S S S S T0 T1 T2 T3 T0 T1 T2 T3 MDS MDS Fig. 1. Round function F (left) and its fast software implementation (right) t t ≪ F (a1,b10 17). It means that in order to output one word, the cipher needs 16 table-look-up operations. Each table Ti (i =0,...,3) is accessed four times. t ⊕ t 0,...,7 t ⊕ For example, the indexes used for looking up table T0 are (a1 b4) ,(a1 t 32,...,39 t ⊕ t ≪ 0,...,7 t ⊕ t ≪ 32,...,39 b4) ,(a1 b10 17) and (a1 b10 17) .

LNCS 9065, Pp

(SMC) MODULE of RC4 STREAM CIPHER ALGORITHM for Wi-Fi ENCRYPTION

Ensuring Fast Implementations of Symmetric Ciphers on the Intel Pentium 4 and Beyond

RC4-2S: RC4 Stream Cipher with Two State Tables

Stream Cipher Designs: a Review

Some Words on Cryptanalysis of Stream Ciphers Maximov, Alexander

New Developments in Cryptology Outline Outline Block Ciphers AES

Cryptanalysis of the A5/1 Stream Cipher

Golden Fish an Intelligent Stream Cipher Fuse Memory Modules

Information Processing Information Processing Research ↔ Practice

Analysis of the Non-Linear Part of Mugi

Specification Ver

New Developments in Cryptology