<<

EDIC RESEARCH PROPOSAL 1 Aslı Bay I&C, EPFL

Abstract—This report is mainly about Decorrelation Theory. compute or bound the of d-limited distinguisher The first part presents the overview of Decorrelation Theory in Luby-Rackoff model. He also finds a link between d- mainly security of block against iterated attacks. Then, limited adversaries and differential and linear attacks and a the second part gives a brief overview of security results of the block C which is practically secure. The third part wider class of attacks called iterated attacks. However, the provides the differential-linear attack on full COCONUT98 practicality of tools in this theory is controversial, since it is whose security is proven by Decorrelation Theory. Finally, the hard the compute the exact adversaries’ advantage. Therefore, last part gives a brief overview of my future research. Vaudenay suggested to use some algebraic constructions called Index Terms—Decorelation Theory, Iterated Attacks, The decorrelation modules and proposed a provably secure block C, COCONUT98 cipher called COCONUT98 block cipher which resists to 2- limited adversaries [8]. However, decorrelation results of the I.INTRODUCTION cipher do not prove anything more than its resistance to 2- OST modern block ciphers are resistant to many limited adversary. That is, they do not give any guarantee of M cryptanalytic techniques such as linear , its security against d-limited adversary for d > 2. Therefore, differential cryptanalysis, as well as their variants such as Wagner’s [5] breaks COCONUT98 with 38 boomerang attack, impossible differential attack, or rectangle complexity about 2 and Biham et al. [4] breaks it within a 33.7 attack. Even if the given cipher is resistant to the existing complexity about 2 by differential-linear attack. However, attacks, it is not guarantee that a new variant would not it is still practical to prove the several security results of a break the cipher. Therefore, instead of proving the security practical block cipher, if we take advantage of the symmetries of the construction to each individually, it is reasonable to within the distribution matrices. For example, the block cipher find a technique which provides a unique proof for a family C is a practically secure block cipher proposed by Baignres of attacks. For this reason, Vaudenay proposed Decorrelation and Finiasz [2], the exact advantage of 2-limited distinguisher Theory which provides tools to prove the security of the given is computed, as well as the advantage of differential and linear block cipher [8], [6], [7], [3]. attacks. Decorrelation theory provides tools to quantify the security The structure of this report is as follows: the first section of block ciphers against some family of attacks. It enables to provides a brief overview of Decorrelation Theory and explain the security of block ciphers against iterated attacks. The Proposal submitted to committee: September 14th, 2010; second section provides the security proofs of the block cipher Candidacy exam date: September 21st, 2010; Candidacy exam C. The last section gives the differential-linear attack on full committee: Exam president, thesis director, co-examiner. COCONUT98. Finally the last section gives a brief overview This research plan has been approved: of my future research.

Date: ———————————— A. Terminology Definition 1: The perfect cipher C∗ denotes a random per- Doctoral candidate: ———————————— mutation uniformly distributed among all possible permuta- (name and signature) tions on the given set. Definition 2: In Luby-Rackoff model, an adversary is an infinitely powerful A which has an access to an Thesis director: ———————————— oracle O. The oracle O either implements a cipher C or the (name and signature) Perfect Cipher C∗. The adversary aims to distinguish a cipher C from C∗ by querying the oracle with limited number of ∗ (d) times. The advantage of the attacker is AdvA = |p − p |, Thesis co-director: ———————————— ∗ (if applicable) (name and signature) where p is the probability of accepting C (resp C ). Finally, the adversary output 1 (accept) or 0 (reject). Definition 3: Let a = (a1, ··· , a16) be an array of 128-bits Doct. prog. director:———————————— where ai’s are 8-bit strings. The support of a is a four by four (R. Urbanke) (signature) array with the 0’s at the positions where the entry of a is equal to 0 with 1’s where the entry of a is nonzero. It is denoted by

EDIC-ru/05.05.2009 SUPP(a). EDIC RESEARCH PROPOSAL 2

p−1 Definition 4: A prime p is a strong-prime, if 2 is prime. Since we consider average complexities of the attack with- p−1 and it is a strong-strong prime if p and 2 are both strong- out any information on the , we will concentrate on the primes. average value of of linear probability E(LPC (a, b)) : The following are some notations used in the report: C −2m X (x1⊕x2)·a+(y1⊕y2)·b ⊕:bitwise Xor. ELP (a, b) = 2 (−1) 0·0: dot product. x1,x2,y1,y2 C wt(s): is the number of 1’s in the bit string s. · Pr[(x1, x2) →, (y1, y2)]. (4)

II.DECORRELATION THEORY AND RESISTANCE AGAINST Vaudenay shows the link between the advantage of linear ITERATED ATTACKS distinguisher and 2-wise distribution matrix of C as: In [8], Vaudenay proposed Decorrelation Theory which r n r n Adv ≤ 3 3 n · |||[C]2 − [C∗]2||| + +3 3 provides tools to prove the security results in Luby-Rackoff Table I ∞ M − 1 M − 1 model. In this model, the attacker (the algorithm) is only (5) limited by the number of data (/ pairs) where M is the size of the plaintext space and n is the number (limited to d queries) and is computationally unbounded. of iterations. When the d queries are made at once, the adversary is called nonadaptive and when each query is made according to the TABLE I LINEAR DISTINGUISHER outcome of the previous queries, it is called adaptive. A block Parameters: n, a characteristics (a, b),A cipher C is considered as a random permutation on a message- m Oracle: a permutation c block space M = {0, 1} due to the random choice of the 1.Initialize counter δ to zero secret key. The following defines the distribution matrix of a 1.for i = 1 to i = n random function: 2. pick X at random and obtain c(X) Definition 5: Let F be a random function from a given set 3. if X · a = c(X) · b increase the counter δ 4. end for M1 to a given set M2 and d be an integer, then the d-wise 5.if δ ∈ A output 1, otherwise 0 d d d distribution matrix [F] of F is defined as a M1 ×M2- matrix where the (x,y)-entry of [F]d corresponding to the multipoints d d x = (x1, ··· , xd) ∈ M1 and y = (y1, ··· , yd) ∈ M2 is Differential cryptanalysis was proposed by Biham and defined as the probability that we simultaneously have F(xi) = Shamir in [14] which depends on a differential distinguisher d F yi for i = 1, ··· , d. It is denoted by [F]x,y =Pr[x → y]. shown in Table II. Differential probability is defined as There is no precise definition of decorrelation, however it C can be determined when the distribution matrix of the given DP (a, b) = PrX [C(X + a) = C(X) + b] (6) function (or a cipher) is compared with the distribution matrix Vaudenay shows that the advantage of the differential distin- of another function (or a cipher). When a random function guisher is: (or a cipher) has the same distribution matrix of the perfect n n function (or the perfect cipher), it has a perfect decorrelation. ∗ 2 ∗ 2 AdvTable II(C,C ) ≤ + |||[C] − [C ] |||∞. (7) To compare two distribution matrices of two functions (or two M − 1 2 ciphers) we need to define decorrelation distance: Both Advantages 5 and 7 show that small (negligible) 2- Definition 6: Given two random functions F and G from wise decorrelation bias allows to prove that C is immune from a given set M1 to a given set M2 and d be an integer and against linear and differential cryptanalysis. Md×Md d d a distance D over matrix space R 1 2 , and D([F] , [G] ) is called the d-wise decorrelation distance between F and G. TABLE II DIFFERENTIAL DISTINGUISHER If G is the ideal version of F, then D([F]d, [G]d) is called the d-wise decorrelation bias of F. Parameters: n, a characteristics (a, b) Oracle: a permutation c To compute the distance, the following matrix norms are 1.for i = 1 to i = n used according to the purpose: 2. pick X at random and obtain c(X) and c(X + a) X 3. if c(X + a) = c(X) + b output 1 and stop |||A|||∞ = max |Ax,y| (1) x 4. end for y 5.output 0 X X ||A||a = max ··· max |A(x1,··· ,xd),(y1,··· ,yd)| (2) x1 xd y1 yd Definition 7: A non-adaptive iterated distinguisher of order As mentioned before, Vaudenay finds connection between d and complexity n which is illustrated in Table III is defined 2-wise decorrelation bias and the advantage of the linear by d distinguisher and differential distinguisher of a given cipher. • ”a plaintext distribution” D on M 2d was proposed by Matsui [11], [12] is • ”a text function” T from M n a statistical attack which is based on a linear distinguisher • ”an acceptance function” A from {0, 1} to [0,1] depicted in Table I. Linear probability is defined as: where M is a set and d is an integer. C 2 LP (a, b)) = (2PrX [X · a = C(X) · b] − 1) . (3) d-wise decorrelation and iterated attack of order d. EDIC RESEARCH PROPOSAL 3

TABLE III ∗ ∗ NONADAPTIVE ITERATED ATTACK OF ORDER d (or (X,C (X))) is Z = EX (T (X,C(X)) (or Z = ∗ ∗ Parameters: n, D,T and M EX (T (X,C (X)) and the probability p (or p ) that the attack ∗ Oracle: a permutation c accepts C (or C ) is p = EC (A(T1, ··· Tn)). Then, write p as 1. for t = 1 to t = n do p = E (P Zt1+···+tn (1 − Z)n−(t1+···+tn)). This C (t1,··· ,tn)∈A 2. pick X = (X1,X2, ··· ,Xd) at random with distribution D Pn i n−i sum can be rewritten as p = i=0 aiE(Z (1 − Z) ) for 3. obtain Y = (c(X1), c(X2), ··· , c(Xd)) = (Y1,Y2, ··· ,Yd) n ∗ some integer ai where 0 ≤ ai ≤ i . Since p−p is maximum 4. set Ti = 0 or 1 with an expected value T (X,Y ) n 5. end for if all ai’s are 0 or i , the acceptance set can be chosen 6. If (T1, ··· ,Tn) ∈ A output 1 otherwise 0. as A = {(t1, ··· , tn)|t1 + ··· + tn ∈ D} . Therefore, we get P n i n−i p = E(f(Z)) where f(x) = i∈D i x (1 − x) . Since 0 P n i−nx i n−i 0 |f (x)| ≤ nx≤i≤n i x(1−x) x (1 − x) , |f (x)| ≤ 2n We can see from the advantage of differential cryptanalysis for any x. Thus, by the formal definition of derivative, we (from equation 7), small 2-wise decorrelation bias is enough have |f(Z) − f(Z∗)| ≤ 2n|Z − Z∗|. Now, consider Z2 to prove the advantage is small, even tough it is an iterated with 2d entries such that T ((X1, ··· ,Xd), (Y1, ··· ,Yd)) × attack of order 2. However, it is not always true that a cipher T ((Xd+1, ··· ,X2d), (Yd+1, ··· ,Y2d)). According to Theo- ∗  2 ∗2  which has a small d-wise decorrelation bias is resistant to rem 10 in [3], |E(Z) − E(Z )| < 2 , |E(Z ) − E(Z )| < 2 , ∗ 3 ∗ the iterated attack of order d. The following counter example hence |V (Z) − V (Z )| < 2 . Since |p − p | ≤ E(|f(Z) − can be provided: Consider C(x) = Ax + B over GF () f(Z∗)|), we need to prove that |Z − Z∗| is small with where (A, B) ∈ GF (q)∗ × GF (q) which has a perfect 2- high probability. Now, Tchebichev’s inequality can be used V (Z) wise decorrelation. Let D be the distinguished subset of to prove the rest: we have Pr[|Z − E(Z)| > λ] ≤ 2 and ∗ λ (A, B) ∈ GF (q)∗ × GF (q) with cardinality (q(q−1)) and ∗ ∗ V (Z ) µ Pr[|Z − E(Z )| > λ] ≤ λ2 .Then, for any pairs (x , x ), (y , y ), the key (A, B) is expressed 1 2 1 2 V (Z) V (Z∗) as the function f(x , x , y , y ), therefore for the uniform ∗ ∗ 1 2 1 2 |p − p | ≤ 2 + 2 + 2n(|E(Z) − E(Z )|) + 2λ) distribution of all pairs (x , x ) where x 6= x , then the test λ λ 1 2 1 2 2V (Z∗) + 3   function is: 2 ( ≤ 2 + 2n(2λ + )) 1 if f(x1, x2, y1, y2) ∈ D λ 2 T ((x1, x2), (y1, y2)) = ∗ 3 2 1 ≤ 5((2V (Z ) + )n ) 3 + n 0 otherwise 2 and the acceptance set is 2V (Z∗)+ 3  1 ( 2 3 ∗ 1 if (t1, ··· , tn) 6= (0, ··· , 0) Where λ = ( n ) . When V (Z ) is computed for 2 2 2 A(t1, ··· , tn)= 2d ≤ M, we have V (Z∗) ≤ δ + d + d ≤ δ + 5d 0 otherwise 4M 2(M−d) 4M 1 which leads to the announced result. The probability that the attacker accepts C is p = since µ Notice that theorem is reasonable if δ is not large and M the value of f(x1, x2, y1, y2) is fixed and in each iteration T gives the same answer. But, for the perfect cipher C∗ each is large enough. In what follows, if  is negligible, the cipher iteration provides a random answer. Therefore, we get p∗ = is secure against the iterated attacks of order d. 1 1−(1− )n. Thus, the advantage of the attacker is |p−p∗| = µ III.THE BLOCK CIPHER C 1 1 | − (1 − (1 − )n)|. If the attack has two iterations only, This section describes a practically provably secure block µ µ cipher C proposed by Baigneres and Finiasz [2]. They use 1 1 1 1 i.e, n=2, then |p − p∗| = | − (1 − (1 − ))| = | (1 − )| is the same construction of Baigneres and Vaudenay [1] which µ µ µ µ high. To conclude, we have iterated attack of order 2 on the replaced the substitution boxes of AES by independent per- cipher C which has perfect 2-wise decorrelation. fectly random permutations. In [2], they prove the security of The following theorem shows that small 2d-wise decorrela- the block cipher C against 2-limited adaptive distinguishers, tion bias is sufficient to be secure against this model of attacks iterated attacks of order 1, linear and differential cryptanal- [6]. ysis, impossible differential attack. In addition, they give Theorem 1: Let C be a cipher on a message space of strong evidence of the security of C against algebraic attacks, size M such that |||[C]2d − [C∗]2d||| ≤  for some given d , the boomerang attack, the rectangle attack and (d ≤ M/2) where C∗ is the perfect cipher. Let us consider differential-linear attack. a nonadaptive iterated distinguisher depicted in Table III of order d between C and C∗ of complexity n. We assume that A. Overview of the Block Cipher C the distinguisher generates sets of d of independent C is a 128-bit block cipher which has the same SPN and equal distribution in all iterations. We have structure with AES except that there is no round key addition r and that substitutions boxes of AES are replaced by perfectly 5d2 3 Adv ≤ 5 3 (2δ + + )n2 + n random permutations and there is no linear transformation in 2M 2 the last round. where δ is the probability that two different iterations send at least one query in common. C consists of 10 independent rounds C1, C2, ··· , C10 Proof: The probability that the test accepts (X,C(X)) such that C=C10 ◦ C9 ◦ · · · ◦ C1. Each round Ci except EDIC RESEARCH PROPOSAL 4

TABLE IV for the last round applies a non-linear transformation (S-box THEEXACTVALUESOFTHEBEST 2-LIMITEDADAPTIVEDISTINGUISHER i i i layer) S and a linear transformation L such that C = L ◦ S FOR r ROUNDSOF C for 1 ≤ i ≤ 9 and exceptionally, the last round is C10 = S10. Each round has 16 independent and perfectly random round r Advantage round r Advantage permutations. The linear transformation L is independent 2 1 7 2−126.3 −4.0 −141.3 of round number which first rotates to the left of each row 3 2 8 2 4 2−23.4 9 2−163.1 according to the its number. That is, the zero row is not −45.8 −185.5 rotated, the first row is rotated to the left by one byte and 5 2 10 2 6 2−71.0 the second row is rotated by two bytes to the left, so on. Then, each column of the resulting state is multiplied by an MDS matrix . The linear transformation layer L is specified ` 2m` defined by PS(x,x0),γ = 1γ=SUPP(x⊕x0) and SP be 2 ×2 as follows: −` matrix defined by SPγ,(x,x0) = 1γ=SUPP(x⊕x0)M (M − 1)−wt(γ). Note that the 2-wise distribution matrix of S layer b0,j 02 03 01 01  a0,j  2 −` is [S](x,x0),(y,y0) = 1SUPP(x⊕x0)=SUPP(y⊕y0)M (M − b1,j 01 02 03 01 a1,j+1 0   =   ×   1)−wt(x⊕x ). Notice that SP ×PS =Id and PS ×SP = [S]2 b  01 01 02 03 a   2,j    2,j+2 which helps to rewrite [C]2 as [C]2 = SP × (PS × [L]2 × r−1 ¯ r−1 ¯ b3,j 03 01 01 02 a3,j+3 SP ) × PS = SP × (L) × SP where L216×216 = where a and b denote the input and output of L, respectively. SP × [L]2 × PS indexed by supports. They noticed that L¯ equals the matrix M stated in the following Theorem in [1]: 16 : for keyed C, its key schedule produces Theorem 2: Let c0 and cr be two masks in GF (q) of 160 integers in [0, 28! − 1] by using 128-bit secret key. Cryp- support γ0 and γr respectively. Let σ = q − 1. The expected tographically secure Blum-Blum-Shub (BBS)pseudorandom linear probability over r > 1 rounds of C, when c0 the input number generator [15] is used to generate the extended key. mask and cr the output mask is

Each integer defines a permutation by using one to one map- C −wt(γr ) r−1 ELP (c0, cr) = σ × (M )γ ,γ ping algorithm between 0 and 28! and the set of permutations 0 r of {0, 1}8. This algorithm is specified in [2]. Shortly, the where M is a 216 ×216 matrix, index by pairs of masks −wt(γi−1) key schedule algorithm is defined as follows: let p and q (γi−1, γi) such that Mγi−1,γi = σ N[γi−1, γi]. ∗ are strong-strong primes and n = pq and xi ∈ Zn is Therefore, the 2-wise distribution matrix of r rounds of 894 1023 2 C [C]2 = q−16 C (( (x ⊕ defined as: x1 = k · 2 + 2 and xi = xi−1 mod n is computed as (x,x0),(y,y0) ELP SUPP 0 0 where i = −1, 0, 1, ··· . Let BBS = a1b1a2b2... be the x ), SUPP(y ⊕ y )). pseudorandom bit string where ai, bi ∈ {0, 1} which denotes The advantage of adaptive distinguisher is computed as: x the least and the most significant bits of i, respectively. Then, ||[C]2 − [C∗]2|| this string is used to find 160 integers each of which defines a a X X −16 C 0 0 permutation according to the one-to-one mapping algorithm. = max max |q ELP (SUPP(x ⊕ x ), SUPP(y ⊕ y )) x x0 In total, about 300.000 key bits are generated. y y0 ∗ 2 − [C ](x,y0),(y,y0)| X X 1 B. Security Results of C = q−16 max max |ELPC (γ, γ0) − | x γ6=0 q16 − 1 In [2], they compute the exact advantage of the best 2- y γ06=0 limited adaptive and nonadaptive distinguishers against r X 0 1 0 = γ rounds of C given in Table IV by using decorrelation theory. supp(y⊕y ) y6=y0 What they prove is that 7 rounds of C is enough to resist X 1 0 against 2-limited distinguisher. According to Theorem 10 in = max |ELPC (γ, γ0) − |(q − 1)w(γ ). γ6=0 q16 − 1 [3], the advantage of the best d-limited nonadaptive distin- guisher is: They get the same result for nonadaptive distinguisher which shows that adaptive distinguisher is not better than the non- ∗ 1 d ∗ d AdvAna(C,C ) = |||[C] − [C ] |||∞ adaptive one. 2 Theorem 3: [2] The advantage of the best 2-limited non- ∗ Therefore, to compute the exact advantage of the best 2- adaptive and adaptive distinguishers AdvAna(C,C ) and 2 ∗ 2 ∗ limited adversary is not practical, since [C] and [C ] are both AdvA(C,C ), respectively is: 2256×2256 matrices. However, they found some practical ways ∗ ∗ to compute the advantages. From now on, we will consider C AdvAna(C,C ) = AdvA(C,C ) = 10 9 1 X ∗ 0 as r rounds with m-bit ` S-boxes. Since C=C ◦ C ◦ · · · max |ELPC (γ, γ0) − ELPC (γ, γ0)|(q − 1)wt(γ ). ◦ C1 = Sr ◦ L ◦ Sr−1 ◦ · · · ◦ S2 ◦ L ◦ S1 2 γ , S-boxes which γ06=0 are linearly independent have the same distribution matrices, we have [C]2 = ([S]2 × [L]2)r−1 × [S]2. Notice that for two independent random permutations C1 and C2 over M, we We can further decrease the size of the matrix by consider- d d d 2m` ` C have [C2 ◦ C1] = [C1] × [C2] . Let PS be 2 × 2 matrix ing the weight of the supports. More explicitly, ELP only EDIC RESEARCH PROPOSAL 5 depends on weight of the diagonals of γ and the weights mod GF (264) where x and y are 32-bit data. The multiplica- 0 0 64 of the columns of γ . Let w = (w0, w1, w2, w3) and w = tion is over the finite field GF (2 ) which is defined by the 0 0 0 0 64 11 2 (w0, w1, w2, w3) be the weights of the diagonals of γ and the polynomial x + x + x + x + 1 over GF (2). One round 0 weights of the columns of γ , respectively, then Theorem 3 Feistel is defined as Fi(x, y) = (y, x⊕φ(ROL11(φ(y⊕ki))+c can be rewritten as: mod 232)) where φ(x) = x+256·S(x mod 256) mod 232, where c = B7E15162 and S is an 8 × 24 S-box defined in Adv x A [8]. 1 X ∗ 0 = max |ELPC (γ, γ0) − ELPC γ, γ0)|(q − 1)wt(γ ) 2 γ γ06=0 A. A Brief description of Differential-Linear Cryptanalysis 1 X ∗ 0 = max |ELPC (w, w0) − ELPC (w, w0)|(q − 1)wt(w ) Langford and Hellman introduced Differential-Linear 2 w w06=0 Cryptanalysis in [13]. The attack uses a differential character- X istics that induces a linear relation between two intermediate 1 0 0 wt(γ )=w values with probability one. Then, Biham et al. [4] 0 γ 6=0 improves the technique that the differential part has probability 1 X ∗ 0 = max |ELPC (w, w0) − ELPC (w, w0)|(q − 1)wt(w ) which is smaller than 1. 2 w w06=0 Let E be a block cipher which is composed of two subciphers N[w0]. and E1, i.e, E = E1 ◦E0. Let ΩP → ΩT be differential for E0 with probability 1 and λP → λT be a linear approximation for E1 with probability 1/2 + q. Their attack requires that 0 0 0 where N[w ] is the number of γ with weight w, i.e N[w ] = the bits masked in λP have a zero difference in ΩT , i.e 4  4  4  4  . ΩT · λP = 0. Let P1 and P2 be two plaintexts satisfying w0 w1 w2 w3 According to Theorem 1, since C is immune to 2-limited P1 ⊕ P2 = ΩP and C1 and C2 are their corresponding distinguisher, it is immune to iterated attack of order 1. , then we have λP · E0(P1) = λP · E0(P2) since They the keyed C in the paper and they proved that ΩT · λP = 0. According to the linear relation, λP · E0(P1) = all security proofs of C are valid for the keyed C. λT · E1(E0(P1)) is satisfied with probability (1/2 + q) and λP · E0(P2) = λT · E1(E0(P2)) is satisfied with probability IV. THEKEYED C ISNOTLESSSECURETHANTHE (1/2 + q). Since, λP · E0(P1) = λP · E0(P2), then λT · C1 = 2 UNKEYED C λT · C2 is satisfied with probability 1/2 + 2q . This is called In all security results are done so far, it is assumed that ”differential-linear distinguisher” which enables to attack on the S-boxes are independent and perfectly random. However, the given block cipher by taking many plaintexts with desired this assumption is wrong when we are using a key schedule difference, checking whether the ciphertexts agree on the with 128-bit key. Under the assumption that BBS PRNG is parity of the subset. Afterwards, Biham et al. used differential secure, the keyed C is as secure as the unkeyed C. The idea with probability less than 1, i.e p < 1. Then, the probability 2 of its proof is that if there exists a which of the distinguisher becomes p(1/2 + 2q ) + (1 − p) · 1/2 = 2 is much more powerful on the keyed C than the unkeyed C, 1/2 + 2pq . Furthermore, even if ΩT · λP = 1 the attack is then there exists a powerful distinguisher on the BBS PRNG. still applicable by checking λT · C1 6= λT · C2 instead of In other words, under the security assumption of BBS PRNG, checking λT ·C1 = λT ·C2. The data complexity of the attack −2 −4 the attack which is more efficient against the keyed C than the is O(p q ). Note that, since we only consider the bits of unkeyed C do not give a significant advantage to the adversary. of the output difference of differential characteristics, the bias will be 4pq2. V. CRYPTANALYSIS OF COCONUT98 This section is about the differential-linear cryptanalysis B. The attack of COCONUT98 which was proposed by Biham et. al in The attack captures the 15 bits of the last round. It is [4]. COCONUT98 is a perfectly decorrelated cipher to the based on 7-round differential-linear distinguisher which is order 2 and provably secure against differential and linear constructed as follows: they use 4-round differential charac- cryptanalysis and iterated attacks of order 1. Although, it is teristics with probability 0.83 · 2−4 [5] provably secure, the full rounds of COCONUT98 is broken. (e , e ) → (e ⊕e , e ) → (e , e ) → (e , 0) → (0, e ) Because, differential-linear cryptanalysis is out of the set of 19 18 18 8 29 29 18 18 18 attacks which it resists to. The fixed input difference to the decorrelation module COCONUT98 is a 64-bit block cipher with 256-bit results fixed but unknown difference, because K7 and K8 are consisting of 4 Feistel rounds, a decorrelation module and unknown. Lets call this differential as Ωmodule. The unknown 4 Feistel rounds, consecutively. Let the 256-bit user key output difference of decorrelation module does not prevent the 0 K be K = (K1,K2, ··· ,K8) where all Kis are 32-bit attack as soon as the result of Ωmodule ·λP is a constant value. 0 strings, then the subkeys of 8-round Feistel rounds kis are This result is unknown to the attacker, however it does not (k1, k2, k3, k4, k5, k6, k7, k8) = (K1,K1 ⊕ K3,K1 ⊕ K3 ⊕ affect the analysis. For the linear part, they used 3-round linear K4,K1 ⊕ K4,K2,K2 ⊕ K3,K2 ⊕ K3 ⊕ K4,K2 ⊕ K4) The relation with λP = λT = (000008D7x, 00000001x) with decorrelation module is M(xy) = (xy ⊕ K5K6) × K7K8 probability 1/2+q = 1/2+0.0364. Therefore, the probability EDIC RESEARCH PROPOSAL 6 of the distinguisher is 1/2 + 4pq2 ≈ 1/3638. For the key similar result for a block cipher in [3] (Theorem 8): recovery attack, we need to check the following equation: d H(C) ≥ d · log |M| + d · log (1 − ) ∗ ∗ 2 2 |M| ((000008D7x) · (CR ⊕ CR)) ⊕ ((00000001x) · (f(CR,CR, k8) ∗ ⊕ CL ⊕ CL) = 0 (8) which shows that if we want to prove the security of the cipher against d-limited adversary for d > 2, key length ∗ ∗ ∗ where C = (CR,CL) and C = (CL,CR) are corresponding should be too long. To cope with this, a new notion can be ∗ ∗ ciphertext of P and P satisfying P ⊕ P = ΩP . The only defined as nonperfect security for Shannon’s theorem which unknown value in this equation is the least significant bit of relates the nonperfect decorrelation. In otherwords, defining the output of f-function. Therefore, we need to guess 22 least the nonperfect security notions and formalize the nonperfect significant bit of k8. However, the authors avoid guessing all decorrelation can help the reduce the key size. By this way, 22 bits of the subkey in order to decrease time complexity. we expect to need smaller key size and still have secure If 8 least significant bit and 7 bits in positions [22 ∼ 16] are cipher to some extend. This is what we want to improve in 2 guessed and 7 bits are not guessed, the bias will be 4pq ·(1− my future work. −7+1 1 2 ) ≈ 3700 , since guessed 7 bits causes the error of that −7 bit (22nd-bit) with probability 2 . The attack procedure goes Secondly, in my future research, we will also focus on as follows: Theorem 1 which proves that if a block cipher has small 2d- 15 • Initialize a counter from 2 to 0. wise decorrelation bias, it is secure against the iterated attacks 1 26.7 • Take 8 · ( 3700−2 ) ≈ 2 plaintext-ciphertext pairs of order d. However, in order to resist to iterated attack of ∗ ∗ ∗ (Pi,Ci) and (Pj ,Cj ) such that Pi ⊕ Pj = ΩP . order d, the cipher may need smaller decorrelation order which • For each 15-bit possible subkey value of k8, check for is less 2d. If the decorrelation order is exactly determined, each ciphertext pair, Equation 8 is satisfied or not. If yes, it needs less work to prove the security of the block cipher increment the counter of that key guess by 1. against iterated attack of order d. In addition, this theorem • The key which has the maximal bias, that is |Count/N − may be rewritten according to the nonperfect decorrelation, 1/2| is maximal, adapt this key candidate as the correct too. subkey value. 26.7 27.7 The data complexity of the attack is 2 · 2 = 2 REFERENCES chosen plaintexts and their corresponding ciphertexts. The [1] T. Baigneres` and S. Vaudenay. Proving the security of AES substitution 1 27.7 15 39.7 time complexity is 8 · 2 · 2 = 2 full COCONUT98 and permutation network. In B. Preneel and S.E. Tavares, editors, . The memory complexity is the same as the data Selected Areas in Cryptography, SAC’05 , vol. 3897 of LNCS, pp. complexity. 65-81, Springer-Verlag, 2006. [2] T. Baigneres` annd M. Finiasz. Dial C for Cipher. In Biham and Youssef, Time complexity can be reduced by doing precomputation of editors. Selected Areas in Cryptography, SAC’06, vol. 4356 of LNCS, the 15-bit ciphertext value and 15-bit key value in the last pp. 76-95, Springer-Verlag, 2007. round. Thus, it costs 215 · 215 = 230 last round encryptions. [3] S. Vaudenay. Decorrelation: a theory for block ciphers security. Journal 39.7 of Cryptology, 16(4): pages 249-286, 2003. Then, the time complexity attack becomes 2 memory [4] E. Biham, O. Dunkelman, N. Keller. Enhancing Differential-Linear access to the precomputed table which is equivalent to 233.7 Cryptanalysis. Advances in Cryptology ASIACRYPT’02: vol. of 2501 full COCONUT98 encryptions. The memory complexity is of LNCS pp.25466, Springer-Verlag, 2002. [5] D. Wagner. The Boomerang Attack. Proceedings of FSE’99, vol. 1636 dominated by data storage. of LNCS, pp. 156-170, 1999. Finally, the success rate of the attack is computed as 75.46% [6] S. Vaudenay. Resistance Against General Iterated Attacks. In Advances in Cryptology EUROCRYPT’99, vol. of 1592 of LNCS pp.25571, Springer-Verlag, 1999. VI.MY FUTURE RESEARCH PROPOSAL [7] S. Vaudenay. Feistel Ciphers with L2-Decorrelation. In Selected Areas in Cryptography, vol. of 1556 of LNCS pp.1-14, Springer-Verlag, 1999. My future research will mainly concentrate on the provable [8] S. Vaudenay. for Block Ciphers by Decorrelation. 0 security of block ciphers by using Decorrelation Theory. Be- In ST ACKS 98, vol. 1373 of LNCS, pp. 249-275, Springer-Verlag, 1998. cause, for the provable security of block ciphers, Decorrelation [9] K. Nyberg L- R. Knudsen. Provable security against differential - Theory has very important role in cryptography that it enables analysis. Journal of Cryptology, vol. 8, pp. 27-37, 1995. to design provable secure block ciphers such as the security [10] F, Chaubaud, S. Vaudenay. Links between Differential and Linear Cryptanalysis. In Advances in Cryptology EUROCRYPT’94, vol. 950of guarantee practical block cipher C. However, the theory has LNCS, PP. 356-365, Springer-Verlag,1995 some drawbacks. So far, all proofs are done according to the [11] M. Matsui. Linear Cryptanalysis Methods for DES Cipher. In Advances perfect decorrelation which has similar results with Shannon’s in Cryptology EUROCRYPT’93, vol. 765 of LNCS, pp. 386-397, Springer-Verlag, 1994. perfect secrecy theory. In addition, Vaudenay also proved that [12] M. Matsui. The First Experimental Cryptanalysis of Data Encryption perfect decorrelation implies perfect secrecy. According to Standard. . In Advances in Cryptology CRYPTO’94, vol. 839 of LNCS, the Shannon theorem, if a cipher C has perfect secrecy for pp. 1-11, Springer-Verlag, 1994. [13] S. K. Langford, M. Hellman. Differential-Linear Cryptanalysis, Ad- any distribution of plaintexts over the plaintext space that is vances in Cryptology, proceedings of CRYPTO’95, vol. 839 of LNCS, H(X|C(X)) = H(X) where H denotes Shannon entropy, pp. 17-25, 1994. [14] E. Biham and A. Shamir. Differential cryptanalysis of des-like cryp- then H(C) ≥ log2|M|. This says that, the key length should tosystems. In CRYPTO’90: Proceedings of the 10th Annual International be greater or than equal to the plaintext length for the perfect Cryptology Conference on Advances in Cryptology, pp. 2-21, Springer- secrecy. For the perfect decorrelation, Vaudenay proved the Verlag,1991. EDIC RESEARCH PROPOSAL 7

[15] L. Blum, M. Blum, and M. Shub. A simple unpredictable pseudorandom number generator. SIAM J. Comput., 15(2): pp. 364-383, 1986. [16] C. E. Shannon. Communication Theory of Secrecy Systems. Bell system technical journal, vol. 28, pp. 656-715, 1949.