<<

Chapter 5 - integral .

James McLaughlin

1 Introduction.

The history of is a little complicated, and the most important papers to study regarding it are not in fact the ones in which it was first defined. We give a brief recap here: In 1997, Daemen, Knudsen, and Rijmen published a paper [3] describing a new cipher. This cipher, , was a forerunner of Rijndael [10], the eventual AES, and was designed using the same wide trail strategy to provide security against differential and . However, while working on the paper, the authors discovered a new kind of chosen- attack which broke six rounds of SQUARE. Since SQUARE had originally been designed with this many rounds, the authors were forced to add more rounds and to publish details of the attack as well as the cipher. The attack - not at the time given a name, but later referred to as the “Square attack” - did not scale well to attack more rounds, and also bore certain similarities to linear and differential cryptanalysis. Because of this, and because of the level of diffusion stipulated by the wide trail strategy, the authors decided that only two extra rounds needed to be added to the cipher to defeat the attack. Although specific to SQUARE, the similarities between SQUARE and Rijndael, as also the cipher CRYPTON [8], meant that it could easily be adapted to these ciphers. Again, it broke six rounds of Rijndael, in an attack much the same as the attack against SQUARE, and the creator of CRYPTON conjectured [8] that it would not break more than six rounds of that either. Lucks [9] went on to generalise the attack to non-SQUARE-like ciphers. He used it to attack the Twofish [4], and renamed his generalised attack the “saturation attack”. Other ciphers were attacked using modifications of the original techniques (in papers including [1], [5], [2]) until Knudsen and Wagner [7] drew the various divergent techniques together into a single framework, came up with the term integral cryptanalysis, and defined the more powerful higher-order integral attack.

2 Technical details

Definition 2.1. The definition of a set is such that it cannot contain duplicate elements. A multiset can, however, although like a set no ordering on its elements is necessarily defined. For example, {1, 1, 2} is a multiset containing 1 twice and 2 once. It is equivalent to {1, 2, 1} but not to {1, 2}.

When referring to a “set” of chosen or known , since these “sets” can contain duplicates, there are times when “multiset” would in fact be correct. We make the distinction here because integral cryptanalysis has its roots in multiset theory. The structures of chosen plaintexts it takes as input are typically larger than pairs. Usually, some division of the plaintext block into w “words” is defined that corresponds to the way in which the cipher operates. For instance, AES treats the data as a 4x4 array of bytes, and operates on it bytewise, so the example attacks on reduced-round AES use these bytes as the words. The plaintexts will be chosen so that the multiset of all their ith words (1 ≤ i ≤ w) satisfies some specific property for at least one value of i - for instance, every plaintext’s 12th word might be the zero vector. (In practice, it would seem unlikely that specifying a property for merely one value of i would be sufficient to be able to predict properties of the data after several rounds of diffusion, indeed we may need to specify a property for every i.) Before we continue, it should be noted that not every cipher operates “wordwise” on small blocks of data. AES, as we have noted, does, but DES operates on individual bits instead; and this seems to have contributed to its being much less vulnerable to integral cryptanalysis than its successor.

Definition 2.2. Like linear approximations or differential characteristics, an integral “covers” several rounds of the cipher in sequence, and describes how we expect the specified properties to be affected by each successive round. In particular, we expect with some probability that at least one word of the output at the end of these rounds will satisfy a particular property.

For example, this diagram (the symbols are explained below) shows a three-round integral for AES:

(Diagram taken from [6])

The symbols (here ordered by the amount of information each specifies) have the following meanings:

C: A word labelled C is such that every plaintext//interim data block in our multiset has the same value for that word. (Note that if the ith and jth words are both labelled C, the value for the ith words isn’t necessarily the same as that for the jth words.) Now, a typical integral attack uses a number m of chosen plaintexts equal to the number k of possible words, and in such a situation labelling the ith word C also means that the sum of all the ith words is the zero vector.

A: If the ith word is labelled A, then no two data blocks have the same value for this word. If m = k, this specifies even more information, to whit that every possible value for the ith word occurs in precisely one block, and that we can predict the sum of these words. (If each word is a string of at least two bits, the addition operation will probably be bitwise xor (⊕) and the sum will be the zero vector.) Explaining why the A symbol seems to give us less information than the C symbol would be a little complicated, but note the way in which, in the examples, the Cs are replaced with As as the rounds progress. This is quite strong evidence of information being lost as a result of the cipher’s diffusion properties. S: If the ith word is labelled S, the sum of all the ith words can be predicted (the attacker must be able to specify what that sum is when drawing up the attack.) As already stated, the addition operation used to define “sum” will probably be ⊕ - we are not aware of any integral attacks using addition with carry here. If m = k, we know that this property holds for words labelled with the C or A symbols, and so the S symbol gives us less information than either. If this is not the case, we know that there are some situations where S does provide more information than A, however. In addition, this is the only one of the various properties that, if specified at the end of the integral, would not necessarily occur if the wrong TPS was tried in the final round. For this reason, it is vitally important that we have at least one S in the last round of the integral.

There is also one more symbol, which did not occur above:

?: The question mark tells us that no information is known or can be predicted about the multiset of ith words. (A diagram in the next section depicts an integral for a different cipher, in which an increasing number of words are labelled with ? as the integral nears its final round.)

More such symbols may be defined in future research (indeed a variant attack in the next section replaces A with two new symbols). However, regardless of the symbols used, the attacker will still need to devise an integral with at least one symbol like S in its last round, to create a property observable for the correct TPS but not for at least some of the incorrect guesses. Returning to the integral described above; the reader may well be surprised to discover that it holds with probability 1! In fact, this is also the case for every one of the example integrals in [6]! We will therefore assume for the remainder of this chapter that any integral we refer to has probability 1 - although this is not necessarily so, the specified relations do frequently determine the behaviour of cipher components with 100% probability. This can be explained in part by noting that the nature of the relations and the number of words input do give us more information then a single function input (or pair thereof) with some simple property would in predicting the output. (That said, it may be that accepting the loss of such determinism as the rounds progress, and defining new relations which only hold with certain probability in later rounds, may turn out to be necessary to extend the attack any further.)

2.1 Using integrals in an attack Let’s assume we’ve chosen an integral covering the first (r − 1) rounds, where r is the total number of rounds. We can then attack the cipher in the same way as we would with differential or linear cryptanalysis. That is, we choose a subset of the ciphertext words such that some property of their partial decryptions is predicted by the integral, and define the TPS to be the set of bits affecting those words in the final round. For each candidate TPS, we decrypt those words in all of the . Any TPS for which the multiset of decrypted words does not satisfy the predicted property can be instantly eliminated. We can use integrals covering different subsets of rounds in much the same way as we would with differential cryptanalysis. [6] uses the three-round integral shown earlier to cover rounds 2, 3, and 4 of a reduced six-round AES. We count on TPS candidates which include one byte from the final-round key, and four bytes from the previous round key, corresponding to the data in that round that would affect the fifth byte in the next. It is also necessary to count on four more bytes in the first round, and to use the results of this to select 256 plaintexts from a total of 232 before partially decrypting their corresponding ciphertexts. Not all ciphers are as easy to define words and integrals for as AES. DES, as we’ve stated, is one example - although it divides the data up into two 32-bit blocks, it operates on those blocks at the level of individual bits, and the expansion and permutation make it hard to see any way of subdividing the blocks into words. Various two-round integrals were still found for DES by treating the 32-bit blocks as the words, then dividing them up into smaller subwords that didn’t correspond to the 6-bit S-box inputs but still allowed properties of the multisets of S-box outputs to be predicted. (Insofar as we can currently tell, however, these integrals can only be used to attack versions of DES reduced to three or four rounds.)

Definition 2.3. The term “integral” is also used in another context. Given a multiset S of vectors, an integral for S is the sum of all the vectors in S. (Note that the vectors will typically be strings of bits, and that the addition operation will probably be ⊕.)

3 Higher-order integrals.

As stated in the previous section, a typical integral attack uses a number m of chosen plaintexts equal to the number k of possible words. This may, for instance, be because only one word varies between the plaintexts, and that word takes all k possible values. But what if we were to use a number of chosen plaintexts equal to kd for some integer d> 1? Would it be possible to predict properties, not only of individual words, but of concatenations of d words? The answer is that it would, and integrals predicting such properties and working with the correspondingly larger amounts of data are known as d-th order integrals. We can use dth-order integrals to improve on the effectiveness of integral cryptanalysis. To keep the notation consistent with [7], we will say that the number of chosen plaintexts is equal to md, which in turn is equal to kd. Let us first of all define two new symbols to replace A:

Ad: A word labelled Ad takes each of the k possible values precisely md−1 = kd−1 times.

d d Ai : Asetof d words, all labelled Ai for the same value of i, is such that the concatenation of these words takes all md possible values precisely once.

d d d Since any word labelled Ai also satisfies the A property, Ai gives us more information than Ad. It still gives us less information than C, and both symbols convey more information than S since the sums of the words they label can still be predicted. For example, this diagram from [6] shows a four-round fourth-order integral for AES, re- quiring 232 chosen plaintexts: It can be used to launch an attack on six-round AES which is faster than the one described previously, but which only recovers five key bytes instead of nine. In this attack, rounds 1 to 4 are the ones covered by the new integral. This diagram, also from [6], shows a second-order integral for Nyberg’s “Generalised Feistel Network” cipher [11]. As mentioned earlier, observe how the number of words about which we can predict nothing (those labelled ?) increases rapidly in the later rounds:

Finally, we note that a near-generalisation of Definition 2.3’s “integrals” to the higher-order case also exists:

Definition 3.1. Consider a multiset of kd chosen plaintexts, such that for one particular d- subset of the set of words, the concatenation of these words takes all possible values at least once, and all other words remain constant. The sum of the plaintexts in this set is known as a dth-order integral. (Note that an integral according to Definition 2.3 is not equivalent to a first-order integral here.)

References

[1] P.S.L.M. Barreto, H.Y. Kim, J. Nakahara Jr, B. Preneel, V. Rijmen, and J. Vandewalle. Improved SQUARE attacks against reduced-round . In M. Matsui, edi- tor, Proceedings of the Eighth International Workshop on Fast Software (FSE 2001), volume 2355 of Lecture Notes in Computer Science, pages 165–173. IACR, Springer, April 2001.

[2] P.S.L.M. Barreto, H.Y. Kim, J. Nakahara Jr, B. Preneel, and J. Vandewalle. SQUARE attacks on reduced-round PES and IDEA block ciphers. Cryptology ePrint Archive, Report 2001/068, August 2001. http://eprint.iacr.org/2001/068. [3] J. Daemen, L. Knudsen, and V. Rijmen. The SQUARE. In E. Biham, editor, Proceedings of the Fourth International Workshop on Fast Software Encryption (FSE 1997), volume 1267 of Lecture Notes in Computer Science, pages 149–165. IACR, Springer, January 1997.

[4] N. Ferguson, C. Hall, J. Kelsey, B. Schneier, D. Wagner, and D. Whiting. Twofish: A 128- bit block cipher, June 1998. http://www.schneier.com/paper-twofish-paper.pdf.

[5] Yeping He and Sihan Qing. Square attack on reduced cipher. In Sihan Qing, Tat- suaki Okamoto, and Jianying Zhou, editors, Proceedings of the 3rd International Confer- ence on Information and Communications Security (ICICS 2001), volume 2229 of Lecture Notes in Computer Science, pages 238–245. Springer, November 2001.

[6] L.R. Knudsen and D. Wagner. Integral cryptanalysis. Hosted on NESSIE project website, December 2001. Full version of FSE 2002 paper. https://www.cosic.esat.kuleuven. be/nessie/reports/phase2/uibwp5-015-1..

[7] L.R. Knudsen and D. Wagner. Integral cryptanalysis (extended abstract). In J. Daemen and V. Rijmen, editors, Proceedings of the Ninth International Workshop on Fast Software Encryption (FSE 2002), volume 2365 of Lecture Notes in Computer Science, pages 629– 632. IACR, Springer, February 2002.

[8] C.H. Lim. A revised version of CRYPTON: CRYPTON v1.0. In L.R. Knudsen, editor, Proceedings of the Sixth International Workshop on Fast Software Encryption (FSE 1999), volume 1636 of Lecture Notes in Computer Science, pages 31–45. IACR, Springer, March 1999.

[9] S. Lucks. The saturation attack - a bait for Twofish. In M. Matsui, editor, Proceedings of the Eighth International Workshop on Fast Software Encryption (FSE 2001), volume 2355 of Lecture Notes in Computer Science, pages 187–205. IACR, Springer, April 2001.

[10] National Institute for Science and Technology (NIST). Advanced Encryption Standard (FIPS PUB 197), November 2001. http://www.csrc.nist.gov/publications/fips/ fips197/fips-197.pdf.

[11] K. Nyberg. Generalized Feistel networks. In Kwangjo Kim and Tsutomu Matsumoto, editors, Advances in Cryptology - Asiacrypt ’96, volume 1163 of Lecture Notes in Computer Science, pages 91–104. IACR, Springer, November 1996.