Integral Cryptanalysis

Chapter 5 - integral cryptanalysis. James McLaughlin 1 Introduction. The history of integral cryptanalysis is a little complicated, and the most important papers to study regarding it are not in fact the ones in which it was first defined. We give a brief recap here: In 1997, Daemen, Knudsen, and Rijmen published a paper [3] describing a new cipher. This cipher, SQUARE, was a forerunner of Rijndael [10], the eventual AES, and was designed using the same wide trail strategy to provide security against differential and linear cryptanalysis. However, while working on the paper, the authors discovered a new kind of chosen-plaintext attack which broke six rounds of SQUARE. Since SQUARE had originally been designed with this many rounds, the authors were forced to add more rounds and to publish details of the attack as well as the cipher. The attack - not at the time given a name, but later referred to as the “Square attack” - did not scale well to attack more rounds, and also bore certain similarities to linear and differential cryptanalysis. Because of this, and because of the level of diffusion stipulated by the wide trail strategy, the authors decided that only two extra rounds needed to be added to the cipher to defeat the attack. Although specific to SQUARE, the similarities between SQUARE and Rijndael, as also the cipher CRYPTON [8], meant that it could easily be adapted to these ciphers. Again, it broke six rounds of Rijndael, in an attack much the same as the attack against SQUARE, and the creator of CRYPTON conjectured [8] that it would not break more than six rounds of that either. Lucks [9] went on to generalise the attack to non-SQUARE-like ciphers. He used it to attack the Feistel cipher Twofish [4], and renamed his generalised attack the “saturation attack”. Other ciphers were attacked using modifications of the original techniques (in papers including [1], [5], [2]) until Knudsen and Wagner [7] drew the various divergent techniques together into a single framework, came up with the term integral cryptanalysis, and defined the more powerful higher-order integral attack. 2 Technical details Definition 2.1. The definition of a set is such that it cannot contain duplicate elements. A multiset can, however, although like a set no ordering on its elements is necessarily defined. For example, {1, 1, 2} is a multiset containing 1 twice and 2 once. It is equivalent to {1, 2, 1} but not to {1, 2}. When referring to a “set” of chosen or known plaintexts, since these “sets” can contain duplicates, there are times when “multiset” would in fact be correct. We make the distinction here because integral cryptanalysis has its roots in multiset theory. The structures of chosen plaintexts it takes as input are typically larger than pairs. Usually, some division of the plaintext block into w “words” is defined that corresponds to the way in which the cipher operates. For instance, AES treats the data as a 4x4 array of bytes, and operates on it bytewise, so the example attacks on reduced-round AES use these bytes as the words. The plaintexts will be chosen so that the multiset of all their ith words (1 ≤ i ≤ w) satisfies some specific property for at least one value of i - for instance, every plaintext’s 12th word might be the zero vector. (In practice, it would seem unlikely that specifying a property for merely one value of i would be sufficient to be able to predict properties of the data after several rounds of diffusion, indeed we may need to specify a property for every i.) Before we continue, it should be noted that not every cipher operates “wordwise” on small blocks of data. AES, as we have noted, does, but DES operates on individual bits instead; and this seems to have contributed to its being much less vulnerable to integral cryptanalysis than its successor. Definition 2.2. Like linear approximations or differential characteristics, an integral “covers” several rounds of the cipher in sequence, and describes how we expect the specified properties to be affected by each successive round. In particular, we expect with some probability that at least one word of the output at the end of these rounds will satisfy a particular property. For example, this diagram (the symbols are explained below) shows a three-round integral for AES: (Diagram taken from [6]) The symbols (here ordered by the amount of information each specifies) have the following meanings: C: A word labelled C is such that every plaintext/ciphertext/interim data block in our multiset has the same value for that word. (Note that if the ith and jth words are both labelled C, the value for the ith words isn’t necessarily the same as that for the jth words.) Now, a typical integral attack uses a number m of chosen plaintexts equal to the number k of possible words, and in such a situation labelling the ith word C also means that the sum of all the ith words is the zero vector. A: If the ith word is labelled A, then no two data blocks have the same value for this word. If m = k, this specifies even more information, to whit that every possible value for the ith word occurs in precisely one block, and that we can predict the sum of these words. (If each word is a string of at least two bits, the addition operation will probably be bitwise xor (⊕) and the sum will be the zero vector.) Explaining why the A symbol seems to give us less information than the C symbol would be a little complicated, but note the way in which, in the examples, the Cs are replaced with As as the rounds progress. This is quite strong evidence of information being lost as a result of the cipher’s diffusion properties. S: If the ith word is labelled S, the sum of all the ith words can be predicted (the attacker must be able to specify what that sum is when drawing up the attack.) As already stated, the addition operation used to define “sum” will probably be ⊕ - we are not aware of any integral attacks using addition with carry here. If m = k, we know that this property holds for words labelled with the C or A symbols, and so the S symbol gives us less information than either. If this is not the case, we know that there are some situations where S does provide more information than A, however. In addition, this is the only one of the various properties that, if specified at the end of the integral, would not necessarily occur if the wrong TPS was tried in the final round. For this reason, it is vitally important that we have at least one S in the last round of the integral. There is also one more symbol, which did not occur above: ?: The question mark tells us that no information is known or can be predicted about the multiset of ith words. (A diagram in the next section depicts an integral for a different cipher, in which an increasing number of words are labelled with ? as the integral nears its final round.) More such symbols may be defined in future research (indeed a variant attack in the next section replaces A with two new symbols). However, regardless of the symbols used, the attacker will still need to devise an integral with at least one symbol like S in its last round, to create a property observable for the correct TPS but not for at least some of the incorrect guesses. Returning to the integral described above; the reader may well be surprised to discover that it holds with probability 1! In fact, this is also the case for every one of the example integrals in [6]! We will therefore assume for the remainder of this chapter that any integral we refer to has probability 1 - although this is not necessarily so, the specified relations do frequently determine the behaviour of cipher components with 100% probability. This can be explained in part by noting that the nature of the relations and the number of words input do give us more information then a single function input (or pair thereof) with some simple property would in predicting the output. (That said, it may be that accepting the loss of such determinism as the rounds progress, and defining new relations which only hold with certain probability in later rounds, may turn out to be necessary to extend the attack any further.) 2.1 Using integrals in an attack Let’s assume we’ve chosen an integral covering the first (r − 1) rounds, where r is the total number of rounds. We can then attack the cipher in the same way as we would with differential or linear cryptanalysis. That is, we choose a subset of the ciphertext words such that some property of their partial decryptions is predicted by the integral, and define the TPS to be the set of key bits affecting those words in the final round. For each candidate TPS, we decrypt those words in all of the ciphertexts. Any TPS for which the multiset of decrypted words does not satisfy the predicted property can be instantly eliminated. We can use integrals covering different subsets of rounds in much the same way as we would with differential cryptanalysis. [6] uses the three-round integral shown earlier to cover rounds 2, 3, and 4 of a reduced six-round AES. We count on TPS candidates which include one byte from the final-round key, and four bytes from the previous round key, corresponding to the data in that round that would affect the fifth byte in the next.

Load more