EE 418 Network Security and Cryptography Lecture #6 Outline: 1

EE 418 Network Security and Cryptography Lecture #6 October 18, 2016 Cryptanalysis. Lecture notes prepared by Professor Radha Poovendran. Tamara Bonaci Department of Electrical Engineering University of Washington, Seattle Outline: 1. Review: Introduction to cryptanalysis 2. Remarks on Letter Distribution of the English Language 3. Cryptanalysis of the Affine Cipher 4. Cryptanalysis of the VigenèreCipher 5. Cryptanalysis of the Hill Cipher 1 Review: Introduction to Cryptanalysis Last lecture, we started our discussion on how secure cryptosystems are, and how could one go about break- ing them. In doing so, we turned to cryptanalysis, and started by considering one of the most important assumptions in the modern cryptography, namely the Kerchoff's principle, which states that in assessing the security of a cryptosystem, one should always assume that an attacker knows the details of the cryptosystem being used. Therefore, the security of the system should always be based on the key, and not on the obscurity of a cryptographic algorithm. 1.1 Attack models We then considered different goals that an attacker can have when attacking a channel between communicating parties. For example, an attacker may wish to: 1. Read one specific message. 2. Find the encryption/decryption key, and thus read all of the exchanged messages. 3. Corrupt Alice's message into another message in such a way that Bob thinks that Alice has sent the altered message. 4. Masquerade as Alice in order to communicate with Bob such that Bob believes he is communicating with Alice. For each of these goals, there are four main types of attacks that an attacker can use, and those types differ in the amount of information an attacker has available when trying to determine the key. Those four attack types are as follows. Type of attack Description Ciphertext only attack Eve only observes the ciphertext y Known plaintext attack Eve knows the ciphertext y corresponding to plaintext x Chosen plaintext attack Eve has temporary access to an encryption box. The encryption box takes as input any chosen plaintext x and outputs the ciphertext y Chosen ciphertext attack Eve has temporary access to a decryption box. The decryption box takes as input any chosen ciphertext y and outputs the plaintext x Based on these models, we can analyze the security of every cryptosystem. 1 2 Cryptanalysis of the Shift Cipher { Ciphertext only: Let K = 3 and the plaintext be shift. We then get VKLIW as the cipher (for a right shift). Assume Eve knows only the ciphertext V KLIW . Eve also knows that a shift cipher algorithm is used for encryption. Given the small cardinality of the key space, Eve can try all the possible 26 shifts in right direction. Upon shifting, the following plaintexts are obtained: 1stleft shift 2ndleft shift 3rdleft shift vkliw −! ujkhv −! tijgu −! shift, and so on. Since \shift" is the only dictio- nary word in the list of 26 possible words, Eve assumes that it is indeed the plaintext that was encrypted. Therefore, Eve can also infer the original key K = 3. { Known plaintext: If Eve knows a (plaintext, ciphertext) pair, then Eve can find the key by subtracting the plaintext from the ciphertext mod 26. For instance, if Eve knows that plaintext b corresponds to ciphertext E, then Eve can determine that K = 3. { Chosen plaintext: Choose letter a as plaintext; the resulting ciphertext will be the key. For example, if the ciphertext is P then K = 15: { Chosen cipher: Choose A as the ciphertext. The plaintext is then the negative of the key K: 3 Remarks on Letter Distribution of the English Language English language text has different frequencies for different alphabetic characters. An estimate of relative frequencies (probabilities) of the 26 letters are presentedin Table 3. Note that letter e has the maximum relative frequency of 0.127. Table 1. Probabilities of occurrence of the 26 letters of the English language alphabet. A B C D E F G H I J K L M 0.082 0.015 0.028 0.043 0.127 0.022 0.020 0.061 0.070 0.002 0.008 0.040 0.024 N O P Q R S T U V W X Y Z 0.067 0.075 0.019 0.001 0.060 0.063 0.091 0.028 0.010 0.023 0.001 0.020 0.001 Similarly we can define frequencies of digrams, trigrams, initial letters, final letters, etc. More generally, we can use the statistical properties of the English language to perform cryptanalysis. A key observation here that the vowels "a, e, i, o" and the letters "t, s, b, h, d" have relatively high probability of appearance compared to other characters. Table 3 indicates the rank order of vowels based on their frequencies, and Table 3 the rank order of consonants "t, s, d, n, h" based on their frequencies. Table 2. Rank order of the probabilities of occurrence of the vowels. E 0.127 A 0.082 I 0.075 O 0.070 U 0.028 2 Table 3. Probabilities of most frequently occurring consonants. T 0.091 S 0.063 N 0.067 H 0.061 D 0.043 4 Cryptanalysis of the Affine Cipher { Ciphertext only attack: Let's assume Eve that has intercepted the following ciphertext: FMXVEDKAPHFERBNDKRXRSREFMORUDSDKDVSH VUFEDKAPRKDLYEVLRHHR The most frequent letters are R with 8 occurrences, D with 7, E; K; H with 5 and F; V; S with 4. First guess is that R = e and D = t: Given the encryption function eK (x) = ax + b (1) we get the following linear system: 4a + b = 17 (2) 19a + b = 3: (3) Solving the system we obtain the unique solution a = 6; b = 19 (note that a solution must be in Z26). But for the affine cipher a has to be relatively prime to 26. Given that gcd(26; 6) = 2, a = 6; b = 19 is not a valid key. Second guess R = e and E = t: Solving the linear system yields a = 13 which again is not a legal key. Third guess is R = e and K = t; which yields a = 3; and b = 5: Since this is a valid key we decrypt the entire ciphertext to see if we get a meaningful English text. algorithms are quite general definitions of arithmetic processes Note: Besides the statistical analysis, Eve could have tried all possible 312 pairs (a; b) that constitute a valid key for the affine cipher. { Known plaintext attack: Let Eve know that uw = 20 22, has cipher KQ = 10 16. She can then setup the following system of linear equations: 10 = 20a + b (mod 26); (4) 16 = 22a + b (mod 26): (5) Equations 4 and 5 give: 6 = 2a mod 26. i.e. 2a = q × 26 + 6 ) a = 3; 16. But gcd(16; 26) 6= 1 ) a = 3. From Equation 4 we can now get b as follows: 10 = 20 × 3 + b (mod 26); (6) i:e: − 50 = b (mod 26) (7) i:e: b = q × 26 + (−50) ) q = 2 ) b = 2: (8) Hence Eve only needs to know two pairs of (cipher, plaintext) pairs. { Chosen plaintext: If Eve can choose ab = 0 1 as plaintext, the cipher will be: 0 × a + b ≡ b (mod 26); (9) 1 × a + b ≡ a + b (mod 26): (10) and Eve can easily find the key K. { Chosen ciphertext: Eve chooses AB as cipher, and proceeds as above. 3 5 Cryptanalysis of the VigenéreCipher 5.1 Known Plaintext Attack If Eve knows at least m (ciphertext, plaintext) pairs then by subtracting the plaintext from the ciphertext she can get the vector of m keys. 5.2 Chosen plaintext attack Choose aa..a as plaintext, and get K as the ciphertext. | {z } m a a a ... a 0 0 0 ... 0 + K1 K2 K3 ... Km K1 K2 K3 ... Km Note 1: One does not need to choose x = aa...a as plaintext, as any known plaintext will also reveal the | {z } m key K. 5.3 Chosen Ciphertext Attack Choose AAA..A as a ciphertext, and the obtained plaintext is then the negative of the key K. | {z } m A A A ... A 0 0 0 ... 0 - K1 K2 K3 ... Km −K1 −K2 −K3 ... −Km Note 2: Again, one does not need to not choose AAA..A as the ciphertext. Any chosen ciphertext will do. | {z } m 5.4 Ciphertext only attack We left this attack last as it is the hardest to launch. In general, an exhaustive search is very slow due to the large cardinality of the keyspace. We can, however, perform a statistical analysis based on the structure of the English language. The statistical analysis is more difficult than the affine and substitution cipher cases because: (a) the Vigenérecipher is a polyalphabetic cryptosystem, and (b) the length of the key m is not known to Eve. 4 Consider the following example where the plaintext is x=weed: In the given example, alphabet e is mapped PLAINTEXT: 22 4 4 3 KEY: 2 4 6 7 CIPHER: 24 8 10 10 YIKK to I the first time, and to K the second time. Moreover, alphabets e and d both map to the same cipher K. For long text, we can expect that all the letters may have equal frequency of occurrence and hence, the letter frequencies may not be particularly useful. Eve can still attempt to break the cryptosystem by executing the following attack in two stages: 1. Finding the key vector length m; 2. Finding the key vector K. Finding key vector length m using Kasiski Test: The key length m can be found using the Kasiski test.

Load more