Lecture Notes (Pdf)
Total Page:16
File Type:pdf, Size:1020Kb
Cryptanalysis Lecture 2: The adversary joins the twentieth century John Manferdelli [email protected] [email protected] © 2004-2008, John L. Manferdelli. This material is provided without warranty of any kind including, without limitation, warranty of non-infringement or suitability for any purpose. This material is not guaranteed to be error free and is intended for instructional use only. 1 jlm20081004 Dramatis persona Users Adversaries • Alice (party A) • Eve (passive eavesdropper) • Bob (party B) • Mallory (active interceptor) • Trent (trusted authority) • Fred (forger) • Peggy and Victor • Daffy (disruptor) (authentication participants) • Mother Nature • Users (Yes Brutus, the fault lies Users Agents in us, not the stars) • Cryptographic designer • Personnel Security Adversaries Agents • Security Guards • Dopey (dim attacker) • Security Analysts • Einstein (smart attacker --- you) • Rockefeller (rich attacker) • Klaus (inside spy) 2 JLM 20080915 Adversaries and their discontents Wiretap Adversary (Eve) Bob Alice Eve Plaintext Encrypt Decrypt Plaintext (P) Channel (P) Man in the Middle Adversary (Mallory) Alice Bob Plaintext Encrypt Mallory Decrypt Plaintext (P) (P) Channel JLM 20080915 3 Claude Shannon JLM 20080915 4 Information Theory Motivation • How much information is in a binary string? • Game: I have a value between 0 and 2n-1 (inclusive), find it by asking the minimum number of yes/no questions. • Write the number as [bn-1bn-2…b0]2 . • Questions: Is bn-1 1?, Is bn-2 1? , … , Is b0 1? • So, what is the amount of information in a number between 0 and 2n-1? • Answer: n bits • The same question: Let X be a probability distribution taking on values between 0 and 2n-1 with equal probability. What is the information content of a observation? • There is a mathematical function that measures the information in an observation from a probability distribution. It’s denoted H(X). • H(X)= S i –pilg(pi) JLM 20080915 5 What is the form of H(X)? • If H is continuous and satisfies: – H(1/n, …, 1/n)< H(1/(n+1), …, 1/(n+1)) – H(p1,p2,…,pj,…,pn)=H(p1,p2,…, qpj, (1-q)pj,…,pn) – H(p1,p2,…,pj,…,pn)= 1 if pj= 1/n for all j n then H(p)= Si=1 -pilg(pi). • H(p1,p2,…,pj,…,pn) is maximized if pj= 1/n for all j JLM 20080915 6 Information Theory • The “definition” of H(X) has two desireable properties: • Doubling the storage (the bits your familiar with) doubles the information content • H(1/2, 1/3, 1/6)= H(1/2, 1/2) + ½ H(2/3,1/3) • It was originally developed to study how efficiently one can reliably transmit information over “noisy” channel. • Applied by Shannon to Cryptography (BTSJ, 1949) • Thus information learned about Y by observing X is I(Y,X)= H(Y)-H(Y|X). • Used to estimate requirements for cryptanalysis of a cipher. JLM 20080915 7 Sample key distributions • Studying key search • Distribution A: 2 bit key each key equally likely • Distribution B: 4 bit key each key equally likely • Distribution C: n bit key each key equally likely • Distribution A’: 2 bit key selected from distribution (1/2, 1/6, 1/6, 1/6) • Distribution B’: 4 bit key selected from distribution (1/2, 1/30, 1/30, …, 1/30) • Distribution C’: n bit key selected from distribution (1/2, ½ 1/(2n- 1),…, ½ 1/(2n-1)) JLM 20080915 8 H for the key distributions • Distribution A: H(X)= ¼ lg(4) + ¼ lg(4) + ¼ lg(4) +1/4 lg(4) = 2 bits • Distribution B: H(X)= 16 x (1/16 lg(16))= 4 bits • Distribution C: H(X)= 2n x (1/2n) lg(2n) = n bits • Distribution A’: H(X) = ½ lg(2) + 3 x(1/6 lg(6))= 1.79 bits • Distribution B’: H(X) = ½ lg(2) + 15 x(1/30 lg(30))= 2.95 bits • Distribution C’: H(X) = ½ lg(2) + 1/2 2n-1 x(1/(2n-1) lg(2n-1)) n/2+1 bits JLM 20080915 9 Some Theorems • Bayes: P(X=x|Y=y) P(Y=y)= P(Y=y|X=x) P(X=x)= P(X=x, Y=y) • X and Y are independent iff P(X=x, Y=y)= P(X=x)P(Y=y) • H(X,Y)= H(Y)+H(X|Y) • H(X,Y) S H(X)+H(Y) • H(Y|X) S H(Y) with equality iff X and Y are independent. • If X is a random variable representing an experiment in selecting one of N items from a set, S, H(X) SSlg(N) with equality iff every selection is equally likely (Selecting a key has highest entropy off each key is equally likely). JLM 20080915 10 Huffman Coding • Uniquely readable • Average length, L, satisfies Morse Code – H(X) S L S H(X)+1 A . - N - . B - . O - - - 0 S1 C - . - . P . - - . .4 D - . Q - - . - 10 E . R . - . S2 1.0 .35 F . - . S . 110 1 G - - . T - S3 .2 11 .6 H . U . - 111 I . V . - S4 .25 .05 J . - - - W . - - K - . - X - . - H(X)= -(.4lg(.4)) + .35 lg(.35) + .2 lg(.2) + .05 lg(.05)) L . - . Y - . - - H(X)= 1.74, [H(X)]= 2. [y] means the ceiling function, M - - Z - - . the smallest integer greater than or equal to y. JLM 20080915 11 Long term equivocation • HE= Lim nS (x[1],…,x[n]) (1/n)Pr(X=(x[1],…,x[n])) lg(Pr(X=(x[1],…,x[n]))) • For random stream of letters • HR= Si(1/26)lg(26)=4.7004 • For English • HE = 1.2-1.5 (so English is about 75% redundant) • There are approximately T(n)= 2nH n symbol messages that can be drawn from the meaningful English sample space. • How many possible cipher-texts make sense? • H(Pn)+H(K) > H(Cn) • nHE + lg(|K|) > n lg(|S|) • lg(|K|)/(lg(|S|)- HE)>n • R = 1- HE /lg(|S|) JLM 20080915 12 Unicity and random ciphers Question: How many messages do I need to trial decode so that the expected number of false keys for which all m messages land in the meaningless subset is less than 1? Answer: The unicity point. Nice application of Information Theory. Theorem: Let H be the entropy of the source (say English) and let S be the alphabet. Let K be the set of (equiprobable) keys. Then u= lg(|K|)/(lg(|SS)-H). JLM 20080915 13 Unicity for random ciphers Meaningful Messages 2Hn Cipher Messages |S|n Non-Meaningful Messages Decoding with correct key Decoding with incorrect key JLM 20080915 14 Unicity distance for mono-alphabet HCaeserKey= Hrandom = lg(26)= 4.7004 HEnglish 1.2. • For Caeser, u lg(26)/(4.7-1.2) 4 symbols, for ciphertext only attack. For known plaintext/ciphertext, only 1 corresponding plain/cipher symbol is required for unique decode. • For arbitrary substitution, u lg(26!)/(4.7-1.2) 25 symbols for ciphertext only attack. For corresponding plain/ciphertext attack, about 8-10 symbols are required. • Both estimates are remarkably close to actual experience. JLM 20080915 15 Information theoretic estimates to break mono-alphabet Cipher Type of Attack Information Computational Resources Resources Caeser Ciphertext only U= 4.7/1.2=4 26 computations letters Caeser Known plaintext 1 corresponding 1 plain/cipher pair Substitution Ciphertext only ~30 letters O(1) Substitution Known plaintext ~10 letters O(1) JLM 20080915 16 One Time Pad (OTP) • The one time pad or Vernam cipher takes a plaintext consisting of symbols p= (p0, p1, …, pn) and a keystream k= (k0, k1, …, kn) where the symbols come from the alphabet Zm and produces the ciphertext c= (c0, c1, …, cn) where ci = (pi + ki) (mod m). • Perfect security of the one time pad: If P(ki=j)=1/m and is iid, 0<=j<m, then H(c|p)=H(p) so the scheme is secure. • m=2 in the binary case and m=26 in the case of the roman alphabet. • Stream ciphers replace the ‘perfectly random’ sequence k with a pseudo-random sequence k’ (based on a much smaller input key ks and a stream generator R). JLM 20080915 17 One-time pad alphabetic encryption Plaintext +Key (mod 26)= Ciphertext B U L L W I N K L E I S A D O P E Plaintext 1 20 11 11 22 08 13 10 11 04 08 18 00 03 14 15 04 N O W I S T H E T I M E F O R A L Key 13 14 22 08 18 19 07 04 19 08 12 04 05 14 17 00 11 14 8 07 19 14 01 20 14 04 12 20 22 05 17 05 15 15 Ciphertext O S H T 0 B U O E M U W F R F P P Legend A B C D E F G H I J K L M 00 01 02 03 04 05 06 07 08 09 10 11 12 N O P Q R S T U V W X Y Z 13 14 15 16 17 18 19 20 21 22 23 24 25 JLM 20080915 18 One-time pad alphabetic decryption Ciphertext+26-Key (mod 26)= Plaintext 14 8 07 19 14 01 20 14 04 12 20 22 05 17 05 15 15 Ciphertext O S H T 0 B U O E M U W F R F P P N O W I S T H E T I M E F O R A L Key 13 14 22 08 18 19 07 04 19 08 12 04 05 14 17 00 11 B U L L W I N K L E I S A D O P E Plaintext 1 20 11 11 22 08 13 10 11 04 08 18 00 03 14 15 04 Legend A B C D E F G H I J K L M 00 01 02 03 04 05 06 07 08 09 10 11 12 N O P Q R S T U V W X Y Z 13 14 15 16 17 18 19 20 21 22 23 24 25 JLM 20080915 19 Binary one-time pad Plaintext ⊕ Key = Ciphertext Ciphertext ⊕ Key = Plaintext Plaintext 10101110011100000101110110110000 00101010011010110001010110010111 Key 10100100000110110100100000100111 Ciphertext 00101010011010110001010110010111 Key 10101110011100000101110110110000 Plaintext JLM 20080915 20 The one time pad has perfect security • E is perfect if H(X|Y)=H(X) where X is a plaintext distribution and Y is the ciphertext distribution with respect to a cipher E.