Cryptography and Network Security
Spring 2012 http://users.abo.fi/ipetre/crypto/
Lecture 2: Classical encryption
Ion Petre Department of IT, Åbo Akademi University
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 1 Overview of the course
I. CRYPTOGRAPHY III. NETWORK SECURITY Secret-key cryptography Email security Classical encryption techniques IP security DES, AES, RC5, RC4 Web security (SSL, secure Public-key cryptography electronic transactions) RSA Firewalls Key management Wireless security II. AUTHENTICATION IV. OTHER ISSUES MAC Viruses Hashes and message digests Digital cash Digital signatures Secret sharing schemes Kerberos Zero-knowledge techniques
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 2 Part I. Cryptography
Will cover more than half of this course I.1 Secret-key cryptography Also called symmetric or conventional cryptography Five ingredients Plaintext Encryption algorithm: runs on the plaintext and the encryption key to yield the ciphertext Secret key: an input to the encryption algorithm, value independent of the plaintext; different keys will yield different outputs Ciphertext: the scrambled text produced as an output by the encryption algorithm Decryption algorithm: runs on the ciphertext and the key to produce the plaintext Requirements for secure conventional encryption Strong encryption algorithm An opponent who knows one or more ciphertexts would not be able to find the plaintexts or the key Ideally, even if he knows one or more pairs plaintext-ciphertext, he would not be able to find the key Sender and receiver must share the same key. Once the key is compromised, all communications using that key are readable It is impractical to decrypt the message on the basis of the ciphertext plus the knowledge of the encryption algorithm encryption algorithm is not a secret
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 3 Cryptography – some notations
Notation for relating the plaintext, ciphertext, and the keys
C=EK(P) denotes that C is the encryption of the plaintext P using the key K
P=DK(C) denotes that P is the decryption of the ciphertext C using the key K
Then DK(EK(P))=P
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 4 Caesar Cipher
It is a typical substitution cipher and the oldest known – attributed to Julius Caesar Simple rule: replace each letter of the alphabet with the letter standing 3 places further down the alphabet Example: MEET ME AFTER THE TOGA PARTY PHHW PH DIWHU WKH WRJD SDUWB Here the key is 3 – choose another key to get a different substitution The alphabet is wrapped around so that after Z follows A: a b c d e f g h i j k l m n o p q r s t u v w x y z D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 5 Caesar cipher
Mathematically, give each letter a number
a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
The key is a number from to 25 Caesar cipher can now be given as E(p) = (p + k) mod (26)
D(C) = (C – k) mod (26)
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 6 Attacking Caesar
Caesar can be broken if we only know one pair (plain letter, encrypted letter) The difference between them is the key
Caesar can be broken even if we only have the encrypted text and no knowledge of the plaintext Brute-force attack is easy: there are only 25 keys possible Try all 25 keys and check to see which key gives an intelligible message
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 7 Why is Caesar easy to break?
Only 25 keys to try The language of the plaintext is known and easily recognizable What if the language is unknown? What if the plaintext is a binary file of an unknown format?
From Stallings – “Cryptography and Network Security”
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 8 Strengthening Caesar: monoalphabetic ciphers
Caesar only has 25 possible keys – far from secure Idea: instead of shifting the letters with a fixed amount how about allowing any permutation of the alphabet Plain: abcdefghijklmnopqrstuvwxyz Cipher: DKVQFIBJWPESCXHTMYAUOLRGZN
Plaintext: if we wish to replace letters Ciphertext: WI RF RWAJ UH YFTSDVF SFUUFYA
This is called monoalphabetic susbstitution cipher – a single alphabet is used The increase in the number of keys is dramatic: 26!, i.e., more than 4x1026 possible keys Compare: DES only has an order of 1016 possible keys
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 9 How large is large?
Reference Order of magnitude
Seconds in a year ≈ 3 x 107
Age of our solar system (years) ≈ 6 x 109
Seconds since creation of solar system ≈ 2 x 1017
Clock cycles per year, 3 GHz computer ≈ 9.6 x 1016
Binary strings of length 64 264 ≈ 1.8 x 1019
Binary strings of length 128 2128 ≈ 3.4 x 1038
Binary strings of length 256 2256 ≈ 1.2 x 1077
Number of 75-digit prime numbers ≈ 5.2 x 1072
Electrons in the universe ≈ 8.37 x 1077
Adapted from Handbook of Applied Cryptography (A.Menezes, P.van Oorschot, S.Vanstone), 1996
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 10 Monoalphabetic ciphers
Having 1016 possible keys appears to make the system challenging: difficult to perform brute-force attacks There is however another line of attack that easily defeats the system even when a relatively small ciphertext is known If the cryptanalyst knows the nature of the text, e.g., noncompressed English text, then he can exploit the regularities of the language
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 11 Language redundancy and cryptanalysis
Human languages are redundant Letters are not equally commonly used In English E is by far the most common letter Followed by T,R,N,I,O,A,S Other letters are fairly rare See Z,J,K,Q,X Tables of single, double & triple letter frequencies exist Most common digram in English is TH Most common trigram in English in THE
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 12 English Letter Frequencies
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 13 Cryptanalysis of monoalphabetic ciphers
Key concept - monoalphabetic substitution ciphers do not change relative letter frequencies Discovered by Arabs in the 9th century Calculate letter frequencies for ciphertext Compare counts/plots against known values Most frequent letter in the ciphertext may well encrypt E The next one could encrypt T or A After relatively few tries the system is broken If the ciphertext is relatively short (and so, the frequencies are not fully relevant) then more guesses may be needed Powerful tool: look at the frequency of two-letter combinations (digrams)
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 14 Example of cryptanalysis
Ciphertext:
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZVUEPHZHMDZSHZOWSFPAPPDTSVPQUZ WYMXUZUHSXEPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
Count relative letter frequencies: P is the most frequent (13.33%), followed by Z (11.67), S (8.33), U (8.33), O (7.5), M (6.67), H (5.83), etc. Guess P and Z stand for E and T but the order is not clear because of small difference in the frequency The next set of letters {S,U, O, M, H} may stand for {A, H, I, N, O, R, S} but again it is not completely clear which is which One may try to guess and see how the text translates Also, a good guess is that ZW, the most common digram in the ciphertext, is TH, the most common digram in English: thus, ZWP is THE Proceed with trial and error and finally get after inserting the proper blanks:
it was disclosed yesterday that several informal but direct contacts have been made with political representatives of the viet cong in moscow
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 15 Some conclusions after this cryptanalysis
Monoalphabetic ciphers are easy to break because they reflect the frequency of the original alphabet Essential to know the original alphabet Countermeasure: provide multiple substitutes for a given letter Highly frequent letters such as E could be encrypted using a larger number of letters than less frequent letters such as Z: to encrypt E one could choose either one of, say 15 fixed letters, and to encrypt Z one could choose either one of, say 2 fixed letters The number of encryptions for a letter may be proportional with the frequency rate in the original language (English) This would hide the letter-frequency information However: Multiple-letter patterns (digrams, trigrams, etc) survive in the text providing a tool for cryptanalysis Each element of the plaintext only affects one element in the ciphertext Longer text needed for breaking the system, but cryptanalysis still relatively straightforward
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 16 Measures to hide the structure of the plaintext
1. Encrypt multiple letters of the plaintext at once 2. Use more than one substitution in encryption and decryption (polyalphabetic ciphers)
Consider both these approaches in the following
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 17 Playfair Cipher
The Playfair Cipher is an example of multiple-letter encryption Invented by Sir Charles Wheatstone in 1854, but named after his friend Baron Playfair who championed the cipher at the British foreign office Based on the use of a 5x5 matrix in which the letters of the alphabet are written (I is considered the same as J) This is called key matrix
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 18 Playfair key matrix
A 5X5 matrix of letters based on a keyword Fill in letters of keyword (no duplicates) Left to right, top to bottom Fill the rest of matrix with the other letters in alphabetic order E.g. using the keyword MONARCHY, we obtain the following matrix M O N A R C H Y B D E F G I K L P Q S T U V W X Z
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 19 Encrypting and decrypting with Playfair
The plaintext is encrypted two letters at a time:
1. Break the plaintext into pairs of two consecutive letters 2. If a pair is a repeated letter, insert a filler like 'X‘ in the plaintext, eg. "balloon" is treated as "ba lx lo on" 3. If both letters fall in the same row of the key matrix, replace each with the letter to its right (wrapping back to start from end), eg. “AR" encrypts as "RM" 4. If both letters fall in the same column, replace each with the letter below it (again wrapping to top from bottom), eg. “MU" encrypts to "CM" 5. Otherwise each letter is replaced by the one in its row in the column of the other letter of the pair, eg. “HS" encrypts to "BP", and “EA" to "IM" or "JM" (as desired)
Decryption works in the reverse direction The examples above are based on this key matrix:
M O N A R M O N A R C H Y B D C H Y B D E F G I K E F G I K L P Q S T L P Q S T U V W X Z U V W X Z
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 20 Security of Playfair
Security much improved over monoalphabetic There are 26 x 26 = 676 digrams Needs a 676 entry digram frequency table to analyse (vs. 26 for a monoalphabetic) and correspondingly more ciphertext Widely used for many years (eg. US & British military in WW I, other allied forces in WW II) Can be broken, given a few hundred letters Still has much of plaintext structure
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 21 Source: W.Stallings, Cryptography and network security, 2011 (figure 2.6) January 10, 2012 http://users.abo.fi/ipetre/crypto/ 22 Hill cipher
Developed by mathematician Lester Hill, 1929 Based on linear algebra
Recall denote by I the (square) unit matrix, having 1 on the main diagonal, 0 everywhere else for any square matrix M we have that MxI = IxM = M
the property holds over any semiring, e.g. Z, R, but also Z26 for a square matrix M, if there is a matrix N such that MxN=NxM=I, then we say that N is the inverse of M and we denote it M-1 do not discuss here determinants and calculating the inverse of a matrix (if it exists)
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 23 Hill cipher
Each letter is represented by numbers from 0 to 25, similarly as in the Caesar cipher, from a to z calculations are done modulo 26 Key: an invertible matrix K modulo 26, of size m Example with m=3: 17 17 5 4 9 15 − K = 21 18 21; K 1 = 15 17 6 2 2 19 24 0 17
Plaintext: split it into blocks of m consecutive letters consider each block as a row vector p with m entries modulo 26 encrypt each block separately to yield an encrypted row vector of the same size k k k 11 12 13 Encryption: c=pK mod 26 (c1,c2 ,c3 ) = ( p1, p2 , p3 )k21 k22 k23 Decryption: p=cK-1 mod 26 k31 k32 k33
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 24 Hill cipher - cryptanalysis
Quite strong against ciphertext-only attacks Weak against the known plaintext-attack collect m pairs plaintext-ciphertext, where m is the size of the key write the m plaintexts as the rows of a square matrix P of size m write the m ciphertexts as the rows of a square matrix C of size m we have that C=PK mod 26
if P is invertible modulo 26, then K=P-1C mod 26 if P is not invertible, then collect more plaintext-ciphertext pairs until an invertible P is obtained
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 25 Measures to hide the structure of the plaintext
1. Encrypt multiple letters of the plaintext at once 2. Use more than one substitution in encryption and decryption (polyalphabetic ciphers)
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 26 Polyalphabetic substitution ciphers
Idea: use different monoalphabetic substitutions as one proceeds through the plaintext Makes cryptanalysis harder with more alphabets (substitutions) to guess and flattens frequency distribution A key determines which particular substitution is used in each step Example: the Vigenère cipher
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 27 Vigenère Cipher
Proposed by Giovan Batista Belaso (1553) and reinvented by Blaise de Vigenère (1586), called “le chiffre indéchiffrable” for 300 years Effectively multiple Caesar ciphers
Key is a word K = k1 k2 ... kd Encryption Read one letter t from the plaintext and one letter k from the key-word t is encrypted according to the Caesar cipher with key k for the next plain-letter, use the next letter from the key-word When the key word is finished, start the reading of the key from the beginning
In other words: ci=(pi+ki mod m) mod 26 Decryption works in reverse Example: key is “bcde”; “testing” is encrypted as “ugvxjpj” Note that the two ‘t’ are encrypted by different letters: ‘u’ and ‘x’ The two ‘j’ in the cryptotext come from different plain letters: ‘i’ and ‘j’
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 28 Source: W.Stallings, Cryptography and network security, 2011 (Table 2.3) January 10, 2012 http://users.abo.fi/ipetre/crypto/ 29 Plaintext letters here Vigenere tableau A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z B B C D E F G H I J K L M N O P Q R S T U V W X Y Z A C C D E F G H I J K L M N O P Q R S T U V W X Y Z A B D D E F G H I J K L M N O P Q R S T U V W X Y Z A B C E E F G H I J K L M N O P Q R S T U V W X Y Z A B C D F F G H I J K L M N O P Q R S T U V W X Y Z A B C D E G G H I J K L M N O P Q R S T U V W X Y Z A B C D E F
Key letters letters here Key H H I J K L M N O P Q R S T U V W X Y Z A B C D E F G I I J K L M N O P Q R S T U V W X Y Z A B C D E F G H Example J J K L M N O P Q R S T U V W X Y Z A B C D E F G H I K K L M N O P Q R S T U V W X Y Z A B C D E F G H I J • write the plaintext out L L M N O P Q R S T U V W X Y Z A B C D E F G H I J K M M N O P Q R S T U V W X Y Z A B C D E F G H I J K L • write the keyword repeated above it N N O P Q R S T U V W X Y Z A B C D E F G H I J K L M • use each key letter as a Caesar cipher key O O P Q R S T U V W X Y Z A B C D E F G H I J K L M N • encrypt the corresponding plaintext letter P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O • eg using keyword deceptive Q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P R R S T U V W X Y Z A B C D E F G H I J K L M N O P Q
S S T U V W X Y Z A B C D E F G H I J K L M N O P Q R plain: wearediscoveredsaveyourself T T U V W X Y Z A B C D E F G H I J K L M N O P Q R S key: deceptivedeceptivedeceptive U U V W X Y Z A B C D E F G H I J K L M N O P Q R S T cipher: ZICVTWQNGRZGVTWAVZHCQYGLMGJ V V W X Y Z A B C D E F G H I J K L M N O P Q R S T U W W X Y Z A B C D E F G H I J K L M N O P Q R S T U V X X Y Z A B C D E F G H I J K L M N O P Q R S T U V W January 10, 2012 http://users.abo.fi/ipetre/crypto/Y Y Z A B C D E F G H I J K L M N O P Q R S T U V W30 X Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y
Security of Vigenère Ciphers
Its strength lays in the fact that each plaintext letter has multiple ciphertext letters Letter frequencies are obscured (but not totally lost)
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 31 Breaking Vigenère: the Kasiski Method (cryptotext only)
Method developed by Babbage (1854) / Kasiski (1863) Famous incident with breaking the Zimmerman telegram (Jan 16, 1917), contributed to the US entering WWI We need to find the key word and for this, we first find its length Idea: if the length is N, then the letters on positions 1, N+1, 2N+1, 3N+1, etc are encrypted with Caesar; same for letters on positions i, N+i, 2N+i, 3N+i, etc., where i runs from 1 to N Clearly, if we deduce the length of the key word, then breaking the system is easy: break N Caesar systems Finding the length of the key word If plaintext starts with “the” (encrypted say by “XYZ”) and “the” also occurs starting from position N+1, then 2nd occurrence of “the” will also be encrypted by “XYZ” Idea: repetitions in ciphertext give clues to period Approach: find a piece of ciphertext that is repeated several times (say, at distance 6, 9, 18, 9 from each other) If they really come from the same piece of plaintext, then the length of the key word will be a divisor of all those distances (in our example, the length of the key word must be 3)
Example plain: wearediscoveredsaveyourself key: deceptivedeceptivedeceptive cipher: ZICVTWQNGRZGVTWAVZHCQYGLMGJ
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 32 Improvement on Vigenère: autokey system
If the key were as long as the message, then the system would be defended against the previous attack Vigenère proposed the autokey cipher the keyword is followed by the message itself (see example bellow) Decryption Knowing the keyword can recover the first few letters Use these in turn on the rest of the message Note: the system still has frequency characteristics to attack and can be rather easily defeated Example: the key is deceptive Weakness: plaintext and key share the same statistical distribution of letters
plaintext: wearediscoveredsaveyourself key: deceptivewearediscoveredsav ciphertext: ZICVTWQNGKZEIIGASXSTSLVVWLA
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 33 Vernam cipher
Proposed as a reaction to the Kasinksi method and to the statistical attack on the auto-key method Proposed by Gilbert Vernam (1918), an AT&T engineer Key: a (very) long sequence of bits written on a self-looped tape Plaintext: binary sequence (rather than sequence of letters)
Encryption: ci=pi⊕ki
Decryption: pi=ci⊕ki
Weakness: the repeating key possible attack with (very long) ciphertext or with known plaintext
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 34 One-Time pad
Proposed by Army Signal Corp office Joseph Mauborgne (1918) as an improvement over the Vernam cipher Idea: use a (truly) random key as long as the plaintext It is unbreakable since the ciphertext bears no statistical relationship to the plaintext Moreover, for any plaintext & any ciphertext there exists a key mapping one to the other Thus, a ciphertext can be decrypted to any plaintext of the same length The cryptanalyst is in an impossible situation Example: the ciphertext RPAY may have come from “dead”, “live”, “book” or any other 4-letter combination nothing to learn about the key by listening on the channel because the key is never repeated
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 35 Security of the one-time pad
The security is entirely given by the randomness of the key and by never repeating a key If the key is truly random, then the ciphertext is random A key can only be used once if the cryptanalyst is to be kept in the “dark” Perfect secrecy Problems with this perfect cryptosystem Making large quantities of truly random characters is a significant task Key distribution is enormously difficult: for any message to be sent, a key of equal length must be available to both parties Very limited use in practice, only over limited-bandwidth channels requiring perfect security
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 36 Other technique of encryption: transpositions
We have considered so far substitutions to hide the plaintext: each letter is mapped into a letter according to some substitution Different idea: perform some sort of permutation on the plaintext letters Hide the message by rearranging the letter order without altering the actual letters used The simplest such technique: rail fence technique
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 37 Rail Fence cipher
Idea: write plaintext letters diagonally over a number of rows, then read off cipher row by row E.g., with a rail fence of depth 2, to encrypt the text “meet me after the toga party”, write message out as: m e m a t r h t g p r y e t e f e t e o a a t Ciphertext is read from the above row-by-row: MEMATRHTGPRYETEFETEOAAT Attack: trivial (no key involved)
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 38 Row transposition ciphers
More complex scheme: row transposition Write letters of message out in rows over a specified number of columns Reading the cryptotext column-by-column, with the columns permuted according to some key
Example: “attack postponed until two am” with key 4312567: first read the column marked by 1, then the one marked by 2, etc.
Key: 4 3 1 2 5 6 7 Plaintext: a t t a c k p Ciphertext: TTNAAPTMTSUOAODWCOIXKNLYPETZ o s t p o n e d u n t i l t w o a m x y z If we number the letters in the plaintext from 1 to 28, then the result of the first encryption is the following permutation of letters from plaintext: 03 10 17 24 04 11 18 25 02 09 16 23 01 08 15 22 05 12 19 26 06 13 20 27 07 14 21 28 Note the regularity of that sequence! Cryptanalysis: write the ciphertext onto columns and play with the order of the columns
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 39 Iterating the encryption makes it more secure
Idea: use the same scheme once more to increase security
Key: 4 3 1 2 5 6 7
Input: T T N A A P T Output: NSCYAUOPTTWLTMDNAOIEPAXTTOKZ M T S U O A O D W C O I X K N L Y P E T Z After the second transposition we get the following sequence of letters:
17 09 05 27 24 16 12 07 10 02 22 20 03 25 15 12 04 23 19 14 11 01 26 21 18 08 06 28
This is far less structured and so, more difficult to cryptanalyze
In general, easy to recognize pure transposition ciphers: same letter frequency as the language of the plaintext
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 40 Product Ciphers
Ciphers using substitutions or transpositions are not secure because of language characteristics Idea: using several ciphers in succession increases security However: two substitutions only make another (more complex?) substitution two transpositions make another (more complex?) transposition a substitution followed by a transposition makes a new much harder cipher
This is the bridge from classical to modern ciphers
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 41 Rotor Machines
Before modern ciphers, rotor machines were most common product cipher Widely used in WWII German Enigma, Allied Hagelin, Japanese Purple Implemented a very complex, varying substitution cipher Principle: the machine has a set of independently rotating cylinders through which electrical impulses flow Each cylinder has 26 input pins and 26 output pins with internal wiring that connects each input pin to a unique, fixed output pin (one cylinder thus defines a monoalphabetic substitution cipher) The output pins of one cylinder are connected to the input pins of the next cylinder After each keystroke, the last cylinder rotates one position and the others remain still After a complete rotation of the last cylinder (26 keystrokes), the cylinder before it rotates one position, etc. 3 cylinders have a period of 263=17576 4 cylinders have a period of 456 976 5 cylinders have a period of 11 881 376 each period is far larger than the length of the typical message sent at any one time different transmissions would use different keys
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 42 The Enigma machine (pictures from Wikipedia)
January 10, 2012 http://users.abo.fi/ipetre/crypto/ 43 January 10, 2012 http://users.abo.fi/ipetre/crypto/ 44