<<

and Network Security

Spring 2012 http://users.abo.fi/ipetre/crypto/

Lecture 2: Classical

Ion Petre Department of IT, Åbo Akademi University

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 1 Overview of the course

 I. CRYPTOGRAPHY  III. NETWORK SECURITY  Secret- cryptography  Email security  Classical encryption techniques  IP security  DES, AES, RC5, RC4  Web security (SSL, secure  Public-key cryptography electronic transactions)  RSA  Firewalls   Wireless security  II. AUTHENTICATION  IV. OTHER ISSUES  MAC  Viruses  Hashes and message digests  Digital cash  Digital signatures  schemes  Kerberos  Zero-knowledge techniques

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 2 Part I. Cryptography

 Will cover more than half of this course  I.1 Secret-key cryptography  Also called symmetric or conventional cryptography  Five ingredients   Encryption : runs on the plaintext and the encryption key to yield the  Secret key: an input to the encryption algorithm, value independent of the plaintext; different keys will yield different outputs  Ciphertext: the scrambled text produced as an output by the encryption algorithm  Decryption algorithm: runs on the ciphertext and the key to produce the plaintext  Requirements for secure conventional encryption  Strong encryption algorithm  An opponent who knows one or more would not be able to find the or the key  Ideally, even if he knows one or more pairs plaintext-ciphertext, he would not be able to find the key  Sender and receiver must share the same key. Once the key is compromised, all communications using that key are readable  It is impractical to decrypt the message on the basis of the ciphertext plus the knowledge of the encryption algorithm  encryption algorithm is not a secret

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 3 Cryptography – some notations

 Notation for relating the plaintext, ciphertext, and the keys

 C=EK(P) denotes that C is the encryption of the plaintext P using the key K

 P=DK(C) denotes that P is the decryption of the ciphertext C using the key K

 Then DK(EK(P))=P

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 4 Caesar

 It is a typical and the oldest known – attributed to Julius Caesar  Simple rule: replace each letter of the alphabet with the letter standing 3 places further down the alphabet  Example: MEET ME AFTER THE TOGA PARTY PHHW PH DIWHU WKH WRJD SDUWB  Here the key is 3 – choose another key to get a different substitution  The alphabet is wrapped around so that after Z follows A: a b c d e f g h i j k l m n o p r s t u v w x y z D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 5

 Mathematically, give each letter a number

a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

 The key is a number from to 25  Caesar cipher can now be given as  E(p) = (p + k) mod (26)

 D(C) = (C – k) mod (26)

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 6 Attacking Caesar

 Caesar can be broken if we only know one pair (plain letter, encrypted letter)  The difference between them is the key

 Caesar can be broken even if we only have the encrypted text and no knowledge of the plaintext  Brute-force attack is easy: there are only 25 keys possible  Try all 25 keys and check to see which key gives an intelligible message

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 7 Why is Caesar easy to break?

 Only 25 keys to try  The language of the plaintext is known and easily recognizable  What if the language is unknown?  What if the plaintext is a binary file of an unknown format?

From Stallings – “Cryptography and Network Security”

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 8 Strengthening Caesar: monoalphabetic

 Caesar only has 25 possible keys – far from secure  Idea: instead of shifting the letters with a fixed amount how about allowing any permutation of the alphabet Plain: abcdefghijklmnopqrstuvwxyz Cipher: DKVQFIBJWPESCXHTMYAUOLRGZN

Plaintext: if we wish to replace letters Ciphertext: WI RF RWAJ UH YFTSDVF SFUUFYA

 This is called monoalphabetic susbstitution cipher – a single alphabet is used  The increase in the number of keys is dramatic: 26!, i.e., more than 4x1026 possible keys  Compare: DES only has an order of 1016 possible keys

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 9 How large is large?

Reference Order of magnitude

Seconds in a year ≈ 3 x 107

Age of our solar system (years) ≈ 6 x 109

Seconds since creation of solar system ≈ 2 x 1017

Clock cycles per year, 3 GHz computer ≈ 9.6 x 1016

Binary strings of length 64 264 ≈ 1.8 x 1019

Binary strings of length 128 2128 ≈ 3.4 x 1038

Binary strings of length 256 2256 ≈ 1.2 x 1077

Number of 75-digit prime numbers ≈ 5.2 x 1072

Electrons in the universe ≈ 8.37 x 1077

Adapted from Handbook of Applied Cryptography (A.Menezes, P.van Oorschot, S.Vanstone), 1996

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 10 Monoalphabetic ciphers

 Having 1016 possible keys appears to make the system challenging: difficult to perform brute-force attacks  There is however another line of attack that easily defeats the system even when a relatively small ciphertext is known  If the cryptanalyst knows the nature of the text, e.g., noncompressed English text, then he can exploit the regularities of the language

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 11 Language redundancy and

 Human languages are redundant  Letters are not equally commonly used  In English E is by far the most common letter  Followed by T,R,N,I,O,A,S  Other letters are fairly rare  See Z,J,K,Q,X  Tables of single, double & triple letter frequencies exist  Most common digram in English is TH  Most common trigram in English in THE

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 12 English Letter Frequencies

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 13 Cryptanalysis of monoalphabetic ciphers

 Key concept - monoalphabetic substitution ciphers do not change relative letter frequencies  Discovered by Arabs in the 9th century  Calculate letter frequencies for ciphertext  Compare counts/plots against known values  Most frequent letter in the ciphertext may well encrypt E  The next one could encrypt T or A  After relatively few tries the system is broken  If the ciphertext is relatively short (and so, the frequencies are not fully relevant) then more guesses may be needed  Powerful tool: look at the frequency of two-letter combinations (digrams)

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 14 Example of cryptanalysis

 Ciphertext:

UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZVUEPHZHMDZSHZOWSFPAPPDTSVPQUZ WYMXUZUHSXEPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ

 Count relative letter frequencies: P is the most frequent (13.33%), followed by Z (11.67), S (8.33), U (8.33), O (7.5), M (6.67), H (5.83), etc.  Guess P and Z stand for E and T but the order is not clear because of small difference in the frequency  The next set of letters {S,U, O, M, H} may stand for {A, H, I, N, O, R, S} but again it is not completely clear which is which  One may try to guess and see how the text translates  Also, a good guess is that ZW, the most common digram in the ciphertext, is TH, the most common digram in English: thus, ZWP is THE  Proceed with trial and error and finally get after inserting the proper blanks:

it was disclosed yesterday that several informal but direct contacts have been made with political representatives of the viet cong in moscow

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 15 Some conclusions after this cryptanalysis

 Monoalphabetic ciphers are easy to break because they reflect the frequency of the original alphabet  Essential to know the original alphabet  Countermeasure: provide multiple substitutes for a given letter  Highly frequent letters such as E could be encrypted using a larger number of letters than less frequent letters such as Z: to encrypt E one could choose either one of, say 15 fixed letters, and to encrypt Z one could choose either one of, say 2 fixed letters  The number of for a letter may be proportional with the frequency rate in the original language (English)  This would hide the letter-frequency information  However: Multiple-letter patterns (digrams, trigrams, etc) survive in the text providing a tool for cryptanalysis  Each element of the plaintext only affects one element in the ciphertext  Longer text needed for breaking the system, but cryptanalysis still relatively straightforward

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 16 Measures to hide the structure of the plaintext

1. Encrypt multiple letters of the plaintext at once 2. Use more than one substitution in encryption and decryption (polyalphabetic ciphers)

 Consider both these approaches in the following

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 17

 The Playfair Cipher is an example of multiple-letter encryption  Invented by Sir Charles Wheatstone in 1854, but named after his friend Baron Playfair who championed the cipher at the British foreign office  Based on the use of a 5x5 in which the letters of the alphabet are written (I is considered the same as J)  This is called key matrix

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 18 Playfair key matrix

 A 5X5 matrix of letters based on a keyword  Fill in letters of keyword (no duplicates)  Left to right, top to bottom  Fill the rest of matrix with the other letters in alphabetic order  E.g. using the keyword MONARCHY, we obtain the following matrix M O N A R C H Y B D E F G I K L P Q S T U V W X Z

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 19 Encrypting and decrypting with Playfair

 The plaintext is encrypted two letters at a time:

1. Break the plaintext into pairs of two consecutive letters 2. If a pair is a repeated letter, insert a filler like 'X‘ in the plaintext, eg. "balloon" is treated as "ba lx lo on" 3. If both letters fall in the same row of the key matrix, replace each with the letter to its right (wrapping back to start from end), eg. “AR" encrypts as "RM" 4. If both letters fall in the same column, replace each with the letter below it (again wrapping to top from bottom), eg. “MU" encrypts to "CM" 5. Otherwise each letter is replaced by the one in its row in the column of the other letter of the pair, eg. “HS" encrypts to "BP", and “EA" to "IM" or "JM" (as desired)

 Decryption works in the reverse direction  The examples above are based on this key matrix:

M O N A R M O N A R C H Y B D C H Y B D E F G I K E F G I K L P Q S T L P Q S T U V W X Z U V W X Z

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 20 Security of Playfair

 Security much improved over monoalphabetic  There are 26 x 26 = 676 digrams  Needs a 676 entry digram frequency table to analyse (vs. 26 for a monoalphabetic) and correspondingly more ciphertext  Widely used for many years (eg. US & British military in WW I, other allied forces in WW II)  Can be broken, given a few hundred letters  Still has much of plaintext structure

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 21 Source: W.Stallings, Cryptography and network security, 2011 (figure 2.6) January 10, 2012 http://users.abo.fi/ipetre/crypto/ 22

 Developed by mathematician Lester Hill, 1929  Based on

 Recall  denote by I the () unit matrix, having 1 on the main diagonal, 0 everywhere else  for any square matrix M we have that MxI = IxM = M

 the property holds over any semiring, e.g. Z, R, but also Z26  for a square matrix M, if there is a matrix N such that MxN=NxM=I, then we say that N is the inverse of M and we denote it M-1  do not discuss here and calculating the inverse of a matrix (if it exists)

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 23 Hill cipher

 Each letter is represented by numbers from 0 to 25, similarly as in the Caesar cipher, from a to z  calculations are done modulo 26  Key: an K modulo 26, of size m  Example with m=3: 17 17 5   4 9 15   −   K = 21 18 21; K 1 = 15 17 6       2 2 19 24 0 17

 Plaintext: split it into blocks of m consecutive letters  consider each block as a row vector p with m entries modulo 26  encrypt each block separately to yield an encrypted row vector of the same size  k k k   11 12 13   Encryption: c=pK mod 26 (c1,c2 ,c3 ) = ( p1, p2 , p3 )k21 k22 k23   Decryption: p=cK-1 mod 26   k31 k32 k33 

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 24 Hill cipher - cryptanalysis

 Quite strong against ciphertext-only attacks  Weak against the known plaintext-attack  collect m pairs plaintext-ciphertext, where m is the size of the key  write the m plaintexts as the rows of a square matrix P of size m  write the m ciphertexts as the rows of a square matrix C of size m  we have that C=PK mod 26

 if P is invertible modulo 26, then K=P-1C mod 26  if P is not invertible, then collect more plaintext-ciphertext pairs until an invertible P is obtained

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 25 Measures to hide the structure of the plaintext

1. Encrypt multiple letters of the plaintext at once 2. Use more than one substitution in encryption and decryption (polyalphabetic ciphers)

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 26 Polyalphabetic substitution ciphers

 Idea: use different monoalphabetic substitutions as one proceeds through the plaintext  Makes cryptanalysis harder with more alphabets (substitutions) to guess and flattens frequency distribution  A key determines which particular substitution is used in each step  Example: the Vigenère cipher

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 27 Vigenère Cipher

 Proposed by Giovan Batista Belaso (1553) and reinvented by Blaise de Vigenère (1586), called “le chiffre indéchiffrable” for 300 years  Effectively multiple Caesar ciphers

 Key is a word K = k1 k2 ... kd  Encryption  Read one letter t from the plaintext and one letter k from the key-word  t is encrypted according to the Caesar cipher with key k  for the next plain-letter, use the next letter from the key-word  When the key word is finished, start the reading of the key from the beginning

 In other words: ci=(pi+ki mod m) mod 26  Decryption works in reverse  Example: key is “bcde”; “testing” is encrypted as “ugvxjpj”  Note that the two ‘t’ are encrypted by different letters: ‘u’ and ‘x’  The two ‘j’ in the cryptotext come from different plain letters: ‘i’ and ‘j’

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 28 Source: W.Stallings, Cryptography and network security, 2011 (Table 2.3) January 10, 2012 http://users.abo.fi/ipetre/crypto/ 29 Plaintext letters here Vigenere tableau A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z B B C D E F G H I J K L M N O P Q R S T U V W X Y Z A C C D E F G H I J K L M N O P Q R S T U V W X Y Z A B D D E F G H I J K L M N O P Q R S T U V W X Y Z A B C E E F G H I J K L M N O P Q R S T U V W X Y Z A B C D F F G H I J K L M N O P Q R S T U V W X Y Z A B C D E G G H I J K L M N O P Q R S T U V W X Y Z A B C D E F

Key letters letters here Key H H I J K L M N O P Q R S T U V W X Y Z A B C D E F G I I J K L M N O P Q R S T U V W X Y Z A B C D E F G H Example J J K L M N O P Q R S T U V W X Y Z A B C D E F G H I K K L M N O P Q R S T U V W X Y Z A B C D E F G H I J • write the plaintext out L L M N O P Q R S T U V W X Y Z A B C D E F G H I J K M M N O P Q R S T U V W X Y Z A B C D E F G H I J K L • write the keyword repeated above it N N O P Q R S T U V W X Y Z A B C D E F G H I J K L M • use each key letter as a Caesar cipher key O O P Q R S T U V W X Y Z A B C D E F G H I J K L M N • encrypt the corresponding plaintext letter P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O • eg using keyword deceptive Q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P R R S T U V W X Y Z A B C D E F G H I J K L M N O P Q

S S T U V W X Y Z A B C D E F G H I J K L M N O P Q R plain: wearediscoveredsaveyourself T T U V W X Y Z A B C D E F G H I J K L M N O P Q R S key: deceptivedeceptivedeceptive U U V W X Y Z A B C D E F G H I J K L M N O P Q R S T cipher: ZICVTWQNGRZGVTWAVZHCQYGLMGJ V V W X Y Z A B C D E F G H I J K L M N O P Q R S T U W W X Y Z A B C D E F G H I J K L M N O P Q R S T U V X X Y Z A B C D E F G H I J K L M N O P Q R S T U V W January 10, 2012 http://users.abo.fi/ipetre/crypto/Y Y Z A B C D E F G H I J K L M N O P Q R S T U V W30 X Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y

Security of Vigenère Ciphers

 Its strength lays in the fact that each plaintext letter has multiple ciphertext letters  Letter frequencies are obscured (but not totally lost)

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 31 Breaking Vigenère: the Kasiski Method (cryptotext only)

 Method developed by Babbage (1854) / Kasiski (1863)  Famous incident with breaking the Zimmerman telegram (Jan 16, 1917), contributed to the US entering WWI  We need to find the key word and for this, we first find its length  Idea: if the length is N, then the letters on positions 1, N+1, 2N+1, 3N+1, etc are encrypted with Caesar; same for letters on positions i, N+i, 2N+i, 3N+i, etc., where i runs from 1 to N  Clearly, if we deduce the length of the key word, then breaking the system is easy: break N Caesar systems  Finding the length of the key word  If plaintext starts with “the” (encrypted say by “XYZ”) and “the” also occurs starting from position N+1, then 2nd occurrence of “the” will also be encrypted by “XYZ”  Idea: repetitions in ciphertext give clues to period  Approach: find a piece of ciphertext that is repeated several times (say, at distance 6, 9, 18, 9 from each other)  If they really come from the same piece of plaintext, then the length of the key word will be a divisor of all those distances (in our example, the length of the key word must be 3)

Example plain: wearediscoveredsaveyourself key: deceptivedeceptivedeceptive cipher: ZICVTWQNGRZGVTWAVZHCQYGLMGJ

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 32 Improvement on Vigenère: autokey system

 If the key were as long as the message, then the system would be defended against the previous attack  Vigenère proposed the  the keyword is followed by the message itself (see example bellow)  Decryption  Knowing the keyword can recover the first few letters  Use these in turn on the rest of the message  Note: the system still has frequency characteristics to attack and can be rather easily defeated  Example: the key is deceptive  Weakness: plaintext and key share the same statistical distribution of letters

plaintext: wearediscoveredsaveyourself key: deceptivewearediscoveredsav ciphertext: ZICVTWQNGKZEIIGASXSTSLVVWLA

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 33 Vernam cipher

 Proposed as a reaction to the Kasinksi method and to the statistical attack on the auto-key method  Proposed by Gilbert Vernam (1918), an AT&T engineer  Key: a (very) long sequence of bits written on a self-looped tape  Plaintext: binary sequence (rather than sequence of letters)

 Encryption: ci=pi⊕ki

 Decryption: pi=ci⊕ki

 Weakness: the repeating key  possible attack with (very long) ciphertext or with known plaintext

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 34 One-Time pad

 Proposed by Army Signal Corp office Joseph Mauborgne (1918) as an improvement over the Vernam cipher  Idea: use a (truly) random key as long as the plaintext  It is unbreakable since the ciphertext bears no statistical relationship to the plaintext  Moreover, for any plaintext & any ciphertext there exists a key mapping one to the other  Thus, a ciphertext can be decrypted to any plaintext of the same length  The cryptanalyst is in an impossible situation  Example: the ciphertext RPAY may have come from “dead”, “live”, “book” or any other 4-letter combination  nothing to learn about the key by listening on the channel because the key is never repeated

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 35 Security of the one-time pad

 The security is entirely given by the randomness of the key and by never repeating a key  If the key is truly random, then the ciphertext is random  A key can only be used once if the cryptanalyst is to be kept in the “dark”  Perfect secrecy  Problems with this perfect  Making large quantities of truly random characters is a significant task  Key distribution is enormously difficult: for any message to be sent, a key of equal length must be available to both parties  Very limited use in practice, only over limited-bandwidth channels requiring perfect security

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 36 Other technique of encryption: transpositions

 We have considered so far substitutions to hide the plaintext: each letter is mapped into a letter according to some substitution  Different idea: perform some sort of permutation on the plaintext letters  Hide the message by rearranging the letter order without altering the actual letters used  The simplest such technique: rail fence technique

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 37

 Idea: write plaintext letters diagonally over a number of rows, then read off cipher row by row  E.g., with a rail fence of depth 2, to encrypt the text “meet me after the toga party”, write message out as: m e m a t r h t g p r y e t e f e t e o a a t  Ciphertext is read from the above row-by-row: MEMATRHTGPRYETEFETEOAAT  Attack: trivial (no key involved)

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 38 Row transposition ciphers

 More complex scheme: row transposition  Write letters of message out in rows over a specified number of columns  Reading the cryptotext column-by-column, with the columns permuted according to some key

 Example: “attack postponed until two am” with key 4312567: first read the column marked by 1, then the one marked by 2, etc.

Key: 4 3 1 2 5 6 7 Plaintext: a t t a c k p Ciphertext: TTNAAPTMTSUOAODWCOIXKNLYPETZ o s t p o n e d u n t i l t w o a m x y z  If we number the letters in the plaintext from 1 to 28, then the result of the first encryption is the following permutation of letters from plaintext: 03 10 17 24 04 11 18 25 02 09 16 23 01 08 15 22 05 12 19 26 06 13 20 27 07 14 21 28  Note the regularity of that sequence!  Cryptanalysis: write the ciphertext onto columns and play with the order of the columns

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 39 Iterating the encryption makes it more secure

 Idea: use the same scheme once more to increase security

Key: 4 3 1 2 5 6 7

Input: T T N A A P T Output: NSCYAUOPTTWLTMDNAOIEPAXTTOKZ M T S U O A O D W C O I X K N L Y P E T Z  After the second transposition we get the following sequence of letters:

17 09 05 27 24 16 12 07 10 02 22 20 03 25 15 12 04 23 19 14 11 01 26 21 18 08 06 28

 This is far less structured and so, more difficult to cryptanalyze

 In general, easy to recognize pure transposition ciphers: same as the language of the plaintext

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 40 Product Ciphers

 Ciphers using substitutions or transpositions are not secure because of language characteristics  Idea: using several ciphers in succession increases security  However:  two substitutions only make another (more complex?) substitution  two transpositions make another (more complex?) transposition  a substitution followed by a transposition makes a new much harder cipher

This is the bridge from classical to modern ciphers

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 41 Rotor Machines

 Before modern ciphers, rotor machines were most common  Widely used in WWII  German Enigma, Allied Hagelin, Japanese Purple  Implemented a very complex, varying substitution cipher  Principle: the machine has a set of independently rotating cylinders through which electrical impulses flow  Each cylinder has 26 input pins and 26 output pins with internal wiring that connects each input pin to a unique, fixed output pin (one cylinder thus defines a monoalphabetic substitution cipher)  The output pins of one cylinder are connected to the input pins of the next cylinder  After each keystroke, the last cylinder rotates one position and the others remain still  After a complete rotation of the last cylinder (26 keystrokes), the cylinder before it rotates one position, etc.  3 cylinders have a period of 263=17576  4 cylinders have a period of 456 976  5 cylinders have a period of 11 881 376  each period is far larger than the length of the typical message sent at any one time  different transmissions would use different keys

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 42 The (pictures from Wikipedia)

January 10, 2012 http://users.abo.fi/ipetre/crypto/ 43 January 10, 2012 http://users.abo.fi/ipetre/crypto/ 44