Classical Encryption

Cryptography and Network Security Spring 2012 http://users.abo.fi/ipetre/crypto/ Lecture 2: Classical encryption Ion Petre Department of IT, Åbo Akademi University January 10, 2012 http://users.abo.fi/ipetre/crypto/ 1 Overview of the course I. CRYPTOGRAPHY III. NETWORK SECURITY Secret-key cryptography Email security Classical encryption techniques IP security DES, AES, RC5, RC4 Web security (SSL, secure Public-key cryptography electronic transactions) RSA Firewalls Key management Wireless security II. AUTHENTICATION IV. OTHER ISSUES MAC Viruses Hashes and message digests Digital cash Digital signatures Secret sharing schemes Kerberos Zero-knowledge techniques January 10, 2012 http://users.abo.fi/ipetre/crypto/ 2 Part I. Cryptography Will cover more than half of this course I.1 Secret-key cryptography Also called symmetric or conventional cryptography Five ingredients Plaintext Encryption algorithm: runs on the plaintext and the encryption key to yield the ciphertext Secret key: an input to the encryption algorithm, value independent of the plaintext; different keys will yield different outputs Ciphertext: the scrambled text produced as an output by the encryption algorithm Decryption algorithm: runs on the ciphertext and the key to produce the plaintext Requirements for secure conventional encryption Strong encryption algorithm An opponent who knows one or more ciphertexts would not be able to find the plaintexts or the key Ideally, even if he knows one or more pairs plaintext-ciphertext, he would not be able to find the key Sender and receiver must share the same key. Once the key is compromised, all communications using that key are readable It is impractical to decrypt the message on the basis of the ciphertext plus the knowledge of the encryption algorithm encryption algorithm is not a secret January 10, 2012 http://users.abo.fi/ipetre/crypto/ 3 Cryptography – some notations Notation for relating the plaintext, ciphertext, and the keys C=EK(P) denotes that C is the encryption of the plaintext P using the key K P=DK(C) denotes that P is the decryption of the ciphertext C using the key K Then DK(EK(P))=P January 10, 2012 http://users.abo.fi/ipetre/crypto/ 4 Caesar Cipher It is a typical substitution cipher and the oldest known – attributed to Julius Caesar Simple rule: replace each letter of the alphabet with the letter standing 3 places further down the alphabet Example: MEET ME AFTER THE TOGA PARTY PHHW PH DIWHU WKH WRJD SDUWB Here the key is 3 – choose another key to get a different substitution The alphabet is wrapped around so that after Z follows A: a b c d e f g h i j k l m n o p q r s t u v w x y z D E F G H I J K L M N O P Q R S T U V W X Y Z A B C January 10, 2012 http://users.abo.fi/ipetre/crypto/ 5 Caesar cipher Mathematically, give each letter a number a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 The key is a number from to 25 Caesar cipher can now be given as E(p) = (p + k) mod (26) D(C) = (C – k) mod (26) January 10, 2012 http://users.abo.fi/ipetre/crypto/ 6 Attacking Caesar Caesar can be broken if we only know one pair (plain letter, encrypted letter) The difference between them is the key Caesar can be broken even if we only have the encrypted text and no knowledge of the plaintext Brute-force attack is easy: there are only 25 keys possible Try all 25 keys and check to see which key gives an intelligible message January 10, 2012 http://users.abo.fi/ipetre/crypto/ 7 Why is Caesar easy to break? Only 25 keys to try The language of the plaintext is known and easily recognizable What if the language is unknown? What if the plaintext is a binary file of an unknown format? From Stallings – “Cryptography and Network Security” January 10, 2012 http://users.abo.fi/ipetre/crypto/ 8 Strengthening Caesar: monoalphabetic ciphers Caesar only has 25 possible keys – far from secure Idea: instead of shifting the letters with a fixed amount how about allowing any permutation of the alphabet Plain: abcdefghijklmnopqrstuvwxyz Cipher: DKVQFIBJWPESCXHTMYAUOLRGZN Plaintext: if we wish to replace letters Ciphertext: WI RF RWAJ UH YFTSDVF SFUUFYA This is called monoalphabetic susbstitution cipher – a single alphabet is used The increase in the number of keys is dramatic: 26!, i.e., more than 4x1026 possible keys Compare: DES only has an order of 1016 possible keys January 10, 2012 http://users.abo.fi/ipetre/crypto/ 9 How large is large? Reference Order of magnitude Seconds in a year ≈ 3 x 107 Age of our solar system (years) ≈ 6 x 109 Seconds since creation of solar system ≈ 2 x 1017 Clock cycles per year, 3 GHz computer ≈ 9.6 x 1016 Binary strings of length 64 264 ≈ 1.8 x 1019 Binary strings of length 128 2128 ≈ 3.4 x 1038 Binary strings of length 256 2256 ≈ 1.2 x 1077 Number of 75-digit prime numbers ≈ 5.2 x 1072 Electrons in the universe ≈ 8.37 x 1077 Adapted from Handbook of Applied Cryptography (A.Menezes, P.van Oorschot, S.Vanstone), 1996 January 10, 2012 http://users.abo.fi/ipetre/crypto/ 10 Monoalphabetic ciphers Having 1016 possible keys appears to make the system challenging: difficult to perform brute-force attacks There is however another line of attack that easily defeats the system even when a relatively small ciphertext is known If the cryptanalyst knows the nature of the text, e.g., noncompressed English text, then he can exploit the regularities of the language January 10, 2012 http://users.abo.fi/ipetre/crypto/ 11 Language redundancy and cryptanalysis Human languages are redundant Letters are not equally commonly used In English E is by far the most common letter Followed by T,R,N,I,O,A,S Other letters are fairly rare See Z,J,K,Q,X Tables of single, double & triple letter frequencies exist Most common digram in English is TH Most common trigram in English in THE January 10, 2012 http://users.abo.fi/ipetre/crypto/ 12 English Letter Frequencies January 10, 2012 http://users.abo.fi/ipetre/crypto/ 13 Cryptanalysis of monoalphabetic ciphers Key concept - monoalphabetic substitution ciphers do not change relative letter frequencies Discovered by Arabs in the 9th century Calculate letter frequencies for ciphertext Compare counts/plots against known values Most frequent letter in the ciphertext may well encrypt E The next one could encrypt T or A After relatively few tries the system is broken If the ciphertext is relatively short (and so, the frequencies are not fully relevant) then more guesses may be needed Powerful tool: look at the frequency of two-letter combinations (digrams) January 10, 2012 http://users.abo.fi/ipetre/crypto/ 14 Example of cryptanalysis Ciphertext: UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZVUEPHZHMDZSHZOWSFPAPPDTSVPQUZ WYMXUZUHSXEPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ Count relative letter frequencies: P is the most frequent (13.33%), followed by Z (11.67), S (8.33), U (8.33), O (7.5), M (6.67), H (5.83), etc. Guess P and Z stand for E and T but the order is not clear because of small difference in the frequency The next set of letters {S,U, O, M, H} may stand for {A, H, I, N, O, R, S} but again it is not completely clear which is which One may try to guess and see how the text translates Also, a good guess is that ZW, the most common digram in the ciphertext, is TH, the most common digram in English: thus, ZWP is THE Proceed with trial and error and finally get after inserting the proper blanks: it was disclosed yesterday that several informal but direct contacts have been made with political representatives of the viet cong in moscow January 10, 2012 http://users.abo.fi/ipetre/crypto/ 15 Some conclusions after this cryptanalysis Monoalphabetic ciphers are easy to break because they reflect the frequency of the original alphabet Essential to know the original alphabet Countermeasure: provide multiple substitutes for a given letter Highly frequent letters such as E could be encrypted using a larger number of letters than less frequent letters such as Z: to encrypt E one could choose either one of, say 15 fixed letters, and to encrypt Z one could choose either one of, say 2 fixed letters The number of encryptions for a letter may be proportional with the frequency rate in the original language (English) This would hide the letter-frequency information However: Multiple-letter patterns (digrams, trigrams, etc) survive in the text providing a tool for cryptanalysis Each element of the plaintext only affects one element in the ciphertext Longer text needed for breaking the system, but cryptanalysis still relatively straightforward January 10, 2012 http://users.abo.fi/ipetre/crypto/ 16 Measures to hide the structure of the plaintext 1. Encrypt multiple letters of the plaintext at once 2. Use more than one substitution in encryption and decryption (polyalphabetic ciphers) Consider both these approaches in the following January 10, 2012 http://users.abo.fi/ipetre/crypto/ 17 Playfair Cipher The Playfair Cipher is an example of multiple-letter encryption Invented by Sir Charles Wheatstone in 1854, but named after his friend Baron Playfair who championed the cipher at the British foreign office Based on the use of a 5x5 matrix in which the letters of the alphabet are written (I is considered the same as J) This is called key matrix January 10, 2012 http://users.abo.fi/ipetre/crypto/ 18 Playfair key matrix A 5X5 matrix of letters based on a keyword Fill in letters of keyword (no duplicates) Left to right, top to bottom Fill the rest of matrix with the other letters in alphabetic order E.g.

Load more