<<

An Introduction to

Gianluigi Me [email protected] Anno Accademico 2011/12 Overview

(and )  Modern and Decryption Principles  Symmetric Cryptography  Stream and Block Modes  Key Management for Conventional Cryptography  Attack on Bad Implementation of Cryptography: IEEE 802.11 WEP   Public Key Cryptography  Digital Signatures  Key Management for Public-Key Cryptography Main sources

 Network Security Essential / Stallings  Applied Cryptography / Schneier  Handbook of Applied Cryptography / Menezes, van Oorschot, Vanstone  Innovative Cryptography, N. Moldovyan , A.Moldovyan  Modern Cryptography: Theory and Practice, Wenbo Mao

History of Steganography and Cryptography Steganography

 Steganography  Steganos = “covered” in Greek, Graphein = “to write”  Being able to communicate secretly has always been considered an advantage  Secret messages were often not written down, but rather memorized by sworn messengers  Or hidden  Demaratus, a Greek immigrant to Persia, reveals Persia’s intention to attack Athens. Writes the secret message on a tablet, and covers it with wax.  Histaiaeus encourages Aristagoras of Miletus to revolt against the Persian King. Writes message on shaved head of the messenger, and sends him after his hair grew  Chinese wrote on silk, turned into wax-covered ball that was swallowed by the messenger Steganography (cont.)  Invisible Ink  Certain organic fluids are transparent when dried but the deposit can be charred and is then visible  A mixture of alum and vinegar may be used to write on hardboiled eggs, so that can only be read once shell is broken

 Embedded information  Germans used “microdots” - documents shrunk to the size of a dot, and embedded within innocent letters  Secret messages within music (Beatles) Steganography (cont.)  Steganography is also used to foil piracy in digital content  Watermarking copyright information into images, music  Programmers sometime embed “easter eggs”

 Steganography has been used by spies and children alike  Most recently, US argued that Bin Laden implanted instructions within taped interviews

 Steganography is weaker than cryptography because the information is revealed once the message is intercepted

 However, steganography can be used in conjunction with cryptography Cryptography  In Cryptography, the meaning of the message is hidden, not its existence  = “hidden” in Greek  Historically, and also today, encryption involves  transposition of letters  ’s is first cryptographic device (5th Century BC)  Message written on a leather strip, which is then unwound to scramble the message

 substitution  Kama-Sutra suggests that women learn to encrypt their love messages by substituting pre-paired letters (4th Century AD)  Cipher – replace letters  – replace words Cryptography

 The classical cryptographic task is to provide for a reversible transformation of an understandable (original text) to a seemingly random character sequence called a or a .  The security of modern is not based on the secrecy of the , but on the secrecy of a relatively small amount of information, called a secret key. Cryptography

Definition : A cryptographic system consists of the following:  a plaintext message space M: a set of strings over some alphabet  a ciphertext message space C: a set of possible ciphertext messages  an encryption key space K: a set of possible encryption keys, and a decryption key space K1: a set of possible decryption keys  an efficient key generation algorithm  an efficient encryption algorithm  an efficient decryption algorithm . l l For integer 1 , G(1 ) outputs a key pair (ke, kd)  K x K' of length l . For ke  K and m  M, we denote by

the encryption transformation and read it as "c is an encryption of m under key ke,"

Cryptography

 the decryption transformation and read it as "m is the decryption of c under key kd."

 It is necessary that for all m  M and all ke  K, there exists kd  K': Cryptography

 Combining Shannon's semantic characterization for and Kerchoffs' principle, we can provide a summary for a good cryptosystem as follows:  ε and D contain no component or design part which is secret;  ε distributes meaningful messages fairly uniformly over the entire ciphertext message space; it may even be possible that the random distribution is due to some internal random operation of ;  With the correct cryptographic key, ε and D are practically efficient;  Without the correct key, the task for recovering from a ciphertext the correspondent plaintext is a problem of a difficulty determined solely by the size of the key parameter, which usually takes a size s such that solving the problem requires computational resource of a quantitative measure beyond p(s) for p being any polynomial.

Cryptography

 If a cipher achieves independence between the distributions of its plaintext and ciphertext, then we say that the cipher is secure in an information-theoretically secure sense  The notion of information-theoretic-based cryptographic security is developed by Shannon. According to Shannon's theory, we can summarize two conditions for secure use of classical ciphers: Cryptography

 Conditions for Secure Use of Classical Ciphers  #K ≥ #M;

 k εU K and is used once in each encryption only.  So if a encrypts a message string of length l , then in order for the encryption to be secure, the length of a key string should be at least l , and the key string should be used once only.  The scheme (D; ε) is perfectly secure if for

every pair of messages x; x0, ε Un(x) ≡ ε Un(x0). Cryptography

Definition:  A cryptosystem is perfectly message indistinguishable if for all x1 and x2 and all y

Pr (εk(x1)=y)=Pr (εk(x2)=y)  Theorem: Shannon secrecy and message indistinguishability are equivalent. When restricting to #P =#C=#K key ambiguity is equivalent as well.  LEMMA: #K<#P implies not perfectly secret. Vernam Cipher

 Assume that the message is a string of n binary bits  the key is also a string of n binary bits

 where U stands for uniformely random  Encryption takes place one bit at a time and the ciphertext string c = c1c2…cn is found by the bit operation XOR (exclusive or) each message bit with the corresponding key bit 1in Vernam Cipher

 Considering M = C = K = {0, 1}*, the Vernam cipher is a special case of substitution ciphers  Confidentiality offered by the one-time-key Vernam cipher is in the information-theoretically secure sense, or is, unconditional.  In fact, a ciphertext message string c provides (an eavesdropper) no information whatsoever about the plaintext message string m since any m could have yield c if the key k is equal to c m (bit by bit).

Review: Bayes’ Theorem

 Let X and Y be two random variables. Define:

 Theorem (Chain Rule):

Apriori  Theorem (Bayes):

Aposteriori Theoretical (Perfect) Security

 What does it mean for a cryptosystem to be perfectly secure?  Essentially, the adversary doesn’t learn anything from the ciphertext: Perfect Security means |K|>|P|

 Reminder: adversary “knows the system” and has unlimited power  If key-space is finite, each ciphertext must map to a finite number of plaintexts

 If |P|>|K|, some plaintexts will be “impossible” for some ciphertexts

The Vernam Cipher (1)

 Is there a perfectly secure cryptosystem for which |K|=|P|?

 Theorem (Shannon): Let (P,K,C,E,D) be a cryptosystem for which |K|=|P|=|C|. Then the cryptosystem provides perfect if: The Vernam Cipher (2)

 Proof: Let (P,K,C,E,D) be a cryptosystem for which |K|=|P|=|C|.  Because of perfect secrecy:

 |K|=|P|=|C|, so there is a unique key associated with every pair (p,c)

The Vernam Cipher (3)

 Fix c. For all possible plaintexts pi, let ki be the key satisfying eki(pi)=c  By Bayes: Mathematical Proof

mi ki prob. ci prob. prob. 0 x 0 ½ 0 ½ x 0 x 1 ½ 1 ½ x 1 1-x 0 ½ 1 ½ (1-x) 1 1-x 1 ½ 0 ½ (1-x) • We find out the probability of a ciphertext bit being 1 or 0 is equal to (½)x + (½)(1-x) = ½. Ciphertext looks like a random sequence. exclusive or Operator

a b c = a  b

0 0 0 0 1 1

1 0 1 1 1 0 Example

 message =‘IF’  then its ASCII code =(1001001 1000110)  key = (1010110 0110001)  Encryption:  1001001 1000110 plaintext  1010110 0110001 key  0011111 1110110 ciphertext  Decryption:  0011111 1110110 ciphertext  1010110 0110001 key  1001001 1000110 plaintext

 In parallel with the development of cryptographic systems, methods have been developed that make it possible to restore an original message based on the ciphertext and other known information. These methods are collectively known as cryptanalysis. Advances in cryptanalysis have led to tightening the requirements on cryptographic algorithms Cryptanalysis

 Ciphertext  Plaintext and its corresponding ciphertext  Chosen plaintext  Chosen ciphertext  Adapted plaintext  Adapted ciphertext Additionally, some attacks use:  Hardware faults  Power consumption measurements  Calculation time measurements Cryptanalysis

As cryptography is the science and art of creating secret codes, cryptanalysis is the science and art of breaking those codes.

Figure 3.3 Cryptanalysis attacks

3.29 Cryptanalysis Ciphertext-Only Attack

3.30 Cryptanalysis

 known plaintext cryptanalysis, it is assumed that the cryptanalyst knows the ciphertext and a portion of the original text, and in special cases knows the correspondence between the ciphertext and the original text.  chosen plaintext cryptanalysis, it is assumed that the cryptanalyst can enter a specially chosen text into the enciphering device and get a cryptogram created under the control of the secret key.  Chosen ciphertext cryptanalysis assumes that the opponent can use ciphertexts created by him or her for deciphering. The texts were specially chosen to most easily compute the secret key from texts obtained at the output of the deciphering device.  Adapted text cryptanalysis corresponds to a case in which the attacker repeatedly submits texts for encryption (or decryption), with each new portion being chosen depending on previously obtained cryptanalysis results. This kind of attack is the one most favorable for the opponent. Cryptanalysis Known-Plaintext Attack

3.32 Cryptanalysis Chosen-Plaintext Attack

3.33 Cryptanalysis Chosen-Ciphertext Attack

3.34 Applications of Cryptography

 Protection against unauthorized reading (or providing information privacy)  Protection against creating false messages (both intentional and unpremeditated)  Valid user authentication  Information integrity control  Information authentication  Digital signatures  Computerized secret voting  Digital cash  Computerized coin-tossing  Protection against the repudiation of the receipt of a message  Simultaneous contract signing  Protection against document forgery Historical Cryptographic Exemplars  Julius Caesar liked encrypting messages  Replaced Greek letters for Roman letters  Caesar Shift Cipher  Each letter substituted by shifting n places  E X A M P L E  H A D P S O H  Only 25 such ciphers  Substitution based on key phrase  Substitution key consists of phrase’s letters (uniquely) followed by rest of the alphabet  THIS IS ’S KEY  THISALCENDBOKY-FGJMPQRUVWXZ  26! (roughly 1026) monoalphabetic substitution ciphers Historical Cryptographic Exemplars  Coding  Louis XIV’s (Rossignols) used one symbol (3- digit number) per syllable (held 200 years)  Mary Queen of Scots used a combination of cipher and coded words (nomenclator)  e.g,

 US Army used Navajo language as code in WWII Historical Cryptographic Exemplars  Coding  Mary Queen of Scots usò una combinazione di cifrari e parole codificate (nomenclator)  Obiettivo: uccidere la regina Elisabetta I

Historical Cryptographic Exemplars

 Etienne Bazeries postulated that the cluster 124-22-125-46-345 encoded "les ennemis" (the enemies) and from that information was able to unravel the entire cipher

Substitution cipher

 The encryption algorithm ε(m) is a substitution function which replaces each m  M with a corresponding c  C.  The substitution function is parameterized by a secret key k.  The decryption algorithm Dk(c) is merely the reverse substitution.  the substitution can be given by a mapping

 and the reverse substitution is just the corresponding inverse mapping .  The simplest case is called shift ciphers. In shift ciphers, K = M = C;  let N = #M, the encryption and decryption mappings are defined by

 with m, c, Substitution cipher  For the case of M being the capital letters of the Latin alphabet, i.e., , the shift cipher is also known as (k = 3).  gcd(k, N) = 1, then for every m < N

 ranges over the entire message space . Therefore for such k and for m, c < N Substitution cipher

 provide a simple substitution cipher.  With can also define a simple substitution cipher called : Substitution cipher

 These ciphers are called monoalphabetic ciphers:  for a given encryption key, each element in the plaintext message space will be substituted into a unique element in the ciphertext message space.  Therefore, monoalphabetic ciphers are extremely vulnerable to attacks Historical Cryptographic Exemplars  The broke monoalphabetic substitution using frequency analysis  In English (Beker&Piper)

a 8.2% j 0.2 s 6.3

b 1.5 k 0.8 t 9.1 c 2.8 l 4.0 u 2.8 d 4.3 m 2.4 v 1.0 e 12.7 n 6.7 w 2.4 f 2.2 o 7.5 x 0.2 g 2.0 p 1.9 y 2.0 h 6.1 0.1 z 0.1 i 7.0 r 6.0  Thus, letters ciphering e, t, and a are easily discovered  Subsequently can look for the rest of the letters and letter pairs Frequency  Frequency of distribution of english alphabet letters Substitution cipher

 Cryptanalyze:  Ek wso veptemheka tsk vl jloa qdsk broy Substitution cipher Da qui

 Cryptanalyze:  Ek wso, veptemheka tsk vl jloa qdsk broy Substitution cipher  Let  Interpret A = 0, B = 1, …, Z=25  The encryption algorithm ε(m) is defined as the following permutation

 Then the corresponding decryption algorithm DK(C) is given by Substitution cipher  Encrypt: Sicurezza dei sistemi informatici

 …………………

 Fill in the blanks!

 Decrypt? Substitution cipher

 The key space of this cipher has the size 26! > 4 x 1026  Huge in comparison with the size of the message space!  But each plaintext character is encrypted to a unique ciphertext character!  This weakness renders this cipher extremely vulnerable to a cryptanalysis technique called frequency analysis which exploits the fact that natural languages contain a high volume of redundancy Cifrario di sostituzione omofonica

 Homos – phonos  E’ possibile sostituire la stessa lettera con più simboli dell’alfabeto di cifratura (e.g. cifre) in modo da bilanciare la frequenza dei simboli ed abbattere l’efficacia della crittanalisi basata sull’analisi di frequenza dei simboli.  Meglio o peggio dei codici polialfabetici?  E’ un tipo di codice monoalfabetico  Peggio, perché ad una lettera corrispondono un numero inferiore di simboli  Peggio perchè un simbolo, in un cifrario polialfabetico, può rappresentare più lettere 1 Codici di sostituzione omofonica

Polyalphabetic cipher

 A substitution cipher is called a if a plaintext message element in P may be substituted into many, possibly any, ciphertext message element in C Historical Cryptographic Exemplars  Vigenere’s polyalphabetic cipher (16th century) generalizes Caesar’s shift cipher  Can alternate between lines; or  Use keyword: it consists of a period p and a sequence k1,k2,...,kp of Caesar shifts

 The Vigenere cipher is not amenable to simple frequency analysis Historical Cryptographic Exemplars Vigenere

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z B B C D E F G H I J K L M N O P Q R S T U V W X Y Z A C C D E F G H I J K L M N O P Q R S T U V W X Y Z A B D D E F G H I J K L M N O P Q R S T U V W X Y Z A B C E E F G H I J K L M N O P Q R S T U V W X Y Z A B C D F F G H I J K L M N O P Q R S T U V W X Y Z A B C D E G G H I J K L M N O P Q R S T U V W X Y Z A B C D E F H H I J K L M N O P Q R S T U V W X Y Z A B C D E F G Ci≡(Pi+Ki)mod26 I I J K L M N O P Q R S T U V W X Y Z A B C D E F G H J J K L M N O P Q R S T U V W X Y Z A B C D E F G H I K K L M N O P Q R S T U V W X Y Z A B C D E F G H I J L L M N O P Q R S T U V W X Y Z A B C D E F G H I J K M M N O P Q R S T U V W X Y Z A B C D E F G H I J K L P ≡(C -K )mod26 N N O P Q R S T U V W X Y Z A B C D E F G H I J K L M i i i O O P Q R S T U V W X Y Z A B C D E F G H I J K L M N P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O Q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P R R S T U V W X Y Z A B C D E F G H I J K L M N O P S S S T U V W X Y Z A B C D E F G H I J K L M N O P Q R T T U V W X Y Z A B C D E F G H I J K L M N O P Q R S U U V W X Y Z A B C D E F G H I J K L M N O P Q R S T V V W X Y Z A B C D E F G H I J K L M N O P Q R S T U W W X Y Z A B C D E F G H I J K L M N O P Q R S T U V X X Y Z A B C D E F G H I J K L M N O P Q R S T U V W Y Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Keyword: white z Plaintext: Divert Troops To East Ridge w h i t e w h i t e w h D i v e r t T r o o p s

(Ciphertext) Confronto Testo da cifrare: gauguininfouniroma2itmeTeachingSicurezzadeiSistemiInformatici

KEY: CHARLESV Exercise on Vigenere Vigenere Square

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z B B C D E F G H I J K L M N O P Q R S T U V W X Y Z A C C D E F G H I J K L M N O P Q R S T U V W X Y Z A B D D E F G H I J K L M N O P Q R S T U V W X Y Z A B C E E F G H I J K L M N O P Q R S T U V W X Y Z A B C D F F G H I J K L M N O P Q R S T U V W X Y Z A B C D E G G H I J K L M N O P Q R S T U V W X Y Z A B C D E F H H I J K L M N O P Q R S T U V W X Y Z A B C D E F G Ci≡(Pi+Ki)mod26 I I J K L M N O P Q R S T U V W X Y Z A B C D E F G H J J K L M N O P Q R S T U V W X Y Z A B C D E F G H I K K L M N O P Q R S T U V W X Y Z A B C D E F G H I J L L M N O P Q R S T U V W X Y Z A B C D E F G H I J K M M N O P Q R S T U V W X Y Z A B C D E F G H I J K L P ≡(C -K )mod26 N N O P Q R S T U V W X Y Z A B C D E F G H I J K L M i i i O O P Q R S T U V W X Y Z A B C D E F G H I J K L M N P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O Q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P R R S T U V W X Y Z A B C D E F G H I J K L M N O P S S S T U V W X Y Z A B C D E F G H I J K L M N O P Q R T T U V W X Y Z A B C D E F G H I J K L M N O P Q R S U U V W X Y Z A B C D E F G H I J K L M N O P Q R S T V V W X Y Z A B C D E F G H I J K L M N O P Q R S T U W W X Y Z A B C D E F G H I J K L M N O P Q R S T U V X X Y Z A B C D E F G H I J K L M N O P Q R S T U V W Y Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Keyword:GME Plaintext: sicurezza dei sistemi informatici Exercise on Vigenere

Compare the frequencies of the resulting ciphertexts (Sub/Vigenere)

Chosen Plaintext/Ciphertext attack? Playfair ciphers

 Matrix-based block  p l a y f  cipher used in WWI    i r b c d   In a 5x5 matrix, write e g h k m   the letters of the word n o q s t  “playfair” (for u v w x z  example) without   dups, and fill in with other letters of the alphabet, except I,J used interchangeably. Playfair encryption  p l a y f  1. Break plaintext into letter pairs    If a pair would contain double letters,  i r b c d  split with x P  e g h k m  Pad end with x    hellothere becomes… n o q s t   he lx lo th er ex 2. For each pair, u v w x z     If they are in the same row, replace each with the letter to its right (mod 5) He lx lo th er ex  he  KG  If they are in the same column, replace KG YV RV QM GI KU each with the letter below it (mod 5)  lo  RV  Otherwise, replace each with letter we’d get if we swapped their column To decrypt, just reverse! indices  lx YV

Weaknesses

 p l a y f   P is unknown, but structure of    i r b c d  its bottom is predictable P  e g h k m   n o q s t   Can break using digram u v w x z  frequencies  Example: If a digram and its reverse both appear often, it’s He lx lo th er ex probably ER and RE.

 Each plaintext letter maps to KG YV RV QM GI KU how many possible ciphertext letters?

Playfair

 Sconfiggere il nemico senza combattere è la massima abilità  Ciphertext?  P:=5x5 filled in alphabetical order Historical Cryptographic Exemplars

 Babbage broke Vigenere’s Cipher (19th century)  Stage 1: Discover key length  Look for repeated sequences, and measure the distance between them  The key length is a factor of these distances  Stage 2: Identify the key itself  Compare distributions for each of the key letters with the standard distribution, to identify the shift  Kasiski test Kasiski Test

 Determinazione della periodicità del ciphertext  Si basa sulla considerazione che alcune parole comuni (quali “the” in inglese) saranno cifrate con lo stesso crittogramma, con conseguente presenza di gruppi ripetuti nel crittogramma. CRYPTO IS SHORT FOR CRYPTOGRAPHY IZYCEIQYATSXBFBCWZEXFSMZACSSHF gianluigime IZYCEIQYAHBCNNUZCEJJBUORNABGFH gianlui IZYCZWIFYPOEZNOEIZYCZWGEGXHLFH gian Assumendo che i crittogrammi ripetuti siano corrispondenti allo stesso testo, si nota che la distanza tra i due è 16:la chiave è lunga 16,8,4 o 2.

 The Kasiski test relies on the occasional coincidental alignment of letter groups in the plaintext with the keyword to give a keyword length estimate. The test says if a string of characters appears repeatedly in a polyalphabetic ciphertext message, it is possible (though not certain), that the distance between the occurrences is a multiple of the length of the keyword.

 To demonstrate how this works, suppose the Vigene re cipher is used to encipher the message “THE CHILD IS FATHER OF THE MAN” using the keyword “POETRY” to produce the following ciphertext. Plaintext T H E C H I L D I S F A T H E R O F T H E M A N Keyword P O E T R Y P O E T R Y P O E T R Y P O E T R Y Ciphertext I V E V Y G A R M L M Y I V E K F D I V E F R L

The keyword “POETRY” is six letters long. Note that the trigraph “IVE” occurs three times in the ciphertext. The second occurrence of “IVE” occurs 12 character positions after the first. The third occurrence of “IVE” occurs 6 character positions after the second.

 This leads to the assertion that the separations of common letter occurrences stand a good chance of being multiple of the keyword. This observation leads to the following fact concerning the Kasiski test.

Fact: The greatest common divisor or divisor of it of the separations of common characters that occur in a ciphertext enciphered by the Vigene re cipher tends to be a good chance of being equal or at least some multiple of the keyword.

 Since “IVE” was separated by 12 characters and then 6 characters, then by observing that gcd(6, 12) = 6, we see that we have hit exactly the number of letters that occurred in the keyword. We conclude with one more example illustrating the Kasiski test. Kasiski Test

 Vigenere has a period m such that for

each fixed j the mi+j-th characters are given simple substitution of the

corresponding mi+jth plaintext Kappa Test

 inventato nel 1925 da William F. Friedman  Indice di coincidenza: probabilità che 2 lettere cifrate rappresentano la stessa dell’alfabeto nel testo in chiaro. Conoscendo la probabilità di 2 lettere scelte a caso nell’alfabeto inglese, l’indice che rappresentino la stessa lettera è 6.5 % . Probability of Selecting Multiple Letters in Standard English  In the standard English frequency table for letters, the probability of selecting a single letter list is the relative frequency converted to decimal, that is

selecing a single letter relative frequency of the letter found in the table P    in standard English  100

In a large sample of English text, estimate the probability of selecting two E’s. Two A’s.

For convenience, we will assign the variables

p0 , p1, p2 , p3, p4 , p24, p25 to represent the probabilities of selecting the letters A, B, C, D, E, …, Y, Z from the standard English alphabet. The subscripts of the variables correspond to the MOD 26 alphabet assignment number of the corresponding alphabet letter. We will use this variable assignment in the next example.

Example 3: What is the probability that two randomly selected English letters are the same?

Solution: Using the standard English frequency table, we see that  two letters  P   P(2 A's)  P(2 B's)  P(2 C's)  P(2 Y' s)  P(2 Z's) are the same 2 2 2 2 2  p0  p1  p2  p24  p25  (0.08167)2  (0.01492)2  (0.02782)2  (0.04253)2  (0.12702)2  (0.02228)2  (0.02015)2  (0.06094)2  (0.06966)2  (0.00153)2  (0.00772)2  (0.04025)2  (0.02406)2  (0.06749)2  (0.07507)2  (0.01929)2  (0.00095)2  (0.05987)2  (0.06327)2  (0.09056)2  (0.02758)2  (0.00978)2  (0.02360)2  (0.00150)2  (0.01974)2  (0.00074)2  0.065

Index of Coincidence for Monoalphabetic Ciphers

In monoalphabetic ciphers, the frequencies of letters in standard English are preserved when converting from plaintext to ciphertext. The following example illustrates why this is true.

Illustrate why the Caesar shift cipher preserves frequencies when converting from plain the ciphertext.

Solution:

Recall the represents the probability that two randomly selected letters are identical. Since monoalphabetic ciphers preserves frequencies, the index of coincidence of the plaintext alphabet of standard English will be exactly the same as the index of coincidence of the ciphertext alphabet for a monoalphabetic cipher. Using the result from previous example, this fact results in the following statement:

Index of Coincidence for Monoalphabetic Ciphers

Index of Coincidence Two randomly selected  I  P   0.065 For Monoalphabetic Ciphers  letters are the same 

Index of Coincidence for Polyalphabetic Ciphers

 In a polyalphabetic cipher, the goal is to distribute the letter frequencies so that each letters has the same likelihood of occurring in the ciphertext. The next example determines what the index of coincidence is for a polyalphabetic cipher for a large collection of letters.

Problem:Determine the probability that two randomly selected letters are identical of the ciphertext of a message enciphered with a polyalphabetic cipher, assuming there are a very large number of letters in the ciphertext.

 two letters  P   P(2 A's) P(2 B's) P(2 C's) P(2 Y' s) P(2 Z's) are the same 2 2 2 2 2  p0  p1  p2  p24  p25 2  26* p0  26*(1/ 26)2  (1/ 26)  0.003846  0.0385

Since the index of coincidence represents the probability that two randomly selected letters are identical, Example 5 allows us to make the following statement:

Index of Coincidence for Polyalphabetic Ciphers Index of Coincidence Two randomly selected  I  P   0.0385 For Polyalphabetic Ciphers  letters are the same 

The index of coincidence values for monoalphabetic (0.065) and polyalphabetic ciphers (0.0385) were derived assuming that the plaintext message has a very large number of letters. When messages are enciphered and deciphered, these messages are normally much shorter. Hence, the index of coincidence for a typical enciphered message enciphered will be bounded somewhere between 0.0385 and 0.065. This leads to the following statement:

Index of Coincidence Bound

For a typical ciphertext message, the index of coincidence I satisfies: 0.0385  I  0.065

Fact: If I is close to 0.0385, then the cipher is likely to have been obtained from a polyalphabetic cipher. If I is closer to 0.065, the cipher is likely to be monoalphabetic.

Knowing what the value for the index of coincidence tells us, we now need to derive a formula for calculating it. Before doing this, we need to recall the following fact concerning summation notation: Kappa Test

 La lunghezza della chiave è circa (Friedman):

L = IC = k

 n è la lunghezza del testo ed n1 ,…, n26 sono le frequenze delle lettere. Derivation of Formula for the Index of Coincidence for a Given Ciphertext Message Suppose a ciphertext message is received. Let n 0 , n 1 , n 2 ,  , n 24 , n 25 be the counts of the number of occurrences of the alphabet letters A, B, C, …, Y, Z that occur in the ciphertext (note that the subscript of each variable corresponds to the MOD 26 alphabet assignment number of the corresponding letter). Suppose

Total number of letters  n  n0  n1  n2  n24  n25 represents the total sum of all of the letters in the ciphertext. Recall that the index of coincidence represents the probability that two randomly selected letters are identical. Using the multiplication principle of probability for n total letters, we can compute the following probabilities for each individual letter:

2 A's are randomly  n0 (n0 1) n0(n0 1) P      selected  n (n 1) n(n 1)

2 B's are randomly  n1 (n1 1) n1(n1 1) P      selected  n (n 1) n(n 1)

2 C's are randomly  n2 (n2 1) n2(n2 1) P      selected  n (n 1) n(n 1) 

2 Y' s are randomly  n24 (n24 1) n24(n24 1) P      selected  n (n 1) n(n 1)

2 Z's are randomly  n25 (n25 1) n25(n25 1) P      selected  n (n 1) n(n 1)

Since these probabilities are mutually exclusive, we can sum the individual probabilities for each letters to find the index on coincidence.

Index of  2 letters randomly   P  Coincidence selected are the same n (n 1) n (n 1) n (n 1) n (n 1) n (n 1)  0 0  1 1  2 2  24 24  25 25 n(n 1) n(n 1) n(n 1) n(n 1) n(n 1) 1  n (n 1)  n (n 1)  n (n 1)  n (n 1)  n (n 1) n(n 1) 0 0 1 1 2 2 24 24 25 25 1 25  ni (ni 1) n(n 1) i0

We summarize this result.

Formula for the Index of Coincidence The index of coincidence I is given by the formula: 1 25 I  n (n 1) n(n 1)  i i i0 1  n (n 1)  n (n 1)  n (n 1)  n (n 1)  n (n 1) n(n 1) 0 0 1 1 2 2 24 24 25 25 where n represents the total number of letters in the ciphertext and n i , i  0 , 1,  , 25 represents the number of letters corresponding to each individual letter in the ciphertext with n 0  the number of A' s , n 1  the number of B' s , n2  the number of C's etc. Index of coincidence  Expected values of IC for several periods.

 The lower the index of coincidence, the less variation in the characters of the ciphertext and (from our model of English) the longer the period of the cipher Transposition Ciphers

 A (also called permutation cipher) transforms a message by rearranging the positions of the elements of the message without changing the identities of the elements.  Consider that the elements of a plaintext message are letters in ; let b be a fixed positive integer representing the size of a message block; let ;  let K be all permutations of (1, 2, …, b). Transposition Ciphers

 Example: b=4 P: This is a secret message Encryption Key: Group by 4: This isas ecre tmes sage

C: hstissiaceermsteaesg Decryption:

Decryption Key: Transposition Ciphers

 Railfence: TRHCEEIETGSSMAIAEASS

T 5 T R H 3 H C E E E 1 I E T G K 4 S S M A E 2 I A E S Y 6 S S

 Redfence (by key): IETGIAESHCEESSMATRSS  Columnar T H E K E Y  IEEIRSHSMESCSTATGSEA 5 3 1 4 2 6

T H I S I S A S E C R E T M E S S A G E The German ADFGVX cipher

 Was introduced in 1918, it was a combination of substitution and transposition.  Draw up a 6x6 grid, and fill the grid with a random combination of the 26 letters of the alphabet and the 10 digits.  The arrangement of the elements in the grid is part of the key. The German ADFGVX cipher

A D F G V X A 8 P 3 D 1 N D L T 4 O A H F 7 K B C 5 Z G J U 6 W G M V X S V I R 2 X 9 E Y 0 F Q The German ADFGVX cipher The first step is to take each letter of the plaintext, locate its position, and substitute it with the letters that label its row and column. A D F G V X For example, 8 becomes AA A 8 P 3 D 1 N P becomes AD, L is DA D L T 4 O A H F 7 K B C 5 Z G J U 6 W G M V X S V I R 2 X 9 E Y 0 F Q

The German ADFGVX cipher So the message

ATTACK AT 10 PM

BECOMES

DV DD DD DV FG FD DV DD AV XG AD GX

The German ADFGVX cipher This is a simple mono-alphabetic substitution cipher, which can be broken by frequency analysis. But now we add some transposition into the mix. For step 2, we need a keyword. In our example, we use the keyword MARK The keyword is the second piece of information we must share with the receiver. The German ADFGVX cipher Now we transpose by arranging the message in columns and shifting the columns around according to the alphabetical order of the keyword: MARK AKMR DVDD VDDD DDDV DVDD FGFD GDFF DVDD VDDD AVXG VGAX ADGX DXAG The German ADFGVX cipher Now we read off the message column wise: AKMR VDDD DVDD VDGVVDDVDDGXDDFDXG GDFF VDDD VGAX DXAG Substitution ADFGVX

 Try with

 KEY:SITV

 In war, discipline can do more than fury The German (Scherbius)

 Electrical encryption machine, performing 8 substitutions  uses n=3 rotating scramblers (26n orientations)  scramblers can be configured in n! orders  pre-keyboard k=6 swapped letter-pairs  Encryption consists of  optional letter pair switch  compounded scrambling, with shifts  reflector swaps letter pairs  and backward scrambling  Scrambles rotate in tandem  Total of 1017 possible configurations  changed daily according to a  each message has own orientation (message key)  later added 4th scrambler  Used extensively by Germany in WW2  Hitler used a more complex version of Enigma called The German Enigma Machine (Scherbius) Poles Crack the Enigma  Polish cryptanalysts obtained information about the encryption procedure from commercial Enigmas  Obtained information on its usage  the Germans used a different orientation key for each message, encrypted twice in the message header (using the day key)  Rejewski focused on the repetitions  Formalized relationships between 1st-4th ,2nd-5th, and 3rd-6th letters  ABCDEFGHIJKLMNOPQRSTUVWXYZ  FQHPLWOGBMVRXUYCZITNJEASDK  Built chains  (AFWA), (BQZKVELRIB), (CHGOYDPC), (JMXSTNUJ)  Chains depend only on scrambler orientation, not pair swaps  Thus need to consider only 6 x 263 = 105456 configurations  Built a catalog of characteristic chains for all configurations Poles Crack the Enigma  Rejewski’s algorithm to discover the day key  First, use catalog to identify the scrambler setting and orientation  Then, run the ciphertext through an Enigma and look at the text to identify swapped letter pairs  “Bomb” machines were constructed to mechanize the search British Crack Improved Enigma  In 1939, Germans increased Enigma security  added 2 extra scramblers to choose – 10x arrangements  increased to 10 letter pair swaps  British Cryptanalysts () took from the Polish  Recruited best Mathematicians (Turing) and large staff (7000)  Received from Polish  Used human weaknesses provided hints and cribs  Trivial message keys (key sequences, names initials)  Artificial “intelligent” restrictions on scramblers arrangements and pair swaps restricted the search space  Standard message formats, e.g., weather  Some German were captured  Turing constructed swap-independent chains similar to Rejewski  First British Bomb (Victory) delivered in 1940  Search still required significant human help  The British – broken German, Italian and Japanese communications were crucial to winning the war Kerchoff’s Principle  Everybody should know design and algorithm, except values of secrete keys

 Lots of horror stories about ``Security by obscurity''... Is there Unbreakable Encryption? (Resume)

 One-time pads  Sender and receiver use a pre-arranged random stream of letters

 Encryption=addition mod 26, exor M E S S A G E  Every letter in the key used once T H I S K E Y

F L A K K K C

 Example: plaintext 1000 letters, key length=5,20,1000->5 sets of 200 letters (good!), 20 sets of 50 letters (harder), 1000 sets of 1 letter (impossible!)  Perfectly secure encryption (Shannon)  Used by Soviet spies, and also for US-Soviet hotline  Requires significant logistical effort and coordination  Relies on of key One Time Pad  Alice and Bob share a use-once key K (random key), say 1100

 To produce ciphertext, Alice XOR message M (say 0110) and key: C = M  K = 1010

 Ciphertext sent over insecure medium

 When Bob gets ciphertext, XOR with (same key) to decrypt: M = C  K=0110 Information Theoretic Security

 THEOREM: If pad NEVER used again, ciphertext does not increase adversary's knowledge about message.  Why is it important to use pad only once?  Assume same pad K used twice:

C1 = M1  K; C2 = M2  K  Eve guesses it. She can compute:

C1  C2 = (M1  K)  (M2  K)=M1  M2  Not much?  By language statistics (frequency and redundancy) one can easily get M1 and M2! Computationally Secure Encryption

 Encryption scheme is computationally secure if  The time required to break the cipher exceeds the useful lifetime of the information, or  The cost of breaking the cipher exceeds the value of the encrypted information;  Most schemes that we will discuss are not unbreakable in principle, but are computationally secure  Usually rely on very large key-space, impregnable to brute force  Moreover, the most advanced schemes rely on lack of knowledge of effective algorithms for certain hard problems, not on a proven inexistence of such algorithms  Usually factorization, discrete logarithms, or square roots mod p Project Venoma  Recently disclosed by CIA  During Cold War Russians used one time pads for encrypted communication between embassies  Western intelligence agencies guessed (hoped) pads reused (this was the case)  Collected huge number of ciphertexts, looked for dependencies, and decrypted good portions of messages

Summary

 Encryption Algorithms and Keys  Substitution : bits, letters, words  Transposition  Decryption Algorithms  Reversed process  Knowledge of the algorithm and the key  Cryptanalysis  Identify algorithm  Obtain as many plaintext-ciphertext pairs  Use systematicity (patterns)  Use cribs An Introduction to Cryptography

Gianluigi Me [email protected] Anno Accademico 2010/11 Modern Encryption and Cryptanalysis Principles Modern Encryption Principles  Encryption scheme has 5 ingredients  Plaintext, Encryption Algorithm, Key, Ciphertext, and Decryption Algorithm  Kerchoff: Security depends on secrecy of key, not of algorithm

Notation

 M, or P will usually denote plaintext message  C will usually denote ciphertext  K will usually denote a key

 Ek(M) = C is the encryption function

 Dk(C) = M is the decryption function

 Dk(Ek(M)) = M represents the typical flow Computational Security

 Key management for one time pad difficult: sacrifice information theoretic security for computational security

 Computational security based on computational difficulty of breaking an encryption scheme

 Strength relies on fact that adversary cannot determine K even if he/she has (M,C) Cryptographic Protocols

 Self enforcing protocols  Arbitrated protocols  Trusted third party helps in real time  Adjudicated protocols  Trusted third party, but only if needed and after the fact Attacks Against Cryptographic Protocols Mostly attacks against implementations:  Passive attacks (eavesdropping)  Cryptanalysis  Traffic analysis  Active attacks  Impersonation  Interruption / denial  Modification of messages  Fabrication of new messages  Replay / Reflect messages Cryptographic Algorithms

 Type of operations applies to plaintext E.g., Substitution and transposition  Type of key(s) Symmetric : same key

Asymmetric, Public-Key : Dk2(Ek1(M))=M  How plaintext is processed into ciphertext How many and which operations How the operations are combined Block ciphers, Stream ciphers Cryptanalysis (attacks against cryptographic algorithm)  Ciphertext only  Uses only knowledge of algorithm and ciphertext  Known plaintext  Also one or more plain-ciphertext pairs  Or, probable words: dictionary, known formats, etc.  Chosen text  Chosen to reveal information about the key  Chosen plaintext and its ciphertext

 Differential chosen plaintext  Adaptive chosen plaintext  Chosen ciphertext and its original plaintext

 Mostly against public-keys Modern Encryption and Cryptanalysis Principles-End