An Introduction to Cryptography
Gianluigi Me [email protected] Anno Accademico 2011/12 Overview
History of Cryptography (and Steganography) Modern Encryption and Decryption Principles Symmetric Key Cryptography Stream Ciphers and Cipher Block Modes Key Management for Conventional Cryptography Attack on Bad Implementation of Cryptography: IEEE 802.11 WEP Message Authentication Public Key Cryptography Digital Signatures Key Management for Public-Key Cryptography Main sources
Network Security Essential / Stallings Applied Cryptography / Schneier Handbook of Applied Cryptography / Menezes, van Oorschot, Vanstone Innovative Cryptography, N. Moldovyan , A.Moldovyan Modern Cryptography: Theory and Practice, Wenbo Mao
History of Steganography and Cryptography Steganography
Steganography Steganos = “covered” in Greek, Graphein = “to write” Being able to communicate secretly has always been considered an advantage Secret messages were often not written down, but rather memorized by sworn messengers Or hidden Demaratus, a Greek immigrant to Persia, reveals Persia’s intention to attack Athens. Writes the secret message on a tablet, and covers it with wax. Histaiaeus encourages Aristagoras of Miletus to revolt against the Persian King. Writes message on shaved head of the messenger, and sends him after his hair grew Chinese wrote on silk, turned into wax-covered ball that was swallowed by the messenger Steganography (cont.) Invisible Ink Certain organic fluids are transparent when dried but the deposit can be charred and is then visible A mixture of alum and vinegar may be used to write on hardboiled eggs, so that can only be read once shell is broken
Embedded information Germans used “microdots” - documents shrunk to the size of a dot, and embedded within innocent letters Secret messages within music (Beatles) Steganography (cont.) Steganography is also used to foil piracy in digital content Watermarking copyright information into images, music Programmers sometime embed “easter eggs”
Steganography has been used by spies and children alike Most recently, US argued that Bin Laden implanted instructions within taped interviews
Steganography is weaker than cryptography because the information is revealed once the message is intercepted
However, steganography can be used in conjunction with cryptography Cryptography In Cryptography, the meaning of the message is hidden, not its existence Kryptos = “hidden” in Greek Historically, and also today, encryption involves transposition of letters Sparta’s scytale is first cryptographic device (5th Century BC) Message written on a leather strip, which is then unwound to scramble the message
substitution Kama-Sutra suggests that women learn to encrypt their love messages by substituting pre-paired letters (4th Century AD) Cipher – replace letters Code – replace words Cryptography
The classical cryptographic task is to provide for a reversible transformation of an understandable plaintext (original text) to a seemingly random character sequence called a ciphertext or a cryptogram. The security of modern cryptosystems is not based on the secrecy of the algorithm, but on the secrecy of a relatively small amount of information, called a secret key. Cryptography
Definition : A cryptographic system consists of the following: a plaintext message space M: a set of strings over some alphabet a ciphertext message space C: a set of possible ciphertext messages an encryption key space K: a set of possible encryption keys, and a decryption key space K1: a set of possible decryption keys an efficient key generation algorithm an efficient encryption algorithm an efficient decryption algorithm . l l For integer 1 , G(1 ) outputs a key pair (ke, kd) K x K' of length l . For ke K and m M, we denote by
the encryption transformation and read it as "c is an encryption of m under key ke,"
Cryptography
the decryption transformation and read it as "m is the decryption of c under key kd."
It is necessary that for all m M and all ke K, there exists kd K': Cryptography
Combining Shannon's semantic characterization for cryptosystem and Kerchoffs' principle, we can provide a summary for a good cryptosystem as follows: Algorithms ε and D contain no component or design part which is secret; ε distributes meaningful messages fairly uniformly over the entire ciphertext message space; it may even be possible that the random distribution is due to some internal random operation of ; With the correct cryptographic key, ε and D are practically efficient; Without the correct key, the task for recovering from a ciphertext the correspondent plaintext is a problem of a difficulty determined solely by the size of the key parameter, which usually takes a size s such that solving the problem requires computational resource of a quantitative measure beyond p(s) for p being any polynomial.
Cryptography
If a cipher achieves independence between the distributions of its plaintext and ciphertext, then we say that the cipher is secure in an information-theoretically secure sense The notion of information-theoretic-based cryptographic security is developed by Shannon. According to Shannon's theory, we can summarize two conditions for secure use of classical ciphers: Cryptography
Conditions for Secure Use of Classical Ciphers #K ≥ #M;
k εU K and is used once in each encryption only. So if a classical cipher encrypts a message string of length l , then in order for the encryption to be secure, the length of a key string should be at least l , and the key string should be used once only. The scheme (D; ε) is perfectly secure if for
every pair of messages x; x0, ε Un(x) ≡ ε Un(x0). Cryptography
Definition: A cryptosystem is perfectly message indistinguishable if for all plaintexts x1 and x2 and all ciphertexts y
Pr (εk(x1)=y)=Pr (εk(x2)=y) Theorem: Shannon secrecy and message indistinguishability are equivalent. When restricting to #P =#C=#K key ambiguity is equivalent as well. LEMMA: #K<#P implies not perfectly secret. Vernam Cipher
Assume that the message is a string of n binary bits the key is also a string of n binary bits
where U stands for uniformely random Encryption takes place one bit at a time and the ciphertext string c = c1c2…cn is found by the bit operation XOR (exclusive or) each message bit with the corresponding key bit 1in Vernam Cipher
Considering M = C = K = {0, 1}*, the Vernam cipher is a special case of substitution ciphers Confidentiality offered by the one-time-key Vernam cipher is in the information-theoretically secure sense, or is, unconditional. In fact, a ciphertext message string c provides (an eavesdropper) no information whatsoever about the plaintext message string m since any m could have yield c if the key k is equal to c m (bit by bit).
Review: Bayes’ Theorem
Let X and Y be two random variables. Define:
Theorem (Chain Rule):
Apriori Theorem (Bayes):
Aposteriori Theoretical (Perfect) Security
What does it mean for a cryptosystem to be perfectly secure? Essentially, the adversary doesn’t learn anything from the ciphertext: Perfect Security means |K|>|P|
Reminder: adversary “knows the system” and has unlimited power If key-space is finite, each ciphertext must map to a finite number of plaintexts
If |P|>|K|, some plaintexts will be “impossible” for some ciphertexts
The Vernam Cipher (1)
Is there a perfectly secure cryptosystem for which |K|=|P|?
Theorem (Shannon): Let (P,K,C,E,D) be a cryptosystem for which |K|=|P|=|C|. Then the cryptosystem provides perfect if: The Vernam Cipher (2)
Proof: Let (P,K,C,E,D) be a cryptosystem for which |K|=|P|=|C|. Because of perfect secrecy:
|K|=|P|=|C|, so there is a unique key associated with every pair (p,c)
The Vernam Cipher (3)
Fix c. For all possible plaintexts pi, let ki be the key satisfying eki(pi)=c By Bayes: Mathematical Proof
mi ki prob. ci prob. prob. 0 x 0 ½ 0 ½ x 0 x 1 ½ 1 ½ x 1 1-x 0 ½ 1 ½ (1-x) 1 1-x 1 ½ 0 ½ (1-x) • We find out the probability of a ciphertext bit being 1 or 0 is equal to (½)x + (½)(1-x) = ½. Ciphertext looks like a random sequence. exclusive or Operator
a b c = a b
0 0 0 0 1 1
1 0 1 1 1 0 Example
message =‘IF’ then its ASCII code =(1001001 1000110) key = (1010110 0110001) Encryption: 1001001 1000110 plaintext 1010110 0110001 key 0011111 1110110 ciphertext Decryption: 0011111 1110110 ciphertext 1010110 0110001 key 1001001 1000110 plaintext Cryptanalysis
In parallel with the development of cryptographic systems, methods have been developed that make it possible to restore an original message based on the ciphertext and other known information. These methods are collectively known as cryptanalysis. Advances in cryptanalysis have led to tightening the requirements on cryptographic algorithms Cryptanalysis
Ciphertext Plaintext and its corresponding ciphertext Chosen plaintext Chosen ciphertext Adapted plaintext Adapted ciphertext Additionally, some attacks use: Hardware faults Power consumption measurements Calculation time measurements Cryptanalysis
As cryptography is the science and art of creating secret codes, cryptanalysis is the science and art of breaking those codes.
Figure 3.3 Cryptanalysis attacks
3.29 Cryptanalysis Ciphertext-Only Attack
3.30 Cryptanalysis
known plaintext cryptanalysis, it is assumed that the cryptanalyst knows the ciphertext and a portion of the original text, and in special cases knows the correspondence between the ciphertext and the original text. chosen plaintext cryptanalysis, it is assumed that the cryptanalyst can enter a specially chosen text into the enciphering device and get a cryptogram created under the control of the secret key. Chosen ciphertext cryptanalysis assumes that the opponent can use ciphertexts created by him or her for deciphering. The texts were specially chosen to most easily compute the secret key from texts obtained at the output of the deciphering device. Adapted text cryptanalysis corresponds to a case in which the attacker repeatedly submits texts for encryption (or decryption), with each new portion being chosen depending on previously obtained cryptanalysis results. This kind of attack is the one most favorable for the opponent. Cryptanalysis Known-Plaintext Attack
3.32 Cryptanalysis Chosen-Plaintext Attack
3.33 Cryptanalysis Chosen-Ciphertext Attack
3.34 Applications of Cryptography
Protection against unauthorized reading (or providing information privacy) Protection against creating false messages (both intentional and unpremeditated) Valid user authentication Information integrity control Information authentication Digital signatures Computerized secret voting Digital cash Computerized coin-tossing Protection against the repudiation of the receipt of a message Simultaneous contract signing Protection against document forgery Historical Cryptographic Exemplars Julius Caesar liked encrypting messages Replaced Greek letters for Roman letters Caesar Shift Cipher Each letter substituted by shifting n places E X A M P L E H A D P S O H Only 25 such ciphers Substitution based on key phrase Substitution key consists of phrase’s letters (uniquely) followed by rest of the alphabet THIS IS ALICE AND BOB’S KEY THISALCENDBOKY-FGJMPQRUVWXZ 26! (roughly 1026) monoalphabetic substitution ciphers Historical Cryptographic Exemplars Coding Louis XIV’s Great Cipher (Rossignols) used one symbol (3- digit number) per syllable (held 200 years) Mary Queen of Scots used a combination of cipher and coded words (nomenclator) e.g,
US Army used Navajo language as code in WWII Historical Cryptographic Exemplars Coding Mary Queen of Scots usò una combinazione di cifrari e parole codificate (nomenclator) Obiettivo: uccidere la regina Elisabetta I
Historical Cryptographic Exemplars
Etienne Bazeries postulated that the cluster 124-22-125-46-345 encoded "les ennemis" (the enemies) and from that information was able to unravel the entire cipher
Substitution cipher
The encryption algorithm ε(m) is a substitution function which replaces each m M with a corresponding c C. The substitution function is parameterized by a secret key k. The decryption algorithm Dk(c) is merely the reverse substitution. the substitution can be given by a mapping
and the reverse substitution is just the corresponding inverse mapping . Substitution cipher The simplest case is called shift ciphers. In shift ciphers, K = M = C; let N = #M, the encryption and decryption mappings are defined by
with m, c, Substitution cipher For the case of M being the capital letters of the Latin alphabet, i.e., , the shift cipher is also known as Caesar cipher (k = 3). gcd(k, N) = 1, then for every m < N
ranges over the entire message space . Therefore for such k and for m, c < N Substitution cipher
provide a simple substitution cipher. With can also define a simple substitution cipher called affine cipher: Substitution cipher
These ciphers are called monoalphabetic ciphers: for a given encryption key, each element in the plaintext message space will be substituted into a unique element in the ciphertext message space. Therefore, monoalphabetic ciphers are extremely vulnerable to frequency analysis attacks Historical Cryptographic Exemplars The Arabs broke monoalphabetic substitution using frequency analysis In English (Beker&Piper)
a 8.2% j 0.2 s 6.3
b 1.5 k 0.8 t 9.1 c 2.8 l 4.0 u 2.8 d 4.3 m 2.4 v 1.0 e 12.7 n 6.7 w 2.4 f 2.2 o 7.5 x 0.2 g 2.0 p 1.9 y 2.0 h 6.1 q 0.1 z 0.1 i 7.0 r 6.0 Thus, letters ciphering e, t, and a are easily discovered Subsequently can look for the rest of the letters and letter pairs Frequency Frequency of distribution of english alphabet letters Substitution cipher
Cryptanalyze: Ek wso veptemheka tsk vl jloa qdsk broy Substitution cipher Da qui
Cryptanalyze: Ek wso, veptemheka tsk vl jloa qdsk broy Substitution cipher Let Interpret A = 0, B = 1, …, Z=25 The encryption algorithm ε(m) is defined as the following permutation
Then the corresponding decryption algorithm DK(C) is given by Substitution cipher Encrypt: Sicurezza dei sistemi informatici
…………………
Fill in the blanks!
Decrypt? Substitution cipher
The key space of this cipher has the size 26! > 4 x 1026 Huge in comparison with the size of the message space! But each plaintext character is encrypted to a unique ciphertext character! This weakness renders this cipher extremely vulnerable to a cryptanalysis technique called frequency analysis which exploits the fact that natural languages contain a high volume of redundancy Cifrario di sostituzione omofonica
Homos – phonos E’ possibile sostituire la stessa lettera con più simboli dell’alfabeto di cifratura (e.g. cifre) in modo da bilanciare la frequenza dei simboli ed abbattere l’efficacia della crittanalisi basata sull’analisi di frequenza dei simboli. Meglio o peggio dei codici polialfabetici? E’ un tipo di codice monoalfabetico Peggio, perché ad una lettera corrispondono un numero inferiore di simboli Peggio perchè un simbolo, in un cifrario polialfabetico, può rappresentare più lettere 1 Codici di sostituzione omofonica
Polyalphabetic cipher
A substitution cipher is called a polyalphabetic cipher if a plaintext message element in P may be substituted into many, possibly any, ciphertext message element in C Historical Cryptographic Exemplars Vigenere’s polyalphabetic cipher (16th century) generalizes Caesar’s shift cipher Can alternate between lines; or Use keyword: it consists of a period p and a sequence k1,k2,...,kp of Caesar shifts
The Vigenere cipher is not amenable to simple frequency analysis Historical Cryptographic Exemplars Vigenere Square
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z B B C D E F G H I J K L M N O P Q R S T U V W X Y Z A C C D E F G H I J K L M N O P Q R S T U V W X Y Z A B D D E F G H I J K L M N O P Q R S T U V W X Y Z A B C E E F G H I J K L M N O P Q R S T U V W X Y Z A B C D F F G H I J K L M N O P Q R S T U V W X Y Z A B C D E G G H I J K L M N O P Q R S T U V W X Y Z A B C D E F H H I J K L M N O P Q R S T U V W X Y Z A B C D E F G Ci≡(Pi+Ki)mod26 I I J K L M N O P Q R S T U V W X Y Z A B C D E F G H J J K L M N O P Q R S T U V W X Y Z A B C D E F G H I K K L M N O P Q R S T U V W X Y Z A B C D E F G H I J L L M N O P Q R S T U V W X Y Z A B C D E F G H I J K M M N O P Q R S T U V W X Y Z A B C D E F G H I J K L P ≡(C -K )mod26 N N O P Q R S T U V W X Y Z A B C D E F G H I J K L M i i i O O P Q R S T U V W X Y Z A B C D E F G H I J K L M N P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O Q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P R R S T U V W X Y Z A B C D E F G H I J K L M N O P S S S T U V W X Y Z A B C D E F G H I J K L M N O P Q R T T U V W X Y Z A B C D E F G H I J K L M N O P Q R S U U V W X Y Z A B C D E F G H I J K L M N O P Q R S T V V W X Y Z A B C D E F G H I J K L M N O P Q R S T U W W X Y Z A B C D E F G H I J K L M N O P Q R S T U V X X Y Z A B C D E F G H I J K L M N O P Q R S T U V W Y Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Keyword: white z Plaintext: Divert Troops To East Ridge w h i t e w h i t e w h D i v e r t T r o o p s
(Ciphertext) Confronto Testo da cifrare: gauguininfouniroma2itmeTeachingSicurezzadeiSistemiInformatici
KEY: CHARLESV Exercise on Vigenere Vigenere Square
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z B B C D E F G H I J K L M N O P Q R S T U V W X Y Z A C C D E F G H I J K L M N O P Q R S T U V W X Y Z A B D D E F G H I J K L M N O P Q R S T U V W X Y Z A B C E E F G H I J K L M N O P Q R S T U V W X Y Z A B C D F F G H I J K L M N O P Q R S T U V W X Y Z A B C D E G G H I J K L M N O P Q R S T U V W X Y Z A B C D E F H H I J K L M N O P Q R S T U V W X Y Z A B C D E F G Ci≡(Pi+Ki)mod26 I I J K L M N O P Q R S T U V W X Y Z A B C D E F G H J J K L M N O P Q R S T U V W X Y Z A B C D E F G H I K K L M N O P Q R S T U V W X Y Z A B C D E F G H I J L L M N O P Q R S T U V W X Y Z A B C D E F G H I J K M M N O P Q R S T U V W X Y Z A B C D E F G H I J K L P ≡(C -K )mod26 N N O P Q R S T U V W X Y Z A B C D E F G H I J K L M i i i O O P Q R S T U V W X Y Z A B C D E F G H I J K L M N P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O Q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P R R S T U V W X Y Z A B C D E F G H I J K L M N O P S S S T U V W X Y Z A B C D E F G H I J K L M N O P Q R T T U V W X Y Z A B C D E F G H I J K L M N O P Q R S U U V W X Y Z A B C D E F G H I J K L M N O P Q R S T V V W X Y Z A B C D E F G H I J K L M N O P Q R S T U W W X Y Z A B C D E F G H I J K L M N O P Q R S T U V X X Y Z A B C D E F G H I J K L M N O P Q R S T U V W Y Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Keyword:GME Plaintext: sicurezza dei sistemi informatici Exercise on Vigenere
Compare the frequencies of the resulting ciphertexts (Sub/Vigenere)
Chosen Plaintext/Ciphertext attack? Playfair ciphers
Matrix-based block p l a y f cipher used in WWI i r b c d In a 5x5 matrix, write e g h k m the letters of the word n o q s t “playfair” (for u v w x z example) without dups, and fill in with other letters of the alphabet, except I,J used interchangeably. Playfair encryption p l a y f 1. Break plaintext into letter pairs If a pair would contain double letters, i r b c d split with x P e g h k m Pad end with x hellothere becomes… n o q s t he lx lo th er ex 2. For each pair, u v w x z If they are in the same row, replace each with the letter to its right (mod 5) He lx lo th er ex he KG If they are in the same column, replace KG YV RV QM GI KU each with the letter below it (mod 5) lo RV Otherwise, replace each with letter we’d get if we swapped their column To decrypt, just reverse! indices lx YV
Weaknesses
p l a y f P is unknown, but structure of i r b c d its bottom is predictable P e g h k m n o q s t Can break using digram u v w x z frequencies Example: If a digram and its reverse both appear often, it’s He lx lo th er ex probably ER and RE.
Each plaintext letter maps to KG YV RV QM GI KU how many possible ciphertext letters?
Playfair
Sconfiggere il nemico senza combattere è la massima abilità Ciphertext? P:=5x5 filled in alphabetical order Historical Cryptographic Exemplars
Babbage broke Vigenere’s Cipher (19th century) Stage 1: Discover key length Look for repeated sequences, and measure the distance between them The key length is a factor of these distances Stage 2: Identify the key itself Compare distributions for each of the key letters with the standard distribution, to identify the shift Kasiski test Kasiski Test
Determinazione della periodicità del ciphertext Si basa sulla considerazione che alcune parole comuni (quali “the” in inglese) saranno cifrate con lo stesso crittogramma, con conseguente presenza di gruppi ripetuti nel crittogramma. CRYPTO IS SHORT FOR CRYPTOGRAPHY IZYCEIQYATSXBFBCWZEXFSMZACSSHF gianluigime IZYCEIQYAHBCNNUZCEJJBUORNABGFH gianlui IZYCZWIFYPOEZNOEIZYCZWGEGXHLFH gian Assumendo che i crittogrammi ripetuti siano corrispondenti allo stesso testo, si nota che la distanza tra i due è 16:la chiave è lunga 16,8,4 o 2.
The Kasiski test relies on the occasional coincidental alignment of letter groups in the plaintext with the keyword to give a keyword length estimate. The test says if a string of characters appears repeatedly in a polyalphabetic ciphertext message, it is possible (though not certain), that the distance between the occurrences is a multiple of the length of the keyword.
To demonstrate how this works, suppose the Vigene re cipher is used to encipher the message “THE CHILD IS FATHER OF THE MAN” using the keyword “POETRY” to produce the following ciphertext. Plaintext T H E C H I L D I S F A T H E R O F T H E M A N Keyword P O E T R Y P O E T R Y P O E T R Y P O E T R Y Ciphertext I V E V Y G A R M L M Y I V E K F D I V E F R L
The keyword “POETRY” is six letters long. Note that the trigraph “IVE” occurs three times in the ciphertext. The second occurrence of “IVE” occurs 12 character positions after the first. The third occurrence of “IVE” occurs 6 character positions after the second.
This leads to the assertion that the separations of common letter occurrences stand a good chance of being multiple of the keyword. This observation leads to the following fact concerning the Kasiski test.
Fact: The greatest common divisor or divisor of it of the separations of common characters that occur in a ciphertext enciphered by the Vigene re cipher tends to be a good chance of being equal or at least some multiple of the keyword.
Since “IVE” was separated by 12 characters and then 6 characters, then by observing that gcd(6, 12) = 6, we see that we have hit exactly the number of letters that occurred in the keyword. We conclude with one more example illustrating the Kasiski test. Kasiski Test
Vigenere has a period m such that for
each fixed j the mi+j-th characters are given simple substitution of the
corresponding mi+jth plaintext Kappa Test
inventato nel 1925 da William F. Friedman Indice di coincidenza: probabilità che 2 lettere cifrate rappresentano la stessa dell’alfabeto nel testo in chiaro. Conoscendo la probabilità di 2 lettere scelte a caso nell’alfabeto inglese, l’indice che rappresentino la stessa lettera è 6.5 % . Probability of Selecting Multiple Letters in Standard English In the standard English frequency table for letters, the probability of selecting a single letter list is the relative frequency converted to decimal, that is
selecing a single letter relative frequency of the letter found in the table P in standard English 100
In a large sample of English text, estimate the probability of selecting two E’s. Two A’s.
For convenience, we will assign the variables
p0 , p1, p2 , p3, p4 , p24, p25 to represent the probabilities of selecting the letters A, B, C, D, E, …, Y, Z from the standard English alphabet. The subscripts of the variables correspond to the MOD 26 alphabet assignment number of the corresponding alphabet letter. We will use this variable assignment in the next example.
Example 3: What is the probability that two randomly selected English letters are the same?
Solution: Using the standard English frequency table, we see that two letters P P(2 A's) P(2 B's) P(2 C's) P(2 Y' s) P(2 Z's) are the same 2 2 2 2 2 p0 p1 p2 p24 p25 (0.08167)2 (0.01492)2 (0.02782)2 (0.04253)2 (0.12702)2 (0.02228)2 (0.02015)2 (0.06094)2 (0.06966)2 (0.00153)2 (0.00772)2 (0.04025)2 (0.02406)2 (0.06749)2 (0.07507)2 (0.01929)2 (0.00095)2 (0.05987)2 (0.06327)2 (0.09056)2 (0.02758)2 (0.00978)2 (0.02360)2 (0.00150)2 (0.01974)2 (0.00074)2 0.065
Index of Coincidence for Monoalphabetic Ciphers
In monoalphabetic ciphers, the frequencies of letters in standard English are preserved when converting from plaintext to ciphertext. The following example illustrates why this is true.
Illustrate why the Caesar shift cipher preserves frequencies when converting from plain the ciphertext.
Solution:
Recall the index of coincidence represents the probability that two randomly selected letters are identical. Since monoalphabetic ciphers preserves frequencies, the index of coincidence of the plaintext alphabet of standard English will be exactly the same as the index of coincidence of the ciphertext alphabet for a monoalphabetic cipher. Using the result from previous example, this fact results in the following statement:
Index of Coincidence for Monoalphabetic Ciphers
Index of Coincidence Two randomly selected I P 0.065 For Monoalphabetic Ciphers letters are the same
Index of Coincidence for Polyalphabetic Ciphers
In a polyalphabetic cipher, the goal is to distribute the letter frequencies so that each letters has the same likelihood of occurring in the ciphertext. The next example determines what the index of coincidence is for a polyalphabetic cipher for a large collection of letters.
Problem:Determine the probability that two randomly selected letters are identical of the ciphertext of a message enciphered with a polyalphabetic cipher, assuming there are a very large number of letters in the ciphertext.
two letters P P(2 A's) P(2 B's) P(2 C's) P(2 Y' s) P(2 Z's) are the same 2 2 2 2 2 p0 p1 p2 p24 p25 2 26* p0 26*(1/ 26)2 (1/ 26) 0.003846 0.0385
█
Since the index of coincidence represents the probability that two randomly selected letters are identical, Example 5 allows us to make the following statement:
Index of Coincidence for Polyalphabetic Ciphers Index of Coincidence Two randomly selected I P 0.0385 For Polyalphabetic Ciphers letters are the same
The index of coincidence values for monoalphabetic (0.065) and polyalphabetic ciphers (0.0385) were derived assuming that the plaintext message has a very large number of letters. When messages are enciphered and deciphered, these messages are normally much shorter. Hence, the index of coincidence for a typical enciphered message enciphered will be bounded somewhere between 0.0385 and 0.065. This leads to the following statement:
Index of Coincidence Bound
For a typical ciphertext message, the index of coincidence I satisfies: 0.0385 I 0.065
Fact: If I is close to 0.0385, then the cipher is likely to have been obtained from a polyalphabetic cipher. If I is closer to 0.065, the cipher is likely to be monoalphabetic.
Knowing what the value for the index of coincidence tells us, we now need to derive a formula for calculating it. Before doing this, we need to recall the following fact concerning summation notation: Kappa Test
La lunghezza della chiave è circa (Friedman):
L = IC = k
n è la lunghezza del testo ed n1 ,…, n26 sono le frequenze delle lettere. Derivation of Formula for the Index of Coincidence for a Given Ciphertext Message Suppose a ciphertext message is received. Let n 0 , n 1 , n 2 , , n 24 , n 25 be the counts of the number of occurrences of the alphabet letters A, B, C, …, Y, Z that occur in the ciphertext (note that the subscript of each variable corresponds to the MOD 26 alphabet assignment number of the corresponding letter). Suppose
Total number of letters n n0 n1 n2 n24 n25 represents the total sum of all of the letters in the ciphertext. Recall that the index of coincidence represents the probability that two randomly selected letters are identical. Using the multiplication principle of probability for n total letters, we can compute the following probabilities for each individual letter:
2 A's are randomly n0 (n0 1) n0(n0 1) P selected n (n 1) n(n 1)
2 B's are randomly n1 (n1 1) n1(n1 1) P selected n (n 1) n(n 1)
2 C's are randomly n2 (n2 1) n2(n2 1) P selected n (n 1) n(n 1)
2 Y' s are randomly n24 (n24 1) n24(n24 1) P selected n (n 1) n(n 1)
2 Z's are randomly n25 (n25 1) n25(n25 1) P selected n (n 1) n(n 1)
Since these probabilities are mutually exclusive, we can sum the individual probabilities for each letters to find the index on coincidence.
Index of 2 letters randomly P Coincidence selected are the same n (n 1) n (n 1) n (n 1) n (n 1) n (n 1) 0 0 1 1 2 2 24 24 25 25 n(n 1) n(n 1) n(n 1) n(n 1) n(n 1) 1 n (n 1) n (n 1) n (n 1) n (n 1) n (n 1) n(n 1) 0 0 1 1 2 2 24 24 25 25 1 25 ni (ni 1) n(n 1) i0
We summarize this result.
Formula for the Index of Coincidence The index of coincidence I is given by the formula: 1 25 I n (n 1) n(n 1) i i i0 1 n (n 1) n (n 1) n (n 1) n (n 1) n (n 1) n(n 1) 0 0 1 1 2 2 24 24 25 25 where n represents the total number of letters in the ciphertext and n i , i 0 , 1, , 25 represents the number of letters corresponding to each individual letter in the ciphertext with n 0 the number of A' s , n 1 the number of B' s , n2 the number of C's etc. Index of coincidence Expected values of IC for several periods.
The lower the index of coincidence, the less variation in the characters of the ciphertext and (from our model of English) the longer the period of the cipher Transposition Ciphers
A transposition cipher (also called permutation cipher) transforms a message by rearranging the positions of the elements of the message without changing the identities of the elements. Consider that the elements of a plaintext message are letters in ; let b be a fixed positive integer representing the size of a message block; let ; let K be all permutations of (1, 2, …, b). Transposition Ciphers
Example: b=4 P: This is a secret message Encryption Key: Group by 4: This isas ecre tmes sage
C: hstissiaceermsteaesg Decryption:
Decryption Key: Transposition Ciphers
Railfence: TRHCEEIETGSSMAIAEASS
T 5 T R H 3 H C E E E 1 I E T G K 4 S S M A E 2 I A E S Y 6 S S
Redfence (by key): IETGIAESHCEESSMATRSS Columnar T H E K E Y IEEIRSHSMESCSTATGSEA 5 3 1 4 2 6
T H I S I S A S E C R E T M E S S A G E The German ADFGVX cipher
Was introduced in 1918, it was a combination of substitution and transposition. Draw up a 6x6 grid, and fill the grid with a random combination of the 26 letters of the alphabet and the 10 digits. The arrangement of the elements in the grid is part of the key. The German ADFGVX cipher
A D F G V X A 8 P 3 D 1 N D L T 4 O A H F 7 K B C 5 Z G J U 6 W G M V X S V I R 2 X 9 E Y 0 F Q The German ADFGVX cipher The first step is to take each letter of the plaintext, locate its position, and substitute it with the letters that label its row and column. A D F G V X For example, 8 becomes AA A 8 P 3 D 1 N P becomes AD, L is DA D L T 4 O A H F 7 K B C 5 Z G J U 6 W G M V X S V I R 2 X 9 E Y 0 F Q
The German ADFGVX cipher So the message
ATTACK AT 10 PM
BECOMES
DV DD DD DV FG FD DV DD AV XG AD GX
The German ADFGVX cipher This is a simple mono-alphabetic substitution cipher, which can be broken by frequency analysis. But now we add some transposition into the mix. For step 2, we need a keyword. In our example, we use the keyword MARK The keyword is the second piece of information we must share with the receiver. The German ADFGVX cipher Now we transpose by arranging the message in columns and shifting the columns around according to the alphabetical order of the keyword: MARK AKMR DVDD VDDD DDDV DVDD FGFD GDFF DVDD VDDD AVXG VGAX ADGX DXAG The German ADFGVX cipher Now we read off the message column wise: AKMR VDDD DVDD VDGVVDDVDDGXDDFDXG GDFF VDDD VGAX DXAG Substitution ADFGVX
Try with
KEY:SITV
In war, discipline can do more than fury The German Enigma Machine (Scherbius)
Electrical encryption machine, performing 8 substitutions uses n=3 rotating scramblers (26n orientations) scramblers can be configured in n! orders pre-keyboard k=6 swapped letter-pairs Encryption consists of optional letter pair switch compounded scrambling, with shifts reflector swaps letter pairs and backward scrambling Scrambles rotate in tandem Total of 1017 possible configurations changed daily according to a codebook each message has own orientation (message key) later added 4th scrambler Used extensively by Germany in WW2 Hitler used a more complex version of Enigma called Lorenz cipher The German Enigma Machine (Scherbius) Poles Crack the Enigma Polish cryptanalysts obtained information about the encryption procedure from commercial Enigmas Obtained information on its usage the Germans used a different orientation key for each message, encrypted twice in the message header (using the day key) Rejewski focused on the repetitions Formalized relationships between 1st-4th ,2nd-5th, and 3rd-6th letters ABCDEFGHIJKLMNOPQRSTUVWXYZ FQHPLWOGBMVRXUYCZITNJEASDK Built chains (AFWA), (BQZKVELRIB), (CHGOYDPC), (JMXSTNUJ) Chains depend only on scrambler orientation, not pair swaps Thus need to consider only 6 x 263 = 105456 configurations Built a catalog of characteristic chains for all configurations Poles Crack the Enigma Rejewski’s algorithm to discover the day key First, use catalog to identify the scrambler setting and orientation Then, run the ciphertext through an Enigma and look at the text to identify swapped letter pairs “Bomb” machines were constructed to mechanize the search British Crack Improved Enigma In 1939, Germans increased Enigma security added 2 extra scramblers to choose – 10x arrangements increased to 10 letter pair swaps British Cryptanalysts (Bletchley Park) took from the Polish Recruited best Mathematicians (Turing) and large staff (7000) Received Bombes from Polish Used human weaknesses provided hints and cribs Trivial message keys (key sequences, names initials) Artificial “intelligent” restrictions on scramblers arrangements and pair swaps restricted the search space Standard message formats, e.g., weather Some German codebooks were captured Turing constructed swap-independent chains similar to Rejewski First British Bomb (Victory) delivered in 1940 Search still required significant human help The British ULTRA – broken German, Italian and Japanese communications were crucial to winning the war Kerchoff’s Principle Everybody should know design and algorithm, except values of secrete keys
Lots of horror stories about ``Security by obscurity''... Is there Unbreakable Encryption? (Resume)
One-time pads Sender and receiver use a pre-arranged random stream of letters
Encryption=addition mod 26, exor M E S S A G E Every letter in the key used once T H I S K E Y
F L A K K K C
Example: plaintext 1000 letters, key length=5,20,1000->5 sets of 200 letters (good!), 20 sets of 50 letters (harder), 1000 sets of 1 letter (impossible!) Perfectly secure encryption (Shannon) Used by Soviet spies, and also for US-Soviet hotline Requires significant logistical effort and coordination Relies on randomness of key One Time Pad Alice and Bob share a use-once key K (random key), say 1100
To produce ciphertext, Alice XOR message M (say 0110) and key: C = M K = 1010
Ciphertext sent over insecure medium
When Bob gets ciphertext, XOR with (same key) to decrypt: M = C K=0110 Information Theoretic Security
THEOREM: If pad NEVER used again, ciphertext does not increase adversary's knowledge about message. Why is it important to use pad only once? Assume same pad K used twice:
C1 = M1 K; C2 = M2 K Eve guesses it. She can compute:
C1 C2 = (M1 K) (M2 K)=M1 M2 Not much? By language statistics (frequency and redundancy) one can easily get M1 and M2! Computationally Secure Encryption
Encryption scheme is computationally secure if The time required to break the cipher exceeds the useful lifetime of the information, or The cost of breaking the cipher exceeds the value of the encrypted information; Most schemes that we will discuss are not unbreakable in principle, but are computationally secure Usually rely on very large key-space, impregnable to brute force Moreover, the most advanced schemes rely on lack of knowledge of effective algorithms for certain hard problems, not on a proven inexistence of such algorithms Usually factorization, discrete logarithms, or square roots mod p Project Venoma Recently disclosed by CIA During Cold War Russians used one time pads for encrypted communication between embassies Western intelligence agencies guessed (hoped) pads reused (this was the case) Collected huge number of ciphertexts, looked for dependencies, and decrypted good portions of messages
Summary
Encryption Algorithms and Keys Substitution : bits, letters, words Transposition Decryption Algorithms Reversed process Knowledge of the algorithm and the key Cryptanalysis Identify algorithm Obtain as many plaintext-ciphertext pairs Use systematicity (patterns) Use cribs An Introduction to Cryptography
Gianluigi Me [email protected] Anno Accademico 2010/11 Modern Encryption and Cryptanalysis Principles Modern Encryption Principles Encryption scheme has 5 ingredients Plaintext, Encryption Algorithm, Key, Ciphertext, and Decryption Algorithm Kerchoff: Security depends on secrecy of key, not of algorithm
Notation
M, or P will usually denote plaintext message C will usually denote ciphertext K will usually denote a key
Ek(M) = C is the encryption function
Dk(C) = M is the decryption function
Dk(Ek(M)) = M represents the typical flow Computational Security
Key management for one time pad difficult: sacrifice information theoretic security for computational security
Computational security based on computational difficulty of breaking an encryption scheme
Strength relies on fact that adversary cannot determine K even if he/she has (M,C) Cryptographic Protocols
Self enforcing protocols Arbitrated protocols Trusted third party helps in real time Adjudicated protocols Trusted third party, but only if needed and after the fact Attacks Against Cryptographic Protocols Mostly attacks against implementations: Passive attacks (eavesdropping) Cryptanalysis Traffic analysis Active attacks Impersonation Interruption / denial Modification of messages Fabrication of new messages Replay / Reflect messages Cryptographic Algorithms
Type of operations applies to plaintext E.g., Substitution and transposition Type of key(s) Symmetric : same key
Asymmetric, Public-Key : Dk2(Ek1(M))=M How plaintext is processed into ciphertext How many and which operations How the operations are combined Block ciphers, Stream ciphers Cryptanalysis (attacks against cryptographic algorithm) Ciphertext only Uses only knowledge of algorithm and ciphertext Known plaintext Also one or more plain-ciphertext pairs Or, probable words: dictionary, known formats, etc. Chosen text Chosen to reveal information about the key Chosen plaintext and its ciphertext
Differential chosen plaintext Adaptive chosen plaintext Chosen ciphertext and its original plaintext
Mostly against public-keys Modern Encryption and Cryptanalysis Principles-End