Finishing polyalphabetics & demonstrating transpositions

Recorded lecture for 4/6/20

CS 330 Polyalphabetics & Transpositions HW is out

• Due Wednesday night

• Write C program to count letter frequencies in English text ( or plaintext)

CS 330 Polyalphabetics & Transpositions Recall: Vigenère • write the keyword repeatedly (identifies column)

• write the plaintext out (identifies row)

• Encrypt each letter by writing the value at that row and column

: deceptivedeceptivedeceptive plaintext: wearediscoveredsaveyourself ciphertext: ZICVTWQNGRZGVTWAVZHCQYGLMGJ

CS 330 Polyalphabetics & Transpositions Recall: Finding the key length using sorted frequencies

CS 330 Polyalphabetics & Transpositions 51 CS 330 Polyalphabetics & Transpositions 52 Finding the key length, Part two

• we can quantify this by noticing that the best coset signature curves are generally those that are as low as possible on the left and as high as possible on the right.

• we can measure this tendency by finding the difference between the area under the curve for the top half (from 14 to 26), minus the area of the curve for the bottom half (from

1 to 13), called Vj

CS 330 Polyalphabetics & Transpositions 53 Finding the key length, Part two

• The average (Ai) of these differences for all the cosets for a suggested key length L, gives us a value which measures how close we are to the signature of English.

• The local maximum of these values for all numbers of coset values is the likely key length.

CS 330 Polyalphabetics & Transpositions 54 Algorithm for finding key length:

1. read in the text and the maximum key length to try 2. for every L from 1 to the max key length do 3. break up the text into L cosets 4. compute the frequency counts for each coset 5. compute the frequency values (divide by length of each coset) 6. sort the frequency values

7. compute the Vi’s for each of the L cosets

8. compute the average (the Aj’s) of the Vi’s for each coset

9. find a place where an Aj is a local maximum. That’s a likely key length.

CS 330 Polyalphabetics & Transpositions 55 CS 330 Polyalphabetics & Transpositions Finding the key, Part one

• Now that you know the key length, you can find the key.

• First, remember that each of the cosets contains letters that were encrypted with the same cipher alphabet

• So that each of them is a monoalphabetic substitution

CS 330 Polyalphabetics & Transpositions 57 Finding the key, Part one

• And that with a Vigenere, the substitution alphabets are all shifted (Caesar) alphabets

• And that each of the cosets will then have a graph that is close to a shifted graph of the English signature.

• So, if you find the shifted graph that is closest to the English signature graph, you’ve found a key letter in the keyword.

CS 330 Polyalphabetics & Transpositions 58 The English Scrawl...

CS 330 Polyalphabetics & Transpositions The scrawl…

CS 330 Polyalphabetics & Transpositions 60 Finding the key, Part two

• So to quantify this… • One way to look at the scrawl of each coset and the signature of English is that graphically we’re trying to make them parallel to each other. – (so that each scrawl value is identical to a signature value or a multiple of it.)

• How can we quantify this idea of ‘parallel’?

CS 330 Polyalphabetics & Transpositions 61 Stealing from linear algebra

• Vector

CS 330 Polyalphabetics & Transpositions 62 Stealing from linear algebra

• Vector • Dot product

CS 330 Polyalphabetics & Transpositions 63 Stealing from linear algebra

• Vector • Dot product

• Vector magnitude

CS 330 Polyalphabetics & Transpositions 64 Stealing from linear algebra

• Vector • Dot product

• Vector magnitude

• Cauchy-Schwarz Inequality: a • b <= ||a|| ||b|| and a • b == ||a|| ||b|| iff a and b are parallel

CS 330 Polyalphabetics & Transpositions 65 Stealing from linear algebra

• Vector • Dot product

• Vector magnitude

• Cauchy-Schwarz Inequality: a • b <= ||a|| ||b|| and a • b == ||a|| ||b|| iff a and b are parallel the largest dot product is the closest to the product of the magnitudes, and the closest to parallel

CS 330 Polyalphabetics & Transpositions 66 Algorithm for key letters

1. read in the text and the value for the length of the key - L 2. divide the text into L cosets 3. count the letter occurrences in each coset 4. compute the letter frequencies in each coset 5. for each coset j 1. compute the 26 shifted lists for this coset and their dot product with b, the standard frequencies of English 2. find the index of the max dot product value over the 26 dot products; that is where the key letter is in the alphabet

CS 330 Polyalphabetics & Transpositions 67 • To cryptanalyze a (possible) To summarize… : – do a and look at the relative frequencies – The best way today is to use the Barr-Samoson technique to find the key length and the key itself, and, just to be safe…

• compute the to get a guess, N, of how many alphabets there might be, or use the key length signature algorithm.

• try separating the ciphertext into the N separate texts and do a frequency analysis of each them. Compute the IC for each to see if it matches your guess (it should be near .066) or use the scrawl algorithm.

• Use Kasiski. Look for confirmation by finding repeated ciphertext at intervals that match the IC or a multiple of the IC.

– cryptanalyze each of the monoalphabets to find the key. 69 CS 330 Polyalphabetics & Transpositions So how secure is Vigenere?

• Well, from 1586 till 1863 it was pretty unbreakable.

• Now, it’s chump change…. – useful for short messages that don’t need to be secure for very long.

70 CS 330 Polyalphabetics & Transpositions So how secure is Vigenere?

• Well, from 1586 till 1863 it was pretty unbreakable.

• Now, it’s chump change…. – useful for short messages that don’t need to be secure for very long.

• So what’s next??? – it seems like we’re running out of ways to make substitution secure… – It’s that pesky key length thing…

71 CS 330 Polyalphabetics & Transpositions Autokey Cipher

• Ideally we want a key as long as the message

72 CS 330 Polyalphabetics & Transpositions Autokey Cipher

• Ideally we want a key as long as the message

• Vigenère proposed the autokey cipher – where the keyword is prefixed to the message as the initial key – then use the message itself as the rest of the key

• eg. given key deceptive

key: deceptivewearediscoveredsav plain: wearediscoveredsaveyourself cipher: ZICVTWQNGKZEIIGASXSTSLVVWLA

73 CS 330 Polyalphabetics & Transpositions Transposition Ciphers

• Transpositions re-arrange the original plaintext to create the ciphertext.

• Two usual types – a complete columnar transposition – an incomplete columnar transposition

• Other types exist including – route transposition – rail-fence ciphers

75 CS 330 Polyalphabetics & Transpositions Complete Columnar Transposition

– the message can be written as an n x m table of letters (some dummy letters may need to be added to fill out the table) – message is written row-by-row

– then is read off column-by-column using a key to pick the columns.

– the key is either a word (so you pick the columns off in alphabetical order)

– or a number (and you pick the columns off in ascending order)

76 CS 330 Polyalphabetics & Transpositions plaintext: Laser beams can be modulated to carry more intelligence than radio waves key: SORCERY

S O R C E R Y

6 3 4 1 2 5 7 L A S E R B E A M S C A N B E M O D U L A T E D T O C A R R Y M O R E I N T E L L I G E N C E T H A N R A D I O W A V E S X X ECDTM ECAER AUOOL EDSAM MERNE NASSO DYTNR VBNLC RLTIX LAETR IGAWE BAAEI HOX 77 CS 330 Polyalphabetics & Transpositions Incomplete Columnar Transposition

– same as complete, but you don’t include the dummy letters so some columns are shorter than others.

– harder to decipher because of the irregular length of the columns.

78 CS 330 Polyalphabetics & Transpositions HOLMES HAD BEEN SEATED FOR SOME HOURS IN SILENCE Length of text is 40

Key is CRYPT

Pull columns off in the order: 1 4 2 5 3 1 2 3 4 5 H O L M E S H A D B Transposition E E N S E - Example 2 A T E D F O R S O M E H O U R S I N S I L E N C E Cipher is: HSEAO ESLMD SDOUS COHET RHIEE BEFMR IELAN ESONN

CS 330 Polyalphabetics & Transpositions 79 HOLMES HAD BEEN SEATED FOR SOME HOURS IN SILENCE Length of text is 40

Key is HOLMES Pull columns off in the order: 5 1 3 4 2 6 1 2 3 4 5 6 H O L M E S Downside here is H A D B E E that you’re fooled into thinking it’s a N S E A T E complete rectangle D F O R S O (because it factors M E H O U R nicely) S I N S I L E N C E Number of rows is 7 Cipher is: EETSU IHHND MSELD EOHNC MBARO SEOAS FEINS EEORL CS 330 Polyalphabetics & Transpositions 80 of Transposition Ciphers

1. Do a frequency analysis to confirm that it’s a transposition 2. Factor the length of the message to guess at rectangle size 3. Pick pair of factors and write down the message column- wise 4. These columns will be numbered 1..n 5. Re-arrange the columns till you get text that makes sense. (Key is the order in which you’ve re-arranged the columns)

CS 330 Polyalphabetics & Transpositions 81