<<

Lakehead University Theory of Cryptology Math–3375

The ADFGVX

Wilson Poulter & Justin Kulp, Lakehead University

he ADFGVX cipher is a private- en- as cryptanalytic security [2]. cryption method that uses a At the time, the French army had a dedicated Tsquare to encrypt a plaintext message group known as the Bureau du Chiffre, once, it then uses a keyword to transpose let- or Cipher Bureau. By April 4, 1918, Lieutenant ters of the singly encrypted text, adding ad- , a member of the Bureau, was able ditional difficulty for cryptanalysis. to identify two messages with identical strings and orderings of text, indicating plaintext with the same History beginning and keys. This, and other clues led to a systematic and statistical breakdown of the Ger- On the frontline of trench warfare during the First man scheme during times of heavy communication. World War (1914-1918), communication was an in- However, the Allies never did develop an univer- tegral part of daily activities. In order to carry sal method for decrypting ADFGVX . The out an effective offensive, messages containing bat- largest success of Painvin was on June 3, 1918, when tle plans had to be transmitted along the kilome- he intercepted and decrypted the (translated) mes- tres of trench networks to commanding officers. The sage “Rush munitions Stop Even by day if not seen,” obvious method for accomplishing this task at the which was later confirmed, and prepared the French time was through the use of radio . By army for an attack on June 7 [2]. these means, operators could send messages using Morse to other operators located far away. Al- Method though this was the only practical way of sending messages at the time, the use of radio had the draw- The ADFGVX cipher first employs a 6 × 6 Poly- back that one could not control who the message bius square to encrypt plaintext monographs into was sent to. Thus, enemy operators could intercept digraphs and then applies a single columnar trans- virtually every message sent, providing them with a position on the modified text. Simply put, the plain- tactical advantage. In order to protect the content text letters are substituted by digraphs and are then of messages even after they had been intercepted, transposed in columns by the use of a keyword [3]. cryptographers from all countries began to develop This combination of fractionation and transposition ciphers that would protect their messages even if makes the cipher especially difficult for cryptanaly- they were intercepted [1]. sis [1]. The ADFGX cipher originally came into use on March 5, 1918, and later evolved into the more com- plicated ADFGVX cipher to include numbers. The method was invented by the German Colonel Fritz To accomplish the first step of , plain- Nebel and was chosen by a conference of German text characters are substituted using a 6 × 6 Poly- cipher specialists to be used during the war. It was bius square. The Polybius square is a checkerboard designed to optimize radio operator success as well scheme that uses the letters A, , , , , and ,

Page 1 of 5 Lakehead University Theory of Cryptology Math–3375

A D F G V X from the keyword, moving down to a new row when A P 4 G all letters from the keyword have been used. D 3 5 I As an example, using the keyword MATHEMAT- F 8 V ICS, which becomes MATHEICS, we would write G 1 D 2 X out (1) as follows, V 0 A F X 7 U 9 6 MATHEICS FGDFFXAG (2) Table 1: A sample ADFGVX Polybius Square FVAAAXDF VGAGDVFX in order, as column and row identifiers, which forms Thus the characters of the digraphs form columns a grid where each element of the grid can be identi- underneath the letters of the keyword. × fied by its row and column header [3]. In this 6 6 The keyword’s letters are then rearranged in al- scheme, the English alphabet and the digits 0–9 are phabetical order, with their corresponding columns randomly placed within the grid and each of these rearranged in the same order simultaneously. In our characters maps to an unique pair of letters. Using previous example MATHEICS would be rearranged the sample Polybius square in Table 1, we get that to read ACEHIMST, and (2) would become, the number 8 encrypts to the pair FA, and the pair AD decrypts to the letter K. ACEHIMST In the original ADFGX cipher, one uses a 5 × 5 GAFFXFGD (3) Polybius square, with each letter mapping to one of VDAAXFFA the 25 different digraphs. Of course, since there are GFDGVVXA 26 letters in the English alphabet, one traditionally To complete the encryption, the left-most column would merge the letters i and j into a single letter. is now written as a string of characters horizontally, Whereupon the correct letter would be chosen con- with the topmost letters of the column becoming the textually when decrypting [1]. leftmost in the string, the bottommost becoming the Since, in each case, a single plaintext character rightmost. This is then repeated for the second col- is mapped to more than one character, umn, etc., until the last column [3]. This transforms the ADFGVX and ADFGX ciphers are examples of (3) into the final, encrypted, string: fractionating ciphers. As an example, using Table 1, the plaintext mes- GVGAD FFADF AGXXV FFVGF XDAA. (4) sage CRYPTOGRAPHY will result in the cipher- text: Now the ciphertext may now be sent with some degree of confidence that it will not be decrypted by FGDFF XAGFV AAAXD FVGAG DVFX. (1) someone without a key.

Columnar Trasposition Strengths To make this cipher even stronger, our already once encrypted text is encrypted again using columnar The ADFGVX cipher was designed to not only be transposition, or in other words, by moving around a strong cipher, but also to be a relatively simple groups of the ciphertext via some keyword. cipher to use. The major advantage of this cipher To elaborate, suppose we have some keyword (in in the context of the First World War had to do the language the message is written in), we eliminate with the substituted Polybius square. The Polybius all the duplicate letters of the keyword, preserving square generally uses the digits 1–5 to act as column the leftmost letters. For example, if the keyword is headers and row identifiers, which was suited for its ZYXY we write ZYX, rather than ZXY [1]. With original method of encoding (knocks or light flashes), the remaining ordered characters, the message is ar- however as mentioned above, these messages were ranged so that the encoded message can be read being encoded using . Thus it is most “correctly” (that is, so that if it were decrypted it sensible to use characters which have Morse equiv- would be a readable message) from left to right, and alents that are all very distinct from one another, from top to bottom. Then each character of the these being the letters A, D, F, G, V, and X. Ex- encrypted message is placed directly under a letter amples of the International Morse symbols for the

Page 2 of 5 Lakehead University Theory of Cryptology Math–3375

A · — assuming that each character maps to only one di- D — ·· graph and vice-versa, since monographs are simply F ·· — substituted with digraphs. For this same reason if G —— · an calculation was performed V ··· — on the encrypted text of 36 digraphic combinations, X — ·· — the index of coincidence would remain the same. It is when this method of substituting monographs Table 2: International Morse Symbols for ADFGVX with digraphs is coupled with the columnar transpo- sition described above, that the ADFGVX cipher be- letters A, D, F, G, V, and X can be seen in Table comes strong [3]. By the process of columnar trans- 2. By using these distinct letters, one may assume position, each of the two singular characters from ev- operator accuracy would increase [2]. ery digraph is re-associated with the character above As previously mentioned, the ADFGVX and re- it or below it in its column, or the character at the lated schemes act as a fractionating cipher since they top or bottom of the row next to it (right for top, left map each character to more than one letter, as a for bottom). In the general case, the frequency of result of the usage of the Polybius square. Addition- every digraph should be changed to something that ally, the underlying concept of the Polybius square is more reflective of a , since can be modified, adding additional dimensions, by two of the same digraphs may no longer correspond the example of the ADFGVX cipher, to aid in en- to the same plaintext letter. For 36 characters, the coding an alphabet of any length. Although, if the index of coincidence for a uniform distribution of length of the alphabet cannot be factored into di- text is 1/36 ≃ 0.027778. Using Table 1 and the key- mensions to form a Polybius square (or modified word WARIOPNCES to encrypt the text used to “Polybius rectangle”), then either some letters will compute (5), and then computing the new index of have to map to a non-unique digraph (one letter coincidence leads to the following, appears in more than one position in the square), 36 ( ) or certain digraphs will have to map back to a non- ∑ n 2 IC ≃ i ≃ 0.032469 (6) unique plaintext letter (more than one letter appears new N in one position of the square). Of course, this is also i=1 generalizable to greater than a digraphic scheme, in While this change does not seem very large, we note the sense that a single plaintext character could map that this is a change of about 36% towards the per- to a string of any length by use of a higher dimen- fectly polyalphabetic scheme for 36 letters. A sim- sional “Polybius Hyperrectangle.” However, these ilarly sized change in the index of coincidence for fractionation methods are a result of the Polybius the more familiar 26 letter scheme is comparable to square, and are also not traditionally used in the changing the index of coincidence from 0.066700 to implementation of the ADFGVX cipher. 0.056534. Which is the relative equivalent to chang- It is clear that the first operation of this cipher ing from a monoalphabetic cipher to a polyalpha- will not prevent one’s ability to break the code on betic cipher using 2 alphabets in a 26 letter scheme its own. Encryption of sample English text [4], of [5]. length 23,664 eligible characters, leads to the follow- In a sense, the columnar transposition allows a ing sample index of coincidence, letter to be distributed to more than one place at a time, which hides the properties of a typical plain- ∑36 n (n − 1) IC := i i ≃ 0.034884 (5) text message such as letter frequency, which in turn N(N − 1) i=1 hides the clues to how the message was transposed [2]. Which is exactly what is suspected of English text1 (and numbers 0–9) with the approximation that the count of numbers relative to regular text is negligible. Weaknesses Applying the Polybius square does not change the performed on text if the person Since the ADFGVX cipher is a private-key cipher decrypting knows a Polybius square is being used, it suffers from the weakness that if anyone has the full Polybius square and keyword, then decryption 1Performing this analysis on just the text, excluding the 0– 9 leads to IC ≃ 0.067622, which is in line with typical becomes a matter of writing out the message and monoalphabetic English text as well applying the algorithm in reverse.

Page 3 of 5 Lakehead University Theory of Cryptology Math–3375

As mentioned prior, the application of the Poly- columns becomes very useful now. Since the longer bius square alone is not very strong as it does not columns would have been on the left side before change any frequency analysis, since one uses strings transposition, and the shorter ones on the right, we of two characters for frequency analysis instead of can concern ourselves with finding the arrangements singular characters. With limited observance it be- of two subsets, which greatly decreases the number comes clear that the text has been encrypted using of possible arrangements. For instance, if our key- a Polybius square. In fact, the interception of the word is of length X, then there are X! possible col- first ADFGX messages having only five characters umn arrangements. However, if there are Y columns immediately led Painvin to suspect a checkerboard of greater length, then we need only concern our- encryption scheme. This is similar for the ADFGVX selves with Y !(X − Y )! arrangements, far less than cipher, except that considering the plaintext - before. To test these arrangements, one carries out bet uses contains 36 characters (with the inclusion a frequency analysis on the adjacent pairs of letters of the digits 0–9), a cryptanalyst only has to de- to see if the analysis is similar to the encoded lan- cide if the extra ten digits are to encrypt numbers guage [3]. If it is, then monoalphabetic decryption or reduce frequency clues with homophones. Once can begin to determine if the arrangement is correct. one can determine that the cipher is monoalpha- If not, then we make take further steps from the betic, frequency analysis can quickly solve the cipher results of our analysis. [2]. Thus, for people without access to the keyword, For instance, if we compare two frequency analy- breaking the ADFGVX cipher is almost entirely a ses performed on two different arrangements of the matter of undoing the columnar transposition. same message, and we find that AD is the most fre- The addition of transposition eliminates the possi- quent in the first, and DA is the most frequent in bility of breaking the cipher using frequency analysis. the second, then it is likely that a transposition has However, once one is working under the notion of occurred in the arrangement. To solve this issue, columnar transposition, breaking it can become pos- one need only switch around the two columns in one sible given a high frequency of interceptions. Take, of the arrangements, and see if it yields a better fre- for instance, a case where many operators are all us- quency analysis on all characters [1]. By matching ing the same key to encrypt their messages, and all digraph frequency to column arrangements as such, of the messages begin with HI THERE. Although in one may slowly begin to find the correct arrange- the encrypted text these letters may be positioned ment of text. differently, we note that when the text is arranged Another technique that is effective, but only when in columns the ciphertext that corresponds to HI the keyword is of even length, is to examine the THERE should be found in the first fourteen letters. frequency of a single character in the odd columns If in the column arrangements of our messages, even against the frequency of that same character in only one of these fourteen letters does not match the oth- the even columns. The reason for doing this is re- ers, then there is an error in our arrangement. Thus lated to the way in which an even keyword arranges given a large enough sample size, knowing that the the digraphs into columns. Under an even keyword, messages all begin the same (or end the same), we the first letter of each digraph always belongs to an can attempt to find the length of the keyword by odd column, and the second always belongs to an identifying the matching arrangements of letters at even column. Using Table 1 as our example, if we the beginning of the message [2]. examine a single character in the ADFGVX Polybius The issue with this strategy is that the messages square, say A, we note that when it is the first letter are not required to be of a length that is divisible by in the digraph it corresponds to one of the labels: O, the length of the keyword, and so some columns may K, Z, P, 4, or G. Whereas if A is the second letter in have up to one more additional letter than another the digraph, it corresponds to one of: O, J, 8, 1, B, column, which can change the steps one may need to or 7. When calculating the sum of the relative fre- take arrangements. However, given a large enough quencies [7] of these first six letters (assuming the oc- sample size, it is likely that there will be some of currence of digits is approximately zero), we obtain a length divisible by the keyword, or only off by a a value of 0.12297, while the sum of the relative fre- small number of characters [3]. quencies of the other six letters is 0.09152, two very Once the length of the keyword is known, the issue different frequencies. This difference should occur in becomes rearranging the columns to their original general for any character we choose, since it is likely state before the transposition. Although they were that six random characters of the Polybius square a hindrance in the prior step, the varying lengths of will have very different relative frequency from the

Page 4 of 5 Lakehead University Theory of Cryptology Math–3375 other six (with only one letter in common). Thus, FDDGG GVGDG DDXVV FVXXG ADDDV AVXDF when we test even and odd columns in this manner, DXDVG XGXDV VDVAG VXFDV VDFFA VVDXD we expect the frequency analysis to be quite differ- XFXXX XXFFD GDFFF DVVDV VAAXF GFFFD ent between even and odd columns. If not, then we GAXVG DXVXV DGXFV DADVV FVGAG VDGAV may suspect that an even column is arranged as an AFVXV XAAFV VFGGG DDDXD FVFXV GDVGV odd column (and vice-versa) [6]. VXXVD VFFXX VAVVV FVGFF VFDXF GGXGG Using all of these techniques in a large sample size FVAVG GAADV VVFDF GDDVX FVVVV FDXFV should exhaust many of the potential possibilities, VAFDV AVFXV VDGGF DAVVF XVFVV XVXVF eventually leading to the key of the cipher, or a key FDDFG GGVXF XDVXD GVAVG GAVDD GDAAV of the same length and ordering. In essence, the ADGXA VGGFG GDVXX DGVDA GVVGV GAXFF solution to an ADFGVX cipher is a combination GAXDG FGAGV GGXDV GVGXX DVFXG VVDVD of brute-force and statistical analysis, as statistical XDDAF FDGXG FXGGG VDAFF VVVDV VGVFD analysis is used to: guess the length of the keyword, AFXFG VVVVD DVDGF FVXGF VXGAX DVAGV find the original arrangement of columns, and then DVXDG FXVXV XVVVD VGVXV DDAFF VDGVX decrypt the monoalphabetic fractionated text. DXDVF VGVDV VVFVX DFFVV VVVVD VDDVV AFFVX DVXVX VVXDV AVDVG VGDGG AAFAG A Sample Passage XXGAD DFDVD FFFDV VXGGX GAAFV DFGGA XDAFG GVGDG VDGDD DGVVG DGGFG GXFGA Using the Polybius Square from Table 1 and the key- AFXVD GDXVA FGAGV GDVGF DXFGA AVGDD word WARRIOR PRINCESS, which becomes WAR- FAVFG VVDAF VVGDA A IOPNCES, we translate the plaintext [8]: References HERE THE RED QUEEN BEGAN AGAIN CAN YOU ANSWER USEFUL QUESTIONS SHE [1] Greg Goebel, Codes & Codebreakers In World SAID HOW IS BREAD MADE I KNOW THAT War I ALICE CRIED EAGERLY YOU TAKE SOME http://www.vectorsite.net/ttcode 04.html#m3 FLOUR WHERE DO YOU PICK THE FLOWER (2014). THE WHITE QUEEN ASKED IN A GARDEN OR IN THE HEDGES WELL IT ISNT [2] David Khan, The Codebreakers – The Story of PICKED AT ALL ALICE EXPLAINED ITS Secret Writing, Macmillan Publishing GROUND HOW MANY ACRES OF GROUND Company, New York, 1st edition, (1967). SAID THE WHITE QUEEN YOU MUSTNT [3] Cecily Morrison, and Ben Roberts, Codes and LEAVE OUT SO MANY THINGS FAN HER Ciphers - ADFGVX Cipher, HEAD THE RED QUEEN ANXIOUSLY http://www.srcf.ucam.org/˜bgr25/cipher/adfgvx.php INTERRUPTED SHELL BE FEVERISH AFTER (2008). SO MUCH THINKING SO THEY SET TO WORK AND FANNED HER WITH BUNCHES OF [4] James de Mille, A Strange Manuscript Found LEAVES TILL SHE HAD TO BEG THEM TO in a Copper Cylinder, Harper & Brothers, New LEAVE OFF IT BLEW HER HAIR ABOUT SO York, (1888).

Which becomes the, doubly-long, ciphertext: [5] dCode, http://www.dcode.fr/ (2016). [6] James Lyons, Practical Cryptography, VVVVV GFXFF VDXDG AAFVA GXFVF ADDAV http://practicalcryptography.com/ciphers/adfgvx- FDDVV VVVDX GFFVV XXGVG XVGFX FDXVX cipher/ FXVVV ADXVD FXVVG VVFXX VDFXV FGFDV (2012). XAXVF FFVVF DVDXV AVVGV AVGAD GDXVX DXVGG XFGGD GVGXD VFGGF VAVFG AFGFD [7] Robert Edward Lewand, Cryptological DXVGA GDGAG FFAFV GVGGX AXDVX VDDXF Mathematics, The Mathematical Association XFFFV DGDGD GGGGX DGGGG DGVFA GGFAX of America, (2000). XDDXG VGVGF AVGAF FDVFD GXXVV VXFGV GXGGD GVAFV GGAFF GVDDF DDAFD XXGVG [8] Lewis Carroll, Through the Looking-Glass, DDGFD XFVVF AXFAD VFAAG FDXXV XVXVF Macmillan Publishers, (1871). AXDFG GXXDD FDVFG FADXA DGFVV GFVDV

Page 5 of 5