Variable-Length Hill Cipher with MDS Key Matrix

Variable-length Hill Cipher with MDS Key Matrix

Kondwani Magamba †, Solomon Kadaleka †† and Ansley Kasambara †† [email protected] [email protected] [email protected] †Department of Mathematics, Catholic University of Malawi, Montfort Campus, Malawi ††Department of Mathematics and Statistics, Polytechnic College, University of Malawi, Blantyre, Malawi mostly used is: A = 0, B = 1, . . . , Z = 25. Additionally, an Summary 푚 × 푚 matrix 퐾 with entries from ℤ푛 is chosen as the The Hill Cipher is a classical symmetric cipher which breaks secret key. The encryption and decryption is then given as; plaintext into blocks of size 푚 and then multiplies each block by an 푚 × 푚 key matrix to yield ciphertext. However, it is well 퐶 = 퐾푃 mod 푛 푃 = 퐾퐶 mod 푛 (1) known that the Hill cipher succumbs to cryptanalysis relatively easily. As a result, there have been efforts to strengthen the where 퐶 is the ciphertext. For encryption to be possible 푃 cipher through the use of various techniques e.g. permuting rows and columns of the key matrix to encrypt each plaintext vector is divided into substrings of length 푚. If 푙, the length of with a new key matrix. In this paper, we strengthen the security the plaintext, is not divisible by 푚 then the plaintext block of the Hill cipher against a known-plaintext attack by encrypting must be padded with extra characters. Another each plaintext matrix by a variable-length key matrix obtained complication arises when 퐾 is not invertible. A necessary from a Maximum Distance Separable (MDS) master key matrix. and sufficient condition for a matrix 퐾 over ℤ푛 to be Key words: invertible is that the determinant of 퐾 be non-zero and co- Hill cipher, Cryptanalysis, MDS codes prime to 푛. See [7] for the size of the keyspace of a Hill cipher. 1. Introduction 3. Literature Review The Hill cipher is a classical symmetric cipher based on matrix transformation. It has several advantages including Several research works have been done with an aim to its resistance to frequency analysis and simplicity due to improve the security of Hill cipher. For example, a method the fact that it uses matrix multiplication and inversion for HCM-PT found in [5] tries to make Hill cipher secure by encryption and decryption. However, it succumbs to the using dynamic key matrix 퐾푡 obtained by random known-plaintext attack [4] and as such there have been permutations 푀푡 of columns and rows of the master key efforts to strengthen the cipher through the use of matrix and transfers an encrypted plaintext and encrypted various techniques which have improved the security of permutation vector 푢 = 퐾푡 to the receiving side. The the cipher quite significantly [1],[2],[3]. Unlike the number of dynamic keys generated is 푚! where 푚 refers original Hill cipher the proposed modifications of the Hill to the size of the key matrix. cipher have practical applications including image encryption see for example [6]. HCM-PT Encryption The method employed in this paper aims at generating 퐾 = 푀 퐾푀−1 dynamic variable-length key matrices from a given shared 푡 푡 푡 퐶 = 퐾푡 푃 MDS master key matrix. With the generated matrices, each plaintext matrix is encrypted using a different key 푢 = 퐾푡 matrix and this renders the ciphertext immune to a known- plaintext attack. Even though this method thwarts the known-plaintext The rest of the paper is divided as follows; in Section 2 the attack on the plaintext, it is vulnerable to known-plaintext basic concept of a Hill cipher is outlined, a literature attack on the permutation vector. This prompted [11] to review is in Section 3, Sections 4 and 5 present a modify HCM-PT. discussion on MDS matrices and in Section 6 the proposed algorithm is described. The modification to HCM-PT called SHC-M found in [11] works the same way as HCM-PT but does not transfer the permutation vector, instead both the sender and the receiver use a pseudo-random permutation generator, 2. Hill Cipher 푡 = 푃푅푃퐺 푠푒푒푑, 푛 and only the number of the necessary permutation is transferred to the receiver. The number of Let us consider a plaintext string 푃 of length 푙 defined over dynamic keys is the same as HCM-PT. an alphabet of order 푛. Each letter is mapped to an element of ℤ푛 . Often 푛 = 26 is used and the correspondence 2

SHC-M Encryption 6. Variable-Length Hill Cipher 1. Sender, S, selects a number, 푛 2. calculates 푡 = 푃푅푃퐺 푠푒푒푑, 푛 As pointed out earlier SHC-M has a dynamic key space 3. S → R: C, 푛 xor 푆퐸퐸퐷 (NDK) of 푚! but in this section we show that by requiring 4. Proceed as in HCM − PT that the key matrix be MDS we increase the key space of SHC-M. We also develop a way of encrypting variable- Another modification to the Hill cipher is found in [6] length plaintext blocks. which tries to improve the security of Hill cipher by using a one-time-one key matrix to encrypt each plaintext block. 6.1 Enlarging the Key space This unique key is computed by multiplying the current key with a secret Initial Vector (IV). However, [10] shows We will work as follows; suppose the secret shared MDS that the method is prone to a known-plain text attack. matrix 퐾 is of size 푚 × 푚 where 푚 ≥ 3. The cases The method presented here works the same way as 푚 = 1, 2 are trivial. Then for each of the 푚! permutations SHC-M but our method requires that the master key matrix of the columns of 퐾we can form 푗 × 푗 invertible be MDS. With an MDS master key, the key space is bigger submatrices where 푗 ∈ 2. . 푚 . Each of these matrices can than that of SHC-M and our method can be used to encrypt be used for encryption in our method. This can be done by plaintext blocks of variable length. As far as we know this choosing columns of a required length from any of the is the first instance of a variable length Hill cipher. permutations. Alternatively, we can start by choosing 푗 columns from a total of 푚 columns. We then form 푗 × 푗 invertible 4. MDS Matrices 푚 submatrices from the 푗 columns. Each possible Let 퐶 be a linear code of length 푛, dimension 푘, and submatrix is then permuted to give 푗! permuted distance 푑 with parameters 푛, 푘, 푑 . A generator matrix 퐺 submatrices. We do this for all possible matrix lengths for a linear 퐶 is a 푘 × 푛 matrix whose rows form a basis 푗 ∈ 2. . 푚 . It is not difficult to see that the total number of for 퐶. Linear codes obey the Singleton bound, dynamic matrices (NDK) is; 푑 ≤ 푛 − 푘 + 1. If a linear code meets the Singleton bound, 푚 푚 2 푑 = 푛 − 푘 + 1, then it is called a Maximum Distance NDK = 푗! 푗 =2 푗 Separable code or MDS code. Also, an 푛, 푘, 푑 -error Obviously, this number gets bigger as 푚 increases. AS fra correcting code with generator matrix 퐺 = 퐼푘×푘 |퐴 where as we know, the number of dynamic keys found here is the 퐼푘×푘 is the 푘 × 푘 identity matrix, and 퐴 is a 푘 × 푛 − 푘 best of all the known modifications of the Hill cipher. The matrix, is MDS if and only if every square submatrix of 퐴 best thing with this method is that it applies to any Hill is nonsingular [8]. In most applications e.g. [12] the code cipher modification which proposes dynamic matrix keys. 퐶 is taken as a linear code with parameters [2푘, 푘, 푘 + 1]and generator matrix 퐺 = 퐼푘×푘 |퐴푘×푘 . We illustrate the method with an example. Let When we take a generator matrix of a linear MDS code with parameters [2푘, 푘, 푘 + 1] as 퐺 = 퐼 |퐴 , the 푘×푘 푘×푘 푘11 푘12 푘13 푘 × 푘 matrix 퐴 is the matrix we call MDS matrix. MDS 퐾 = 푘 푘 푘 matrices are an important building block adopted by 21 22 23 푘31 푘32 푘33 different block ciphers as they guarantee fast and effective diffusion in a small number of rounds. See for example be a 3 × 3 MDS secret key matrix. Then applying the [12]. In this paper MDS matrices are used as key matrices formula above we have 18 matrices of order 2 × 2 and 6 for a Hill cipher. This modification increases the key space matrices of order 3 × 3 giving a total of 24dynamic of SHC-M because in an MDS matrix every square matrices. We now show how to generate all the 24 submatrix is non-singular which is the requirement for a matrices. key matrix of a Hill cipher.

We begin by choosing 2 columns from 퐾. We get the

5. Generating MDS Matrices 푘11 푘12 푘11 푘13 푘12 푘13 following matrices, 푘21 푘22 , 푘21 푘23 and 푘22 푘23 . 푘31 푘32 푘31 푘33 푘32 푘33 There are different methods of generating MDS matrices From these matrices we form 2 × 2 matrices e.g from e.g. using Cauchy and Hadamard matrices. MDS matrices 푘 푘 11 12 푘11 푘12 푘11 푘12 푘21 푘22 can also be generated exhaustively but probably the most 푘 푘 we get , and and 21 22 푘21 푘22 푘31 푘32 푘31 푘32 popular method is to use Reed Solomon Codes. 푘31 푘32 column-permuted versions of these matrices. Continuing

−1 in the same fashion and considering the 6 permutations of Calculate 푀푡 ; Calculate 퐾푡 = 푀푡 퐾푀푡 퐾 we get all 24 dynamic matrices. Get 푚 from control key It can be seen that the key space of SHC-M is now Obtain 퐾푚 from 퐾푡 2 Decrypt using 푃 = 퐾−1퐶 푚 푚 푗! which is a significant increase from 푚!. This 푚 푚 푚 푗=2 푗 end number doubles when we permute the dynamic matrices in a row wise manner. The key space can also be increased From the encryption algorithm one easily sees that the by using 푗 × 푗 matrices to form 2푗 × 2푗 matrices for sender and receiver need to find a way of obtaining a encryption. For example, if we have the 4 푗 × 푗 matrices length 푚 encryption matrix 퐾푚 from 퐾푡 . One way could 퐾1, 퐾2, 퐾3 and 퐾4 we may, 푤푙표푔, form a new matrix; be to take matrices in a left-right then top-bottom manner. 퐾1 퐾2 It is worth noting that there are different ways of choosing 퐾∗ = variable-length matrices from 퐾 and this strengthens the 퐾3 퐾4 푡 proposed algorithm even more. 6.2. Encryption and Decryption

Encryption is the same as SHC-M but we require that we 7. Security Analysis choose a control key (preferably binary) to govern The main limitation with Hill Cipher is that it is prone to encryption, see [13]. This key determines the length of a known plaintext attack, that is, if an attacker has 푚 block to encrypt i.e. when to encrypt a 2 × 2 block, a distinct plaintext and ciphertext pairs then they can 3 × 3 block and so on. To avoid having too many secret retrieve the key by solving 퐾 = 푃−1퐶 or 퐾 = 퐶푃−1 keys we can use the seed as the control key but a problem depending on the encryption equation used [6]. If 푃 is not may arise when we have to expand the seed due to a long invertible, then other sets of 푚 plaintext-ciphertext pairs plaintext. Once we have the binary control key we set up a have to be tried. By encrypting variable-length blocks of correspondence between binary strings and block lengths. plaintext the proposed method is safe from known For example, if 푚 = 4 then the possible lengths of plaintext attacks. Also the proposed method inherits subkeys are 2, 3 and 4 therefore we can have the following immunity from a cipher-text only attack from the original set up; Hill cipher. The proposed method does not, however, 01 → 2, 10 → 3 and 11 → 4. introduce sufficient confusion and diffusion to be ranked

among the best and efficient symmetric key This means we read the control key two bits at a time and cryptosystems. It is worth pointing out that the proposed encrypt block lengths accordingly. It is therefore very algorithm relies on many matrix transformations and this important that the control key be random to prevent slows down the algorithm. attacks. We now give a full description of the proposed method: Algorithm 1: Encryption 8. Conclusion Data: Plaintext P, K, SEED, Control Key Result: Cipher Text, C We have presented a variable-length Hill cipher which is a begin modification of the original Hill cipher. It is also based on 푟 ← block number SHC-M which is itself also a modification of the Hill 푡 = PRPG(SEED, 푟) ← a permutation order cipher. Our method strengthens the Hill cipher against 푀푡 ← a permutation from 푡 known-plaintext and ciphertext only attacks. The former is −1 퐾푡 = 푀푡 퐾푀푡 ← encryption matrix due to the use of dynamically changing key matrices. Our Determine block length 푚 from control key method is better than the methods found in the review in Obtain a length 푚 encryption matrix 퐾푚 from 퐾푡 that it has a larger key space. On the downside our method Get plaintext of length 푚, 푃푚 , from P is not efficient due to the use of many matrix operations. Encrypt using 퐶 = 퐾푚 푃푚 Send C, r xor SEED end

Algorithm 2: Decryption Data: Ciphertext C, K, SEED, Control Key Result: Plaintext, P begin Calculate 푡 from SEED and 푟

References [1] V. U. K. Sastry, D. S. R. Murthy, S. Durga Bhavani, “A Block Cipher Involving a Key Applied on Both the Sides of the Plain Text,” International Journal of Computer and Network Security (IJCNS), Vol. 1, No. 1, pp. 27 -30, Oct. 2009. [2] V. U. K. Sastry, V. Janaki, “A Modified Hill Cipher with Multiple Keys”, International Journal of Computational Science, Vol. 2, No. 6, 815-826, Dec. 2008. [3] Bhibhudendra Acharya, Girija Sankar Rath, and Sarat Kumar Patra, “Novel Modified Hill Cipher Algorithm,” Proceedings of ICTAETS, pp. 126-130, 2008 [4] D.R. Stinson, “Cryptography Theory and Practice,” 3rd edition, Chapman and Hall/CRC, pp.13-37, 2006. [5] Shahrokh Saeednia, “How to Make Hill cipher Secure,” Cryp- tologia 24:4, pp. 353-360, Oct 2000. [6] Ismail I A, Amin Mohammed, Diab Hossam, “How to Repair the Hill Cipher,” Journal of Zhejiang University Science, 7(12), pp. 2022-2030, 2006. [7] Overbey, J., Traves, W., and Wojdylo, J., “On the keyspace of the Hill cipher,” Cryptologia, 29(l), pp. 59-72, 2005. [8] F. J. McWilliams and N. J. A. Sloane, “The Theory of Error Correcting Codes,” Amsterdam, The Netherlands, North Holland, 1977. [9] Sriram Ramanujam and Marimuthu Karuppiah, “Designing an algorithm with high Avalanche Effect,” IJCSNS International Journal of Computer Science and Network Security, VOL.11, No.1, 106-111 January 2011. [10] Romero, Y. R. Garcia, R. V. et al., “Comments on How to Repair the Hill Cipher,” J. Zhejiang Univ. Sci. A 9(2): pp. 211-214, 2008. [11] Chefranov, A. G., “Secure Hill Cipher Modification,” Proc. Of the First International Conference on Security of Information and Network (SIN2007) 7-10 May 2007, Gazimagusa (TRNC) North Cyprus, Elci, A., Ors, B., and Preneel, B (Eds) Trafford Publishing, Canada, 2008: pp 34- 37, 2007. [12] Daemen, J., Rijmen, V.,The Design of Rijndael: “AES – The Advanced Encryption Standard.” Springer, 2002. [13] Monem, A.M., Rahma1 and Yacob, B.Z.,”The Dynamic Dual Key Encryption Algorithm Based on joint Galois Fields,” IJCSNS International Journal of Computer Science and Network Security, VOL.11, No.8, August 2011.