<<

Advances in of Communications doi:10.3934/amc.2019045 Volume 13, No. 4, 2019, 779–843

CRYPTOGRAPHICALLY SIGNIFICANT MDS MATRICES OVER FINITE FIELDS: A BRIEF SURVEY AND SOME GENERALIZED RESULTS

Kishan Chand Gupta Applied Unit Indian Statistical Institute 203, B.T. Road, Kolkata-700108, India Sumit Kumar Pandey∗ Ashoka University Sonepat, Haryana, India Indranil Ghosh Ray School of Engineering and Mathematical Sciences, City University London London EC1V 0HB, United Kingdom Susanta Samanta Applied Statistics Unit Indian Statistical Institute 203, B.T. Road, Kolkata-700108, India

Abstract. A is MDS or super-regular if and only if every sub- matrices of it are nonsingular. MDS matrices provide perfect diffusion in block ciphers and hash functions. In this paper we provide a brief survey on crypto- graphically significant MDS matrices - a first to the best of our knowledge. In addition to providing a summary of existing results, we make several contribu- tions. We exhibit some deep and nontrivial interconnections between different constructions of MDS matrices. For example, we prove that all known Van- dermonde constructions are basically equivalent to Cauchy constructions. We prove some folklore results which are used in MDS matrix literature. Wherever possible, we provide some simpler alternative proofs. We do not discuss effi- ciency issues or hardware implementations; however, the theory accumulated and discussed here should provide an easy guide towards efficient implementa- tions.

1. Introduction Claude Shannon, in his paper “Communication Theory of Secrecy Systems” [60] introduced the concept of confusion and diffusion which play key roles in the design of block ciphers and hash functions. The idea of confusion is to make the statistical relation between the ciphertext and the message too complex to be exploited by the attacker and it is achieved by nonlinear functions like S-boxes and Boolean functions. Diffusion ensures that each bit of the message and each bit of the secret key influence many bits of the ciphertext and after a few rounds all the output bits depend on all the input bits. One possibility of formalizing the notion of perfect diffusion is

2010 Mathematics Subject Classification: Primary: 58F15, 58F17; Secondary: 53C35. Key words and phrases: Diffusion, , MDS matrix, , branch number, , . ∗ Corresponding author: Sumit Kumar Pandey.

779 c 2019 AIMS 780 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta the concept of multipermutation, which was introduced in [59, 65]. Another way to define it is using branch numbers and Maximum Distance Separable (MDS) matrices [15]. In [30, 31, 32], Heys and Tavares showed that the replacement of the permutation layer of Substitution Permutation Networks (SPNs) with a diffusive linear transformation improves the avalanche characteristics of the , which increases the cipher’s resistance to differential and linear . Thus the main application of MDS matrix in is in designing block ciphers and hash functions that provide security against differential and linear cryptanalysis. MDS matrices offer diffusion properties and is one of the vital constituents of modern age ciphers and hash functions. The idea of MDS matrix comes from MDS code and in this survey we will discuss the construction of MDS matrices. A great deal of research on MDS matrices with cryptography in mind has been done during the period 1994 to 1998. In the year 1994, Schnorr and Vaudenay [59] introduced multipermutations as formalization of diffusion layer. In 1995, Vaud- ney [65] showed the usefulness of multipermutation in the design of cryptographic primitives. During 1994 to 1996 Heys and Tavares [30, 31, 32] showed that the re- placement of the permutation layer of Substitution Permutation Networks (SPNs) with a diffusive linear transformation improves the avalanche characteristics of the block cipher, which increases the cipher’s resistance to differential and linear crypt- analysis. In the year 1996 Rijmen et. al. were the first to use MDS matrices in the cipher SHARK [50] and later in the 1997, Daemen et. al. used MDS matrices in the cipher SQUARE [14]. Then in the year 1998, Daemen and Rijmen used circulant MDS matrix in the cipher AES[15]. During the period 1998 to 1999, Schneier et. al. used MDS matrix in the block cipher Twofish [57, 58]. Now the usefulness of MDS matrices in the diffusion layer is well established. The MUGI [66] uses AES MDS matrix in its linear transformations. MDS matrices are also used in the design of hash functions. Hash functions like Whirlpool [5, 61], SPN-Hash [12], Maelstrom [16], Grφstl [17] and the PHOTON [19] family of light weight hash functions use MDS matrices for their diffusion layers. In 2011 the authors of hash function PHOTON [19] and block cipher LED [20] used MDS matrices constructed from companion matrices, which opened a new area of research in the construction of MDS matrices for lightweight cryptography. We provide a brief sketch of the construction of cryptographically significant MDS matrices. There are two main approaches in constructing MDS matrices - nonrecursive and recursive. In recursive constructions, we generally start with a A of order n, with proper choice of coefficients of the character- istic such that An is an MDS matrix. Recursive constructions are very popular for lightweight applications. In nonrecursive constructions, constructed matrices are themselves MDS. Another way to classify the techniques used to find MDS matrices is based on whether the matrix is constructed directly or a search method is employed by enu- merating some search space. Direct constructions use algebraic properties to find MDS matrices. While in search methods, elements of the matrix are judiciously selected and it is checked if the matrix is MDS or not. It may be noted that the problem of verifying whether a matrix is MDS or not, is NP-complete. Hence, the search technique is useful only for finding MDS matrices of small orders. There are two main direct methods for constructing nonrecursive MDS matrices - one is from a Cauchy matrix and the other is from two Vandermonde matrices. These methods provide MDS matrices of any order, but these matrices are generally

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 781 not efficient for implementation. So, we use the search method in nonrecursive constructions that output efficiently implementable MDS matrices. One popular technique for such constructions is to search for elements of a . AES[15] uses a circulant MDS matrix. There are several circulant-like matrices [24] and generalized circulant matrices [44] which are also used in such constructions. Toeplitz matrices and Hankel matrices are deeply interconnected with circulant matrices. Recently, Toeplitz matrices have been used to construct MDS matrices [55, 56]. Similar to circulant, circulant-like and generalized circulant MDS matrices, search methods are used to construct Toeplitz MDS matrices. As in nonrecursive constructions, there are several direct recursive constructions as well. However, as before, they are not so efficient for implementation and search methods provide efficient MDS matrices of low order. In the general context of implementation of block ciphers, we note that if an ef- ficient MDS matrix M used in encryption, happens to be involutory or orthogonal, then its inverse M −1 applied for decryption will also be efficient. So, it is of special interest to find efficient MDS matrices which are also involutory or orthogonal.

Our contribution: We believe that this is the first survey on MDS matrices. While most of the results in this paper are already known, some results and insights are new. In Theorem 5.1 we provide a nontrivial and deep interconnection between all the known Cauchy based constructions and their corresponding Vandermonde based constructions. In [24], it was proved that Type-I circulant-like MDS matrices of even order can not be involutory or orthogonal but the case of odd orders were not discussed. In Lemma 6.14 and Lemma 6.16 we prove that Type-I circulant-like MDS matrices of odd order are neither orthogonal nor involutory. In Lemma 1 of [44], the authors provided a necessary and sufficient condition for the equivalence between two circulant matrices. In Lemma 6.22 we provide a simpler alternative proof. In Remark 30 we point out the interconnection that a left-circulant matrix is nothing but a row-permutated circulant matrix. It may be noted here that a left- circulant matrix is symmetric while a circulant matrix is not. Using this intercon- nection, we propose an idea to find involutory left-circulant MDS matrices of order n, where n is not a power of 2. In [44] it was proved that left-circulant matrices of order 2n are not involutory. In Theorem 6.19, we show that this result easily follows from the above interconnection and known results on circulant matrices. A similar connection shows up between Hankel and Toeplitz matrices. A is a row-permuted form of a corresponding . By itself, a Toeplitz matrix is not symmetric. However, the corresponding Hankel matrix is symmetric. MDS matrices have been constructed from Toeplitz matrices in [55, 56]. In Section7, we use the above interconnection to easily extend these constructions for MDS matrices from Hankel matrices. As in the case of circulant matrices, we use the above interconnection to prove Theorem 7.5 stating that a Hankel matrix of order 2n is not involutory. We prove some folklore results which are often used in literature mostly without formal proofs. For example, in Corollary2 and Corollary3, we prove that if A is MDS, then AT and A−1 are also MDS. In Lemma 2.5 we show that if A is an MDS 0 matrix over F2r , then A , obtained by multiplying a row (or column) of A with any ∗ c ∈ F2r is MDS as well. We prove that for any two P and Q

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 782 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta and any two nonsingular diagonal matrices D1 and D2, if A is MDS then so are D1AD2 (Corollary1) and P AQ (Corollary4). We find a gap in one of the lemmas in the paper [2, Lemma 2] and then provided the correct statement in Subsection 8.2. The result is stated in Lemma 8.7 followed by an example which shows the existence of a gap in the statement of Lemma 2 of the paper [2]. The organization of the paper is as follows. In Section2 we provide definitions and preliminaries with proof of some folklore results. In arranging the constructions and the associated results, we follow the classification of MDS matrix constructions that have been previously described. In Sections3 and Section4, we describe direct nonrecursive constructions. In Section3, we point out the various constructions of MDS matrices from Cauchy matrices. In Section4 we provide the various construc- tions of MDS matrices from Vandermonde matrices. In Section5, we point out the interconnection and equivalence between these two constructions. To overcome the inefficiencies of direct constructions, we move on to constructions using the search method in Sections6 and Section7. In Section6 we point out the constructions of MDS matrices from circulant matrices and its variants while in Section7 we provide the constructions of MDS matrices from Toeplitz and Hankel matrices. In Section8 we describe direct constructions of recursive MDS matrices, i.e. power of some companion matrices. In Section9 we conclude the paper.

2. Definition and preliminaries

Let F2 = {0, 1} be the finite field of two elements, F2r be the finite field of r ∗ 2 elements and F2r be the multiplicative group of F2r . Elements of F2r can be represented as of degree less than r over F2. For example, let β ∈ Pr−1 i F2r , then β can be represented as i=0 biα , where bi ∈ F2 and α is the root of the constructing polynomial of F2r . Again if α be a primitive element of F2r , all the nonzero elements of F2r can be expressed as the power of α. Therefore  2 3 2r −1 F2r = 0, 1, α, α , α , . . . , α . Another representation uses hexadecimal digits. Here the hexadecimal digits are used to denote the coefficients of the corresponding polynomial representation. For example α7 + α4 + α2 + 1 = 1.α7 + 0.α6 + 0.α5 + 4 3 2 1.α + 0.α + 1.α + 0.α + 1 = (10010101)2 = 95x ∈ F28 . An MDS matrix provides diffusion properties that have useful applications in cryptography. The idea comes from coding theory, in particular from maximum distance separable (MDS) code.

Definition 2.1. [21] Let F be a finite field and p and q be two . Let x → M × x be a mapping from Fp to Fq defined by the q × p matrix M. We say that M is an MDS matrix if the set of all pairs (x, M × x) is an MDS code i.e. a linear code of dimension p, length p + q and minimal distance q + 1. In this context we state two important theorems from coding theory. Theorem 2.2. (The Singleton bound)[45, page 33] Let C be an [n, k, d] code. Then d ≤ n − k + 1. Definition 2.3. A code with d = n − k + 1 is called maximum distance separable code or MDS code in short. Theorem 2.4. [45, page 321] An [n, k, d] code C with G = [I | A], where A is a k×(n−k) matrix, is MDS if and only if every square submatrix (formed

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 783 from any i rows and any i columns, for any i = 1, 2, . . . , min{k, n − k}) of A is nonsingular. The following fact is another way to characterize an MDS matrix. Fact 1. A A is an MDS matrix if and only if every square submatrices of A are nonsingular. Fact 2. All square submatrices of an MDS matrix are MDS. Now we briefly record the MDS conjecture in the following fact. Fact 3. (MDS Conjecture)[33][45, page 328] Let C be a [n, k, d] linear MDS code over Fq. Then  q + 1, 2 ≤ k ≤ q n ≤ k + 1, q < k except for k ∈ {3, q − 1} and q is even, in which case it has length at most q + 2. In this paper we are interested in the cases for which n ≤ q + 1 holds. We also like to mention that [n, 1, n], [n, n − 1, 2] and [n, n, 1] are called trivial MDS codes, others MDS codes are called nontrivial. We now prove some folklore results on MDS matrices. One of the elementary row operations on matrices is multiplying a row of a matrix by a nonzero scalar. MDS property remains invariant under such operations. So we have the following lemma.

0 Lemma 2.5. Let A be an MDS matrix over F2r , then A , obtained by multiplying ∗ a row (or column) of A by any c ∈ F2r will also be an MDS. Proof. Take an arbitrary square submatrix B0 of A0. Suppose B is the corresponding submatrix of A. If the submatrix contains the row (or column) in which c has multiplied, then det(B0) = c · det(B) otherwise det(B0) = det(B). Since A is MDS, we have det(B0) 6= 0. Therefore A0 is MDS. We generalize the Lemma 2.5 as follows. Corollary 1. Let A be an MDS matrix, then for any nonsingular diagonal matrices D1 and D2, D1AD2 will also be an MDS matrix.

Proof. Let D1 = diag(c0, c1, . . . , cn−1) and D2 = diag(d0, d1, . . . , dn−1). By the multiplication with D1 to A it means multiply the i-th row of A by ci for 0 ≤ i ≤ n − 1. Therefore by Lemma 2.5, D1A is an MDS matrix. Whereas for D2, it means that the j-th column of D1A is multiplied by dj for 0 ≤ j ≤ n − 1. Therefore by Lemma 2.5, D1AD2 is an MDS matrix. Note that converse of Corollary1 is also true. In the following two Corollaries two important properties of MDS matrices are studied. Corollary 2. If A is an MDS matrix, then AT is also an MDS matrix.

T Proof. Consider an arbitrary submatrix of order k from A by choosing say i1, i2, T i3, . . . , ik-th rows and j1, j2, j3, . . . , jk-th columns. Denote this submatrix as A (i1, i2, . . . , ik|j1, j2, . . . , jk). It is easy to check that T T A (i1, i2, . . . , ik|j1, j2, . . . , jk) = A(j1, j2, . . . , jk|i1, i2, . . . , ik) .

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 784 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Now T T det(A (i1, i2, . . . , ik|j1, j2, . . . , jk)) = det(A(j1, j2, . . . , jk|i1, i2, . . . , ik) )

= det(A(j1, j2, . . . , jk|i1, i2, . . . , ik)).

Since A is MDS, we have det(A(j1, j2, . . . , jk|i1, i2, . . . , ik)) 6= 0. T T Therefore det(A (i1, i2, . . . , ik|j1, j2, . . . , jk)) 6= 0 and hence A is MDS. Corollary 3. The inverse of an MDS matrix is MDS. Proof. Suppose G = [I | A] is a generator matrix of an MDS code. Elementary row operation change G = [I | A] to G0 = [A−1 | I]. As elementary row operation does not change the code, G0 is also generator matrix of the MDS code. So the code defined by [I | A−1] has the same minimal distance. Therefore A−1 is an MDS matrix. The diffusion power of a linear transformation (specified by a matrix) is measured by its branch numbers [15, pages 130–132]. Definition 2.6. [15, page 132] The Differential branch number βd(M) of a matrix M of order n over finite field F2r is defined as the minimum number of nonzero components in the input vector x and the output vector Mx as we range over all n nonzero x ∈ (F2r ) i.e. βd(M) = minx6=0(w(x) + w(Mx)) where w(x) denotes the weight of the vector x i.e. number of nonzero components of the vector x. Note that the differential branch number of a matrix M is exactly the distance of the linear code generated by the matrix [I | M]. Definition 2.7. [15, page 132] The Linear branch number βl(M) of a matrix M of order n over finite field F2r is defined as the minimum number of nonzero components in the input vector x and the output vector M T x as we range over all n nonzero x ∈ (F2r ) i.e. T βl(M) = minx6=0(w(x) + w(M x)) where w(x) denotes the weight of the vector x i.e. number of nonzero components of the vector x. Remark 1. [15, page 132] Note that the maximal value of βd(M) and βl(M) are n + 1. In general βd(M) 6= βl(M) but if a matrix has the maximum possible differential or linear branch number, then both branch numbers are equal. Therefore the following fact is another characterization of MDS matrix.

Fact 4. [15] A square matrix A of order n is MDS if and only if βd(A) = βl(A) = n + 1. In this paper we are discussing about MDS matrices and since both the branch numbers are equal, we will call βd(M) and βl(M) simply as branch number and it will be denoted as βM . Definition 2.8. The permutation matrix P is the binary matrix which is obtained from the by permuting the rows (or columns). Note that permutation matrix is invertible and P −1 = P T and product of two permutation matrix is a permutation matrix. Permuting rows (or columns) does not change the branch number of a matrix. So we have the following result from [44] with a different proof.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 785

Proposition 1. For any permutation matrices P and Q, the branch numbers of the two matrices M and PMQ are same. Proof. Suppose that x is the nonzero vector such that

w(x) + w(Mx) = βM . Note that the inverse of a permutation matrix and product of two permutation matrices is again a permutation matrix. Also multiplication with a permutation matrix does not change the weight of a vector. Therefore for y = Q−1x we have w(y) + w(P MQy) = w(Q−1x) + w(P Mx)

= w(x) + w(Mx) = βM .

Since βPMQ = miny6=0 (w(y) + w(P MQy)), we have βPMQ ≤ βM . Again suppose that x is the nonzero vector such that w(x)+w(P MQx) = βPMQ. Let y = Qx. Now w(y) + w(My) = w(Qx) + w(MQx)

= w(x) + w(P MQx) = βPMQ.

Therefore βM ≤ βPMQ. Hence βM = βPMQ. Corollary 4. If A is an MDS matrix, then for any permutation matrices P and Q, PAQ is an MDS matrix. Proof. From Proposition1, we know that A and P AQ have same branch number. Also from Fact4, we know that a matrix is MDS if and only if it attains optimal branch number. Therefore P AQ is MDS. Many modern block ciphers use MDS matrices as a vital constituent to incorpo- rate diffusion property. In general two different modules are needed for encryption and decryption operations. In [68], authors proposed a special class of SPNs that uses same network for both the encryption and decryption operation. The idea was to use involutory MDS matrices for incorporating diffusion. Definition 2.9. A square matrix M is called involutory matrix if it satisfies the condition M 2 = I i.e. M = M −1. For lightweight cryptographic application, it is desirable to have matrices whose elements are of low Hamming weight with as many zeros as possible in the higher order bits. If such a matrix is involutory (or orthogonal), then encryption and decryption can be implemented with same (or almost same circuitry) and same computational cost. Definition 2.10. A square matrix M is called orthogonal matrix if MM T = I.

In [53], authors defined a special form of matrices over F2r called Finite Hadamard (FFHadamard) matrices. In [4, 48, 63], authors call FFHadamard matrix as . In this paper we also call it Hadamard.

n n Definition 2.11. [53] A 2 × 2 matrix H is Hadamard matrix in F2r if it can be represented as follows: UV  VU where the submatrices U and V are also Hadamard.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 786 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

For example a 22 × 22 Hadamard matrix is:   x0 x1 x2 x3 x1 x0 x3 x2 H =   . x2 x3 x0 x1 x3 x2 x1 x0

Hadamard matrix (often called dyadic matrix) is a . Noting that Hadamard matrices commute and we are working in a field of characteristic 2, it is easy to check by induction that H2 = c2I, where c is the sum of the elements of the first row. Therefore if the sum of the elements of the first row is equal to 1, then it will be an involutory matrix. Note that the block cipher [4] uses Hadamard involutory MDS matrix, whose first row is (1, α, α2, α + α2), where α is the root of the primitive polynomial x8 + x4 + x3 + x2 + 1. Also the block ciphers Khazad [3] and CLEFIA [62] use Hadamard involutory MDS matrices in their diffusion layers. Because of the positional structure of Hadamard matrix, we have the following results.

n n Fact 5. A 2 ×2 matrix H = (hi,j) is Hadamard in F2r if and only if hi,j = hi+k,j+k m−1 m−1 and hi,j+k = hj+k,i for 0 ≤ i, j ≤ 2 − 1 and k = 2 where 1 ≤ m ≤ n.

n n Lemma 2.12. Let H = (hi,j) be a 2 × 2 matrix whose first row is (h0, h1,..., h2n−1), then H is Hadamard if and only if hi,j = hi⊕j, where in i ⊕ j, i and j are the n-bit binary representation of i and j respectively.

m−1 Proof. If part: Suppose that hi,j = hi⊕j. Then for 1 ≤ m ≤ n, 0 ≤ i, j ≤ 2 − 1 m−1 and k = 2 , we have (i + k) ⊕ (j + k) = i ⊕ j. Therefore, hi,j = hi⊕j = h(i+k)⊕(j+k) = hi+k,j+k. Again hi,(j+k) = hi⊕(j+k) = h(j+k)⊕i = h(j+k),i. Therefore by Fact5, H is a Hadamard matrix. Only if part: Suppose that H is a Hadamard matrix of order 2n. We have to n show that hi,j = hi⊕j for 0 ≤ i, j ≤ 2 − 1. We will prove this by using the principle of mathematical induction. For n = 1,

h h  H = 0 1 . h1 h0

Here h0,0 = h0 = h0⊕0, h1,1 = h0 = h1⊕1, h0,1 = h1 = h0⊕1 and h1,0 = h1 = h1⊕0. Therefore the result is true for n = 1. Suppose that the result is true for n = l. Now suppose that H is a Hadamard matrix of order 2l+1 UV  with the first row (h , h , . . . , h l+1 ). Since H is Hadamard, H = , 0 1 2 −1 VU l where U = (ui,j) and V = (vi,j) are the Hadamard matrices of order 2 with the first row (h0, h1, . . . , h2l−1) and (h2l , h2l+1, . . . , h2l+1−1) respectively. Now for l l 0 ≤ i, j ≤ 2 − 1 and k = 2 , hi,j = ui,j and hi,j+k = vi,j. Now by induction l l hypothesis as U is Hadamard, hi,j = hi⊕j for 0 ≤ i, j ≤ 2 − 1. Since k = 2 and 0 ≤ i, j ≤ 2l − 1, we have i ⊕ j = (i + k) ⊕ (j + k). Therefore from Fact 5, we have hi+k,j+k = hi,j = hi⊕j = h(i+k)⊕(j+k). Similarly for V , it can be checked by applying induction hypothesis that hi,j+k = hi⊕(j+k). From Fact5 we have, h(j+k),i = hi,(j+k) = hi⊕(j+k) = h(j+k)⊕i. Therefore hi,j = hi⊕j for 0 ≤ i, j ≤ 2l+1 − 1. Therefore by induction the result is true for all n.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 787

Note that a Hadamard matrix can be represented by its first row. We will denote the Hadamard matrix with its first row (h0, h1, . . . , h2n−1) as had(h0, h1, . . . , h2n−1). Because of the structure of Hadamard matrices the following fact is easy to verify.

n Fact 6. Let H = (hi,j) be a square matrix of order 2 and f be a bijection such 0 0 0 that f(hi,j) = hi,j. Then H is Hadamard if and only if H = (hi,j) is Hadamard.

Lemma 2.13. Let G = {x0, x1, . . . , x2n−1} be an additive subgroup of F2r which is a linear span of n linearly independent elements {x1, x2, x22 , . . . , x2n−1 } such that Pn−1 xi = k=0 ikx2k where (in−1, . . . , i1, i0) is the binary representation of i. Then xi + xj = xi⊕j.

Proof. Suppose (in−1, . . . , i1, i0) and (jn−1, . . . , j1, j0) are the binary representation of i and j respectively. Therefore xi = i0x1 + i1x2 + i2x22 + ... + in−1x2n−1 and xj = j0x1 + j1x2 + j2x22 + ... + jn−1x2n−1 . Therefore xi + xj = (i0 + j0)x1 + (i1 + j1)x2 + (i2 + j2)x22 + ... + (in−1 + jn−1)x2n−1 = xi⊕j.

Remark 2. The additive subgroup G = {x0, . . . , x2n−1} in Lemma 2.13 is con- structed by the linear combination of n linearly independent elements labeled x1, x2, x22 , . . . , x2n−1 . Once x1, x2, x22 , . . . , x2n−1 have been fixed every other element xi ∈ G will be fixed to satisfy xi + xj = xi⊕j. Now we are ready to provide the following corollary from [21].

Corollary 5. [21, Fact 9] Let G = {x0, x1, . . . , x2n−1} be an additive subgroup of F2r with xi + xj = xi⊕j, where in i ⊕ j, i and j are the n-bit binary representation 0 0 1 of i and j respectively. Then for l ∈ 2r \ G, the matrix H = (h ) = ( ) is F i,j l+xi⊕j Hadamard.

n Proof. Consider the matrix H = (hi,j) of order 2 , where hi,j = xi⊕j for 0 ≤ i, j ≤ n 2 − 1. Therefore the first row of H is (x0, x1, . . . , x2n−1). Then from Lemma 2.12, H = had(x0, x1, x2, . . . , x2n−1). Since 0 6∈ l + G, we have l + xi⊕j = l + xi + xj 6= 0. 1 0 Now consider the bijection f(hi,j) = . Therefore by Fact6, H is a Hadamard l+hi,j matrix.

There are several design techniques for constructing MDS matrices including exhaustive search for small matrices. But for large MDS matrices, exhaustive search is not possible as problem of testing whether a matrix is MDS or not is NP-complete. We start with nonrecursive direct constructions which surely provide MDS matrices. In the next two sections we discuss Cauchy based constructions and Vandermonde based constructions.

3. Constructing MDS matrices from Cauchy matrices Application of Cauchy matrices for constructing MDS codes are widely available in literature [13, 21, 45, 46, 51, 52, 63, 69]. Youssef et. al. used Cauchy matrix for constructing MDS matrices with efficient cryptographic applications in mind [69]. Gupta et. al. [21] used similar methods in a more formal setup. Cui et. al. [13] define compact Cauchy matrix and provide several interesting results. Mattoussi et. al. [46] used triangular array to construct MDS codes, which is related with Cauchy matrices [46, 51]. We will mainly discuss [13, 21, 46, 51, 52, 69] in this section.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 788 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Definition 3.1. Given {x0, x1, ..., xn−1} ⊆ F2r and {y0, y1, ..., yn−1} ⊆ F2r such that xi + yj 6= 0 for all 0 ≤ i, j ≤ n − 1, then the matrix A = (ai,j), 0 ≤ i, j ≤ n − 1, 1 where ai,j = is called a Cauchy matrix. xi+yj It is known that [45, page 323] Y (xj − xi)(yj − yi) 0≤i

So provided xi’s are distinct and yj’s are distinct and xi + yj 6= 0 for all 0 ≤ i, j ≤ n − 1, det(A) 6= 0 i.e. A is nonsingular. This was formalized in [21] as follows.

Fact 7. [21] For distinct x0, x1 . . . , xn−1 ∈ F2r and distinct y0, y1 . . . , yn−1 ∈ F2r , such that xi + yj 6= 0 for all 0 ≤ i, j ≤ n − 1, the Cauchy matrix A = (ai,j), 1 0 ≤ i, j ≤ n − 1 where ai,j = , is nonsingular. xi+yj Fact 8. [21] Any square submatrix of a Cauchy matrix is again a Cauchy matrix. Thus from Fact7 and Fact8 all square submatrices of a Cauchy matrix are nonsingular. This leads to an MDS matrix construction. We record this in the following lemma.

Lemma 3.2. [21, Lemma 1][45, page 323][69] For distinct x0, x1 . . . , xn−1 and dis- tinct y0, y1 . . . , yn−1, such that xi + yj 6= 0 for all 0 ≤ i, j ≤ n − 1, the matrix 1 A = (ai,j), where ai,j = is an MDS matrix. xi+yj We will call this construction as Cauchy based construction of type 1. Depending on the nature of xi’s and yi’s there are basically four types of constructions available in the literature. We will call this type 1, type 2, type 3 and type 4 constructions and we will come back to it whenever we discuss them.

Remark 3. One special case of Lemma 3.2 is that yi is of the form l + xi, where l is an arbitrary nonzero element in F2r . We will call this construction as Cauchy based construction of type 2. The following lemma and its corollary studies the number of distinct entries in the construction using Lemma 3.2 which is crucial for studying the construction of efficient MDS matrices from Cauchy matrices. Lemma 3.3. [21, Lemma 2] Each row (or each column) of the n × n MDS matrix A, formed using construction of Lemma 3.2 has n distinct elements. Proof. The elements of i-th row of A are 1 for j = 0, . . . , n − 1. Now 1 = xi+yj xi+yj1 1 for any two j1, j2 ∈ {0, . . . , n − 1} such that j1 6= j2 implies yj1 = yj2 , which xi+yj2 is a contradiction to the fact that yj’s are distinct. Since i is arbitrary, the result holds for all rows of A. The proof for columns are similar.

Since each row (or column) of the matrix constructed by Lemma 3.2 has n distinct elements, we have the following the corollary. Corollary 6. [21, Corollary 1] The n×n MDS matrix A, formed using construction of Lemma 3.2 has at least n distinct elements.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 789

Example 1. Let α be the primitive element of F24 whose constructing polynomial 4 4 8 3 5 is x + x + 1. Let x0 = 0, x1 = α , x2 = α and y0 = 1, y1 = α , y2 = α . Then the matrix A using the Lemma 3.2 is given by A  1 1   3 2 2  1 α3 α5 1 α + α + α + 1 α + α + 1 1 1 1 3 2 3 =  (1+α4) (α3+α4) (α4+α5)  =  α + 1 α + 1 α + α + 1  1 1 1 3 2 2 3 2 (1+α8) (α3+α8) (α4+α8) α + α + 1 α α + α + α is MDS but not involutory. Note that each row (and column) has n = 3 distinct elements and total number of distinct element is 9. The following is an example of the special case of Lemma 3.2.

2 3 Example 2. Let x0 = α, x1 = α , x2 = α and yi = l + xi for 0 ≤ i ≤ 2, where l = 1. Therefore  1 1   2 2  1 (1+α+α2) (1+α+α3) 1 α + α α + 1 1 1 2 2  2 1 2 3  α + α 1 α A =  (1+α+α ) (1+α +α )  =   1 1 α2 + 1 α2 1 (1+α+α3) (1+α2+α3) 1 is MDS but not involutory. Note that here each row (and column) has n = 3 distinct elements and total number of distinct elements is 4. From Corollary6, a n × n matrix constructed using Lemma 3.2 has at least n distinct elements. In [21], authors constructed n × n MDS matrices with exactly n distinct elements. It has two-fold advantage. Firstly, only n suitable and efficient elements are to be chosen (say of low implementation cost) to form the MDS matrix using Cauchy construction. It may be noted that to construct efficient MDS ma- trices, it may be desirable to have minimum number of distinct entries to minimize the implementation overheads (See [34]). In [13] authors called such MDS Cauchy matrices having exactly n elements as compact Cauchy matrices. Formally, let an n × n matrix AX = (ai,j) be a Cauchy matrix generated by the vector X = (x0, x1, . . . , xn−1, xn, . . . , x2n−1) i.e. 1 ai,j = . Then AX is called a compact Cauchy matrix if AX precisely has n xi+xn+j distinct entries. Remark 4. We will call an MDS matrix A of order n as compact MDS matrix if the number of distinct elements in A is ≤ n.

Lemma 3.4. [21, Lemma 6] Let G = {x0, x1, . . . , xn−1 } be an additive subgroup of F2r . Let us consider the coset l + G, l∈ / G of G having elements yj = l + xj, 1 j = 0, . . . , n − 1. Then the n × n matrix A = (ai,j), where ai,j = , for all xi+yj 0 ≤ i, j ≤ n − 1 is an MDS matrix.

Proof. We first prove that xi + yj 6= 0 for all 0 ≤ i, j ≤ n − 1. Now, xi + yj = xi + l + xj = l + xi + xj ∈ l + G. But 0 ∈/ l + G (as l∈ / G and 0 ∈ G). So xi + yj 6= 0 for all 0 ≤ i, j ≤ n − 1. Also all xi’s are distinct elements of the group G and yj’s are distinct elements of the coset l + G. Thus from Lemma 3.2, A is an MDS matrix. We will call this construction as Cauchy based construction of type 3. Remark 5. Lemma 3.4 gives MDS matrix of order n, where n is a power of 2. When n is not a power of 2, the construction of n × n MDS matrices over F2r (n < 2r−1) is done in two steps. Firstly we construct 2m × 2m MDS matrix A0 over

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 790 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

m−1 m F2r , where 2 < n < 2 , using Lemma 3.4. In the next step, we select n × n submatrix A of A0 of our liking (select n rows and n columns). Remark 6. Lemma 3.4 is a particular case of Lemma 3.2. Lemma 3.5. [21, Lemma 7] The n × n matrix A of Lemma 3.4 has exactly n distinct entries.

1 Proof. The elements in the i-th row are ai,j = for j = 0, 1, . . . , n − 1. Since l+xi+xj xj’s form the additive group G, xi + xj for j = 0, 1, . . . , n − 1 gives all n distinct elements of G for a fixed i. Thus l + xi + xj for j = 0, 1, . . . , n − 1 gives all n distinct elements of l + G. Since i is arbitrary, therefore in each row of A, there are n distinct elements. Since these elements are nothing but the multiplicative inverse of elements of l + G in F2r , the matrix A has exactly n different elements. Corollary 7. [21, Corollary 3] The matrix A of Lemma 3.4 is symmetric and all rows are the permutations of the first row.

1 1 Proof. ai,j = aj,i = = for all 0 ≤ i, j ≤ n − 1. Therefore A is xi+yj l+xi+xj symmetric. The second part is directly follows from Lemma 3.3 and Lemma 3.5. In [21], authors provided a sufficient condition (Lemma 3.4 of this paper) for a Cauchy MDS matrix to be a compact Cauchy but did not discuss about the converse part. Later in [13], authors provided a necessary and sufficient condition for the Cauchy matrix of order n to have exactly n distinct elements.

Theorem 3.6. [13, Theorem 1] AX is an n × n compact Cauchy matrix over F2r generated by a vector X = (x0, . . . , xn−1, xn, . . . , x2n−1) if and only if there exists a additive subgroup H of F2r and a, b ∈ F2r such that a + b 6∈ H, a + H = {x0, x1, . . . , xn−1}, b + H = {xn, xn+1, . . . , x2n−1}.

Proof. If AX = (ai,j) is a compact Cauchy matrix, then for all i ∈ Zn, we have

{ai,0, ai,1, . . . , ai,n−1} = {a0,0, a0,1, . . . , a0,n−1}.

Since the set {ai,0, ai,1, . . . , ai,n−1} contains distinct entries, we may define a per- −1 −1 −1 mutation πi : n → n such that a = a = a . Z Z i,πi(t) 0,t j,πj (t) Note that a−1 = x + x for all i , i ∈ . We have for any i, j, t ∈ i1,i2 i1 n+i2 1 2 Zn Zn −1 −1 (1) a = xi + x = x0 + xn+t = xj + x = a . i,πi(t) n+πi(t) n+πj (t) j,πj (t)

Moreover, if i 6= j, then xn+πi(t) + xn+πj (t) = xi + xj 6= 0 (from Equation1) which is followed by πi(t) 6= πj(t). Hence, for any t ∈ Zn, it holds {πi(t): i ∈ Zn} = Zn. In other words,

(2) {(k, s): k, s ∈ Zn} = {(k, πi(k)) : k, i ∈ Zn}. Now we define 0 Hx = {x0 + xs : s ∈ Zn},Hx = {xk + xs : k, s ∈ Zn}, 0 Gx = {xn + xn+s : s ∈ Zn},Gx = {xn+k + xn+s : k, s ∈ Zn}. 0 0 Therefore Hx ⊆ Hx and Gx ⊆ Gx. 0 As Gx = {xn+k + xn+πi(k) : k, i ∈ Zn} from Equation2 and xn+k + xn+πi(k) = 0 T x0 + xi from Equation1, we have Gx = {x0 + xi : i ∈ Zn } = Hx. Since AX (the 0 of AX ) is also a compact Cauchy matrix generated by the vector X = (xn,

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 791

0 xn+1, . . . , x2n−1, x0, . . . , xn−1), therefore for the same reason Hx = Gx. Thus we 0 0 0 have Gx = Hx ⊆ Hx = Gx ⊆ Gx. 0 So Hx = Hx, which implies that Hx is closed under addition. Since Hx is finite, Hx is a subgroup of F2r . Let H = Hx, by the definition of Hx and Gx, we arrive that {x0, x1, . . . , xn−1} = x0 +H and {xn, xn+1, . . . , x2n−1} = xn +Gx = xn +Hx = xn + H. Since AX is a Cauchy matrix, we have x0 6∈ xn + H. Therefore x0 + xn 6∈ H. Here x0 and xn are playing the role of a and b respectively For the converse part, we can proceed as Lemma 3.4.

Lemma 3.4 provides construction of MDS matrices. These matrices may not be involutory. In general, in substitution permutation networks (SPN) decryption needs inverse of A. If an efficient MDS matrix A used in encryption, happens to be involutory, then its inverse A−1 applied for decryption will also be efficient. So we may like to make our MDS matrix to be involutory. Towards this we study the following lemma which is also given in [69], but in a slightly different setting.

Lemma 3.7. [21, Lemma 8] Let A = (ai,j) be an n × n matrix formed by Lemma 3.4. Then, A2 = c2I where c = Pn−1 1 . j=0 l+xj

2 Proof. Let A = B = (bi,j). Since A is a , the element bi,j is the inner product of i-th row and j-th row of A. Therefore,

n−1 n−1 X 1 X 1 2 bi,i = 2 = 2 = c (l + xi + xk) (l + xj) k=0 j=0 because the underlying field has characteristic 2. Similarly for i 6= j,

n−1 n−1 X 1 1 X 1 1 bi,j = = ( + ) = 0. (l + xi + xk)(l + xj + xk) xi + xj l + xi + xk l + xj + xk k=0 k=0

Thus, A2 = c2I.

Corollary 8. [21, Corollary 5] If an n × n MDS matrix A is constructed from Lemma 3.4, c−1A is an involutory MDS matrix, where c is the sum of all elements of any row.

Therefore if the sum of the elements of any row of the matrix A of Lemma 3.4 is 1, A will be involutory.

Remark 7. In [13], authors proposed construction of involutory Compact Cauchy matrix which directly follows from Corollary8.

The following is a example of a Compact Cauchy matrix constructed using Lemma 3.4.

Example 3. Let α be the primitive element of F24 whose constructing polynomial is x4 + x + 1. Let G = 0, α, α3, α + α3 and l = α2. Therefore

l + G = α2, α + α2, α3 + α2, α + α3 + α2 .

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 792 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Then the matrix  1 1 1 1  (α2) (α+α2) (α3+α2) (α+α3+α2)  1 1 1 1   (α+α2) (α2) (α+α3+α2) (α3+α2)  A =  1 1 1 1   (α3+α2) (α+α3+α2) (α2) (α+α2)  1 1 1 1 (α+α3+α2) (α3+α2) (α+α2) (α2)

α3 + α2 + 1 α2 + α + 1 α3 + α α + 1   α2 + α + 1 α3 + α2 + 1 α + 1 α3 + α  =    α3 + α α + 1 α3 + α2 + 1 α2 + α + 1  α + 1 α3 + α α2 + α + 1 α3 + α2 + 1 is MDS matrix with exactly 4 distinct elements but not involutory. Sum of any row 3 2 2 3 1 2 is α + α + 1 + α + α + 1 + α + α + α + 1 = α + 1 and (α+1)2 A = I. Hence 1 (α+1) A is an involutory MDS matrix.

Remark 8. Multiplication in F2r by 1 is trivial. So for implementation friendly design, it is desirable to have maximum number of 1’s in MDS matrices to be used in block ciphers and hash functions. We know that each element in a n × n matrix A constructed by Lemma 3.4, occurs exactly n times (See Lemma 3.5). So in the construction of n × n matrix A by Lemma 3.4, maximum number of 1’s that can occur in A is n. It is to be noted that A can be converted to have maximum number of 1’s (i.e. n number of 1’s) without disturbing the MDS property just by multiplying A by inverse of one of its entries. Although this will guarantee occurrence of 1’s in every row, but with this technique we may not control the other n − 1 elements. Also if A is an involutory MDS matrix, such conversion will disturb the involutory property. Remark 9. In [34], authors introduced the idea of efficient MDS matrices by maxi- mizing the number of 1’s and minimizing the number of occurrences of other distinct ∗ elements from F2r . It is to be noted that multiplication of each row of n × n MDS matrix A by inverse of the first elements of the respective rows will lead to an MDS matrix A0 having all 1’s in first column (See Lemma 2.5). Again by multiplying each columns of n × n MDS matrix A0 (starting from the second column) by inverse of the first elements of the respective columns will lead to an MDS matrix A00 hav- ing all 1’s in first row and first column. Thus the number of 1’s in this matrix is 2n−1. Although A00 contains maximum number of 1’s that can be achieved starting from the MDS matrix A, but the number of other distinct terms in this case may be greater than n − 1. If the order of the matrix is even, then A00 will never be involutory.

Lemma 3.8. [21, Theorem 4] Let G = {x0, x1, . . . , x2n−1} be an additive subgroup of F2r which is a linear span of n linearly independent elements {x1, x2, x22 ,..., Pn−1 x2n−1 } such that xi = k=0 ikx2k where (in−1, . . . , i1, i0) is the binary representa- n tion of i. Let yi = l + xi for 0 ≤ i ≤ 2 − 1 where l ∈ F2r \ G. Then the matrix 1 A = (ai,j), where ai,j = is a Hadamard MDS matrix. (xi+yj )

Proof. Consider the matrix H = (hi,j) = (xi + xj). Then hi,j = xi⊕j. Therefore by 1 1 1 Lemma 2.12, H is Hadamard. Now ai,j = = = . Therefore (xi+yj ) (l+xi+xj ) l+xi⊕j from Corollary5, A is Hadamard. Again by Lemma 3.4, A is MDS. Therefore A is a Hadamard MDS matrix.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 793

Remark 10. We will call this construction as Cauchy based construction of type 4. Also note that the matrix constructed using Lemma 3.8 may not be an involu- 1 tory. Whereas c A is a Hadamard involutory MDS matrix, where c is the sum of the elements of any row. Anubis [4] uses Hadamard involutory matrix which was constructed by exhaustive search and not by Lemma 3.8.

Example 4. Let α be the primitive element of F24 whose constructing polynomial 4  3 3 is x + x + 1. Let G = x0 = 0, x1 = α, x2 = α , x3 = α + α be the additive  3 2 group spanned by x1 = α, x2 = α and let l = α . Therefore 2 2 3 2 3 2 y0 = α , y1 = α + α , y2 = α + α and y3 = α + α + α . Then the matrix  1 1 1 1  (α2) (α+α2) (α3+α2) (α+α3+α2)  1 1 1 1   (α+α2) (α2) (α+α3+α2) (α3+α2)  A =  1 1 1 1   (α3+α2) (α+α3+α2) (α2) (α+α2)  1 1 1 1 (α+α3+α2) (α3+α2) (α+α2) (α2)

α3 + α2 + 1 α2 + α + 1 α3 + α α + 1   α2 + α + 1 α3 + α2 + 1 α + 1 α3 + α  =    α3 + α α + 1 α3 + α2 + 1 α2 + α + 1  α + 1 α3 + α α2 + α + 1 α3 + α2 + 1 is a Hadamard MDS matrix but not involutory. Sum of the elements of any row is 1 α + 1 and hence α+1 A is involutory. Remark 11. So far we have type 1, type 2, type 3 and type 4 Cauchy based constructions by Lemma 3.2, Remark3, Lemma 3.4 and Lemma 3.8 respectively. Similarly, in the next section, we will discuss type 1, type 2, type 3 and type 4 Vandermonde based constructions to construct MDS matrices. 1 Remark 12. If A = (ai,j) is a Cauchy matrix, where ai,j = and xi + xi+yj yj 6= 0 for 0 ≤ i, j ≤ n − 1 then for any two nonsingular diagonal matrices D1 = diag(c0, c1, . . . , cn−1) and D2 = diag(d0, d1, . . . , dn−1), the matrix D1AD2 = ( cidj ) is called generalized Cauchy matrix. We know from Corollary1 that if A is xi+yj MDS then D1AD2 is MDS. Also note that even if a Cauchy matrix is not involutory, its corresponding generalized Cauchy matrix can be made involutory for a suitable choice of D1 and D2. We will discuss it later in Section5. We provide another construction of an MDS matrix, which is slightly modified version of [46, 51] and is closely related to Cauchy based construction. Theorem 3.9. [51, Theorem 3] Suppose q = 2r and γ is an arbitrary primitive element of the field Fq. Let Sq be a triangular array whose coefficients are constants along skew diagonal in a Hankel matrix fashion defined as

a1 a2 a3 ... aq−3 aq−2 a2 a3 a4 ... aq−2 a3 a4 ... aq−2 S = a4 ... q . . . . aq−3 aq−2 aq−2

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 794 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

i −1 where ai = (1 − γ ) for 1 ≤ i ≤ q − 2. Then every square submatrix of Sq is nonsingular and hence MDS.

Proof. For 1 ≤ i ≤ q − 2 and 1 ≤ j ≤ q − i − 1, let si,j be the entries of Sq. Thus, 1 si,j = ai+j−1 = , for 1 ≤ i ≤ q − 2 and 1 ≤ j ≤ q − i − 1 1 − γi+j−1 1 = . γj 1 − γ−(i−1)

Consider the vector x = (x1, x2, ..., xq−2) and y = (y1, y2, ..., yq−2) defined by −(i−1) j xi = −γ , yj = γ , for 1 ≤ i ≤ q − 2 and 1 ≤ j ≤ q − 2.

It is easy to check that xi’s and yj’s are distinct and xi + yj 6= 0 for i + j ≤ q − 1. It can be readily verified that xi si,j = , for 1 ≤ i ≤ q − 2 and 1 ≤ j ≤ q − i − 1. xi + yj

Since all the xi’s are distinct and nonzero, all the yj’s are distinct and xi + yj 6= 0 for i and j in the defined ranges, we conclude that every square submatrix of Sq is a nonsingular generalized Cauchy matrix. We close this section by providing an interconnection between Reed-Solomon code and generalized Cauchy matrix [51, Theorem 1]. Theorem 3.10. [51, 52] A matrix of the form G = [I | A] over a finite field F generates a generalized Reed-Solomon code if and only if A = (ai,j) is a generalized cidj Cauchy matrix i.e. ai,j = for 0 ≤ i, j ≤ n − 1, where the xi, yj’s are 2n xi+yj distinct elements of F, such that xi + yj 6= 0 for all i and j and ci, dj 6= 0.

4. Constructing MDS matrices from Vandermonde matrices Application of Vandermonde matrices for constructing MDS codes are widely available in literature [21, 40, 41, 46, 53]. Vandermonde matrices defined over a finite field can contain singular square submatrices (see Fact9). Consequently these matrices by itself need not be MDS over a finite field. The authors of [29, Theorem 2] missed this and provided a wrong construction. Lacan and Fimes [40, 41] used two Vandermonde matrices to build an MDS matrix. Later Sajadieh et. al. [53] used similar method to find an MDS matrix which is also involutory. We will mainly discuss [21, 41, 48, 53] in this section. Definition 4.1. The matrix  1 1 1 ... 1   a0 a1 a2 . . . an−1  2 2 2 2   a0 a1 a2 . . . an−1 A = vand(a0, a1, a2, . . . , an−1) =    . . . . .   . . . . .  n−1 n−1 n−1 n−1 a0 a1 a2 . . . an−1 is called Vandermonde matrix, where ai’s are elements of a finite or infinite field. Y It is known that det(A) = (aj − ai), which is non zero if and only if 0≤j

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 795

Fact 9. [45, page 323] Any square submatrix of a Vandermonde matrix with real, positive entries is nonsingular but this is not true over finite fields. For an example, consider 1 1 1 1  1 α α4 α5  vand(1, α, α4, α5) =   1 α2 α8 α10 1 α3 α12 α15 where α is a primitive element of the finite field F24 defined by the polynomial x4 + x + 1. Consider the 2 × 2 submatrix 1 1  1 α15 which is singular as α15 = 1.

Theorem 4.2. [41, Theorem 2] Let V1 = vand(a0, a1, . . . , an−1) and V2 = vand(b0, b1, . . . , bn−1) be two Vandermonde matrices such that ai, bj are 2n distinct elements −1 −1 from some field. Then the matrices V1 V2 and V2 V1 are such that any square submatrix of them is nonsingular and hence MDS matrices.

Proof. Let us denote by U the n × 2n matrix [V1 | V2]. Consider the product −1 −1 W = V1 U = [I | R] where R = V1 V2. Now, we prove that R does not contain any singular submatrix. Every n × n submatrix of U is nonsingular because it is also a Vandermonde matrix built from n distinct elements. Then any n × n submatrix of W is also −1 nonsingular for it is the product of V1 and the corresponding nonsingular n × n submatrix of U. Now from Remark 13 (written below), the code defined by the −1 generator matrix [I | R] is an MDS code. Thus, V1 V2 is an MDS matrix. For −1 V2 V1 the proof is identical. Remark 13. In the above theorem we have used Corollary 3 of [45, page 319]: A generator matrix G = [I | R] generates a [2n, n, n+1] MDS code if and only if every n column of G are linearly independent. Remark 14. We will call the construction using Theorem 4.2 as Vandermonde based construction of type 1. Note that in Cauchy based construction of type 1 (see Lemma 3.2), a extra condition xi + yj 6= 0 for 0 ≤ i, j ≤ n − 1 is needed. T Remark 15. Some authors [21, 53] use notation vand(a0, a1, a2, . . . , an−1) = A , −1 −1 where A is as defined in Definition 4.1. With this notation V1V2 and V2V1 will be MDS.

Example 5. Let α be the primitive element of F24 whose constructing polynomial 4 4 8 is x + x + 1. Consider the Vandermonde matrices V1 = vand(0, α , α ) and V2 = vand(1, α3, α5). Then the matrix α3 + α2 α2 + 1 1 −1 2 3 V1 V2 =  α + 1 α + α + 1 1 α3 α3 + α2 + α + 1 1 is MDS but not involutory.

In [53], the authors showed that for two Vandermonde matrices V1 = vand(a0, a1, . . . , an−1) and V2 = vand(b0, b1, . . . , bn−1) = vand(l + a0, l + a1, . . . , l + an−1), −1 where l is an arbitrary nonzero element in F2r , the matrix V1 V2 is involutory (see also Remark 16). Again if ai’s and bi’s are 2n different values, then by Theorem

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 796 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

−1 4.2, V1 V2 will be involutory MDS matrix. Corollary9 states this result formally which is a direct consequence of Theorem 4.2 and Theorem 4.4. Theorem 4.3 is an intermediate result for proving Theorem 4.4. −1 −1 −1 Remark 16. V1 V2 is involutory if and only if V1 V2 = V2 V1

Theorem 4.3. [53, Theorem 3] If V1 = vand(a0, a1, . . . , an−1) and V2 = vand(b0, b1, . . . , bn−1) are two invertible Vandermonde matrices such that bi = l + ai, −1 then V2V1 is lower whose nonzero elements are determined by powers of l. −1 −1 Proof. Let V1 = (ti,j) and V = (vi,j) = V2 · V1 , 0 ≤ i, j ≤ n − 1. −1 As V1 · V1 = I, we have n−1 X V .V −1 = t = 1 and 1row(0) 1column(0) i,0 i=0 n−1 X V .V −1 = ak · t = 0 for 1 ≤ k ≤ n − 1. 1row(k) 1column(0) i i,0 i=0 It can be checked that n−1 X v = V .V −1 = t = 1, 0,0 2row(0) 1column(0) i,0 i=0 n−1 n−1 X X k v = V .V −1 = bk · t = (l + a ) · t k,0 2row(k) 1column(0) i i,0 i i,0 i=0 i=0 n−1 X k k k k−1 = ( C0ai + C1ai · l + ... i=0 k k−1 k k + Ck−1ai · l + Ckl ) · ti,0 n−1 X k k = l · ti,0 = l for 1 ≤ k ≤ n − 1. i=0 So we computed the 0-th column of V. −1 Again as V1 · V1 = I, n−1 X V .V −1 = t = 0, 1row(0) 1column(1) i,1 i=0 n−1 X V .V −1 = a · t = 1 and 1row(1) 1column(1) i i,1 i=0 n−1 X V .V −1 = ak · t = 0 for 2 ≤ k ≤ n − 1. 1row(k) 1column(1) i i,0 i=0 Again it can be checked that n−1 X v = V .V −1 = t = 0, 0,1 2row(0) 1column(1) i,1 i=0 n−1 n−1 n−1 X X X v = V .V −1 = b · t = (l + a ) · t = a · t = 1 1,1 2row(1) 1column(1) i i,1 i i,1 i i,1 i=0 i=0 i=0

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 797 and n−1 n−1 X X v = V .V −1 = bk · t = (l + a )k · t k,1 1row(k) 1column(1) i i,1 i i,1 i=0 i=0 n−1 X k k k k−1 = ( C0ai + C1ai · l + ... i=0 k k−1 k k + Ck−1ai · l + Ckl ) · ti,1 n−1 X k k−1 = Ck−1ai · l · ti,1 i=0 k k−1 k k−1 = Ck−1 · l = C1 · l for 2 ≤ k ≤ n − 1. So we have computed the 1-st column of V. Similarly k k−2 v0,2 = v1,2 = 0, v2,2 = 1 and vk,2 = C2 · l for 3 ≤ k ≤ n − 1,

k k−3 v0,3 = v1,3 = v2,3 = 0, v3,3 = 1 and vk,3 = C3 · l for 4 ≤ k ≤ n − 1, −1 and so on. Therefore V = V2 · V1  1 0 0 0 ...... 0 0  l 1 0 0 ...... 0 0  2 2   l C1 · l 1 0 ...... 0 0  3 3 2 3  =  l C1 · l C2 · l 1 ...... 0 0 .  4 4 3 4 2 4   l C1 · l C2 · l C3 · l ...... 0 0    ......   ......  n−1 n−1 n−2 n−1 n−3 n−1 n−4 l C1 · l C2 · l C3 · l ...... l 1 −1 Thus V2V1 is a lower triangular matrix.

Theorem 4.4. [53, Theorem 4] If V1 = vand(a0, a1, . . . , an−1) and V2 = vand(b0, b1, . . . , bn−1) are two invertible Vandermonde matrices such that ai = l + bi, then −1 V2V1 V2 = V1. −1 i i Proof. Let V = (vi,j) = V2 · V1 . Note that (V1)i,j = aj and (V2)i,j = bj. Therefore

(V · V2)i,j = Vrow(i) · V2column(j) i i i−1 i i−2 2 i i−1 i = l + C1l · bj + C2l · bj + ... + Ci−1bj + bj i i = (l + bj) = aj. −1 Therefore V2V1 V2 = V1.

−1 −1 2 −1 Remark 17. V2V1 V2 = V1 implies that (V1 V2) = I i.e. V1 V2 is involutory.

Corollary 9. [53, Corollary 1] If V1 = vand(a0, a1, . . . , an−1) and V2 = vand(b0, b1, . . . , bn−1) are two invertible Vandermonde matrices in the field F2r satisfying −1 the two properties ai = l + bi and ai 6= bj, for 0 ≤ i, j ≤ n − 1, then V1 V2 is involutory MDS matrix. We will call this construction as Vandermonde based construction of type 2. This construction gives involutory MDS matrices. Whereas in Cauchy based construction of type 2 (see Remark3) the constructed MDS matrix need not be involutory.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 798 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Example 6. Let α be the primitive element of F24 whose constructing polynomial 4 2 3 is x + x + 1. Let l = 1, x0 = α, x1 = α , x2 = α and y0 = 1 + α, y1 = 2 3 2 3 1 + α , y2 = 1 + α . Consider the Vandermonde matrices V1 = vand(α, α , α ) and 2 3 V2 = vand(1 + α, 1 + α , 1 + α ). Then the matrix  α3 α3 + 1 α3 + 1  −1 3 2 3 2 3 2 V1 V2 = α + α + α α + α + α + 1 α + α + α α2 + α + 1 α2 + α + 1 α2 + α is involutory MDS and 1 0 0 −1 V2V1 = 1 1 0 1 0 1 is a lower triangular matrix.

Remark 18. Let G = {x0, x1, . . . , xn−1 } be an additive subgroup of F2r . Let us consider the coset l + G, l∈ / G having elements yj = l + xj, j = 0, . . . , n − 1. If V1 = vand(x0, x1, . . . , xn−1) and V2 = vand(l + x0, l + x1, . . . , l + xn−1), then −1 V1 V2 is involutory MDS matrix by Corollary9. We will call this construction as Vandermonde based construction of type 3. Note its similarity of xi’s and yj’s of Cauchy based construction of type 3 using Lemma 3.4. In [53], authors defined Special Vandermonde matrix, which was restated differ- ently but equivalently in [21] as follows.

Definition 4.5. [21] Let G be an additive subgroup {x0, x1, . . . , x2n−1} of F2r n of order 2 which is a linear span of n linearly independent elements {x1, x2, Pn−1 x22 , . . . , x2n−1 } such that xi = i=0 bix2i where (bn−1, . . . , b1, b0) is the binary representation of i. A Vandermonde matrix vand(y0, y1, . . . , y2n−1) is called Special Vandermonde matrix if yi = l + xi. In [53, Corollary 2], authors provided a construction of Hadamard involutory MDS matrices using Special Vandermonde matrices which was generalized in [21, Lemma 5]. We restate it in the following lemma.

Lemma 4.6. [21, Lemma 5] Let V1 = vand(x0, x1, . . . , x2n−1) and V2 = vand(y0, y1, . . . , y2n−1) be Special Vandermonde matrices in F2r where yi = x0 + y0 + xi and −1 y0 6∈ {x0, x1, . . . , x2n−1}, then V1 V2 is Hadamard involutory MDS matrix. The proof of Corollary 2 of [53] was several pages long. The authors of [21] proposed an alternative and much simpler proof of the above lemma [21, Corollary 8] and we will provide another proof in Section5, Corollary 11. We will call the construction of Lemma 4.6 as Vandermonde based construction of type 4. Note that the Cauchy based construction of type 4 using Lemma 3.8 also provides Hadamard MDS matrix but it may not be involutory.

Example 7. Let α be the primitive element of F24 whose constructing polynomial 4  3 3 2 2 is x + x + 1. Let G = 0, α, α , α + α and let y0 = α . Therefore x0 + y0 = α . 3 3 2 2 3 Consider the matrices V1 = vand(0, α, α , α+α ) and V2 = vand(α , α+α , α + α2, α + α3 + α2). The matrix  α3 + α α3 + α2 α2 + α 1  3 2 3 2 −1 α + α α + α 1 α + α  V V2 =   1  α2 + α 1 α3 + α α3 + α2 1 α2 + α α3 + α2 α3 + α

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 799 is Hadamard involutory MDS matrix. Remark 19. Lemma 3.8 provides Hadamard MDS matrix, Lemma 4.6 provides Hadamard MDS matrix which is also involutory. For the sake of efficiency, the Hadamard involutory MDS matrix used in Anubis block cipher [4] was constructed by search method. The authors of [63] discussed the constructions of Hadamard MDS matrices by search methods in details. To reduce the search space they defined an equivalence classes of Hadamard matrices in terms of branch number. In [44], authors discussed the similarity between equivalence classes of Hadamard matrices and equivalence classes of circulant matrices. We will discuss equivalence classes of circulant matrices in Section6. Theorem 4.7. Let M be an MDS matrix and D be a nonsingular . Then, DMD−1 will also be an MDS matrix. If M 2 = cI for some constant c, then (DMD−1)2 = cI. Proof. From Corollary1, DMD−1 is MDS. Let B = DMD−1, then B2 = DMD−1DMD−1 = DM 2D−1 = D(cI)D−1 = cI.

In [48], authors proposed a new form of matrix, which they called generalized Hadamard matrix (GHadamard matrix) and provided several efficient constructions. If H is a Hadamard matrix then DHD−1 is called a GHadamard matrix, where D is a nonsingular matrix. The underline idea of their construction is provided in Theorem 4.7. Note, H is involutory if and only if DHD−1 is involutory. Remark 20. In this section we discussed type 1, type 2, type 3 and type 4 Vander- monde based construction by Theorem 4.2, Corollary9, Remark 18 and Lemma 4.6. Recall Remark 11 for type 1, type 2, type 3 and type 4 Cauchy based construction. Suppose M and V are the MDS matrices from Cauchy based construction and Vandermonde based construction respectively. In the next section we will show that they are related by D1MD2 = V , where D1 and D2 are nonsingular diagonal matrices.

5. Interconnection between Vandermonde based construction and Cauchy based construction Till now we discussed four types of MDS matrix constructions using Cauchy and Vandermonde matrices. In this section we provide a nontrivial interconnec- tion between Cauchy based constructions and Vandermonde based constructions. Suppose x0, x1, x2, . . . , xn−1 and y0, y1, y2, . . . , yn−1 are 2n distinct elements from F2r such that xi + yj 6= 0 for all 0 ≤ i, j ≤ n − 1. Consider the matrices V1 = vand(x0, x1, x2, . . . , xn−1), V2 = vand(y0, y1, y2, . . . , yn−1) and M = (mi,j), 1 −1 −1 where mi,j = . Then we know that V V2,V V1 and M are MDS matrices. xi+yj 1 2 Now we will prove that the type 1 Vandermonde based construction is equivalent to type 1 Cauchy based construction. Type 2, type 3 and type 4 are just the special cases. Note that Gupta et. al. [21, Theorem 5] proved the equivalence for type 4 construction which is a particular case of Theorem 5.1. −1 Theorem 5.1. Suppose V1, V2 and M are as defined above and V1 = (bi,j), 0 ≤ −1 i, j ≤ n − 1, then D1MD2 = V1 V2, where

D1 = diag(b0,n−1, b1,n−1, b2,n−1, . . . , bn−1,n−1)

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 800 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta and n−1 n−1 n−1 n−1 Y Y Y Y D2 = diag( (xk + y0), (xk + y1), (xk + y2),..., (xk + yn−1)). k=0 k=0 k=0 k=0 Proof. Consider the polynomial n−1 2 n−1 X k Pi(x) = bi,0 + bi,1x + bi,2x + ... + bi,n−1x = bi,kx k=0 −1 whose coefficients are the elements of the i-th row of V1 . Consider the (i, j)-th −1 element of V1 V2, n−1 n−1 −1 X X k (3) (V1 V2)i,j = bi,k · (V2)k,j = bi,k · yj = Pi(yj). k=0 k=0 −1 We will prove that (D1MD2)i,j = (V1 V2)i,j. Now

(D1MD2)i,j = (D1)i,i · mi,j · (D2)j,j n−1 1 Y = bi,n−1 · · ( (xk + yj)) xi + yj k=0 (4) = bi,n−1(x0 + yj)(x1 + yj) ... (xi−1 + yj)(xi+1 + yj) ... (xn−1 + yj). −1 The i-th row of V1 V1   = bi,0 bi,1 . . . bi,n−1 · V1 Pn−1 k Pn−1 k Pn−1 k  = k=0 bi,kx0 k=0 bi,kx1 ... k=0 bi,kxn−1   = Pi(x0) Pi(x1) ...Pi(xn−1) . −1 As V1 V1 = I, we have Pi(xi) = 1 and Pi(xj) = 0 for i 6= j i.e. x0, x1, . . . , xi−1, xi+1, . . . , xn−1 are the roots of Pi(x). Therefore

(5) Pi(x) = bi,n−1(x + x0)(x + x1) ... (x + xi−1)(x + xi+1) ... (x + xn−1).

From Equation4 and Equation5, we have ( D1MD2)i,j = Pi(yj) and from Equa- −1 −1 tion3 we have ( V1 V2)i,j = Pi(yj). Therefore (D1MD2)i,j = (V1 V2)i,j. −1 Remark 21. In type 2 construction, V1 V2 is involutory but M is not involutory. But we can make it involutory by D1MD2, where D1 and D2 are the two nonsingular diagonal matrices as defined in Theorem 5.1. For example consider x0 = 0, x1 = 4 8 4 α , x2 = α and yi = α + xi, over F24 whose constructing polynomial is x + x + 1 with α a primitive element. Then α3 + α2 + 1 α3 + α2 α3 + α2 −1 2 2 2 V1 V2 =  α α + 1 α  α3 α3 α3 + 1 is an involutory MDS matrix. But the Cauchy matrix  1 1 1   3 2  α α+α4 α+α8 α + 1 1 α + α 1 1 1 3 3 2 M =  α+α4 α α+α4+α8  =  1 α + 1 α + α + 1 1 1 1 2 3 2 3 α+α8 α+α4+α8 α α + α α + α + 1 α + 1 is not involutory.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 801

1 α2 + 1 α3  −1 3 3 2 Now V1 = 0 α + 1 α + α  . 0 α3 + α2 α2 3 Therefore D1 = diag(c0, c1, c2) and D2 = diag(d0, d1, d2) where c0 = α , c1 = 3 2 2 4 8 3 2 4 α + α , c2 = α , d0 = α · (α + α ) · (α + α ) = α + α + α, d1 = (α + α ) · α · 4 8 3 8 4 8 3 2 (α + α + α ) = α and d2 = (α + α ) · (α + α + α ) · α = α + α + 1. Now it −1 is easy to check that D1MD2 = V1 V2. Therefore the generalized Cauchy matrix D1MD2 is involutory. Cauchy based construction of type 3 provides compact MDS matrix (see Lemma 3.5). In Corollary 10 we will show that Vandermonde based construction of type 3 also provides compact MDS matrix. To prove this we need the following lemmas. Let {x0, x1, . . . , xn−1} be an additive subgroup G of F2r where x0 = 0, V1 = vand(x0, x1, . . . , xn−1) and   b0,0 b0,1 . . . b0,n−1 b1,0 b1,1 . . . b1,n−1 −1   V =   , where b ∈ r . 1  . . . .  i,j F2  . . . .  bn−1,0 bn−1,1 . . . bn−1,n−1 Qn−1 and γ be the product of all nonzero elements of G i.e. γ = i=1 xi. n Lemma 5.2. Let V1 and γ are as defined above, then det(V1) = γ 2 .

1 Q Q 2 Q Proof. det(V1) = k

0 det(V1 ) Proof. Let i ∈ {0, 1, . . . , n − 1} be arbitrary. So, bi,n−1 = , where det(V1)  1 1 1 ... 1 1 ... 1   x0 x1 x2 . . . xi−1 xi+1 . . . xn−1  2 2 2 2 2 2  0  x0 x1 x2 . . . xi−1 xi+1 . . . xn−1 V1 =    . . . . .   . . . . .  n−2 n−2 n−2 n−2 n−2 n−2 x0 x1 x2 . . . xi−1 xi+1 . . . xn−1 = vand(x0, x1, . . . , xi−1, xi+1, . . . , xn−1). 0 Q Q 1 Therefore det(V1 ) = k

Corollary 10. Let {x0, x1, . . . , xn−1} be an additive subgroup G of F2r of order n where x0 = 0 and let yi = l + xi for 0 ≤ i ≤ n − 1, where l 6∈ G. Let V1 = −1 vand(x0, x1, . . . , xn−1) and V2 = vand(y0, y1, . . . , yn−1). Then V1 V2 is a compact involutory MDS matrix.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 802 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

−1 Proof. From Theorem 5.1, we know V1 V2 = D1MD2, where D1 and D2 are the nonsingular diagonal matrices defined in Theorem 5.1. From Lemma 5.3, we 1 1 1 have D1 = diag( γ , γ ,..., γ ). Since xj’s form the additive subgroup group G, xi + xj for i = 0, 1, . . . , n − 1 gives all n distinct elements of G for a fixed j. Thus xi + yj = l + xi + xj for i = 0, 1, . . . , n − 1 gives all n distinct elements of l + G. Qn−1 Qn−1 Qn−1 Therefore k=0 (xk + y0) = k=0 (xk + y1) = ... = k=0 (xk + yn−1) = d for some d i.e. D2 = diag(d, d, . . . , d). Since M is a compact MDS, D1MD2 is compact −1 −1 MDS matrix. Again by Corollary9, V1 V2 is involutory. Therefore V1 V2 is compact involutory MDS matrix. Note that in [21] it was proved that the constructed MDS matrices from Vander- monde based construction of type 4 are involutory and Hadamard. In the following corollary we prove it in a different way.

Corollary 11. Suppose {x0, x1, . . . , xn−1} is an additive subgroup G of F2r of order n such that xi + xj = xi⊕j and let yi = l + xi for 0 ≤ i ≤ n − 1, where l 6∈ G. −1 Let V1 = vand(x0, x1, . . . , xn−1) and V2 = vand(y0, y1, . . . , yn−1). Then V1 V2 is a Hadamard involutory MDS matrix. 1 1 1 Proof. As in the proof of Corollary 10, we obtain D1 = diag( γ , γ ,..., γ ) and D2 = diag(d, d, . . . , d). Since the constructed matrix M in Cauchy based construction of type 4 is MDS and Hadamard, D1MD2 will remain MDS and Hadamard. Again by −1 −1 Corollary9, V1 V2 is involutory. Therefore V1 V2 is Hadamard involutory MDS matrix. Now we are comparing all the known Vandermonde based constructions with their corresponding Cauchy based constructions in Table1. Let x0, x1, x2, . . . , xn−1 and y0, y1, y2, . . . , yn−1 are 2n distinct elements from F2r such that xi + yj 6= 0 for −1 −1 all 0 ≤ i, j ≤ n − 1. Then the matrices V1 V2,V2 V1 and M are MDS, where V1 = vand(x0, x1, x2, . . . , xn−1), V2 = vand(y0, y1, y2, . . . , yn−1) and M = (mi,j), 1 where mi,j = . xi+yj

Table 1. Comparison between Vandermonde and Cauchy based constructions of MDS matrices over a finite field

Construction Type Vandermonde based Con- Cauchy based Construction struction M −1 −1 V1 V2 and V2 V1 Type 1: No extra condition 1. Need not be involutory 1. Need not be involutory 2. Need not be Hadamard 2. Need not be Hadamard 3. Need not be compact 3. Need not be compact Type 2: yi = l + xi, where l is an 1. Involutory and equal 1. Need not be involutory, arbitrary nonzero element in F2r 2. Need not be Hadamard whereas D1MD2 is involutory 3. Need not be compact for some nonsingular diagonal matrices D1 and D2 (see Re- mark 21) 2. Need not be Hadamard 3. Need not be compact Type 3: xi’s are the elements 1. Involutory and equal 1. Need not be involutory, of an additive subgroup G = 2. Need not be Hadamard 1 whereas c M is involutory, {x0, x1, x2, . . . , xn−1} of order n 3. Compact where c is the sum of the ele- of F2r and l 6∈ G ments of any row 2. Need not be Hadamard 3. Compact Type 4: xi’s are the elements 1. Involutory and equal 1. Need not be involutory, of an additive subgroup G = 2. Hadamard 1 whereas c M is involutory, {x0, x1, x2, . . . , xn−1} of order n 3. Compact where c is the sum of the ele- of F2r such that xi + xj = xi⊕j ments of any row and l 6∈ G 2. Hadamard 3. Compact

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 803

Till now we have discussed Cauchy and Vandermonde based constructions. In these methods the constructed matrices are MDS so they are direct nonrecursive constructions. One design goal of MDS matrix is that it has maximum number of 1’s and minimum number of distinct elements [34]. The minimum number of distinct elements in Vandermonde and Cauchy based construction is the dimension of the matrix. Next we consider circulant matrix where the number of distinct elements can be even smaller.

6. Constructing MDS matrices from circulant matrices and its variants To the best of our knowledge, till date there is no known method to provide circulant matrix of arbitrary order which is MDS by construction itself. However, there are constructions of circulant MDS matrices based on search. Such matrices find its applications in lightweight cryptography mainly due to the repetition of its entries. For instance in the AES [15] diffusion matrix, which is circulant, there are two 1’s in its first row and 1 being the multiplicative identity has implementational advantage as multiplication with 1 implies no processing at all. Though search methods provide efficient MDS matrices of moderate order over moderate size search space, it fails for higher order and large search space. Note, MDS matrix in AES [15] has been found by search method. In this section we mainly discuss ideas from [11, 15, 23, 24, 44]. We start with the definition of a circulant matrix which is a special kind of matrix where each row vector is rotated one element to the right relative to the preceding row vector. Definition 6.1. [49, page 290] The n × n matrix of the form   x0 x1 x2 . . . xn−1 xn−1 x0 x1 . . . xn−2    . . . . .  C =  . . . . .     . . . . .   . . . . .  x1 x2 x3 . . . x0 is called a circulant matrix and will be denoted by Circ(x0, . . . , xn−1). It can be checked that the (i, j)-th entry of Circ(x0, . . . , xn−1) can be expressed as (C)i,j = x(j−i)mod n. In AES [15], the circulant MDS matrix used is Circ(α, 1 + α, 1, 1), where α is the root of x8 + x4 + x3 + x + 1. In [36], the 8 × 8 circulant MDS used is Circ(1, 1, α2, 1, α3, 1 + α2, α, 1 + α3), where α is the root of x16 + x5 + x3 + x2 + 1. There are several advantages of using circulant matrix in a diffusion layer: 1. It has a higher probability of finding an MDS matrix as compared to a random- ized square matrix [14]. 2. It has at most n distinct entries, and in addition it can be MDS and contain repeated lightweight entries, which tends to have lower implementation cost as compared to matrices like Hadamard and Cauchy matrices that must have at least n distinct entries in order to be MDS. 3. It has the flexibility to be implemented in both round-based and serialized im- plementations [44].

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 804 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Definition 6.2. An index permutation σ on an ordered set {c0, c1, . . . , cn−1 } is a permutation that permutes the index elements.

For example, let σ be an index permutation on an ordered set {c0, c1, c2, c3, c4}, where σ(i) = 4 − i, the resultant ordered set will be {c4, c3, c2, c1, c0}. In the following lemma we point out an important property of circulant matrices. Lemma 6.3. [49] The product of two circulant matrices is a circulant matrix. Also the inverse and transpose of a circulant matrix are circulant. A circulant matrix can also be written as a polynomial in some suitable permu- tation matrix. So we have the following proposition.

Proposition 2. [49, page 290] A n × n circulant matrix A = Circ(x0, . . . , xn−1) 2 n−1 can be written in the form A = x0I + x1P + x2P + ... + xn−1P , where P = Circ(0, 1, 0,..., 0). Note that the circulant MDS matrix Circ(α, 1 + α, 1, 1) used in AES MixColumn operation has elements of low hamming weights, but the number of 1’s in this matrix is 8. In [34], Junod et al. showed that maximum number of 1’s in 4×4 MDS matrix is 9 and towards this they constructed a new class of efficient MDS matrices whose submatrices were circulant matrices. In [24], Gupta et. al. formalized this new type of matrices as Type-I circulant-like matrices and carried out elaborate study of such matrices towards construction of efficient and perfect diffusion layer. Here we provide the definition of Type-I circulant-like matrices from [24, 34]. Definition 6.4 (Type-I circulant-like matrix). [24, 34] The n × n matrix  a 1 1T A is called Type-I circulant-like matrix, where A = Circ(1, x1, . . . , xn−2), 1 = (1,..., 1), 1 is the unit element and xi’s and a are any nonzero elements of the under- | {z } n-1 times lying field other than 1. This matrix is denoted as T ypeI(a, Circ(1, x1, . . . , xn−2)). Recall that inverse of an MDS matrix is of interest in case of SPN networks. In [24], it was observed that the inverses of Type-I circulant-like matrices are almost of the same form. Towards this we have the following definition. Definition 6.5 (AlmostType-I circulant-like matrix). [24] The n × n matrix  a b bT A is called AlmostType-I circulant-like matrix, where A = Circ(x0, x1, . . . , xn−2), b = (b, . . . , b) and a, b and xi’s are any elements of the underlying field. This | {z } n-1 times matrix is denoted as AlmostT ypeI(a, b, Circ(x0, . . . , xn−2)). Example 8. Consider the Type-I circulant-like matrix A = T ypeI(α, Circ(1, 1 + −1 8 7 6 5 4 3 α+α , α)) over F28 whose constructing polynomial is x +x +x +x +x +x +1 and α is a root of that polynomial. Then the inverse of A is a AlmostType-I circulant-like matrix, where A−1 = AlmostT ypeI(α7 + α6 + α5 + α4 + α3 + α2 + 1, α, Circ(1, α7 + α6 + α4, α7 + α6 + α4 + α2 + 1)).

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 805

It may be noted that involutory or orthogonal MDS matrices are desirable for SPN networks. It will be shown in Lemma 6.12 that circulant matrices can not be both involutory and MDS. In Lemma 6.9 it will also be shown that 2n × 2n circulant matrix can not be both MDS and orthogonal. Again Lemma 6.13 to Lemma 6.16 shows that Type-I circulant-like matrices are neither involutory nor orthogonal. Towards this, authors in [24] introduced a new type of circulant-like MDS matrices which are by construction involutory. This construction was based on the scheme which was initially proposed in [69], where the authors considered the construction of 2n × 2n involutory MDS matrices starting from some n × n submatrix which was an MDS matrix. In [24] authors took the n × n submatrices as circulant MDS matrices and obtained a new type of circulant-like matrices. This leads to the following definition.

Definition 6.6 (Type-II circulant-like matrix). [24] The 2n × 2n matrix

 AA−1 A3 + AA

is called Type-II circulant-like matrix, where A = Circ(x0, . . . , xn−1). This matrix is denoted as T ypeII(Circ(x0, . . . , xn−1)).

In [44], authors propose a new type of matrices, called left-circulant matrices which preserve the benefit of circulant matrices and have the potential of being involutory.

Definition 6.7. [44] A left-circulant matrix L of order n is a matrix where each sub-sequent row is a left rotation of the previous row. It is denoted as l-Circ(x0, x1, ..., xn−1), where xi’s are the entries of the first row of the matrix.

It can be checked that the (i, j)-th entry of L can be expressed as (L)i,j = x(i+j)mod n. For an example, an n × n left-circulant matrix is   x0 x1 x2 . . . xn−2 xn−1  x1 x2 x3 . . . xn−1 x0     x2 x3 x4 . . . x0 x1  L =   .  ......   ......  xn−1 x0 x1 . . . xn−3 xn−2 MDS matrices of dimension 2n × 2n are of special cryptographic interest. Note that in AES, a 22 × 22 MDS matrix is used. In MDS-AES of [47], the proposed matrix is of dimension 24 × 24. In Lemma 6.9 it is proved that 2n × 2n circulant matrix can not be both MDS and orthogonal. In Lemma 6.8 and Corollary 12, we study two important properties of 2n × 2n circulant MDS matrices and using these results we prove Lemma 6.9.

n 2n P2 −1 2n Lemma 6.8. [24, Lemma 4] Circ(x0, x1, . . . , x2n−1) = ( i=0 xi )I, where x0, . . . , x2n−1 ∈ F2r .

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 806 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

2 Proof. From Proposition2, Circ(x0, x1, . . . , x2n−1) = x0I + x1P + x2P + ... + 2n−1 n n x2n−1P , where P = Circ(0, 1, 0,..., 0) is a 2 × 2 matrix. So, 2n 2 2n−1 2n Circ(x0, x1, . . . , x2n−1) = (x0I + x1P + x2P + ... + x2n−1P ) 2n 2n 2n 2n 2n 2n 2 2n 2n 2n−1 = x0 I + x1 P + x2 (P ) + ... + x2n−1(P ) 2n 2n 2n 2n = (x0 + x1 + x2 + ... + x2n−1)I.

n P2 −1 2n Remark 22. If i=0 xi = 1, then Circ(x0, x1, . . . , x2n−1) = I. n P2 −1 2n Corollary 12. [24, Corollary 1] det(Circ(x0, x1, . . . , x2n−1)) = i=0 xi , where x0, x1, . . . , x2n−1 ∈ F2r . 2n 2n Proof. Let A = Circ(x0, x1, . . . , x2n−1) and det(A) = δ. So δ = (det(A)) = n n 2n 2n P2 −1 2n 2n P2 −1 2n det(A ). From Lemma 6.8, A = ( i=0 xi )I. So, δ = det(( i=0 xi )I) = n n 2 n P2 −1 2n P2 −1 2n ( i=0 xi ) . Therefore, δ = i=0 xi . Lemma 6.9. [24, Lemma 5] For n ≥ 2, any 2n × 2n circulant orthogonal matrix over F2r is non MDS.

Proof. Let A = Circ(a0, a1, . . . , a2n−1) be an orthogonal matrix, where a0,..., a2n−1 ∈ F2r . Let the row vectors of A be R0,R1,...,R2n−1, where R0 = (a0, a1, . . . , a2n−1) and Ri can be obtained by rotating Ri−1 one element to the right. Since A is orthogonal, Ri.Rj = 0 whenever i 6= j. Let us consider the cases R0.Rj = 0 for j = {2k + 1 : k = 0,..., 2n−2 − 1}, which give following 2n−2 equations: 2n−1 2n−1 2n−1 2n−1 X X X X aiai+1 = 0, aiai+3 = 0, aiai+5 = 0,..., aiai+2n−1−1 = 0, i=0 i=0 i=0 i=0 where suffixes are computed modulo 2n. Adding these equations, we get P i,j a2ia2j+1 = (a0 + a2 + a4 + ... + a2n−2)(a1 + a3 + a5 + ... + a2n−1) = 0. n−1 n−1 Note that A has a (2 × 2 ) submatrix Circ(a0, a2, a4, . . . , a2n−2) which is formed by 0th, 2nd, 4th, . . . , (2n − 2)th rows and 0th, 2nd, 4th, . . . , (2n − 2)th 2n−1 2n−1 columns. From Corollary 12, det(Circ(a0, a2, a4, . . . , a2n−2)) = a0 + a2 + 2n−1 2n−1 2n−1 a4 + ... + a2n−2 = (a0 + a2 + a4 + ... + a2n−2) . n−1 n−1 Similarly it can be observed that A has (2 × 2 ) submatrix Circ(a1, a3, n a5, . . . , a2n−1) which is formed by 0th, 2nd, 4th, . . . , (2 − 2)th rows and 1st, 3rd, n 2n−1 2n−1 5th, . . . , (2 − 1)th columns and det(Circ(a1, a3, a5, . . . , a2n−1)) = a1 + a3 + 2n−1 2n−1 2n−1 a5 + ... + a2n−1 = (a1 + a3 + a5 + ... + a2n−1) . P Now, i,j a2ia2j+1 = (a0 + a2 + a4 + ... + a2n−2)(a1 + a3 + a5 + ... + a2n−1) = 0, which implies that at least one of these submatrices is singular. So A is non MDS. Remark 23. Lemma 6.9 is a slightly modified version of [24, Lemma 5]. They did not mention that n should be ≥ 2. We observe that the result is not true for the  α 1 + α matrices of order 2 . For example, consider the matrix A = , where 1 + α α 4 α is the primitive element of F24 whose constructing polynomial is x + x + 1. Then it is easy to check that the matrix A is circulant MDS and orthogonal. With the above example it may be checked that similar errors were present in the original version of Lemma 6.12, Theorem 6.19, Theorem 7.2 Theorem 7.3.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 807

Remark 24. Although the 2n ×2n circulant MDS matrices are not orthogonal, the circulant matrices of other order may be orthogonal. For example, let the irreducible 8 4 3 polynomial x +x +x +x+1 be the constructing polynomial of F28 , then the 3×3 matrix Circ(α, 1 + α2 + α3 + α4 + α6, α + α2 + α3 + α4 + α6) and the 6 × 6 matrix Circ(1, 1, α, 1 + α2 + α3 + α5 + α6 + α7, α + α5, α2 + α3 + α6 + α7) are orthogonal. In Lemma 6.12 we show that circulant matrices can not be both involutory and MDS. Before that we study two useful properties in Lemma 6.10 and Lemma 6.11.

Lemma 6.10. [24, Lemma 7] Let A = Circ(x0, x1, . . . , x2n−1) be a (2n) × (2n) circulant matrix, where x0, . . . , x2n−1 ∈ F2r . Then 2 2 2 2 2 2 2 A = Circ(x0 + xn, 0, x1 + xn+1, 0, . . . , xn−1 + x2n−1, 0). Proof. From Proposition2,

2 2n−1 A = x0I + x1P + x2P + ... + x2n−1P , where P = Circ(0, 1, 0,..., 0) is a (2n) × (2n) matrix. So

2 2 2 2 2 4 2 2(2n−1) A = x0I + x1P + x2P + ... + x2n−1P 2 2 2n 2 2 2 2n+2 2 2(d−1) 2 2(2n−1) = (x0I + xnP ) + (x1P + xn+1P ) + ... + (xn−1P + x2n−1P ) 2 2 2 2 2 2 2 (2n−2) = (x0 + xn)I + (x1 + xn+1)P + ... + (xn−1 + x2n−1)P 2 2 2 2 2 2 = Circ(x0 + xn, 0, x1 + xn+1, 0, . . . , xn−1 + x2n−1, 0).

Lemma 6.11. [24, Lemma 8] Let A = Circ(x0, x1, . . . , x2n) be a (2n+1)×(2n+1) circulant matrix, where x0, . . . , x2n ∈ F2r . Then 2 2 2 2 2 2 2 2 A = Circ(x0, xn+1, x1, xn+2, . . . , xn−1, x2n, xn). Proof. From Proposition2, A can be written as

2 2n A = x0I + x1P + x2P + ... + x2nP , where P = Circ(0, 1, 0,..., 0) is a (2n + 1) × (2n + 1) matrix. So

2 2 2 2 2 4 2 2(2n) A = x0I + x1P + x2P + ... + x2nP 2 2 (2n+1+1) 2 2 2 (2n+1+3) 2 4 = x0I + xn+1P + x1P + xn+2P + x2P + ... 2 2n−2 2 (2n+1+2n−1) 2 2n + xn−1P + x2nP + xnP 2 2 2 2 2 3 2 (2n−2) 2 (2n−1) 2 2n = x0I + xn+1P + x1P + xn+2P + ... + xn−1P + x2nP + xnP 2 2 2 2 2 2 2 = Circ(x0, xn+1, x1, xn+2, . . . , xn−1, x2n, xn).

P2n−1 Remark 25. For any A = Circ(a0, . . . , a2n−1) with i=0 ai = 1, from Remark 2n −1 2n−1 Qn−1 2k 22, A = I. So A = A = k=0 A . Also note that matrices of the form k A2 for k > 0 are efficient as most of the elements are zero. So the InvMixColumn operation can be implemented as a simple preprocessing step of multiplication by d−1 A2 × A4 × ... × A2 followed by the MixColumn step. For example, when n = 2, A−1 = A × A2.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 808 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Remark 26. In AES [15], the MDS matrix used in MixColumn operation is M = Circ(α, 1 + α, 1, 1), where α is the root of x8 + x4 + x3 + x + 1. Barreto observed that in the InvMixColumn operation [15] of decryption, instead of M −1, M × M 2 = Circ(α, 1 + α, 1, 1) × Circ(1 + α2, 0, α2, 0) can be used for more efficient implementation. This is a consequence of Lemma 6.8, Lemma 6.10 and Remark 25. Lemma 6.12. [24, Lemma 9] Circulant involutory matrices of order n ≥ 3 over F2r are non MDS.

Proof. Let A = Circ(x0, x1, . . . , x2n−1) be a (2n)×(2n) involutory circulant matrix. 2 2 2 2 2 2 2 Then A = I. But from Lemma 6.10, A = Circ(x0 +xn, 0, x1 +xn+1, 0, . . . , xn−1 + 2 2 2 x2n−1, 0). So clearly x1 + xn+1 = 0. But A has a 2 × 2 submatrix Circ(x1, xn+1) which can be obtained from the 0th and nth rows and the 1st and (n+1)th columns 2 2 of A, and det(Circ(x1, xn+1)) = x1 + xn+1 = 0. So A is not MDS. Again for the (2n+1)×(2n+1) involutory circulant matrix A = Circ(x0, x1,..., 2 2 2 2 2 2 2 x2n), from Lemma 6.11, A = Circ(x0, xn+1, x1, xn+2, . . . , x2n, xn). But since A is 2 involutory, A = I. So clearly xi = 0 for all i ∈ {1,..., 2n}. So A is not MDS. Remark 27. Over a field of odd characteristic, even order involutory circulant matrix is not MDS whereas odd order involutory circulant matrix may be MDS [11]. In [24], it was proved that Type-I circulant-like MDS matrices of even order can not be involutory or orthogonal but they did not discuss about the odd order. In this section we prove that Type-I circulant-like MDS matrices of odd order can not be involutory or orthogonal as well. Lemma 6.13. [24, Lemma 6] Any 2n × 2n Type-I circulant-like MDS matrix over F2r is not orthogonal.  a 1 Proof. Let M = , where A = Circ(1, x , . . . , x ). Now M × M T = 1T A 1 2n−2  2  a + 1 c P2n−2 T T , where c = (c, . . . , c), c = a + 1 + i=1 xi, B = U + A × A and c B | {z } 2n-1 times U = (ui,j), where ui,j = 1 for 0 ≤ i, j ≤ 2n − 2. Suppose M is orthogonal, M × M T = I, which gives a2 + 1 = 1 and hence a = 0. Thus M is not MDS. In the following lemma we prove that there is no orthogonal Type-I circulant-like matrix of odd order as well.

Lemma 6.14. Any (2n + 1) × (2n + 1) Type-I circulant-like matrix over F2r is not orthogonal.  a 1 Proof. Let M = , where A = Circ(1, x , . . . , x ). Now M × M T = 1T A 1 2n−1  2  a c P2n−1 T T , where c = (c, . . . , c), c = a+1+ i=1 xi, B = U +A×A and U = (ui,j) c B | {z } 2n times T is a 2n × 2n matrix with ui,j = 1. Suppose M is orthogonal, M × M = I, which gives a2 = 1 and hence a = 1. P2n−1 P2n−1 Now c = a + 1 + i=1 xi = i=1 xi. Again as M is orthogonal, we have c = 0 P2n−1 P2n−1 2 P2n−1 2 which implies that i=1 xi = 0. Therefore i=1 xi = 0 =⇒ 1 + i=1 xi = 1. T P2n−1 2 T So, (AA )i,i = 1 + i=1 xi = 1. Therefore (MM )1,1 = (B)0,0 = (U)0,0 +

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 809

T (AA )0,0 = 1+1 = 0, Which is a contradiction. Therefore M can not be orthogonal.

In [24], it was proved that there is no involutory Type-I circulant-like matrix of even order. Therefore we have the following lemma from [24].

Lemma 6.15. [24, Lemma 10] Any 2n × 2n Type-I circulant-like matrix over F2r can not be involutory.

 a 1 Proof. Let M = , where A = Circ(1, x , . . . , x ). Now, 1T A 1 2n−2  2  2 a + 1 c P2n−2 2 M = T , where c = (c, . . . , c), c = a + 1 + i=1 xi, B = U + A and c B | {z } 2n-1 times 2 U = (ui,j) is the (2n − 1) × (2n − 1) matrix with ui,j = 1. (A )0,0 = 1, therefore 2 2 2 (M )1,1 = (B)0,0 = (U)0,0 + (A )0,0 = 1 + 1 = 0 and so M 6= I. Hence M is not involutory.

Remark 28. In Lemma 6.15, if n = 2 and A = Circ(1, b, a), a2 + 1 (b + 1) (b + 1) (b + 1)  (b + 1) 0 (1 + a2) (1 + b2) M 2 =   6= I. (b + 1) (1 + b2) 0 (1 + a2) (b + 1) (1 + a2) (1 + b2) 0 So M is not involutory. When a = α and b = 1 + α−1 where α is the root of the 8 7 6 5 4 3 constructing polynomial x + x + x + x + x + x + 1 of F28 , we get the matrix M which is used in block cipher FOX64 [35].

In the following lemma we also prove that there is no involutory Type-I circulant- like matrix of odd order.

Lemma 6.16. Any (2n + 1) × (2n + 1) Type-I circulant-like matrix over F2r can not be involutory.

 a 1 Proof. Let M = , where A = Circ(1, x , . . . , x ). Now, M 2 = 1T A 1 2n−1  2  a c P2n−1 2 T , where c = (c, . . . , c), c = a + 1 + i=1 xi, B = U + A and U = (ui,j) is c B | {z } 2n times the 2n × 2n matrix with ui,j = 1. Since A is even circulant matrix by Lemma 6.10, 2 2 2 we have (A )0,1 = 0. Therefore (M )1,2 = (B)0,1 = (U)0,1 + (A )0,1 = 1. Therefore M 2 6= I. Hence M is not involutory.

For the possibility of constructing involutory MDS matrices from Type-II circulant-like matrices Gupta et. al. [24] showed that Type-II circulant-like ma- trices are always involutory.

Lemma 6.17. [24, Lemma 11] Type-II circulant-like matrices over F2r are involu- tory.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 810 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

 AA−1 Proof. Let A be a n × n circulant matrix and M = be a Type-II A3 + AA circulant-like matrix. Now,  A2 + A−1(A3 + A) AA−1 + A−1A  M 2 = (A3 + A)A + A(A3 + A)(A3 + A)A−1 + A2  2 2  A + A + In×n 0 = 2 2 = I2n×2n. 0 A + In×n + A Hence M is involutory. Example 9. Consider the Type-II circulant like matrix A = T ypeII(Circ(α, 1, 1+ 2 8 4 3 α )) over F28 with constructing polynomial is x + x + x + x + 1, where α is the root of the constructing polynomial. Then A is an involutory MDS matrix. In [69], authors considered the construction of 2n × 2n MDS matrices starting from some random n × n MDS matrix as submatrix. They were unable to obtain any MDS matrix by random search for n = 4. Authors of [24] proved that whenever the n × n submatrix is a circulant MDS matrix and n is even, the corresponding 2n × 2n matrix is non MDS. So we have the following lemma from [24].

Lemma 6.18. [24, Lemma 12] Any 2n × 2n Type-II circulant-like matrix over F2r is non MDS for even values of n.

Proof. Let n = 2d and A = Circ(a0, a1, . . . , a2d−1). From Lemma 6.10 2 2 2 2 2 2 2 A = Circ(a0 + ad, 0, a1 + ad+1, 0, . . . , ad−1 + a2d−1, 0). 2 2 2 Let bi = ai + ad+i for i = 0, . . . , d − 1, so A = Circ(b0, 0, b1, 0, . . . , bd−1, 0). Now 3 2 A = A × A = Circ(a0, a1, . . . , a2d−1) × Circ(b0, 0, b1, 0, . . . , bd−1, 0) = Circ(e0, Pd−1 Pd−1 e1, . . . , e2d−1), where e2k = i=0 a2ibd−i+k and e2k+1 = i=0 a2i+1bd−i+k for k = {0, 1,..., (d − 1)}, where suffixes of ai’s are computed modulo 2d and the suffixes of bj’s are computed modulo d. Let C0,...,C2d−1 be the column vectors T of A, where C0 = (a0, a2d−1, a2d−2, . . . , a1) and Ci is obtained from Ci−1 by one shift vertically downward. Now,    Pd−1    e0 i=0 a2ibd−i a0b0 + a2bd−1 + a4bd−2 + ... + a2d−2b1 Pd−1 a b  e2d−1  i=0 2i+1 d−i+d−1 a1bd−1 + a3bd−2 + a5bd−3 + ... + a2d−1b0 e   Pd−1  a b + a b + a b + ... + a b   2d−2  i=0 a2ibd−i+d−1   0 d−1 2 d−2 4 d−3 2d−2 0  .  =  .  =  .   .   .   .   .   .   .    d−1   e2  P  a0b1 + a2b0 + a4bd−1 + ... + a2d−2b2    i=0 a2ibd−i+1    e1 Pd−1 a1b0 + a3bd−1 + a5bd−2 + ... + a2d−1b1 i=0 a2i+1bd−i   a0b0 + a2bd−1 + a4bd−2 + ... + a2d−2b1 a2d−1b0 + a1bd−1 + a3bd−2 + ... + a2d−3b1   a2d−2b0 + a0bd−1 + a2bd−2 + ... + a2d−4b1 =    .   .     a2b0 + a4bd−1 + a6bd−2 + ... + a0b1  a1b0 + a3bd−1 + a5bd−2 + ... + a2d−1b1

= b0C0 + bd−1C2 + bd−2C4 + ... + b1C2d−2. It is to be noted that the first column of A3 can be written as the linear combi- 3 nation of C0, C2, C4,..., C2d−2. So the first column of (A + A) is

(b0 + 1)C0 + bd−1C2 + bd−2C4 + ... + b1C2d−2.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 811

Let M = T ypeII(Circ(a0, . . . , a2d−1)) be a 4d × 4d Type-II circulant-like matrix whose row vectors are R0,R1,...,R4d−1 and column vectors are T0,T1,...,T4d−1. So the (d+1)×(d+1) submatrix of M obtained by the rows R2d, R2d+1,...,R2d+d and the columns T0, T2d, T2d+2,...,T2d+2d−2 is singular. So from Theorem 2.4, M is non MDS.

Remark 29. It is to be noted that circulant and Type-I circulant MDS matrices lack involutory and orthogonal properties. So we focus to construct these matrices, for which inverse can also be implemented efficiently. Note that although Type-II circulant-like matrices are always involutory, they are not MDS when dimensions are of the form 2(2n) × 2(2n). You may also look at Table2. Now we briefly discuss left-circulant matrices. It is easy to check that a left- circulant matrix is symmetric, so if it is orthogonal then it is involutory and vice versa. Many properties of left-circulant matrices are similar to circulant matrices, in this context we provide a few properties through Proposition3 and Proposition4. In [44], these propositions were used to prove Theorem 6.19, but here we provide an alternative proof. Proposition 3. [44, Proposition 4] The product of two left-circulant matrices is a circulant matrix.

Proof. Let A = l-Circ(x0, x1, ..., xn−1) and B = l-Circ(y0, y1, ..., yn−1) be two left- circulant matrices. Then the (i, j)-th entry of their product is

n−1 n−1 n−1 X X X (A)i,k · (B)k,j = xi+k · yk+j = xk · yk+(j−i). k=0 k=0 k=0 This shows that AB is a circulant matrix.

n n Proposition 4. [44, Proposition 5] For 2 ×2 left-circulant matrix L = l-Circ(x0, n n 2n+1 P2 −1 2n+1 P2 −1 2n x1, ..., x2n−1) over F2r , L = ( i=0 xi) I and det(L) = ( i=0 xi) .

n 2 P2 −1 Proof. By Proposition3, L is circulant with (i, j)-th entry k=0 xk · xk+(j−i) and hence 2n−1 2n−1 2n−1 2n−1 2 2n X X 2n X 2 2n X 2n+1 (L ) = ( xk · xk+i) I = (( xk) ) I = ( xk) I, i=0 k=0 k=0 k=0

n P2 −1 2n which also implies that det(L) = ( i=0 xk) . Remark 30. From Lemma4, we know that if A is MDS matrix then for any permutation matrix P , PA is also MDS matrix. Also from Remark 24, we know that there may exists circulant MDS matrix A= Circ(x0, x1, ..., xn−1) over F2r which is orthogonal, where n is not the power of 2. Now consider the permutation matrix 1 0 0 ... 0 0 0 0 0 0 ... 0 0 1   0 0 0 ... 0 1 0   P = 0 0 0 ... 1 0 0 .   ......  ......  0 1 0 ... 0 0 0

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 812 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

It is easy to check that PA = l-Circ(x0, x1, ..., xn−1). But MDS property and orthogonality of A will not be disturbed by pre multiplying with P . Therefore the matrix PA =l-Circ(x0, x1, ..., xn−1) will be an orthogonal MDS matrix and hence an involutory MDS matrix. In [24], it was proved that a circulant matrix can not be both involutory and MDS and a 2n × 2n circulant matrix can not be both MDS and orthogonal. Similarly in [44, Theorem 4] it was proved that a 2n × 2n left-circulant matrix can not be both MDS and involutory (orthogonal). In the following Theorem we provide an alternative and much simpler proof of Theorem 4 of [44]. For the original proof, readers are advised to go through [44]. Theorem 6.19. [44, Theorem 4] For n ≥ 2, if L is a 2n × 2n left-circulant MDS matrix over F2r , then L is not involutory (orthogonal).

Proof. Assume that L= l-Circ(x0, x1, ..., x2n−1) is an involutory MDS matrix over F2r . As L is symmetric it is also an orthogonal MDS matrix. It is easy to check that PL= Circ(x0, x1, ..., x2n−1), where P is the permutation matrix as in Remark 30. Since P is a permutation matrix, the matrix PL = Circ(x0, x1, ..., x2n−1) will also be orthogonal and MDS. By Lemma 6.9 it is a contradiction. Remark 31. Although the 2n × 2n left-circulant MDS matrices are not involutory, the left-circulant MDS matrices of other orders may be involutory. For example, 8 6 5 2 over F28 with constructing polynomial x + x + x + x + 1, the 5 × 5 matrix l- Circ(1, α, α7 +α5 +α4 +α+1, α7 +α5 +α4 +α3 +α+1, α3 +α) is MDS and involutory and 6×6 matrix l-Circ(1, 1, α7 +α5 +α4 +α +1, α5 +α3 +α2, α2, α7 +α4 +α3 +α) is MDS and involutory. We will finish this section by discussing an equivalence relation between circulant matrices. In [44], authors provided an equivalence relation by which we can partition (n−1)! the n! possible circulant matrices of order n into φ(n) equivalence classes each containing nφ(n) circulant matrices having the same branch number. Here φ is the Euler’s totient function. Definition 6.20. [44] Two matrices M and M 0 are called permutation-equivalent, denoted by M ∼ M 0, if there exist two permutation matrices P and Q such that M 0 = P MQ.

For example, the circulant matrix Circ(x0, x1, . . . , xn−1) is permutation-equiva- lent to l-Circ(x0, x1, . . . , xn−1). Remark 32. It is easy to check that the relation ∼ is an equivalence relation and from Fact1, the permutation-equivalent matrices have the same branch number. Definition 6.21. An equivalence class of circulant matrices is a set of circulant matrices that are equivalent to each other with respect to the equivalence relation ∼. In [44], Liu and Sim provided a necessary and sufficient condition for two circulant matrices to be permutation-equivalent. In this paper we record the lemma without its original proof. We will provide an alternative proof for this which is more basic. For the original proof reader are requested to go through Lemma 1 of [44].

Lemma 6.22. [44, Lemma 1] Given two circulant matrices C = Circ(x0, x1,..., σ σ xn−1) and C = Circ(xσ(0), xσ(1), . . . , xσ(n−1)), C ∼ C if and only if σ is some

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 813 index permutation satisfying σ(i) = (bi + a) mod n, ∀i ∈ {0, 1, . . . , n − 1}, where a, b ∈ Zn and gcd(b, n) = 1. Proof. If part: Suppose that σ(i) = bi+a such that gcd(b, n) = 1. We have to show that for some permutation matrices P and Q, PCQ = Cσ. We will construct three σ permutation matrix P1,P2,Q1 such that C = P1P2CQ1. Note that the inverse of a permutation matrix Q is QT and product of two permutation matrices is a permutation matrix. Construct the permutation matrix Q1 whose i-th column has T −1 1 at i.b-th position for 0 ≤ i ≤ n − 1. Let P1 = Q1 = Q1 . Note that Q1 is a permutation matrix as gcd(b, n) = 1. Now 2 n−1 P1CQ1 = c0P1Q1 + c1P1PQ1 + c2P1P Q+ ... + cn−1P1P Q1 T T T 2 T n−1 = c0Q1 Q1 + c1Q1 PQ1 + c2Q1 P Q1 + ... + cn−1Q1 P Q1 T T 2 T n−1 = c0I + c1Q1 PQ1 + c2(Q1 PQ1) + ... + cn−1(Q1 PQ1) . T T Since Q1 and Q1 are two permutation matrices, it is easy to check that Q1 PQ1 = P j for some j. Therefore j 2j (n−1)j P1CQ1 = c0I + c1P + c2P + ... + cn−1P .

So P1CQ1 is a circulant matrix. In another direction it is easy to check that the first row of P1CQ1 =(c0, cb, c2b, . . . , c(n−1)b). Now consider the permutation matrix P2 =Circ(0, 0,..., 1 , |{z} (n-a)-th position ..., 0). It is easy to check that P2C = Circ(ca, ca+1, ca+2, . . . , ca+n−1) and P1P2CQ1 = Circ(ca, ca+b, ca+2b, . . . , ca+(n−1)b). Only if part: Suppose that C and Cσ are two circulant matrices such that C ∼ σ σ C . Therefore there exists two permutation matrix Q1 and Q2 such that C = Q1CQ2. Since C is a circulant matrix, by Proposition2 we have 2 n−1 C = c0I + c1P + c2P + ... + cn−1P , where (c0, c1, c2, . . . , cn−1) is the first row of C. Therefore 2 n−1 (6) Q1CQ2 = Q1c0Q2 + Q1c1PQ2 + Q1c2P Q2 + ... + Q1cn−1P Q2.

Comparing the positions of ci in both L.H.S. and R.H.S. of Equation6 and since i di d0 Q1CQ2 is circulant, Q1ciP Q2 = ciP for some di ≥ 0. So from Q1c0Q2 = c0P , we have d0 d0 −1 Q1Q2 = P =⇒ Q1 = P Q2 . Again

d1 d0 −1 d1 −1 d1−d0 Q1PQ2 = P =⇒ P Q2 PQ2 = P =⇒ Q2 PQ2 = P . −1 b Let b = (d1 − d0) mod n, so Q2 PQ2 = P . Therefore

i d0 −1 i Q1P Q2 = P Q2 P Q2

d0 −1 i = P (Q2 PQ2) = P d0 P bi = P d0+bi.

σ Pn−1 d0+bi Therefore C = Q1CQ2 = i=0 ciP =⇒ σ(i) = bi + d0. Take d0 = a, we have σ(i) = bi + a. As σ is a permutation on {0, 1, . . . , n − 1}, we must have gcd(b, n) = 1.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 814 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Remark 33. In an equivalence class defined by the relation ∼ all the circulant matrices have same branch number. But it may be noted that different equivalence classes can have same branch number. In [44], authors proved that this is the most compact equivalence classes for circulant matrices in the sense of equivalence class defined in Definition 6.21. It seems interesting, but difficult to define an equivalence class such that different equivalence classes will have different branch numbers and all MDS circulant matrices will be in one equivalence class.

7. Constructing MDS matrices from toeplitz and Hankel matrices Toeplitz matrices have a deep interconnection with circulant matrices and in- terested reader may consult [18, 39] for more information about the connection. Recently Toeplitz matrices are used to construct MDS matrices using search tech- nique [55, 56]. It has similarity to the constructions of circulant MDS matrices discussed in [24]. A Toeplitz matrix is a special kind of matrix where every descending diagonal from left to right is constant. Definition 7.1. The n × n matrix

  a0 a1 a2 . . . an−2 an−1  a−1 a0 a1 . . . an−3 an−2    a−2 a−1 a0 . . . an−4 an−3 A =    ......   ......  a−(n−1) a−(n−2) a−(n−3) . . . a−1 a0 is called a Toeplitz matrix of order n.

A Toeplitz matrix is defined by its first row and first column. For instance {a0, a1, . . . , an−1, a−1, a−2, . . . , a−(n−1)} defines the Toeplitz matrix A of Definition 7.1 and in this paper we will use the notation T oep(a0, a1, . . . , an−1; a−1, a−2, . . . , a−(n−1)) for describing A. Also it can be checked that the (i, j)-th entry of A can be expressed as (A)i,j = a(j−i).

Theorem 7.2. [55, Theorem 1] Toeplitz matrices of order n ≥ 3 over F2r cannot be both MDS and involutory. Proof. Let A be a n × n Toeplitz matrix which is both involutory and MDS, where n ≥ 3. Case 1. When n is odd. The (n − 2)-th element in the 0-th row of A2 is

2 (A )0,n−2 = Arow(0) · Acolumn(n−2)

= a0an−2 + a1an−3 + ... + a n−1 a n−3 + ... + an−2a0 + an−1a−1 2 2 = an−1a−1. Since A is involutory then

an−1a−1 = 0, which implies that an−1 = 0 or a−1 = 0. This contradicts that A is MDS.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 815

Case 2. When n is even. 2 (A )0,n−2 = Arow(0) · Acolumn(n−2)

= a0an−2 + a1an−3 + ... + a n−2 a n−2 + ... + an−2a0 + an−1a−1 2 2 2 = a n−2 + an−1a−1. 2 Therefore as A is an , we have 2 a n−2 + an−1a−1 = 0. 2 n n−2 Consider the 2 × 2 submatrix of A formed by the 0-th and 2 -th row and 2 -th and (n − 1)-th column, " # a n−2 an−1 T = 2 a−1 a n−2 2 which is singular. Therefore A is not MDS. Like circulant matrices, Toeplitz matrices of order 2n can not be both orthogonal and MDS. Theorem 7.3. [55, Theorem 2] For n ≥ 2, any 2n × 2n Toeplitz orthogonal matrix over F2r is non MDS. Proof. Suppose that A be a Toeplitz matrix of order 2n which is both orthogonal T n and MDS. Let δi be the diagonal element of AA for i = 0, 1,..., 2 − 1. Then 2n−1 X 2 n δi = aj−i = 1 for i = 0, 1,..., 2 − 1. j=0

Considering the pair of equations (δi and δi+1), we get n a−i = a2n−i for i = 0, 1,..., 2 − 1. Therefore A is indeed a circulant matrix. Therefore from Lemma 6.9, A can not be MDS. Remark 34. Although the 2n × 2n Toeplitz MDS matrices are not orthogonal, the Toeplitz matrices of other order may exists. For example, let the irreducible 8 4 3 polynomial x +x +x +x+1 be the constructing polynomial of F28 , then the 3×3 matrix T oep(α, 1+α2 +α3 +α4 +α6, α+α2 +α3 +α4 +α6; α+α2 +α3 +α4 +α6, 1+ α2 + α3 + α4 + α6) and the 6 × 6 matrix T oep(1, 1, α, 1 + α2 + α3 + α5 + α6 + α7, α + α5, α2 + α3 + α6 + α7; α2 + α3 + α6 + α7, α + α5, 1 + α2 + α3 + α5 + α6 + α7, α, 1) are orthogonal. Now we introduce Hankel matrices which are closely related to the Toeplitz matrices in which each ascending skew diagonal from left to right is constant. Definition 7.4. The n × n matrix

  a0 a1 a2 . . . an−2 an−1  a1 a2 a3 . . . an−1 an     a2 a3 a4 . . . an an+1  H =    ......   ......  an−1 an an+1 . . . a2n−3 a2n−2 is called a Hankel matrix.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 816 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Note a left-circulant matrix is a special case of Hankel matrix. A Hankel matrix is symmetric and is defined by its first row and last column. For instance {a0, a1, . . . , an−1, an, an+1, . . . , a2n−2} defines the Hankel matrix H of Definition 7.4 and in this paper we will use the notation Hank(a0, a1, . . . , an−1; an, an+1, . . . , a2n−2) for describing H. Also it can be checked that (H)i,j = ai+j. As a Hankel matrix is symmetric, an involutory (orthogonal) Hankel matrix is orthogonal (involutory). Remark 35. From Lemma4, we know that if T is MDS matrix then for any permutation matrix P , PT is also MDS matrix. Also from Remark 34, we know that there may exists Toeplitz MDS matrix T = T oep(a0, a1, . . . , an−2, an−1; a−1, a−2, . . . , a−(n−1)) over F2r which is orthogonal, where n is not a power of 2. Consider the permutation matrix 0 0 0 ... 0 0 1 0 0 0 ... 0 1 0   0 0 0 ... 1 0 0 P =  . ......  ......  1 0 0 ... 0 0 0

Now it is easy to check that PT = H, where H = Hank(a−(n−1), a−(n−2), . . . , a−1, a0; a1, a2, . . . , an−1). But MDS property and orthogonality of T will not be disturbed by the multiplication with P . Therefore the matrix H = Hank(a−(n−1), a−(n−2), . . . , a−1, a0; a1, a2, . . . , an−1) will be an orthogonal MDS matrix and so an involutory MDS matrix which is preferable to reduce the hardware cost as same circuit can be used both for encryption and decryption. We know that a Toeplitz matrix of order n ≥ 3 is not involutory MDS and an orthogonal Toeplitz 2n×2n matrix is not MDS. Similarly we prove that an involutory (orthogonal) Hankel 2n × 2n matrix is not MDS. You may recall Remark 30 and Theorem 6.19 for similar results between circulant and left-circulant matrices.

n n Theorem 7.5. For n ≥ 2, if H is a 2 × 2 Hankel MDS matrix over F2r , then H is not involutory (orthogonal).

Proof. Assume that H = Hank(a0, a1, . . . , a2n−1; a2n , a2n+1, . . . , a2n+1−2) is an involutory MDS matrix over F2r . Therefore H is an orthogonal MDS matrix. It is easy to check that PH = T , where P is the permutation matrix as defined in Remark 35 and T = T oep(a2n−1, a2n , a2n+1, . . . , a2n+1−2; a2n−2, a2n−3, . . . , a1, a0). Since P is a permutation matrix and H is an orthogonal MDS matrix, T = PH is an orthogonal MDS matrix. Therefore T is an orthogonal MDS matrix of order 2n, which is a contradiction by Theorem 7.3. Hence H can not be an involutory.

Remark 36. Although the 2n × 2n Hankel MDS matrices are not involutory, the Hankel MDS matrices of other orders may be involutory. For example, over F28 with constructing polynomial x8 + x6 + x5 + x2 + 1, the 5 × 5 matrix Hank(1, α, α7 + α5 + α4 + α + 1, α7 + α5 + α4 + α3 + α + 1, α3 + α; 1, α, α7 + α5 + α4 + α + 1, α7+α5+α4+α3+α+1) is involutory and 6×6 matrix Hank(1, 1, α7+α5+α4+α+1, α5 + α3 + α2, α2, α7 + α4 + α3 + α; 1, 1, α7 + α5 + α4 + α + 1, α5 + α3 + α2, α2) is involutory. Till now we have discussed nonrecursive constructions by direct methods as well as search methods. From the next section onward we discuss recursive constructions by direct methods as well as search methods. We close this section by providing

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 817

Table2, which summarize the involutory and orthogonal properties of circulant, circulant like, left-circulant, Toeplitz and Hankel matrices.

Table 2. Several results of Circulant, Circulant-like, left- circulant, Toeplitz and Hankel matrices over a finite field

Type Dimension Involutory Orthogonal MDS MDS 2n × 2n do not exist do not exist Circulant 2n × 2n do not exist may exist (Remark 24) (2n + 1) × (2n + 1) do not exist may exist (Remark 24) 2n × 2n do not exist do not exist Type-I (2n + 1) × (2n + 1) do not exist do not exist 2(2n) × 2(2n) do not exist do not exist Type-II 2(2n + 1) × 2(2n + 1) may exist (Example9) may exist 2n × 2n do not exist do not exist left-Circulant 2n × 2n may exist (Remark 31) may exist (Remark 31) (2n + 1) × (2n + 1) may exist (Remark 31) may exist (Remark 31) 2n × 2n do not exist do not exist Toeplitz 2n × 2n do not exist may exist (Remark 34) (2n + 1) × (2n + 1) do not exist may exist (Remark 34) 2n × 2n do not exist do not exist Hankel 2n × 2n may exist (Remark 36) may exist (Remark 36) (2n + 1) × (2n + 1) may exist (Remark 36) may exist (Remark 36)

Remark 37. There was an error in [24, Table 1] where it was given that Type-II circulant-like orthogonal MDS matrix of order 2(2n) × 2(2n) may exist. We have corrected here in Table2.

8. Recursive MDS matrices

We recall some definitions and notations for the sake of completeness. Let Fq denote the field containing q elements for some prime power q and let Fq[x] denote the polynomial ring over Fq in the indeterminate x. We denote the algebraic closure ¯ ∗ of Fq by Fq and the multiplicative group by Fq . Let the characteristic of Fq be s char(Fq) = p for some prime p, which means q = p for some positive s. Let k−1 k g(x) = a0 + a1x + ... + ak−1x + akx ∈ Fq[x] and ak 6= 0. Then the degree of g is k and we denote it as deg(g). The polynomial g is said to be monic if ak = 1. The weight of a polynomial is the number of its nonzero coefficients. The order of a polynomial g(x) (with g(0) 6= 0) is the least positive integer n such that g(x) divides xn − 1, and we denote it by ord(g).

Definition 8.1. Let γ be an element in some extension of Fq. The minimal poly- nomial of γ over Fq, denoted by MinFq (γ), is the lowest degree monic polynomial µ(x) ∈ Fq[x] such that µ(γ) = 0.

Let Mk×n(Fq) denote the set of all matrices of size k × n over Fq. For simplicity, we use Mk(Fq) to denote the ring of all k × k matrices (square matrices of order k) over Fq. Let Ik denote the identity matrix of Mk(Fq). The of a matrix A ∈ Mk(Fq) is denoted by det(A). A square matrix A is said to be nonsingular if det(A) 6= 0 or equivalently, if the rows (columns) of A are linearly independent over Fq. Let us now present a few definitions.

k−1 k Definition 8.2. Let g(x) = a0 + a1x + ... + ak−1x + x ∈ Fq[x] be a monic polynomial of degree k. The companion matrix Cg ∈ Mk(Fq) associated to the

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 818 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta polynomial g is given by  0 1 0 ... 0   . .. .   . . .  Cg =   .  0 0 ...... 1  −a0 −a1 ...... −ak−1

We sometimes use the notation Companion(−a0, −a1,..., −ak−1) to represent the companion matrix Cg. Observe that if a0 6= 0 then the matrix Cg is nonsingular and its inverse is given by

−a  −a1 −a2 ... k−1 −1  a0 a0 a0 a0  1 0 ... 0 0  −1   Cg =  . . . .  .  . .. . .  0 0 ... 1 0

Observe that if a0 is equal to 1 then the elements of the last row of Cg and the −1 elements of the first row of Cg are same. In fact in this case we have

−1 (7) Cg = PCgP, where  0 0 ... 0 1   1 0 ... 0 0  P =    . . .. .   . . . .  0 0 ... 1 0 is a permutation matrix. Let V (x,T ) be the generalized Vandermonde matrix given by

 t1 t2 tk  x1 x1 . . . x1 t1 t2 tk  x2 x2 . . . x2  V (x,T ) =   ,  . . .   . . .  t1 t2 tk xk xk . . . xk

k for x = (x1, . . . , xk) ∈ Fq and T = {t1, t2, . . . , tk} ⊂ Z with 0 ≤ t1 < t2 < . . . < tk. Note that the matrix V (x,T ) is a Vandermonde matrix if T = {0, 1, . . . , k − 1}, and it is well known that Y det(V (x, {0, 1, . . . , k − 1})) = (xj − xi). 1≤i

Evidently, the matrix V (x, {0, 1, . . . , k − 1}) is nonsingular if and only if xi, 1 ≤ i ≤ k, are distinct.

Lemma 8.3. [38, Lemma 1] Let T = {0, 1, . . . , k − 1} and I = {0, 1, . . . , k − 2, k}. Then k ! X det(V (x,I)) = det(V (x,T )) · xi . i=1 Next we present some useful structural results for cyclic codes and MDS codes.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 819

Cyclic codes. We now recall some concepts from coding theory. For more details refer to [45]. A linear code Γ of length n and dimension ` over Fq is denoted as an [n, `]q code. If the minimum distance of Γ is equal to d then we denote it as an [n, `, d]q code. If Γ is an [n, `, d]q code, then n − ` ≥ d − 1, and this bound is known as Singleton Bound. We say a generator matrix G of an [n, `]q code Γ, which is of size `×n, is in systematic form when it contains (usually on the left most positions) the `×` identity matrix I`. The redundancy part of G is the `×(n−`) matrix next to I`. For convenience, abusing the conventional notation, we place the identity matrix on the right side in our discussion. An [n, `]q code is said to be cyclic if a cyclic shift of any element of the code remains in the code. In algebraic terms, cyclic codes can be seen as ideals of n Fq[x]/(x − 1). That is each cyclic code Γ of length n can be seen as Γ = hg(x)i generated by some monic polynomial g(x) ∈ Fq[x]. The polynomial g(x) divides xn−1 and is the unique monic polynomial with minimal degree in Γ. The polynomial g(x) is said to be the generator polynomial of Γ, and the elements of the code Γ are given by the multiples of g(x) of degree less than n, i.e. the polynomials n f(x) ∈ Fq[x]/(x − 1) such that g divides f. Then the code Γ defined by g has dimension ` = n − deg(g). A generator matrix of the code Γ can be given by  g(x)   xg(x)  G =   , 1  .   .  xn−deg(g)−1g(x) | {z } size n where the polynomials xig(x) are treated as vectors of length n formed by their coefficients (in increasing order of exponents). By the division algorithm, xi = q(x)g(x) + (xi mod g(x)). Therefore, xi − (xi mod g(x)) is divisible by g(x) and thus a codeword. Let  −xdeg(g) mod g(x) 1 0 0 ... 0   −xdeg(g)+1 mod g(x) 0 1 0 ... 0  (8) G =   .  . ..   . .  −xn−1 mod g(x) 0 0 0 ... 1 | {z } | {z } size deg(g) size n−deg(g) One can see that G is also a generator matrix of the code Γ because its rows are linearly independent codewords of the code Γ. Remark 38. If gcd(q, n) = 1, then xn − 1 and its derivative nxn−1 are relatively prime and thus has no repeated roots in xn − 1. Therefore, any polynomial g(x) which divides xn − 1 must have distinct roots if gcd(q, n) = 1. We also have the following result which is useful in the next subsections.

Lemma 8.4. [42, Theorem 9.42] Let g(x) ∈ Fq[x] be a monic polynomial of degree ¯ k with ord(g) = n ≥ 2. Suppose that g has distinct roots, say λ1, . . . , λk ∈ Fq. Then Pn−1 i n f(x) = i=0 fix ∈ Fq[x]/(x − 1) is a codeword of Γ = hg(x)i if and only if the coefficient vector (f0, f1, . . . , fn−1) of f is in the null space of the matrix  2 n−1  1 λ1 λ1 . . . λ1  . . .. .  H =  . . . .  . 2 n−1 1 λk λk . . . λk

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 820 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta or, in other words, H is a parity check matrix of Γ.

Proof. If f(x) is the codeword, g(x) divides f(x). Therefore, f(λi) = 0, that is 2 n−1 T f0 +f1λi +f2λi +...+fn−1λi = 0 for 1 ≤ i ≤ k. Thus, H ·[f0 f1 . . . fn−1] = 0. Conversely, let f(x) = q(x)g(x)+r(x) where deg(r) < deg(g) = k. Since f(λi) = g(λi) = 0, therefore r(λi) = 0 for 1 ≤ i ≤ k. As deg(r) < k, it cannot have k roots. Thus, r(x) = 0 and g(x) divides f(x) which implies f(x) is a codeword.

MDS matrices. The major application of MDS matrices in cryptography is in the design of linear diffusion layers in block ciphers and hash functions. They provide maximal diffusion. The input and output of diffusion layers in those applications are generally of same size. So we are interested in only square matrices which are MDS. Thus our interest is to construct [n = 2k, ` = k, d = k + 1]q codes for some positive integer k, from which we can get an MDS matrix of size k × k over Fq.

8.1. Characterization of polynomials that yield recursive MDS matri- ces. A recursive MDS matrix is an MDS matrix which can be expressed as a power m of some companion matrix, i.e, an MDS matrix M = Cg for some monic polyno- mial g(x) ∈ Fq[x] of degree k and an integer m ≥ k. If such an integer m exists for a polynomial g(x) ∈ Fq[x] then we say that g(x) yields a recursive MDS matrix. It is of importance only if the size of the diffusion matrix (MDS) is greater than 1, and so we always assume that the polynomials g(x) ∈ Fq[x] considered for obtaining recursive MDS matrices are of degree k = deg(g) ≥ 2. Note that the companion matrix Cg can be interpreted as  x   x2     .  Cg = .  .    xk−1  xk mod g(x) | {z } size k Now one can see that  x2   xm mod g(x)   x3   xm+1 mod g(x)      2  .  m  .  Cg =  .  ,...,Cg =  .  .      xk mod g(x)   xm+k−2 mod g(x)  xk+1 mod g(x) xm+k−1 mod g(x) | {z } | {z } size k size k m By Remark 13, the matrix Cg is MDS if and only if any k columns of the matrix ¯ m m G = [Cg | I] are linearly independent over Fq. Or, the matrix Cg is MDS if and 0 m only if any k columns of the matrix G = [−Cg | I] are linearly independent over 0 Fq. We can interpret the matrix G as follows.  −xm mod g(x) 1 0 0 ... 0   −xm+1 mod g(x) 0 1 0 ... 0  G0 =   .  . ..   . .  −xm+k−1 mod g(x) 0 0 0 ... 1 | {z } | {z } size deg(g) size k

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 821

We now prove the folklore result.  xm mod g(x)   xm+1 mod g(x)    m  .  Cg =  .  .    xm+k−2 mod g(x)  xm+k−1 mod g(x) We prove it by induction. For m = 1,  x   x mod g(x)   x2   x2 mod g(x)       .   .  Cg =  .  =  .  .      xk−1   xk−1 mod g(x)  xk mod g(x) xk mod g(x) Assume it is true for m = l ≥ 1. Now, we show that it is true for m = l + 1.  0 1 0 ... 0   xl mod g(x)   0 0 1 ... 0   xl+1 mod g(x)      l+1 l  . .. .   .  Cg = CgCg =  . . .   .       0 0 ...... 1   xl+k−2 mod g(x)  l+k−1 −a0 −a1 ...... −ak−1 x mod g(x)  xl+1 mod g(x)  l+2  x mod g(x)     .  =  .   l+k−1   x mod g(x)  Pk−1 l+i i=0 −a0x mod g(x) Pk−1 l+i l+k Now, we show that i=0 −a0x mod g(x) = x mod g(x). k−1 k−1 !! X l+i l X i −aix mod g(x) = x −aix mod g(x) i=0 i=0 = (xlxk) mod g(x) = xl+k mod g(x)

k Cg is an MDS matrix if g(x) yields MDS code and ord(g) = 2k. Assume g(x) ∈ F2s [x] has no repeated roots (Subsection 8.4 deals when g(x) has repeated roots). If g(x) divides x2k −1, then g(x) divides xk −1 and hence ord(g) 6= 2k, a contradiction. Therefore, in a field of characteristic 2, if the polynomial g(x) has no repeated roots, then it cannot divide x2k − 1 for deg(g) = k ≥ 2 which means the length of the code cannot be equal to 2k. But g(x) will divide x2k+z − 1 for some z > 0. In such cases, if g(x) yields [2k + z, k + z, d]q MDS code, then the distance d will be 2k + z − (k + z) + 1 = k + 1. To obtain a [2k, k, k + 1]q code, the code can be shortened at z positions in such a manner that the recursive structure does not get disturbed. The same idea was proposed by Augot et. al. [2] They constructed recursive MDS matrix using the shortened BCH code which we are going to discuss in the next subsection.

8.2. Construction of recursive MDS matrices using shortened BCH codes.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 822 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Definition 8.5. [26, Definition 6][45, pages 29, 194, 592] Given an [n, `, d]q code Γ, and a set R of z indices {i1, . . . , iz}, the shortened code ΓR is the set of words from Γ which are zero at positions i1, . . . , iz and whose zero coordinates are deleted, thus effectively shortening these words by z positions.

The shortened code ΓR has length n−z, dimension ≥ `−z and minimal distance ≥ d. Observe that the dimension can be greater than ` − z. Consider the code Γ = {0000, 1011, 0101, 1110}. If code is shortened at z = 2 positions, 0 and 2 (index starts from 0), the shortened code is ΓR = {00, 11}. Note that the length, dimension and the distance of ΓR is n − z = 4 − 2 = 2, 1 ≥ ` − z = 2 − 2 = 0 and 2 ≥ d = 2 respectively. Let g(x) ∈ Fq[x] be a monic polynomial of degree k with ord(g) = n ≥ 2k. Let 0 m m S be the code with generator matrix G = [−Cg | I]. Note that Cg is MDS if m 0 and only if −Cg is MDS. We can check from (8) that G can be identified as a submatrix of the generator matrix G of the cyclic code Γ = hg(x)i; the submatrix of G formed by the k rows with indices {m − k, m − k + 1, . . . , m − 1} and the 2k columns with indices {0, 1, . . . , k −1, m, m+1, . . . , m+k −1} (indices start from 0). Thus the code S can be seen as a shortened code of the cyclic code Γ (shortening on the n − 2k positions {k, k + 1, . . . , m − 1, m + k, m + k + 1, . . . , n − 1} by taking G as a generator matrix of Γ). It should be noted that S need not be a cyclic code. Now we present an overview of the technique proposed in [2] for the construction of recursive MDS matrices by shortening the suitable BCH codes appropriately. A BCH code is a special type of cyclic code with guaranteed minimum distance (see the definition below). For more details, refer to [2, Section 3] and for related concepts in coding theory refer to [45].

Definition 8.6. [26, Definition 7][45, page 202] A BCH code over Fq is defined using an element β in some extension field of Fq. Suppose that ord(β) = n. First, pick integers ` and d and take the (d − 1) consecutive powers β`, β`+1, . . . , β`+d−2 of β, then compute ` `+d−2  g(x) = lcm MinFq (β ),..., MinFq (β ) , where MinFq (γ) is the minimal polynomial of γ over Fq. The cyclic code (over Fq) defined by g(x) is called a BCH code, and it has length n, dimension n − deg(g) and minimum distance at least d. Note that if all the conjugates of the elements in {β`+i : i = 0, 1, . . . , d − 2} are in that set itself then the degree of the polynomial g(x) is equal to (d − 1). That is g(x) has no other roots except β`, . . . , β`+d−2. In that case the BCH code defined by g(x) becomes MDS. We state this result in the following lemma. ` `+k−1 Lemma 8.7. A BCH code Γ over Fq defined by the k roots [β ,. . ., β ] with the Qk−1 `+j actual distance k + 1 is MDS if and only if P (x) = j=0 (x − β ) is in Fq[x]. In ` `+k−1  this case, g(x) = lcm MinFq (β ),..., MinFq (β ) is equal to P (x). Qk−1 `+j Proof. If part: Let P (x) = j=0 (x − β ). If P (x) ∈ Fq[x], then P (x) contains all conjugates of its roots and so g(x) = P (x). The degree of g(x) is k, the dimension of the code is n − k and the actual distance is ≥ k + 1. But from the Singleton bound, the actual distance is ≤ n − (n − k) + 1 = k + 1. Therefore, the actual distance is k + 1 and achieves the Singleton bound. Hence it is MDS. Only if part: Let ` `+k−1  g(x) = lcm MinFq (β ),..., MinFq (β )

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 823 generates the MDS BCH code. The dimension of the code is n − deg(g) and the actual distance is k + 1 which is equal to n − (n − deg(g)) + 1 (from the Singleton bound). Therefore, k = deg(g) which implies the set of k roots [β`,. . ., β`+k−1] Qk−1 `+j contains all its conjugates. Thus P (x) = j=0 (x − β ) ∈ Fq[x]. In Lemma 8.7, the BCH code Γ must have the condition that the actual distance is k + 1, otherwise it may not satisfy the sufficient condition. We show it by one 3 example. Consider q = 2 . Take β ∈ Fq2 such that ord(β) = 9. Let 3 4 5 g(x) = lcm MinFq (β ), MinFq (β ), MinFq (β )). The cyclotomic coset mod n over Fq is C0 = {0},C1 = {1, 8},C2 = {2, 7},C3 = {3, 6},C4 = {4, 5}. Therefore, 3 4 g(x) = MinFq (β ) · MinFq (β ). Let 0 3 4 5 6 g (x) = lcm MinFq (β ), MinFq (β ), MinFq (β ), MinFq (β )). The degree of g(x) is 4 which yields [n = 9, ` = 5, d ≥ 5]8 code. The distance is ≥ 5 because g(x) and g0(x) yield the same code and the designed distance of g0(x) is 5. By the Singleton bound, d ≤ n − l + 1 = 9 − 5 + 1 = 5. Thus g(x) yields [9, 5, 5]8 Q5 j code which is MDS. But P (x) = j=3(x − β ) is not in Fq[x] because the actual distance is 5, not k + 1 = 4. The example above demonstrates the necessity of the actual distance k + 1 in Lemma 8.7. The lemma mentioned in ([2, Lemma 1]) does not make assumption on the distance of the BCH code and thus suffers a gap in the statement. We investigated that lemma again and provide the correct statement in Lemma 8.7. In a BCH code, the roots of its generating polynomial may not be expressed as consecutive powers of an element. For example, consider the BCH code generated 3 by the polynomial g(x) ∈ Fq[x] where q = 2 defined by 2 3 g(x) = lcm MinFq (β ), MinFq (β )). Roots of g(x) are β2, β3, β6, β7 which are not consecutive in powers of β. In [2, 26], authors considered a particular kind of BCH codes, called c-BCH codes, where all the roots of its generating polynomial are consecutive powers of some element in some field. It is worth to point out that the authors in [2] also used c-BCH code without mentioning it explicitly.

Definition 8.8. [26, Definition 8] We define a c-BCH code over Fq to be a BCH code where the roots of its generating polynomial can be expressed as consecutive powers of an element β in some extension field of Fq. By Lemma 8.7, one can see that a c-BCH code over Fq is also an MDS code. For this reason we call such codes as MDS c-BCH codes.

The MDS c-BCH code over Fq defined by g(x) has length n = ord(g) and dimen- sion k = n − deg(g). So the corresponding MDS matrix will be of size k × deg(g), which may not be suitable for use as a diffusion layer, unless deg(g) = k, as the sizes of input and output of a diffusion layer are generally the same and as a result n = 2k. Also we cannot have ord(β) = 2k since we cannot have elements of even order in extensions of F2. To overcome this problem it is suggested in [2, Section 3.2] to use a shortened MDS c-BCH code (see Definition 8.5) instead of a full length (n = ord(β) > 2k) MDS c-BCH code. From the discussion above, we can see that the generating polynomial of an MDS c-BCH code yields a recursive MDS matrix.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 824 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

We now discuss the technique proposed in [2] to find recursive MDS matrices of size k from MDS c-BCH codes by shortening appropriately. The idea is to look for [n = 2k + z, m = k + z, d = k + 1]q MDS c-BCH codes and shorten them on z positions to obtain the required [2k, k, k + 1]q MDS codes for some odd integer z. So the first step is to construct a c-BCH code of length n = 2k + z (≤ q + 1) for some odd integer z. The upper bound on n is coming from the assumption that MDS conjecture holds (see Fact3). We pick a β of order n (in Qk−1 `+j some extension field of Fq) and `, 0 ≤ ` < n, and compute P (x) = j=0 (x − β ). The above lemma gives a condition when such a polynomial P (x) generates an MDS c-BCH code. The second step is to check whether the condition is satisfied or not, if so we can obtain a recursive MDS matrix from the generating polynomial. Thus essentially one needs to verify the condition:

k−1 Y `+j (9) P (x) = (x − β ) ∈ Fq[x]; j=0 if true then our choice of n, β and ` yields an MDS c-BCH code and its generat- ing polynomial is equal to P (x). Later we will observe that (the paragraph after Theorem 8.11), for some choice of n, if there exists some β and `, 0 ≤ ` < n, which yield an MDS c-BCH code (of length n and dimension n − k), then for such choice of n, ` and for any choice of βi where gcd(i, n) = 1, we can get an MDS c-BCH code. Thus for some choice of n if there exists some `, 0 ≤ ` < n, such that the Equation9 holds true, then we say n is a successful choice, and similarly we say the pair (n, `) is a successful choice. To find all MDS c-BCH codes over Fq that can be obtained in this way, the algorithm of Augot et. al. (see [2, Section 4.2]) verifies the condition computing the polynomial P (x), for all the candidates in the ranges given by n = 2k + z ≤ (q + 1) where z is odd, β has order n and 0 ≤ ` < n − 1.

An efficient method for finding all MDS c-BCH codes. The major draw- back of Augot et. al. algorithm is that there can be many unsuccessful choices of n and ` in the aforementioned respective ranges. In fact it may happen that (see the discussion after Theorem 8.9), for some choices of n and for any `, 0 ≤ ` < n − 1, there cannot be an MDS c-BCH code (of length n = 2k + z and dimension n − k = k + z over Fq). Such choices for n and ` do not yield MDS c-BCH codes, and so the computation/verification of P (x) with such choices is unnecessary. Moreover, for such unsuccessful choices of n, the computation has to be done in extension fields of Fq. So it will be better if we can confine the computation only to the values of n and ` for which the condition: P (x) ∈ Fq[x] holds true, i.e. compute only the poly- nomials which are generating polynomials of MDS c-BCH codes. Another drawback of this algorithm is that for some successful choices of n, the same polynomials are computed twice. In [26], the authors presented results on the values of n and the corresponding values of ` for which the constructed polynomials generate MDS c-BCH codes. The most significant result was Theorem 8.9 which gives a nice relation between n and q. By getting so, a lot of unnecessary choices of n could be omitted and finally we get a set of only those possible values of n which would definitely yield MDS c- BCH codes. This theorem was significant not only because it would directly give the possible values of n, but at the same time the value of l could also be determined. To obtain all possible values of n and l was a remarkable improvement as it drastically

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 825 reduces the running time of finding all MDS c-BCH of length n over Fq which was not possible from the algorithm proposed by Augot et. al. in [2]. We assume that k ≥ 2 and n = 2k + z (≤ q + 1) for some odd integer z (see Fact3). Theorem 8.9. [26, Theorem 2] Let k and n be integers with k ≥ 2 and n > 2k. Then there exists an MDS c-BCH code of length n and of dimension (n − k) over Fq if and only if q ≡ ±1 mod n. In [2], the algorithm would search for all candidate values of n by choosing z from 1 to q + 1 − 2k. As Theorem 8.9 suggests, many of these values of z were definitely wrong choices and hence these values would increase the running time of the algorithm. Moreover, one could obtain the exact values of l depending upon whether n | q − 1 or n | q + 1 from the proof of Theorem 8.9. When n | q − 1, l is any value 0 ≤ l ≤ n − 1. And when n | q + 1, l = (n − k + 1)/2 if k is even otherwise k − 1 l = n − . As a consequence, a formula for the number of such MDS c-BCH 2 codes is obtained. Theorem 8.10. [26, Theorem 3] Let n | (q − 1). Then the number of MDS c-BCH φ(n) codes of length n and of dimension (n − k) over is equal to n · . Fq 2 Theorem 8.11. [26, Theorem 4] Let n | (q + 1). Then the number of MDS c-BCH φ(n) codes of length n and of dimension (n − k) over is . Fq 2 For proofs of Theorems 8.10 and 8.11, see [26]. Nevertheless, we present a brief discussion on the number of MDS c-BCH codes obtained in these theorems. In φ(n) Theorem 8.10, the number of MDS c-BCH codes is n · . The term n appears 2 due to the choices of l which varies from 0 to n − 1, i.e. n choices, whereas the term φ(n)/2 appears because of the number of choices of β (see Equation9) whose order must be exactly n. There are φ(n) choices of such β. But β and β−1 yield the same code, that’s why the number of choices of β becomes φ(n)/2. In Theorem 8.11, there is only one choice of l (see the paragraph after Theorem 8.9) and φ(n)/2 choices of β (similar argument as for Theorem 8.10). 8.3. Recursive MDS matrices using the parity check matrix.

Lemma 8.12. [27, Lemma 4] Let g(x) ∈ Fq[x] be a monic polynomial of degree k ¯ with ord(g) = n. Suppose that g has k distinct roots, say λ1, . . . , λk ∈ Fq. Let S be 0 m the code with generator matrix G = [−Cg | I] for some m, k ≤ m ≤ n − k. Then 2k a vector (f0, f1, . . . , fk−1, fm, fm+1, . . . , fm+k−1) ∈ Fq is an element of S if and only if it is in the null space of the matrix H0 given by  k−1 m m+1 m+k−1  1 λ1 . . . λ1 λ1 λ1 . . . λ1 0  ......  (10) H =  ......  . k−1 m m+1 m+k−1 1 λk . . . λk λk λk . . . λk Proof. The code generated by the matrix G0 is the shortened code S of the code Γ = hg(x)i shortened at positions R = {k, k+1, . . . , m−1, m+k, m+k+1, . . . , n−1}. Therefore (f0, f1, . . . , fk−1, fm, fm+1, . . . , fm+k−1) is an element of S if and only if (f0, f1, f2, . . . , fn−1) is in Γ where (fk, fk+1, . . . , fm−1, fm+k, fm+k+1, . . . , fn−1) = (0, 0,..., 0, 0, 0,..., 0). From Lemma 8.4,(f0, f1, f2, . . . , fn−1) is a codeword of

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 826 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Pn−1 j Γ = hg(x)i if and only if j=0 fjλi = 0 for 1 ≤ i ≤ k. Suppose (f0, f1, f2, . . . , fn−1) P j is shortened at positions R. Let I = {1, 2, . . . , n}. Then j∈I\R fjλi = 0 if and Pn−1 j only if j=0 fjλi = 0 for 1 ≤ i ≤ k. Hence (f0, f1, . . . , fk−1, fm, fm+1,..., 0 fm+k−1) is in the null space of the matrix H if and only if (f0, f1, f2, . . . , fn−1) is in Γ or (f0, f1, . . . , fk−1, fm, fm+1, . . . , fm+k−1) is in S. Let S and H0 be as defined in the above lemma. Note that the code S has dimension k. It is easy to see that the code S is MDS (the code S has minimum 0 2k distance k + 1) if and only if the null space of the matrix H in Fq does not contain a nonzero vector of weight k or less, in other words, any k columns of the matrix 0 H are linearly independent over Fq [45]. For convenience, we state this result as follows.

Theorem 8.13. [27, Theorem 2] Let g(x) ∈ Fq[x] be a monic polynomial of degree ¯ k and ord(g) = n. Suppose that g has k distinct roots, say λ1, . . . , λk ∈ Fq. Let m m be an integer with k ≤ m ≤ n − k. Then the matrix M = Cg is MDS if and only 0 if any k columns of the matrix H given in (10) are linearly independent over Fq. Theorem 8.13 can be proved alternatively as shown in [25]. Suppose g(x) has −1 k distinct roots λ1, . . . , λk. The idea is to use the fact that Cg = VDV or T T −1 T Cg = (V ) DV where V = vand[λ1, λ2, . . . , λk] and D = diag[λ1, λ2, . . . , λk]. If m m T T m T m Cg is MDS, then (Cg ) = (Cg ) is MDS and thus any k columns of [I | (Cg ) ] are linearly independent. Now, T m T −1 m T T −1 T m T T −1 0 [I | (Cg ) ] = [I | (V ) D V ] = (V ) [V | D V ] = (V ) H 0 T m T T −1 where H = [V | D V ]. As λi’s are distinct, (V ) is nonsingular. As a T m m 0 result, (Cg ) is MDS and so Cg if and only if any k columns of H are linearly independent. k k−1 Lemma 8.14. [27, Corollary 1] If g(x) = x + ak−1x + ... + a1x + a0 ∈ Fq[x] (with a0 6= 0) yields a recursive MDS matrix then its (monic) reciprocal polynomial xk  1  a a 1 g∗(x) = g = xk + 1 xk−1 + ... + k−1 x + a0 x a0 a0 a0 also yields a recursive MDS matrix. −1 Proof. The matrix Cg∗ = R(Cg) R where  0 0 ... 0 1   0 0 ... 1 0    R =  ...     0 1 ... 0 0  1 0 ... 0 0 2 m −1 m −1 m and R = Ik. The matrix Cg∗ = R(Cg ) R is MDS if and only if (Cg ) is MDS −1 m m and it is true because (Cg ) is MDS if and only if Cg is MDS. Observe that if λ is the root of g, then λ−1 is the root of g∗ provided λ 6= 0. Qk Lemma 8.15. [27, Corollary 2] If g(x) = i=1(x − λi) ∈ Fq[x] yields a recursive k x Y MDS matrix then for any c ∈ ∗ the polynomial ckg = (x − cλ ) also yields Fq c i i=1 a recursive MDS matrix.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 827

∗ k x −1 Proof. Let g (x) = c g . The matrix C ∗ = cDC D where c g g  1 0 0 ... 0 0   0 c 0 ... 0 0     0 0 c2 ... 0 0  D =    ...     0 0 0 . . . ck−2 0  0 0 0 ... 0 ck−1

m m −1 m The matrix Cg∗ = DCg D is MDS if and only if Cg is MDS.

By using the two previous lemmas, from a polynomial that yields recursive MDS matrix, one can obtain more such polynomials. It would be interesting to find some other techniques that can give more polynomials yielding recursive MDS matrices from one such. Now, we present five methods for the construction of polynomials that yield recursive MDS matrices. The polynomials constructed using these methods have distinct roots. The main tool in these methods is Theorem 8.13: we suitably choose Qk λi, 1 ≤ i ≤ k, and verify that the polynomial g(x) = i=1(x − λi) ∈ Fq[x] satisfies the condition of Theorem 8.13. For this purpose, we will show that any k-column 0 submatrix of H corresponding to λi’s as given in (10) is nonsingular. In essence, we will show that: for an integer m ≥ k, the determinant of the matrix

 r1 r2 rk  λ1 λ1 . . . λ1 r1 r2 rk  λ2 λ2 . . . λ2  (11) H0[R] =   ,  . . .. .   . . . .  r1 r2 rk λk λk . . . λk is nonzero for any subset R = {r1, r2, . . . , rk} ⊂ E = {0, 1, . . . , k − 1, m, m + 1, . . . , m + k − 1} of size k.

8.3.1. Construction I(a). In this method, we use consecutive powers of an element or a fixed multiple of them to get a polynomial from which we can obtain a recursive MDS matrix. Later we will see that this has similarity with BCH codes. But this way we can get more recursive MDS matrices than the method which uses shortened BCH codes [2].

i−1 Theorem 8.16. [27, Theorem 3] Let λi = θ , 1 ≤ i ≤ k, for some θ ∈ Fq. Let Qk m g(x) = i=1(x − λi). Then for an integer m ≥ k, the matrix Cg is MDS if and only if θi 6= θj for all i, j ∈ E = {0, 1, . . . , k − 1, m, m + 1, . . . , m + k − 1} with i 6= j.

m 0 Proof. As discussed above the matrix Cg is MDS if and only if det(H [R]) is nonzero i−1 for all subsets R = {r1, r2, . . . , rk} ⊂ E. We have λi = θ for 1 ≤ i ≤ k, and so we get

 1 (θ)r1 ... (θk−1)r1   1 (θr1 ) ... (θr1 )k−1   1 (θ)r2 ... (θk−1)r2   1 (θr2 ) ... (θr2 )k−1  H0[R] =   =   .  . . .. .   . . .. .   . . . .   . . . .  1 (θ)rk ... (θk−1)rk 1 (θrk ) ... (θrk )k−1

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 828 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

ri 0 Let yri = θ for 1 ≤ i ≤ k. Therefore we have det(H [R]) = det(V ), where  1 y y2 . . . yk−1  r1 r1 r1 2 k−1  1 yr2 yr . . . yr  V =  2 2   . . . .   . . . .  1 y y2 . . . yk−1 rk rk rk We have det(H0[R]) 6= 0 if and only if det(V ) 6= 0. Observe that the matrix V ri is a Vandermonde matrix and det(V ) 6= 0 if and only if yri = θ , 1 ≤ i ≤ k, are m ri distinct and nonzero. Therefore the matrix Cg is MDS if and only if θ , 1 ≤ i ≤ k, are distinct for all R = {r1, r2, . . . , rk} ⊂ E. The latter condition is equivalent to θi 6= θj for all i, j ∈ E = {0, 1, . . . , k − 1, m, m + 1, . . . , m + k − 1} with i 6= j and hence the theorem. Observe that if θ satisfies the condition in Theorem 8.16 then it is necessary that the order of θ is ≥ 2k, and it is also sufficient when m = k.

8 4 3 2 Example 10. Let the field F28 = F2[x]/hµ(x)i, where µ(x) = x +x +x +x +1 is a primitive polynomial over F2. Let α ∈ F28 be a root of µ(x), i.e. a primitive element 2 of F28 . Let us consider θ = α in Theorem 8.16, then λ1 = 1, λ2 = α, λ3 = α , and 3 2 3 4 75 3 249 2 λ4 = α . We get g1(x) = (x + 1)(x + α)(x + α )(x + α ) = x + α x + α x + 78 6 6 78 249 75 α x+α . Then the companion matrix of g1 is Cg1 = Companion(α , α , α , α ) and  α6 α78 α249 α75  81 59 189 163 4  α α α α  C =   g1  α169 α162 α198 α131  α137 α253 α49 α143 is an MDS matrix. We can obtain many more polynomials using Theorem 8.16 and Lemma 8.15. In the previous example, if we multiply c = α with λi for 1 ≤ i ≤ 4, we get 2 3 4 2 λ1 = α, λ2 = α , λ3 = α and λ4 = α . Then we get g1(x) = (x + α)(x + α )(x + 3 4 4 76 3 251 2 81 10 α )(x + α ) = x + α x + α x + α x + α . Then the companion matrix of g1 10 81 251 76 is Cg1 = Companion(α , α , α , α ) and  α10 α81 α251 α76  86 63 192 165 4  α α α α  C =   g1  α175 α167 α202 α134  α144 α4 α54 α147 which is again an MDS matrix.

Below we see that the following matrices over F24 used in the design of PHOTON family of hash functions can be derived from this construction (see [19, page 232]). 4 The constructing polynomial of the field F24 is x + x + 1 and α is a root of it. 5 4 3 3 3 2 1. The polynomial f5(x) = x + αx + (α + 1)x + (α + 1)x + αx + 1 yields a 5 × 5 recursive MDS matrix over F24 . One can check that the roots of f5 are the consecutive powers {β13, β14, β0, β1, β2}, where β = α4. We can get 13 f5(x) by taking θ = β and c = β in Theorem 8.16. 6 5 3 4 2 3 3 2 2. The polynomial f6(x) = x + αx + α x + (α + 1)x + α x + αx + 1 yields a 6 × 6 recursive MDS matrix over F24 . We have verified that the roots of f6 are the consecutive powers {γ6, γ7, γ8, γ9, γ10, γ11}, where γ = (β + 1)45, β is a root of the irreducible polynomial x8 + x4 + x3 + x + 1, α = β7 + β6 + β5 + 1,

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 829

6 and the order of γ is equal to 17. We can get f6(x) by taking θ = γ and c = γ in Theorem 8.16. 7 2 6 2 5 4 3 2 2 2 3. The polynomial f7(x) = x +α x +(α +α)x +x +x +(α +α)x +α x+1 yields a 7 × 7 recursive MDS matrix over F24 . One can check that the roots of 12 13 14 0 1 2 3 f7 are the consecutive powers {α , α , α , α , α , α , α }. We can get f7(x) by taking θ = α and c = α12 in Theorem 8.16. 8 2 7 2 6 3 5 4 3 4. The polynomial f8(x) = x +(α +α)x +(α +1)x +α x +αx +(α +α+ 3 2 2 1)x + αx + α x + α yields a 8 × 8 recursive MDS matrix over F24 . We have i−1 verified that the roots λi’s of f8 are in F28 and of the form λi = θ c, 1 ≤ i ≤ 8, with θ = (β + 1)15 and c = (β + 1)109, where β is a root of the irreducible polynomial x8 + x4 + x3 + x + 1, α = β7 + β6 + β5 + 1, and the order of θ is equal to 17. We can get f8(x) with this choice of λi’s in Theorem 8.16. Relationship with [2]: Augot et. al. proposed a method for the construction of recursive MDS matrices using shortened BCH codes. In this method, first a polynomial g(x) is computed by appropriately fixing an element β in some extension field of Fq, where all the roots of the polynomial g(x) are consecutive powers of β, say βi, βi+1, . . . , βi+k−1 for some integer i. If the roots of the polynomial g(x) are conjugate to each other then g(x) ∈ Fq[x]. This ensures that the BCH code with generator polynomial g(x) becomes MDS. In that case the polynomial g(x) yields a recursive MDS matrix. Now by taking θ = β and c = βi in Theorem 8.16, we can also see that the polynomial g(x) yields a recursive MDS matrix.

Remark 39. Note that if λ1 = c is not a power of θ in Theorem 8.16 then the polynomial obtained may not be a generator polynomial of BCH code. For example, the polynomial f8(x) in Item 4 above is in that form. With the notation used there, 2 4+i the roots of f8(x) can be given by λi = α θ for 1 ≤ i ≤ 8. It can be verified that λi’s cannot be expressed as consecutive powers of an element in F28 . So this method gives a larger set of polynomials yielding recursive MDS matrices than the method which uses shortened BCH codes. These additional polynomials can be seen as those obtained by applying Lemma 8.15. Next we provide two more methods for the construction of polynomials that yield recursive MDS matrices. These two constructions appear to be similar (but not same) to the first construction and so we denote them by I(b) and I(c).

8.3.2. Construction I(b).

i−1 k Theorem 8.17. [28, Theorem 3] Let λi = θ for 1 ≤ i ≤ k − 1 and λk = θ Qk for some θ ∈ Fq. Let g(x) = i=1(x − λi). Then for an integer m ≥ k, the 0 m r r 0 Pk ri matrix Cg is MDS if and only if θ 6= θ for r, r ∈ E and i=1 θ 6= 0 for all R = {r1, r2, . . . , rk} ⊂ E, where E = {0, 1, . . . , k − 1, m, m + 1, . . . , m + k − 1}.

m 0 Proof. From (11) we can see that the matrix Cg is MDS if and only if det(H [R]) 0 of the matrix H [R] is nonzero for all subsets R = {r1, r2, . . . , rk} ⊂ E. We have i−1 k 0 λi = θ for 1 ≤ i ≤ k − 1 and λk = θ . So we have H [R] =

 1 θr1 ... (θk−2)r1 (θk)r1   1 θr1 ... (θr1 )(k−2) (θr1 )k   1 θr2 ... (θk−2)r2 (θk)r2   1 θr2 ... (θr2 )(k−2) (θr2 )k    =   .  ......   ......   . . . . .   . . . . .  1 θrk ... (θk−2)rk (θk)rk 1 θrk ... (θrk )(k−2) (θrk )k

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 830 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

ri 0 0 Let yri = θ for 1 ≤ i ≤ k. Therefore we have det(H [R]) = det(V ), where  1 y y2 . . . yk−2 yk  r1 r1 r1 r1 2 k−2 k  1 yr2 yr . . . yr yr  V 0 =  2 2 2   . . . . .   . . . . .  1 y y2 . . . yk−2 yk rk rk rk rk We have det(H0[R]) 6= 0 if and only if det(V 0) 6= 0. Observe that the matrix V 0 is a generalized Vandermonde matrix. Also from Lemma 8.3 we have det(V 0) 6= 0 if Pk and only if yri are distinct and i=1 yri 6= 0. Hence the proof. Remark 40. We can see that the condition on θ in Theorem 8.17 is applicable even i−1 k ∗ if we take λi = θ c, 1 ≤ i ≤ k − 1, and λk = θ c for some c ∈ Fq . By considering the roots in this way the polynomials that we get are same as those obtained by applying Lemma 8.15. 8.3.3. Construction I(c). i Corollary 13. [28, Corollary 1] Let λ1 = 1, and λi = θ , 2 ≤ i ≤ k, for some ∗ Qk m θ ∈ Fq . Let g(x) = i=1(x − λi). Then for an integer m ≥ k, the matrix Cg 0 r r 0 Pk −ri is MDS if and only if θ 6= θ for r, r ∈ E and i=1 θ 6= 0 for all B = {r1, r2, . . . , rk} ⊂ E, where E = {0, 1, . . . , k − 1, m, m + 1, . . . , m + k − 1}. −1 i−1 −1 k Proof. Consider γi = λk−i+1 = (θ ) c, 1 ≤ i ≤ k − 1 and γk = λ1 = (θ ) c for k m c = θ . Then by Theorem 8.17 and the above remark, the matrix Cg is MDS if and −ri Pk −ri only if θ , 1 ≤ i ≤ k, are distinct and i=1 θ 6= 0 for all B = {r1, r2, . . . , rk} ⊂ E. Hence the proof. It is also not difficult to a see a proof of the above corollary in a similar way as in the proof of Theorem 8.17. Remark 41. As a consequence of the above results, we have the following infinite class of polynomials that yield recursive MDS matrices. Let s ≥ 2k and α be a ¯∗ root of an irreducible polynomial of degree s over F2. Then for any c ∈ F2s , the k Qk−2 i m polynomial g(x) = (x − cα ) · i=0 (x − cα ) yields recursive MDS matrices Cg for m ∈ {k, . . . , s − k}. 8 7 6 In the examples below the constructing polynomial of F28 is x + x + x + x + 1 and α is a root of it. Example 11. We have β = α15 is a primitive 17th and the degree of its minimal polynomial is 8. Then from the remark above the polynomial g(x) = (x − 1)(x − β)(x − β2)(x − β4) yields a recursive MDS matrix of size 4 × 4. One m can see that Cg is MDS for 4 ≤ m ≤ 13. But the same polynomial can also be obtained from the construction II(b). Example 12. We have β = α15 is a primitive 17th root unity. Consider the polynomial g(x) = (x − 1)(x − β)(x − β2)(x − β3)(x − β5). We have verified that this polynomial satisfies the condition in Theorem 8.17 and so it yields a recursive 5 MDS matrix of size 5×5. One can check that Cg is an MDS matrix and can also be verified that this polynomial cannot be obtained by the other known constructions I(a), II(a) and II(b). Next we provide two more methods for the construction of polynomials that yield recursive MDS matrices. These two constructions appear to be similar (but not same) and so we denote them by II(a) and II(b).

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 831

8.3.4. Construction II(a). The recursive MDS matrices obtained with the method which uses Gabidulin codes [7, Section 3.4] can also be obtained by the following s0 0 method. Let Fq1 be a subfield of Fq, i.e. q1 = p for some s | s.

qi−1 Theorem 8.18. [27, Theorem 4] Let λi = θ 1 , 1 ≤ i ≤ k, for some θ ∈ Fq. Qk Let g(x) = i=1(x − λi). Let E = {0, 1, . . . , k − 1, m, m + 1, . . . , m + k − 1} for m some integer m ≥ k. Then the matrix Cg is MDS if and only if any subset B of i {θ : i ∈ E} with |B| = k is linearly independent over Fq1 .

m 0 Proof. As discussed above the matrix Cg is MDS if and only if det(H [R]) of the 0 matrix H [R] given in (11) is nonzero for all subsets R = {r1, r2, . . . , rk} ⊂ E. We qi−1 0 have λi = θ 1 for 1 ≤ i ≤ k, and so we get H [R] =

2 k−1 2 k−1  q q   r q r q r q r  q1 1 1 1 1 1 1 1 1 1 y y y . . . y θ (θ ) (θ ) ... (θ ) r1 r1 r1 r1 k−1 r q r q2 r qk−1 r  q2 q   θ 2 (θ 1 ) 2 (θ 1 ) 2 ... (θ 1 ) 2   y yq1 y 1 . . . y 1    r2 r2 r2 r2  . . . .  =   ,  ......   ......   . . . . .   . . . . .  2 k−1  2 k−1  rk q1 rk q rk q rk q q θ (θ ) (θ 1 ) ... (θ 1 ) y yq1 y 1 . . . y 1 rk rk rk rk

ri 0 where yri = θ , 1 ≤ i ≤ k. Now by [42, Lemma 3.51] we have det(H [R]) 6= 0 ri if and only if yri = θ , 1 ≤ i ≤ k, are linearly independent over Fq1 . We need 0 det(H [R]) 6= 0 for all subsets R = {r1, r2, . . . , rn} ⊂ E. Hence the theorem.

Remark 42. We can see that the condition on θ in Theorem 8.18 is applicable qi−1 ∗ even if we take λi = θ 1 c for some c ∈ Fq . By considering the roots in this way the polynomials that we get are same as those obtained by applying Lemma 8.15. We can also see that, as seen in Remark 39, it is not always possible to fit λi = qi−1 0qj−1 0 θ 1 c, 1 ≤ i ≤ k, in the form θ 1 , 1 ≤ j ≤ k, for some θ . Thus we can get a larger set of polynomials than the method which uses Gabidulin codes [7, Section 3.4].

8 4 3 2 Example 13. Let the field F28 = F2[x]/hµ(x)i, where µ(x) = x + x + x + x + 1 is a primitive polynomial over F2. Let α ∈ F28 be a root of µ(x), i.e. a primitive element of F28 . Let us consider q1 = 2, m = k = 4 and θ = α in Theorem 8.18, then 2 4 8 2 λ1 = α, λ2 = α , λ3 = α and λ4 = α . Then we get g2(x) = (x + α)(x + α )(x + α4)(x + α8) = x4 + α238x3 + α235x2 + α168x + α15. Thus the companion matrix 15 168 235 238 Cg2 = Companion(α , α , α , α ) and

 α15 α168 α235 α238  253 49 170 190 4  α α α α  C =   g2  α205 α246 α92 α138  α153 α252 α3 α18 is an MDS matrix.

Relationship with [7]: It was observed in [7] that MDS matrices can be con- structed using Gabidulin codes [8] by appropriately choosing certain parameters. It was then established that such an MDS matrix is in fact a recursive MDS matrix if a polynomial basis is chosen. We will next see that Berger’s method is a particular case of Theorem 8.18. Let q = 2s and s = 2k. The generator matrix described in [7,

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 832 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Section 3.4] is of the form G = [V | Vˆ ] =

 1 α . . . αk−1 αk αk+1 . . . α2k−1  2 2(k−1) 2k 2(k+1) 2(2k−1)  1 α . . . α α α . . . α   4 4(k−1) 4k 4(k+1) 4(2k−1)   1 α . . . α α α . . . α    ,  ......   ......  k−1 k−1 k−1 k−1 k−1 1 α2 . . . α2 (k−1) α2 k α2 (k+1) . . . α2 (2k−1)

2 2k−1 where {1, α, α , . . . , α } is a polynomial basis of Fq. Then the matrix G can be reduced to systematic form [I | A] by applying elementary row operations: V −1G = V −1[V | Vˆ ] = [I | A]. It was shown that the first column of the matrix A = V −1Vˆ k gives the coefficients of the polynomial g such that Cg = A is an MDS matrix. Observe that for m ≥ k, if deg(Min (θ)) ≥ m+k then θri , 1 ≤ i ≤ k, are linearly Fq1 independent over Fq1 for all subsets R = {r1, r2, . . . , rn} ⊂ E = {0, 1, 2, . . . , k − 1, m, m + 1, . . . , m + k − 1}. Thus by taking θ to be a primitive element in F2s with 2k ≤ s, m = k and q1 = 2, we can see that the condition of Theorem 8.18 is satisfied. The polynomials obtained with these choices in Theorem 8.18 are same as those obtained by the method discussed in [7, Section 3.4]. This way we can get an infinite class of polynomials that yield recursive MDS matrices. The code corresponding to a matrix of this type has an additional property: maximum distance. As discussed in Remark 42, we can fit many other choices in Theorem 8.18 and thus we get a larger set of polynomials than the method discussed in [7, Section 3.4]. Note that the condition deg(Min (θ)) ≥ m + k is not necessary. We see Fq1 below an example where an element θ satisfies the condition in Theorem 8.18 but deg(Min (θ)) < m + k. Fq1

6 4 3 Example 14. Let the field F26 = F2[x]/hµ(x)i, where µ(x) = x + x + x + x + 1 is a primitive polynomial over F2. Let α ∈ F26 be a root of µ(x), i.e. a primitive element of F26 . Let us consider q1 = 2, m = k = 4, θ = α in Theorem 8.18, then 2 4 8 2 λ1 = α, λ2 = α , λ3 = α and λ4 = α . Then we get g3(x) = (x + α)(x + α )(x + α4)(x + α8) = x4 + α30x3 + α43x2 + α51x + α15. Thus the companion matrix 15 51 43 30 Cg3 = Companion(α , α , α , α ) and  α15 α51 α43 α30  45 28 34 19 4  α α α α  C =   g3  α34 α40 α43 α5  α20 α17 α47 α42 is an MDS matrix. Note that deg(MinF2 (α)) = 6 < m + k = 8. 8.3.5. Construction II(b). Now we present another method which appears to be similar to the previous one, but the difference is that the roots of the polynomial g(x) considered below may not fit in the form described in Theorem 8.18 (see also Remark 43). With this method we get a new infinite class of polynomials that yield s0 0 recursive MDS matrices. Let Fq1 be a subfield of Fq, i.e. q1 = p for some s | s.

qi−2 Theorem 8.19. [27, Theorem 5] Let λ1 = 1 and λi = θ 1 , 2 ≤ i ≤ k, for some Qk θ ∈ Fq. Let g(x) = i=1(x − λi). Let E = {0, 1, . . . , k − 1, m, m + 1, . . . , m + k − 1} m for some integer m ≥ k. Then the matrix Cg is MDS if and only if for any subset i B = {b1, b2, . . . , bk} of {θ : i ∈ E} with |B| = k, there does not exist a nontrivial

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 833

linear combination over Fq1 satisfying k k X X αjbj = 0 and αj = 0. j=1 j=1

k 0 Proof. As discussed above the matrix Cg is MDS if and only if det(H [R]) of the 0 matrix H [R] given in (11) is nonzero for all subsets R = {r1, r2, . . . , rk} ⊂ E. We qi−2 have λ1 = 1 and λi = θ 1 for 2 ≤ i ≤ k, and so we get

2 k−2 k−2  q q q   r1 q1 r1 q1 r1  1 1 1 1 (θ) (θ ) ... (θ ) 1 yr1 yr1 yr1 . . . yr1 k−2 2 qk−2 r2 q1 r2 q r2  q1 q1 1   1 (θ) (θ ) ... (θ 1 )   1 yr y yr . . . yr  0    2 r2 2 2  H [R]= . . . . . = . . . . .  ,  ......   . . . . .     . . . . .  k−2 2 k−2 r q1 r q r q q 1 (θ) k (θ ) k ... (θ 1 ) k q1 1 1 1 yrk yrk yrk . . . yrk

ri where yri = θ for 1 ≤ i ≤ k. By subtracting the first row from the other rows, we can see that 2 k−2 q q1 q1 z z 1 zr . . . zr r2 r2 2 2 2 qk−2 q1 q1 1 zr z zr . . . zr 0 3 r3 3 3 det(H [R]) = . . . . , . . . .

q2 qk−2 z zq1 z 1 . . . z 1 rk rk rk rk 0 where zri = yri −yr1 for 2 ≤ i ≤ k. Now by [42, Lemma 3.51] we get det(H [R]) 6= 0 if and only if zri , 2 ≤ i ≤ k, are linearly independent over Fq1 . Consider a linear combination of zri , 2 ≤ i ≤ k, over Fq1 : k k k k ! X X ri r1 X ri X r1 αizri = αi(θ − θ ) = αiθ − αi θ , i=2 i=2 i=2 i=2 Pk where αi ∈ Fq1 for 2 ≤ i ≤ k. By setting α1 = i=2 αi, We can see that det(H0[R]) 6= 0 if and only if there does not exist a nontrivial linear combina- ri tion of B = {θ : ri ∈ R} over Fq1 as specified. We need this condition to be satisfied for all subsets R = {r1, r2, . . . , rk} ⊂ E. Hence the proof. Remark 43. We can also see that the condition on θ in Theorem 8.19 is applicable qi−2 ∗ even if we take λ1 = c and λi = θ 1 c, 2 ≤ i ≤ k, for some c ∈ Fq . By considering the roots in this way the polynomials that we get are same as those obtained by applying Lemma 8.15.

8 4 3 2 Example 15. Let the field F28 = F2[x]/hµ(x)i, where µ(x) = x + x + x + x + 1 is a primitive polynomial over F2. Let α ∈ F28 be a root of µ(x), i.e. a primitive element of F28 . Let us consider q1 = 2, m = k = 4, θ = α in Theorem 8.19, then 2 4 λ1 = 1, λ2 = α, λ3 = α and λ4 = α . Then we get g4(x) = (x + 1)(x + α)(x + α2)(x + α4) = x4 + α129x3 + α167x2 + α11x + α7. Thus the companion matrix 7 11 167 129 Cg4 = Companion(α , α , α , α ) and  α7 α11 α167 α129  136 2 77 121 4  α α α α  C =   g4  α128 α232 α47 α54  α61 α120 α211 α81 is an MDS matrix.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 834 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

2 4 Remark 44. We have verified that the roots λ1 = 1, λ2 = α, λ3 = α and λ4 = α 0qi−1 0 of g4(x) considered in the above example cannot be expressed as θ 1 c for any 0 0 θ , c ∈ F28 (see Remark 42). Therefore the polynomial g4(x) cannot be obtained by Theorem 8.18 and Lemma 8.15. ¯∗ Observe that if an element θ ∈ Fq satisfies the condition in Theorem 8.18 then it k i−1 Q q1 also satisfies the condition in Theorem 8.19. Therefore if g(x) = i=1(x − cθ ) is a polynomial obtained by Theorem 8.18 and Remark 42 then we can see that the k i−2 0 Q q1 polynomial g (x) = (x − c) i=2(x − cθ ) also yields a recursive MDS matrix, and it can be obtained by Theorem 8.19 and Remark 43. For this reason, with the choices as in [7, Section 3.4], we get a new infinite class of recursive MDS matrices from Theorem 8.19.

8.4. Repeated-root cyclic codes. We were interested to find g(x) which would k yield MDS code and ord(g) = 2k (recall Subsection 8.1) so that Cg becomes MDS matrix. For the case when g(x) has no repeated root in the field of characteristic 2, the idea was to generate a [2k + z, k + z, d]q MDS code and then shorten the code at z positions to get a [2k, k, d]q code. The reason for shortening was due to the fact that g(x) cannot have order 2k in the field of characteristic 2. But g(x) may have the order 2k when it has some repeated roots. We show it by one example. Suppose q = 2 and k = 7. Consider g(x) = (x3 + x2 + 1)2(x + 1). It is easy to check that g(x) divides x14 + 1 because x7 + 1 = (x + 1)(x3 + x2 + 1)(x3 + x + 1). Now we show that g(x) does not divide xi + 1 for 1 ≤ i ≤ 13. For 1 ≤ i ≤ 6, it is obvious because deg(g) = 7. Moreover g(x) does not divide x7 + 1 (look at the factors). Now, if g(x) divides xi + 1 for 8 ≤ i ≤ 13, then g(x) divides x14−i + 1, a contradiction. Thus, ord(g) = 14 = 2k. Note that g(x) has repeated roots, otherwise it would not be possible. This subsection discusses the case when g(x) has multiple roots. One of the most important results of this subsection is Corollary 14 which states that if g(x) ∈ F2[x] has any repeated root and deg(g) ≥ 2, it is not possible to get a recursive MDS m matrix M = Cg for any m ≥ 0. As a result, it can be concluded that there does not exist any involutory recursive MDS matrix of order k ≥ 2 over the field of characteristic 2. Now we define some basic notions and discuss some existing results which are going to be used in this subsection. We present only those results on repeated-root cyclic codes which are relevant to this paper. For more details, please see [10, 43].

Pk i Definition 8.20. [27, Definition 1] Let g(x) = i=0 aix ∈ Fq[x]. Then for some integer d ≥ 0, the dth Hasse derivative g[d](x) of g is defined by

k X i g[d](x) = a xi−d, d i i=0 i where we assume = 0 if i < d. d We have the following useful lemma.

Lemma 8.21. [10, Section II] Let µ(x) ∈ Fq[x] be an irreducible polynomial and e e be a positive integer. Then µ(x) divides g(x) ∈ Fq[x] if and only if µ(x) divides g(x) and its first e − 1 Hasse derivatives, i.e. µ(x) divides g[d](x) for all d, 0 ≤ d ≤ e − 1.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 835

In the case where the generator polynomial g(x) ∈ Fq[x] has a root with multi- plicity greater than 1 (i.e. repeated root), the code Γ = hg(x)i generated by g is said be to be a repeated-root cyclic code. A general theory of such codes was presented in [10, 43]. From Lemma 8.21, one can see that an analogous result of the above lemma in this case is as stated below (see also [10, Section II])

Lemma 8.22. [27, Lemma 3] Let g(x) ∈ Fq[x] be a monic polynomial of degree k ¯ with ord(g) = n ≥ 2. Suppose that g has t distinct roots, say λ1, . . . , λt ∈ Fq with Pn−1 i n multiplicities e1, e2, . . . , et respectively. Then f(x) = i=0 fix ∈ Fq[x]/(x − 1) is a codeword of Γ = hg(x)i if and only if the coefficient vector (f0, f1, . . . , fn−1) of f is in the null space of the matrix   He1 (λ1)  .  H =  .  ,

Het (λt) where the matrix Hei (λi) is of size ei × n and its rows are the n-tuples 0 1 1−j 2 2−j n−1 n−j−1 (12) j , j λi , j λi ,..., j λi for 0 ≤ j ≤ ei − 1. Or, in other words, H is a parity check matrix of Γ. The n-tuples mentioned in the above lemma are the jth Hasse derivatives of the 2 n−1 vector (1, λi, λi , . . . , λi ) treating λi as a variable (see Definition 8.20).

Lemma 8.23. [27, Lemma 4] Let g(x) ∈ Fq[x] be a monic polynomial of degree ¯ k with ord(g) = n. Suppose that g has t distinct roots, say λ1, . . . , λt ∈ Fq with multiplicities e1, e2, . . . , et respectively. Let S be the code with generator matrix 0 m G = [−Cg | I] for some m, k ≤ m ≤ n − k. Then a vector (f0, f1, . . . , fk−1, fm, 2k fm+1, . . . , fm+k−1) ∈ Fq is an element of S if and only if it is in the null space of the matrix H0 given by  H0 (λ )  e1 1 0  .  (13) H =  .  , H0 (λ ) et t where the matrix H0 (λ ) is of size e × 2k and its rows are the 2k-tuples ei i i 0 1 1−j k−1 k−1−j m m−j m+1 m+1−j m+k−1 m+k−j−1 j , j λi ,..., j λi , j λi , j λi ,..., j λi for 0 ≤ j < ei. In particular, if the polynomial g(x) has no repeated roots, i.e. 0 ei = 1 for 1 ≤ i ≤ t = k, then the matrix H is given by  k−1 m m+1 m+k−1  1 λ1 . . . λ1 λ1 λ1 . . . λ1 0  ......  (14) H =  ......  . k−1 m m+1 m+k−1 1 λk . . . λk λk λk . . . λk Let S and H0 be as defined in the above lemma. Note that the code S has dimension k. From Lemma 8.23, it is easy to see that the code S is MDS (the code 0 2k S has minimum distance k + 1) if and only if the null space of the matrix H in Fq does not contain a nonzero vector of weight k or less, in other words, any k columns 0 of the matrix H are linearly independent over Fq. For convenience, we state this result as follows.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 836 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Theorem 8.24. [27, Theorem 2] Let g(x) ∈ Fq[x] be a monic polynomial of degree ¯ k with ord(g) = n. Suppose that g has t distinct roots, say λ1, . . . , λt ∈ Fq with multiplicities e1, e2, . . . , et respectively. Let m be an integer with k ≤ m ≤ n − k. m 0 Then the matrix M = Cg is MDS if and only if any k columns of the matrix H given in (13) are linearly independent over Fq.

Corollary 14. [27, Corollary 3] Let char(Fq) = p and g(x) ∈ Fq[x] be a monic polynomial of degree k with g(0) 6= 0 and ord(g) = n. If the polynomial g(x) has a ¯ m root λ ∈ Fq with multiplicity e ≥ p then the matrix M = Cg is not MDS for any non-negative integer m. Proof. Since ord(g) = n, one can easily see that it is enough to verify the result for k ≤ m ≤ n − k. From the discussion above and by Theorem 8.24, it is enough to show that the k × 2k matrix H0 corresponding to g given in (13) contains a singular k × k submatrix. For this purpose we show that a row of H0 contains at least k zeros. Now consider the following n-tuple as given in (12) corresponding to λ and j = p − 1:

 0  k−1 k−p m  m−p+1 m+k−1 m+k−p n−1 n−p p−1 ,..., p−1 λ ,..., p−1 λ ,..., p−1 λ ,..., p−1 λ . i  Observe that for any non-negative integer i we have p−1 ≡ 0 (mod p) if i 6≡ −1 i  (mod p), otherwise p−1 6≡ 0 (mod p). Therefore in the above n-tuple, nonzero elements appear only at positions with indices (index starts from 1) that are multi- ples of p. It is given that the polynomial g has a root λ with multiplicity ≥ p, and so the matrix H0 contains a row formed by the first k elements and the k consecu- tive elements starting from the mth element of the above n-tuple. From the above observation it is evident that this row of H0 contains at least k zeros. Hence the proof.

Remark 45. As a consequence of the above corollary, if char(Fq) = 2 then a polynomial g(x) ∈ Fq[x] which has a repeated root cannot yield a recursive MDS matrix. Thus the recursive MDS matrices over fields of characteristic 2, which are of importance in cryptographic applications, can only be obtained from polynomials without repeated roots. But if char(Fq) = p is odd then there exist polynomials g(x) ∈ Fq[x] with repeated roots that yield recursive MDS matrices. We give an example below. Example 16. Let q = p = 7. Consider the polynomial g(x) = (x − 1)3 = x3 − 2 3 3x − 4x − 1 ∈ F7[x]. The matrix Cg obtained from Cg = Companion(1, 4, 3) of g is MDS:  0 1 0   1 4 3  3 Cg =  0 0 1  and Cg =  3 6 6  . 1 4 3 6 6 3 Involutory MDS matrices. Recall that a square matrix M is said to be invo- lutory if M 2 = I. An involutory recursive MDS matrix M is an MDS matrix which m is involutory and equal to Cg for some companion matrix Cg and some m ≥ k. Theorem 8.25. [28, Theorem 2] There cannot be an involutory recursive MDS matrix M over fields of characteristic 2 except the trivial case M = [1].

m Proof. Let M = Cg be a recursive MDS matrix for some polynomial over Fq of l degree k and m ≥ k. Let l be the smallest positive integer such that Cg = Ik. Then,

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 837 from the discussion in the above section if M is MDS then g cannot have a repeated root. In that case l = ord(g) is odd since char(Fq) = 2 (see the first paragraph 2 2m of Subsection 8.4). If suppose M = Cg = I then l must divide m. Therefore m M = Cg = I which is a contradiction as Ik cannot be MDS unless Ik = [1]. 8.5. Search vs. direct construction method. In the year 2011, in order to reduce the hardware area of the proposed hash function family design PHOTON [19], Guo et. al. came out with an idea of reducing the diffusion layer area by choosing a sparse non-MDS matrix and then used it recursively to finally obtain an MDS one. The proposed was a companion matrix. The PHOTON family has different sizes of companion matrices ranging from 4 × 4 upto 8 × 8. These matrices were obtained through exhaustive search and the search was possible because the order of the matrix was limited to 8 and the field size was limited to 28. Authors of [19] defined Serial(z0, . . . , zk−1)(Companion(z0, . . . , zk−1)), which is 2 k−1 k the companion matrix of z0 + z1x + z2x + ... + zk−1x + x . Their objec- k tive was to find suitable candidates so that Companion(z0, . . . , zk−1) is an MDS matrix. In [19], authors proposed an MDS matrix Companion(1, α, 1, α2)4 over 8 4 3 F28 , where α is the root of the irreducible polynomial x + x + x + x + 1, for AES MixColumn operation which has compact and improved hardware footprint [19]. The proper choice of z0, z1, z2 and z3 (preferably of low Hamming weight) improves the hardware implementation of AES MixColumn transformation. It may be noted that MixColumn operation in [19] is composed of k (k = 4 for AES) ap- plications of the matrix Companion(z0, . . . , zk−1) to the input column vector. More T formally, let W = (w0, . . . , wk−1) be the input column vector of MixColumn and T k Y = (y0, . . . , yk−1) be the corresponding output. Then we have Y = A × W = (A × (A × (A × ... × (A ×W )))) ...), where A = Companion(z0, . . . , zd−1). So the | {z } k times hardware circuitry will depend on companion matrix A and not on the MDS ma- trix Ak. Note that authors of [19] used MAGMA [9] to test all possible values of 2 z0, z1, z2 and z3 and found Companion(1, α, 1, α ) to be the right candidate, which raised to the power 4 gives an MDS matrix. Authors of [54, 67] proposed new dif- fusion layers (k × k MDS matrices) based on companion matrices for smaller values of k. To conquer the quest of getting recursive MDS matrices of bigger size, a more concrete theory was required and that appeared from the realm of coding theory. Though it seems an obvious choice, the initial few works following the work of Guo et. al. lacked this direction. In the year 2013, Berger [7] showed how to use the Gabidulin codes [8] to obtain infinite class of recursive MDS matrices. Then next year, in 2014, Augot et. al. [2] showed how to use shortened BCH codes to obtain recursive MDS matrices. These two methods were enough to provide recursive MDS matrix of any order, nevertheless a more generic construction criterion appeared in [27] using the roots of the characteristic polynomial of the companion matrix. Now we have many more construction methods other than using shortened BCH and Gabidulin codes. And in fact, any recursive MDS matrix obtained using the shortened BCH or the Gabidulin code could also be obtained from the generic construction method discussed in [27]. Now, we have enough construction methods for larger matrix size also. Though the generic construction methods open the feasibility of obtaining re- cursive MDS matrices of any order, there is no guarantee of getting a matrix of

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 838 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta the optimal hardware area; even for the smaller size, it is not guaranteed. The only known method, so far, which could produce the optimal matrix in terms of area is the exhaustive search which is feasible when the matrix size is small (for example, say upto 8) and the field size is not too large (say upto F28 ). Of course, the exhaustive search is never a solution for matrices of larger order or bigger field size. Broadly, there are two different ways to do exhaustive search - (a) to try with all possible field elements or (b) to try with only a subset of the field elements which require less hardware area. The search method (a) is naive and obvious and it works for matrices of only small orders (say upto 8). But this method always produces the optimal matrices. Whereas the second method (b) may work for larger order matrices but its feasibility depends upon the cardinality and choice of the subset of field elements which will be used for searching. Smaller the cardinality, less chance to obtain required matrices, even though they exist in that field. The main reason is that there may not exist any recursive MDS matrix with the entries chosen from the subset, but could exist if constructed from elements other than subset elements. A lot of work has been done in this direction. We are not going to discuss the search methods for constructing recursive MDS matrices in this paper, however, they are particularly interesting in the context of lightweight cryptography. The readers may look for examples in [19, 22, 54, 64, 67]. Recently, a new kind of matrices was proposed known as DSI matrices to obtain MDS matrices recursively.

8.5.1. DSI matrix.

Definition 8.26. [64, Definition 5] Let a = [a1 a2 . . . ak] and b = [b1 b2 . . . bk−1] where ai, bj ∈ F2r for 1 ≤ i ≤ k and 1 ≤ j ≤ k − 1. A Diagonal-Serial-Invertible (DSI) matrix D = (di,j) ∈ Mk(F2r ) is determined by two vectors a and b defined as follows:  a1 i = 1, j = k   ai, i = j + 1 di,j =  bi, i = j ≤ k − 1  0, otherwise.

4 For example, let α be a root of the polynomial x + x + 1 ∈ F2[x], a = [1 1 1 α] and b = [1 0 α3 + 1] Then,  1 0 0 1   1 0 0 0  D =    0 1 α3 + 1 0  0 0 α 0 is a DSI matrix of order 4 which gives a recursive MDS matrix when raised to the power 4, i.e.  α + 1 α + 1 α3 + α 1  4  1 α α + 1 1  D =    α2 + 1 α3 + α2 + α + 1 α3 + α2 α3  α + 1 α3 + 1 α3 + α2 + 1 α is an MDS matrix. Like companion matrices, these matrices are also sparse and are useful for light- weight constructions of diffusion layers. In [64], the authors claimed that the DSI

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 839 matrix D mentioned above has lower hardware area than that of the best compan- ion matrix obtained so far for order 4. Though it is a serious potential candidate for providing lightweight recursive MDS matrices, it suffers two big challenges from companion ones which are 1. Not much theory has been developed so far as opposed to companion matri- ces. The reason is obvious - it was proposed recently. We have now a lot of direct constructions for recursive MDS matrices from companion matri- ces given any order of the matrix. But, in the case of DSI, the best known method is the search method (discussed in the Subsection 8.5) and that’s why its construction method is limited to small order matrices (say upto 8). m m 2. The another important parameter is the power m so that Cg or D becomes MDS. In general, it preferred to take m be the order of the matrix itself as it cannot be less than this value. But, the existence of such m is not guaranteed given any companion or DSI matrix. In case of companion matrices, while using direct construction method, Cg can be constructed after fixing the the order of the matrix and power m, whereas there is no such luxury in the case of DSI. The only way for DSI matrix is to try with different values of m. In general, m is fixed to be equal to the order of the matrix and then several D’s are chosen randomly and checked whether Dm is MDS or not. This is also feasible only for small order matrices. In future DSI may be very promising for higher order matrices as well, but the role of companion matrices for such orders is still unchallenged because of its rich mathematical theory. However, for small order matrices where the search method is feasible, DSI has given better matrices than companion ones in terms of low cost hardware area.

9. Conclusion In this paper, we have studied the properties of MDS matrices and its vari- ous constructions from Cauchy, Vandermonde, circulant, left-circulant, Toeplitz, Hankel and companion matrices. We find a nontrivial equivalence between the Cauchy based constructions and its corresponding Vandermonde based construc- tions. We also observe a interconnection that a left-circulant matrix is nothing but a row-permutated circulant matrix and a similar connection between Hankel and Toeplitz matrices. Using the interconnection we provide an alternative proof that left-circulant and Hankel matrices of order 2n are not both MDS and involutory. We do not discuss efficiency issues but the theory accumulated and discussed here should provide an idea towards efficiency. The readers may look at [6] and the ref- erences mentioned therein for efficient implementation of MDS matrices. We revisit the results discussed in [2] and find a gap in one of the lemmas in that paper. In Subsection 8.2, we show the existence of gap by providing an example and then provide the proof of the lemma after rephrasing it correctly. In Subsection 8.3, we present five methods for construction of polynomial that yield recursive MDS matrices. The main tools in these methods are Theorem 8.13 and determinant of a matrix as defined in (11). We believe that more constructions are possible using Theorem 8.13 and (11) and so it may be taken as a future research. No direct con- struction is available from DSI matrix in the literature. It will be worth to explore algebraic properties of DSI matrix and provide direct recursive constructions from it.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 840 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

Acknowledgments We are thankful to the anonymous reviewers for their valuable comments. We also wish to thank Prof. Rana Barua and Dr. Sanjay Bhattacherjee for providing several useful and valuable suggestions.

References

[1] D. Augot and M. Finiasz, Exhaustive search for small dimension recursive MDS diffusion lay- ers or block ciphers and hash functions, In Proc. of the 2013 IEEE International Symposium on Information Theory, IEEE, 2013, 1551–1555. [2] D. Augot and M. Finiasz, Direct construction of recursive mds diffusion layers using shortened BCH codes, In FSE 2014 , LNCS, Springer, 8540 (2015), 3–17, Also available http://eprint. iacr.org/2014/566.pdf. [3] P. S. L. M. Barreto and V. Rijmen, The Khazad Legacy-Level Block Cipher, Submission to the NESSIE Project, 2000. Available at https://www.cosic.esat.kuleuven.be/nessie/ workshop/submissions.html. [4] P. S. L. M. Barreto and V. Rijmen, The Anubis block cipher, Submission to the NESSIE Project, 2000. Available at https://www.cosic.esat.kuleuven.be/nessie/ workshop/submissions.html. [5] P. S. L. M. Barreto and V. Rijmen, The Whirlpool hashing function, In Proceedings of the 1st NESSIE Workshop, 15 pages, 2000. Available at https://www.cosic.esat.kuleuven.be/ nessie/workshop/submissions.html. [6] C. Beierle, T. Kranz and G. Leander, Lightweight multiplication in GF (2n) with applications to MDS matrices, Advances in cryptology–CRYPTO 2016. Part I, 625–653, Lecture Notes in Comput. Sci., 9814, Springer, Berlin, 2016. [7] T. P. Berger, Construction of recursive MDS diffusion layers from gabidulin codes, In IN- DOCRYPT 2013 , LNCS, Springer, 8250 (2013), 274–285. [8] T. P. Berger and A. Ourivski, Construction of new MDS codes from Gabidulin codes, In Proceedings of ACCT 2009, Kranevo, Bulgaria, 40–47, June, 2004. [9] W. Bosma, J. Cannon and C. Playoust, The magma algebra system I: The user language, J. Symbolic Comput, 24 (1997), 235–265, Computational algebra and number theory (London, 1993). [10] G. Castagnoli, J. L. Massey, P. A. Schoeller and N. von Seeman, On repeated-root cyclic codes, In IEEE Transactions on Inform. Theory, 37 (1991), 337–342. [11] V. Cauchois and P. Loidreau, About circulant involutory MDS matrices, Des. Codes Cryp- togr., 87 (2019), 249–260. [12] J. Choy, H. Yap, K. Khoo, J. Guo, T. Peyrin, A. Poschmann and C. H. Tan, SPN-hash: Improving the provable resistance against differential collision attacks, Progress in cryptology– AFRICACRYPT 2012 , 270–286, Lecture Notes in Comput. Sci., 7374, Springer, Heidelberg, 2012. [13] T. Cui, C. Jin and Z. Kong, On compact cauchy matrices for substitution-permutation net- works, In IEEE Transactions on Computers, 64 (2015), 2098–2102. [14] J. Daemen, L. R. Knudsen and V. Rijmen, The block cipher SQUARE, In 4th Fast Software Encryption Workshop, LNCS, 1267 (1997), 149–165, Springer-Verlag. [15] J. Daemen and V. Rijmen, The Design of Rijndael: AES - The Advanced Encryption Stan- dard, Springer-Verlag, 2002. [16] G. D. Filho, P. Barreto and V. Rijmen, The Maelstrom-0 Hash Function, In Proceedings of the 6th Brazilian Symposium on Information and Computer Systems Security, 2006. [17] P. Gauravaram, L. R. Knudsen, K. Matusiewicz, F. Mendel, C. Rechberger, M. Schlaffer and S. Thomsen, Grφstl a SHA-3 Candidate, Submission to NIST, 2008, Available at http: //www.groestl.info/. [18] R. M. Gray, Toeplitz and Circulant Matrices: A Review Foundations and Trends in Commu- nications and Information Theory, NOW, 2005. [19] J. Guo, T. Peyrin and A. Poschmann, The PHOTON family of lightweight hash functions, In CRYPTO, Springer, 2011 (2011), 222–239. [20] J. Guo, T. Peyrin, A. Poshmann and M. J. B. Robshaw, The LED block cipher, In CHES 2011 , LNCS, 6917 (2011), 326–341, Springer.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 841

[21] K. C. Gupta and I. G. Ray, On constructions of involutory MDS matrices, Progress in cryptology–AFRICACRYPT 2013, 7918 (2013), 43–60. [22] K. C. Gupta and I. G. Ray, On constructions of MDS matrices from companion matrices for lightweight cryptography, In CD-ARES 2013 Workshops: MoCrySEn, Springer, 8128 (2013), 29–43. [23] K. C. Gupta and I. G. Ray, On constructions of circulant MDS matrices for lightweight cryptography, In ISPEC 2014, Springer, 2014, 564–576. [24] K. C. Gupta and I. G. Ray, Cryptographically significant MDS matrices based on circulant and circulant-like matrices for lightweight applications, Cryptography and Communications, 7 (2015), 257–287. [25] K. C. Gupta, S. K. Pandey and A. Venkateswarlu, Towards a general construction of recursive MDS diffusion layers, In WCC2015, https://hal.inria.fr/hal-01276436. [26] K. C. Gupta, S. K. Pandey and A. Venkateswarlu, On the direct construction of recursive MDS matrices, In Des. Codes Crypto., 82 (2017), 77–94. [27] K. C. Gupta, S. K. Pandey and A. Venkateswarlu, Towards a general construction of recursive MDS diffusion layers, In Des. Codes Crypto., 82 (2017), 179–195. [28] K. C. Gupta, S. K. Pandey and A. Venkateswarlu, Almost involutory recursive MDS diffusion layers, Design, Codes and Cryptography, 87 (2019), 609–626. [29] H. Han and H. Zhang, The research on the maximum branch number of P-permutations, 2010 2nd International Workshop on Intelligent Systems and Applications, Wuhan, 2010, 1–4. [30] H. M. Heys and S. E. Tavares, The design of substitution-permutation networks resistant to differential and linear cryptanalysis, Proceedings of 2nd ACM Conference on Computer and Communications Security, Fairfax, Virginia, 1994, 148–155. [31] H. M. Heys and S. E. Tavares, Avalanche characteristics of substitution-permutation encryp- tion networks, IEEE Trans. Comp., 44 (1995), 1131–1139. [32] H. M. Heys and S. E. Tavares, The design of product ciphers resistant to differential and linear cryptanalysis, Journal of Cryptology, 9 (1996), 1–19. [33] J. W. P. Hirschfeld, The main conjecture for MDS codes, Cryptography and Coding (Cirences- ter, 1995), , Lecture Notes in Comput. Sci., 1025 (1995), 44–52. [34] P. Junod and S. Vaudenay, Perfect diffusion primitives for block ciphers building efficient MDS matrices, Selected Areas in Cryptography, 84–99, Lecture Notes in Comput. Sci., 3357, Springer, Berlin, 2005. [35] P. Junod and S. Vaudenay, FOX: A new family of block ciphers, Selected Areas in Cryptog- raphy, 114–129, Lecture Notes in Comput. Sci., 3357, Springer, Berlin, 2005. [36] P. Junod and M. Macchetti, Revisiting the IDEA philosophy, International Workshop on Fast Software Encryption, FSE 2009: Fast Software Encryption, Lecture Notes in Computer Science, 5665 (2009), 277–295. [37] K. Khoo, T. Peyrin, A. Y. Poschmann and H. Yap, FOAM: Searching for hardware-optimal SPN structures and components with a fair comparison, In Lejla Batina and Matthew Rob- shaw, editors, Cryptographic Hardware and Embedded Systems-CHES 2014, volume 8731 of Lecture Notes in Computer Science, pages 433–450. Springer Berlin Heidelberg, 2014. [38] N. Kolokotronis, K. Limniotis and N. Kalouptsidis, Factorization of over finite fields and application in stream ciphers, In Cryptogr. Commun., 1 (2009), 175–205. [39] I. Kra and S. R. Santiago, On circulant matrices, Notices of the AMS, 59 (2012), 368–377. [40] J. Lacan and J. Fimes, A construction of matrices with no singular square submatrices, Finite Fields and Applications, 2948 (2004), 145–147. [41] J. Lacan and J. Fimes, Systematic MDS erasure codes based on vandermonde matrices, IEEE Trans. Commun. Lett., 8 (2004), 570–572. [42] R. Lidl and H. Niederreiter, Finite Fields, Cambridge University Press, Cambridge, 2nd edition, 1997. [43] J. H. van Lint, Repeated-Root Cyclic Codes, IEEE Transactions on Inform. Theory, 37 (1991), 343–345. [44] M. Liu and S. M. Sim, Lightweight MDS generalized circulant matrices, International Confer- ence on Fast Software Encryption, Lecture Notes in Computer Science, 9783 (2016), 101–120. Springer, Berlin, Heidelberg. [45] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error Correcting Codes, North-Holland Publishing Co., Amsterdam-New York-Oxford, 1977.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 842 K. C. Gupta, S. K. Pandey, I. G. Ray and S. Samanta

[46] F. Mattoussi, V. Roca and B. Sayadi, Complexity comparison of the use of Vandermonde versus Hankel matrices to build systematic MDS Reed-Solomon codes, 2012 IEEE 13th In- ternational Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Cesme, 2012, 344–348. [47] J. Nakahara and E. Abrahao, A New Involutory MDS Matrix for the AES, International Journal of Network Security, 9 (2009), 109–116. [48] M. K. Pehlivanoˇglu,M. T. Sakalli, S. Akleylek, N. Duru and V. Rijmen, Generalisation of Hadamard matrix to generate involutory MDS matrices for lightweight cryptography, In IET Information Security, 12 (2018), 348–355. [49] A. R. Rao and P. Bhimasankaram, , Second Edition, Hindustan Book Agency. [50] V. Rijmen, J. Daemen, B. Preneel, A. Bosselaers and E. D. Win, The cipher SHARK, In 3rd Fast Software Encryption Workshop, LNCS, 1039 (1996), 99–111, Springer-Verlag. [51] R. M. Roth and G. Seroussi, On generator matrices of MDS codes, In IEEE Transactions on Information Theory, 31 (1985), 826–830. [52] R. M. Roth and A. Lempel, On MDS codes via Cauchy matrices, In IEEE Transactions on Information Theory, 35 (1989), 1314–1319. [53] M. Sajadieh, M. Dakhilalian, H. Mala and B. Omoomi, On construction of involutory MDS matrices from Vandermonde Matrices in GF (2q), Design, Codes Cryptography, 64 (2012), 287–308. [54] M. Sajadieh, M. Dakhilalian, H. Mala and P. Sepehrdad, Recursive diffusion layers for block ciphers and hash functions, FSE 2012, 7549 (2012), 385–401. [55] S. Sarkar AND H. Syed, Lightweight diffusion layer: Importance of Toeplitz matrices, IACR Trans. Symmetric Cryptol., 2016 (2016), 95–113. [56] S. Sarkar and H. Syed, Analysis of toeplitz MDS matrices, ACISP 2017 , LNCS, 10343 (2017), 3–18. [57] B. Schneier, J. Kelsey, D. Whiting, D. Wagner, C. Hall and N. Ferguson, Twofish: A 128-bit block cipher, In the first AES Candidate Conference. National Institute for Standards and Technology, 1998. [58] B. Schneier, J. Kelsey, D. Whiting, D. Wagner, C. Hall and N. Ferguson, The Twofish en- cryption algorithm, Wiley, 1999. [59] C. Schnorr and S. Vaudenay, Black box cryptanalysis of hash networks based on multipermu- tations, Advances in cryptology–EUROCRYPT ’94 (Perugia), Proceedings, LNCS, Springer- Verlag, 950 (1995), 47–57. [60] C. E. Shannon, Communication theory of secrecy systems, Bell Syst. Technical J., 28 (1949), 656–715. [61] T. Shirai and K. Shibutani, On the diffusion matrix employed in the Whirlpool hashing function, NESSIE public report, 2003. Available at https://www.cosic.esat.kuleuven.be/ nessie/reports/phase2/whirlpool-20030311.pdf. [62] T. Shirai, K. Shibutani, T. Akishita, S. Moriai and T. Iwata, The 128-Bit blockcipher CLEFIA (Extended Abstract), Fnternational Workshop on Fast Software Encryption, Lecture Notes in Computer Science, 4593 (2017), 181–195, Springer, Berlin, Heidelberg [63] S. M. Sim, K. Khoo, F. Oggier and T. Peyrin, Lightweight MDS Involution Matrices, FSE 2015: Fast Software Encryption, Lecture Notes in Computer Science, 9054 (2015), 471–493. Springer, Berlin, Heidelberg. [64] D. Toh, J. Teo, K. Khoo and S. Sim, Lightweight MDS Serial-Type Matrices with Minimal Fixed XOR Count, In AFRICACRYPT 2018 , LNCS, 10831 (2018), 51–71. [65] S. Vaudenay, On the need for multipermutations: Cryptanalysis of MD4 and SAFER, Fast Software Encryption, Proceedings, LNCS, 108 (1995), 286–297. Springer-Verlag, 1995. [66] D. Watanabe, S. Furuya, H. Yoshida, K. Takaragi and B. Preneel, A new keystream generator MUGI, FSE 2002: Fast Software Encryption, Springer Berlin/Heidelberg, 2365 (2002), 179– 194. [67] S. Wu, M. Wang and W. Wu, Recursive diffusion layers for (lightweight) block ciphers and hash functions, SAC 2012: Selected Areas in Cryptography, LNCS, Springer-Verlag Berlin Heidelberg, 7707 (2013), 355–371. [68] A. M. Youssef, S. E. Tavares and H. M. Heys, A New Class of Substitution Permutation Networks, Workshop on Selected Areas in Cryptography, SAC ’96, Workshop Record, (1996), 132–147.

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843 A brief survey on MDS matrices 843

[69] A. M. Youssef, S. Mister and S. E. Tavares, On the Design of Linear Transformations for Sub- stitution Permutation Encryption Networks, In Workshop On Selected Areas in Cryptography, SAC, 97 (1997), 40–48. Received for publication December 2018. E-mail address: [email protected] E-mail address: [email protected] E-mail address: [email protected] E-mail address: [email protected]

Advances in Mathematics of Communications Volume 13, No. 4 (2019), 779–843