Balanced Sparsest Generator Matrices for MDS Codes

Balanced Sparsest Generator Matrices for MDS Codes Son Hoang Dau∗, Wentu Song†, Zheng Dong‡, Chau Yuen§ Singapore University of Technology and Design, Singapore Emails: {∗sonhoang dau, †wentu song, ‡dong zheng, §yuenchau}@sutd.edu.sg Abstract—We show that given n and k, for q sufficiently large, In fact, any error-correcting code can be used in the there always exists an [n, k]q MDS code that has a generator aforementioned scheme for sensor networks. We choose to G matrix satisfying the following two conditions: study MDS codes first because their structure, especially their G − (C1) Sparsest: each row of has Hamming weight n k + 1; weight distribution, is well studied (see, for instance [1, Ch. (C2) Balanced: Hamming weights of the columns of G differ from each other by at most one. 11]). Moreover, they have optimal error-correcting capability, given the length and the dimension. We prove that over a I. INTRODUCTION sufficiently large field, there always exists an MDS code that We study the existence and provide a construction of a has a balanced and sparsest generator matrix, which is ideally sparsest and balanced generator matrix of Maximum Distance suitable for the above encoding scheme for sensor networks. Separable (MDS) codes. A generator matrix is the sparsest Necessary notations and definitions are provided in Sec- if it contains the least number of nonzero entries among all tion II. We state and prove our main result in Section III. generator matrices of the same MDS code. A generator matrix II. PRELIMINARIES is balanced if every column contains approximately the same F number of nonzero entries. More specifically, we require that We denote by q the finite field with q elements. Let the number of nonzero entries in each column differs from [n] denote the set {1, 2,...,n}. The support of a vector u Fn u each other by at most one. = (u1,...,un) ∈ q is defined by supp( ) = {i ∈ [n]: Apart from being of theoretical interest, our study on ui 6= 0}. The (Hamming) weight of u is |supp(u)|. We can balanced sparsest generator matrices for MDS codes was also define weight and support of a row or a column of a motivated by its application in error correction for sensor matrix over some finite field, by regarding them as vectors networks. Suppose n sensors, S1,..., Sn, collectively measure over that field. Apart from Hamming weight, we also use k conditions x1,...,xk, such as temperature, pressure, light other standard notions from coding theory such as minimum intensity, etc. Let x = (x1,...,xk), where xi ∈ Fq for each distance, linear [n, k]q and [n,k,d]q codes, MDS codes, and i =1,...,k (Fq is a finite field of q elements). These sensors generator matrices (for instance, see [1]). G Fk×n G transmit the information they collected to a base station, which For a matrix = (gi,j ) ∈ q , the support matrix of , is a data collector. Furthermore, each sensor performs some denoted supp(G),isa k×n binary matrix M = (mi,j ) where encoding on the information it has, before transmitting the mi,j = 0 if gi,j = 0 and mi,j = 1 if gi,j 6= 0. Let M = information back to the base station in the following way. Let (mi,j ) be a k × n binary matrix. We denote by var(M) = M G be an k×n generator matrix of an [n,k,d]q error-correcting (vi,j ) the matrix obtained from by replacing every nonzero code. Sensor Si transmits the scalar product of x and column entry mi,j =1 by ξi,j , where ξi,j ’s are indeterminates. More arXiv:1301.5108v1 [cs.IT] 22 Jan 2013 i of G to the base station. It is well known in classical formally, vi,j = 0 if mi,j = 0 and vi,j = ξi,j if mi,j = 1. coding theory that this coding scheme allows the base station We also denote by gr(M) the bipartite graph G = (V , E ) x d−1 defined as follows. The vertex set V can be partitioned into to retrieve when at most ⌊ 2 ⌋ sensors transmit wrong information. Moreover, the base station can also identify two parts, namely, the left part L = {ℓ1,...,ℓk}, and the right the malfunctioned sensors. For each sensor Si, only those part R = {r1,...,rn}. The edge set is conditions corresponding to nonzero entries of column i of G E = (ℓi, rj ): i ∈ [k], j ∈ [n], mi,j 6=0 . are involved into encoding. So it is sufficient for Si to measure only such conditions. Thus, if G is sparse then in average, For any k × n matrix N, we define f(N) = P det(P ), n P each sensor only needs to measure a few among k conditions where the product is taken over all k submatrices of order in order to achieve the desired error correction capability. On k of N. Q top of that, if columns of G have approximately the same III. MAIN RESULT number of nonzero entries then the sensors are required to measure approximately the same number of conditions. This A sparsest generator matrix of an [n, k]q MDS code would balance guarantees an even distribution of workload among have precisely n−k+1 nonzero entries in every row. Moreover, k(n−k+1) sensors, which is an important criterion for sensor networks if it is balanced, then each column contains either ⌊ n ⌋ k(n−k+1) where energy saving is a critical issue. or ⌈ n ⌉ nonzero entries. Hereafter, we often use Ri, i ∈ [k], and Cj , j ∈ [n], to denote the supports of row i and ∪i∈I Ri ≥ n − k + |I|, for every subset ∅ 6= I ⊆ [k]. (2) column j, respectively, of a k ×n binary matrix M. Note that Proof: Suppose that (1) holds and that there exists a Ri ⊆ [n] and Cj ⊆ [k]. nonempty set I ⊆ [k] satisfying Lemma 1. Let M = (mi,j ) be a k×n binary matrix. Suppose ∪i∈I Ri ≤ n − k + |I|− 1. (3) that each row of M has weight n − k +1. Then M is the We aim to obtain a contradiction. The condition (3) is equiv- support matrix of a generator matrix of some [n, k]q MDS n−1 alent to code over a sufficiently large field Fq (q> ) if and only k−1 ∩i I Ri ≥ k − |I| +1. (4) if f(var(M)) 6≡ 0. ∈ M G G Hence there exists a set J of k − |I| +1 columns of M that Proof: Suppose = supp( ), where = (gi,j ) satisfies ∩j∈J Cj ≥ |I|. Equivalently we have is a generator matrix of some [n, k]q MDS code. Due to a well-known property of MDS codes (see [1, p. 319]), ∪j∈J C j ≤ k − |I| < k − |I| +1= |J|. (5) every submatrix of order k of G has nonzero determinant. We obtain a contradiction between (1) and (5). The “only if” Therefore, f(G) 6= 0. Note that f(var(M)) can be regarded direction can be proved in a similar manner. as a multivariable polynomial in Fq[...,ξi,j ,...]. Moreover, since M = supp(G), we deduce that f(G) can be obtained Lemma 5. Let M = (mi,j ) be a k×n binary matrix. Suppose from f(var(M)) by substituting ξi,j by gi,j for all i, j where that each row of M has weight n − k +1. Then M is the gi,j 6=0. As f(G) 6=0, we conclude that f(var(M)) 6≡ 0. support matrix of a generator matrix of some [n, k]q MDS M F n−1 Now suppose that f(var( )) 6≡ 0. Note that each column code over a sufficiently large field q (q> k−1 ) if and only of var(M) belongs to precisely n−1 submatrices of order k if (2) holds. k−1 of var(M). Hence the exponent of each ξi,j in f(var(M)) n−1 M Proof: The proof follows from Lemma 1-4. is at most k−1 . Since f(var( )) 6≡ 0, by [2, Lemma n−1 F We present below our main result. 4], if q > k−1 then there exist gi,j ∈ q (for i, j M where mi,j = 1) so that f(var( ))(...,gi,j,...) 6= 0. Let Theorem 6 (Main Theorem). Suppose 1 ≤ k ≤ n and q > G n−1 = (gi,j ) (for i, j where mi,j = 0 we set gi,j = 0). Since . Then there always exists an [n, k] MDS code that has G M k−1 q f( ) = f(var( ))(...,gi,j,...) 6= 0, again by [1, p. 319], a generator matrix G satisfying the following two conditions. G we deduce that is a generator matrix of an [n, k]q MDS (C1) Sparsest: each row of G has weight n − k +1. G code. Therefore, each row of has weight at least n − k +1, (C2) Balanced: column weights of G differ from each other due to the Singleton Bound (see [1, p. 33]). Since each row by at most one. of M also has weight n − k +1, we deduce that gi,j 6= 0 whenever mi,j =1. Therefore, M = supp(G). By Lemma 5, to prove Theorem 6, we need to show that there always exists a k × n binary matrix M satisfying the M Lemma 2. Let = (mi,j ) be a k × n binary matrix. following properties Then var M if and only if every bipartite subgraph f( ( )) 6≡ 0 (P1) each row of M has weight n − k +1, induced by the k left-vertices and some k right-vertices in (P2) column weights of M differ from each other by at most M gr( ) has a perfect matching.

Balanced Sparsest Generator Matrices for MDS Codes

Formalizing Randomized Matching Algorithms

Coded Sparse Matrix Multiplication

Algebraic Algorithms in Combinatorial Optimization

Coded Computation for Speeding up Distributed Machine Learning

Matchings in Graphs 1 Definitions 2 Existence of a Perfect Matching

A Random Linear Network Coding Approach to Multicast Tracey Ho, Member, IEEE, Muriel Médard, Senior Member, IEEE, Ralf Koetter, Senior Member, IEEE, David R

Advanced Algorithms, a Graduate-Level Course Taught by Anupam Gupta at Carnegie Mellon University in Fall 2020

Network Flow Algorithms for Wireless Networks and Design and Analysis of Rate Compatible LDPC Codes Cuizhu Shi Iowa State University

Coded Sparse Matrix Multiplication

Exact Covers Via Determinants Andreas Björklund

Admin Review String Matching

Lecture 11 — January 28, 2014 1 Overview 2 Ideas from Linear Algebra