<<

This document is downloaded from DR‑NTU (https://dr.ntu.edu.sg) Nanyang Technological University, Singapore.

On the variance of average distance of subsets in the Hamming space

Fu, Fang‑Wei; Ling, San; Xing, Chaoping

2004

Fu, F. W., Ling, S., & Xing, C. (2004). On the variance of average distance of subsets in the Hamming space. Discrete Applied Mathematics, 145(3), 465‑478. https://hdl.handle.net/10356/96425 https://doi.org/10.1016/j.dam.2004.08.004

© 2004 Elsevier B.V. This is the author created version of a work that has been peer reviewed and accepted for publication by Discrete Applied Mathematics, Elsevier B.V. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [http://dx.doi.org/10.1016/j.dam.2004.08.004].

Downloaded on 30 Sep 2021 18:29:50 SGT Discrete Applied Mathematics 145 (2005) 465–478 www.elsevier.com/locate/dam

On the variance of average distance of subsets in the Hamming space

Fang-Wei Fua,1, San Lingb, Chaoping Xingb aTemasek Laboratories, National University of Singapore, 5 Sports Drive 2, Singapore 117508, Singapore bDepartment of Mathematics, National University of Singapore, 2 Science Drive 2, Singapore 117543, Singapore

Received 13 September 2002; received in revised form 23 August 2004; accepted 31 August 2004

Abstract n Let V be a finite set with q distinct elements. For a subset C of V , denote var(C) the variance of the average of C. Let T (n, M; q) and R(n, M; q) denote the minimum and maximum variance of the average Hamming distance of n subsets of V with cardinality M, respectively. In this paper, we study T (n, M; q) and R(n, M; q) for general q. Using methods from , we derive upper and lower bounds on var(C), which generalize and unify the bounds for the case q = 2. These bounds enable us to determine the exact value for T (n, M; q) and R(n, M; q) in several cases. © 2004 Elsevier B.V. All rights reserved.

Keywords: Hamming space; Subsets; Average distance; Variance; ; Distance distribution

1. Introduction

n Let V ={v1,v2,...,vq } be a finite set with q distinct elements, where q is a positive integer. Let V be the set of ordered n-tuples over V. The Hamming distance between two vectors a and b is the number of components where they differ, and is n denoted by dH(a, b). Let C be a subset of V with size |C|=M. The average Hamming distance of C is defined by 1   d(C)¯ = d (a, b). (1.1) M2 H a∈C b∈C The variance of the average distance of C is defined by 1   var(C) = [d (a, b) − d(C)¯ ]2. (1.2) M2 H a∈C b∈C It is easy to check that 1   var(C) = [d (a, b)]2 −[d(C)¯ ]2. (1.3) M2 H a∈C b∈C

1On leave from the Department of Mathematics, Nankai University, Tianjin 300071, P. R. China. E-mail addresses: [email protected] (F.-W. Fu), [email protected] (S. Ling), [email protected] (C. Xing).

0166-218X/$ - see front matter © 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.dam.2004.08.004 466 F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478

n The minimum and maximum average Hamming distance of a subset of V with size M are defined by n (n, M; q) = min{d(C)¯ | C is a subset of V with size |C|=M}, n (n, M; q) = max{d(C)¯ | C is a subset of V with size |C|=M}. n The minimum and maximum variance of the average distance of a subset of V with size M are defined by n T (n, M; q) = min{var(C)| C is a subset of V with size |C|=M}, n R(n, M; q) = max{var(C)| C is a subset of V with size |C|=M}.

Ahlswede and Katona [2] first posed the problem of determining (n, M; q) on the extremal combinatorics of Hamming space. There are a number of papers (see [1–4,8–13,15,16]) dealing with this topic thereafter, and some exact values of (n, M; q) n are determined. It is still an open problem to determine (n, M; q) for general n, q and 1M q . Ahlswede and Althöfer [1] observed that this problem also occurs in the construction of good codes for write-efficient memories, introduced by Ahlswede and Zhang [3] as a model for storing and updating information on a rewritable medium with cost constraints. Kündgen [12] observed that this problem is equivalent to a covering problem in graph theory. Ahlswede and Katona [2] first mentioned the problem of determining (n, M; q) for q = 2 and gave a simple solution. Fu and Xing [11] gave a complete solution for the problem of determining (n, M; q)for general q. Since the variance is an important digital characteristic for the average distance, n Fu and Shen [9] first posed the problem of determining T (n, M; 2). For a subset C of {0, 1} , Fu and Shen [9] presented a lower n− bound and an upper bound on var(C). Moreover, they determined the exact value of T (n, 2 1; 2). If the size |C| is odd, Xia and Fu [15] improved the lower and upper bounds of Fu and Shen on var(C). Furthermore, they determined the exact values of n n− T (n, 2 − 1; 2) and T (n, 2 1 ± 1; 2). In this paper, we study T (n, M; q) and R(n, M; q) for general q. Using methods from coding theory, we derive upper and lower bounds on var(C), which generalize and unify the bounds for the case q = 2. These bounds enable us to determine the exact value for T (n, M; q) and R(n, M; q) in several cases. Without loss of generality, below we assume that V = Zq ={0, 1,...,q − 1}, the abelian group under addition modulo q, n since we only deal with the Hamming distance in the Hamming space V . Furthermore, if q is a prime power, we can assume n n that V = Fq , the finite field of q elements. The Hamming weight wH(a) of a vector a in Zq or Fq is the number of nonzero n n coordinates in a. Obviously, for a, b ∈ Zq or Fq ,

dH(a, b) = wH(a − b). If q is a prime power, denote

 ={(c1,c2,c3,...,cn) | ci ∈ Fq and c1 + c2 = 0}, + −  =  ∪{(0, 1, 0,...,0)},  = \{(0, 0, 0,...,0)}.

If q is a positive integer and q 2, denote

n−1 + −  = Zq ×{0},  =  ∪{(0,...,0, 1)},  = \{(0,...,0, 0)}.

Our main results in this paper are given as follows.

Theorem 1. For 2q 4, we have n− R(n, q 1; q) = var(), (1.4) n− + R(n, q 1 + 1; q) = var( ), (1.5) n− − R(n, q 1 − 1; q) = var( ). (1.6)

Theorem 2. If q 2, we have n− T (n, q 1; q) = var(), (1.7) n− − T (n, q 1 − 1; q) = var( ). (1.8)

If n3 or 2q 4, we have n− + T (n, q 1 + 1; q) = var( ). (1.9) F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478 467

+ − + − The exact values of var(), var( ), var( ), var(), var( ) and var( ) will be computed in Section 3. It seems to be difficult to determine T (n, M; q) and R(n, M; q) in general. In particular, it is interesting to know whether Theorem 1 is still true for q being a prime power and q 5. This paper is organized as follows. In Section 2, in order to establish our results, we review some basic properties of distance distributions of codes. In Section 3, we compute var(C) for some subsets. In Section 4, we derive an upper bound on var(C) for 2q 4. Theorem 1 is proved by showing that this upper bound is tight for some cases. In Section 5, we derive a lower bound on var(C) for general q. Theorem 2 is proved by showing that this lower bound is tight for some cases.

2. Preliminaries

In this section, we review some basic properties of distance distributions of codes. n For a subset C of V with size |C|=M, we call C an (n, M; q) in coding theory. The distance distribution of C is defined by

A = 1 |{(a, b)| a,b ∈ C, d (a, b) = i}|,i= , ,...,n. i M H 0 1 (2.1) The dual distance distribution of C is defined by n B = 1 K (j; q)A ,k= , ,...,n, k M k j 0 1 (2.2) j=0 where Kk(j; q) are the q-ary Krawtchouk numbers defined by    k j n − j K (j; q) = (− )i(q − )k−i . k 1 1 i k − i (2.3) i=0 The distance enumerator of C is defined as n i WC(x) = Aix i=0 and the dual distance enumerator of C is defined as n i Wˆ C(x) = Bix . i=0

The MacWilliams–Delsarte identity (see [14]) gives the relationship between WC(x) and Wˆ C(x):   − x Wˆ (x) = 1 [ + (q − )x]nW 1 , C M 1 1 C + (q − )x (2.4) 1 1  M n 1 − x WC(x) = [1 + (q − 1)x] Wˆ C . (2.5) qn 1 + (q − 1)x It is easy to see that n WC(0) = A0 = 1,WC(1) = Ai = M. (2.6) i=0 By (1.1) and (1.3), the average Hamming distance of C is given by n d(C)¯ = 1 iA M i (2.7) i=1 and the variance of d(C)¯ is given by n var(C) = 1 i2A −[d(C)¯ ]2. M i (2.8) i=1 468 F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478

Delsarte (see [6,7]) showed that

Bk 0,k= 0, 1,...,n. (2.9)

Let x = 0 and 1 in (2.4), respectively, we obtain by (2.6) that

Lemma 1. n qn B = , B = . 0 1 i M (2.10) i=0

Lemma 2. n(q − ) B d(C)¯ = 1 − 1 , q q (2.11)

n(q − 1) (q − 2) 1 2 var(C) = + B − (B )2 + B . (2.12) q2 q2 1 q2 1 q2 2

Proof. Eq. (2.11) is obtained by differentiating (2.5), putting x = 1 and combining with (2.7). Eq. (2.12) is obtained by differentiating (2.5) twice, putting x = 1 and using (2.8) and (2.11). 

Ashikhmin and Simonis [5] showed that

Lemma 3. Let C be an (n, M; q) code. The dual distance distribution of C is given by B0,B1,...,Bn. If M ≡  (mod q), then  n  1 k−1 Bk  (q − )(q − 1) ,k= 1, 2,...,n. (2.13) M2 k

If q is a prime power and V = Fq , the finite field with q elements, linear codes over Fq can be introduced. A q-ary code C is n called a q-ary [n, k] if C is a k-dimensional subspace of Fq . For two vectors n n a = (a1,a2,...,an) ∈ Fq , b = (b1,b2,...,bn) ∈ Fq , the scalar product of a and b is defined as

a · b = a1b1 + a2b2 +···+anbn. Let C be a q-ary [n, k] linear code. The set ⊥ n C ={x ∈ Fq : x · c = 0 for all c ∈ C} is called the dual code of C. Let Ai be the number of codewords in C of Hamming weight i. The sequence of numbers A ,A ,...,A 0 1 n is called the weight distribution of C. It is well known in coding theory that for a linear code C, the distance distribution of C is equal to the weight distribution of C, and the dual distance distribution of C is equal to the weight distribution ⊥ of the dual code C . By using these facts and Lemma 2, sometimes it is more convenient for us to compute d(C)¯ and var(C) if C is a linear code.

3. Computation of var(C) for some subsets  In this section, we compute var(C) for some subsets. In general, we cannot find a formula to compute var(C {v}) from var(C) or vice versa, but in some cases we can use (1.1), (1.3), and the following proposition to compute var(C {v}) from var(C).

n n Proposition 1. Let C be a nonempty subset of Zq and v ∈ Zq \C. Then      d (a, b) = d (a, b) + 2 w (a − v), (3.1)   H H H a∈C {v} b∈C {v} a∈C b∈C a∈C F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478 469      d2 (a, b) = d2 (a, b) + 2 w2 (a − v). (3.2)   H H H a∈C {v} b∈C {v} a∈C b∈C a∈C

In this section, we will use the following results:    m m w ( ) = j (q − )j = m(q − )qm−1, H b j 1 1 (3.3) m b∈Zq j=0    m m w2 ( ) = j 2 (q − )j = m(q − )[ + m(q − )]qm−2. H b j 1 1 1 1 (3.4) m b∈Zq j=0

Firstly, we obtain the following results by using Proposition 1.

n n Proposition 2. For any v ∈ Zq , denote C = Zq \{v}. Then

(q − 1)n (q − 1)n d(C)¯ = − , (3.5) q q(qn − 1)2

(q − 1)n (q − 1)n[(q − 1)n − 1] (q − 1)2n2 var(C) = + − . (3.6) q2 q2(qn − 1)2 q2(qn − 1)4 In particular, we obtain n n T (n, q − 1; q) = R(n, q − 1; q) (q − )n (q − )n[(q − )n − ] (q − )2n2 = 1 + 1 1 1 − 1 . q2 q2(qn − 1)2 q2(qn − 1)4

Proof. By (3.3) and (3.4), we have     dH(a, b) = wH(b − a) n n n n a∈Zq b∈Zq a∈Zq b∈Zq  n 2n−1 = q wH(x) = n(q − 1)q , (3.7) n x∈Zq    d2 ( , ) = qn w2 ( ) = n(q − )[ + n(q − )]q2(n−1). H a b H x 1 1 1 (3.8) n n n a∈Zq b∈Zq x∈Zq n Note that C = Zq \{v}. By (3.3) and (3.4), we have    n−1 wH(a − v) = wH(b) = wH(b) = n(q − 1)q , (3.9) n n a∈C b∈Zq \{0} b∈Zq    w2 ( − ) = w2 ( ) = w2 ( ) = n(q − )[ + n(q − )]qn−2. H a v H b H b 1 1 1 (3.10) n n a∈C b∈Zq \{0} b∈Zq

Hence, by Proposition 1 and (3.7)–(3.10), we have   n(q − )   d ( , ) = 1 (qn − )2 − , H a b q 1 1 (3.11) a∈C b∈C d2 ( , ) = n(q − )[ + n(q − )]qn−2(qn − ). H a b 1 1 1 2 (3.12) a∈C b∈C n Note that |C|=q − 1, we obtain (3.5) from (1.1) and (3.11). It is easy to see from (3.5) that d(C)¯ can also be written as n n (q − 1)nq (q − 2) d(C)¯ = . (3.13) q(qn − 1)2 470 F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478

Hence, by (1.3), (3.12) and (3.13),   1 2 ¯ 2 var(C) = [dH(a, b)] −[d(C)] (qn − 1)2 a∈C b∈C n(q − )[ + n(q − )]qn−2(qn − ) (q − )2n2q2n(qn − )2 = 1 1 1 2 − 1 2 (qn − 1)2 q2(qn − 1)4 n (q − 1)n[(q − 1)2 − 1] (q − 1)n = 1 + q2(qn − 1)2 (qn − 1)2 (q − )n (q − )n[(q − )n − ] (q − )2n2 = 1 + 1 1 1 − 1 . q2 q2(qn − 1)2 q2(qn − 1)4

This completes the proof. 

+ − + − Secondly, we compute var(C) if C = ,  ,  , ,  ,  that are defined in Section 1.

+ − + − Proposition 3. Let ,  ,  , ,  and  be the sets defined in Section 1. Then

(q − 1)(n + 2) var() = , (3.14) q2 n 2 2 2 2 + (q − 1)(n + 2) 4q + (q − 1)(n + 2) − (q − 1) n (q − 1) n var( ) = − − , (3.15) q2 q2(qn−1 + 1)2 q2(qn−1 + 1)4 2 2 2 2 − (q − 1)(n + 2) (q − 1) n − (q − 1)(n + 2) (q − 1) n var( ) = + − , (3.16) q2 q2(qn−1 − 1)2 q2(qn−1 − 1)4 (q − 1)(n − 1) var() = , (3.17) q2 (q − )(n − ) qn+1 + (q − )(n − )[(q − )(n − ) − ] var(+) = 1 1 + 2 1 1 1 1 1 q2 q2(qn−1 + 1)2 n [2q − (q − 1)(n − 1)]2 − , (3.18) q2(qn−1 + 1)4 2 2 − (q − 1)(n − 1) (q − 1)(n − 1)[(q − 1)(n − 1) − 1] (q − 1) (n − 1) var( ) = + − . (3.19) q2 q2(qn−1 − 1)2 q2(qn−1 − 1)4

n− Proof. It is easy to see from the definition of  that  is an [n, n − 1] linear code over Fq and ||=q 1. The dual code of  is given by

⊥ n  ={(a, a, 0, 0,...,0) ∈ Fq | a ∈ Fq }.

⊥ The dual distance distribution of  is equal to the weight distribution of  , that is

B0 = 1,B1 = 0,B2 = q − 1,Bi = 0,i3.

Hence, by Lemma 2,

(q − 1)n (q − 1)(n + 2) d(¯ ) = ,var() = . (3.20) q q2

Furthermore, by (1.1), (1.3) and (3.20), we have   2n−3 dH(a, b) = (q − 1)nq , (3.21)  ∈ a∈ b d2 ( , ) = q2(n−2)[(q − )(n + ) + (q − )2n2]. H a b 1 2 1 (3.22) a∈ b∈ F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478 471

It follows from the definition of  and (3.3) that  wH(a − (0, 1, 0,...,0))  a∈  = [wH((a1,a2 − 1)) + wH((a3,...,an))] (a ,a ,a ,...,a )∈ 1 2 3 n  = [1 + wH((a3,...,an))] n−2 (a ,a )=(0,0), (−1,1), (a ,...,an)∈Fq 1 2  3 + [2 + wH((a3,...,an))] n−2 a =0,1,a+a =0,(a,...,an)∈Fq 2 1 2 3  = 2 [1 + wH(b)]+(q − 2) [2 + wH(b)] n−2 n−2 b∈Fq b∈Fq  n−2 = 2(q − 1)q + q wH(b) n−2 b∈Fq n− = n(q − 1)q 2. (3.23)

In the same way, by (3.3) and (3.4), we have  w2 ( − ( , , ,..., )) H a 0 1 0 0  a∈   2 2 = 2 [1 + wH(b)] + (q − 2) [2 + wH(b)] n−2 n−2 b∈Fq b∈Fq n− =[(q − 1)2n2 + (q − 1)n − 2]q 3. (3.24) + Hence, by the definition of  , (3.1), (3.20) and (3.23),

2(n−1) n−2 n−2 n−1 + q 2n(q − 1)q n(q − 1)q (q + 2) d(¯  ) = d(¯ ) + = . (3.25) (qn−1 + 1)2 (qn−1 + 1)2 (qn−1 + 1)2 + By the definition of  , (3.2), (3.22) and (3.24),   d2 ( , ) =[(q − )(n + ) + (q − )2n2]q2(n−2) H a b 1 2 1 + + a∈ b∈ n− + 2[(q − 1)2n2 + (q − 1)n − 2]q 3. (3.26)

From (1.3), (3.25) and (3.26), we have var(+) [(q − )(n + ) + (q − )2n2]q2(n−2) + [(q − )2n2 + (q − )n − ]qn−3 = 1 2 1 2 1 1 2 (qn−1 + 1)2 (n− ) n− n2(q − 1)2q2 2 (q 1 + 2)2 − . (3.27) (qn−1 + 1)4 Note that (n− ) n− [(q − 1)(n + 2) + (q − 1)2n2]q2 2 + 2[(q − 1)2n2 + (q − 1)n − 2]q 3 − n− n = q 2([(q − 1)(n + 2) + (q − 1)2n2][(q 1 + 1)2 − 1]−4q ), (3.28) (n− ) n− n2(q − 1)2q2 2 (q 1 + 2)2 − n− n− = n2(q − 1)2q 2[(q 1 + 1)4 − 2(q 1 + 1)2 + 1]. (3.29)

Hence, we obtain (3.15) from (3.27)–(3.29). 472 F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478

− − Now we compute var( ). Note that  is a linear code over Fq . By the definition of  ,     dH(a, b) = wH(b − a)  ∈ ∈ ∈ a∈ b a b n−1 = q wH(c)  c∈ n−1 = q wH(c). (3.30) − c∈ In the same way, we have    d2 ( , ) = qn−1 w2 ( ). H a b H c (3.31) − a∈ b∈ c∈ By (3.21), (3.22), (3.30) and (3.31),  n−2 wH(c) = (q − 1)nq , (3.32) − c∈ w2 ( ) = qn−3[(q − )(n + ) + (q − )2n2]. H c 1 2 1 (3.33) − c∈ By (3.1), (3.2), (3.21), (3.22), (3.32) and (3.33),   n−2 n−1 dH(a, b) = (q − 1)nq (q − 2), (3.34) − ∈− a∈ b d2 ( , ) =[(q − )(n + ) + (q − )2n2]qn−3(qn−1 − ). H a b 1 2 1 2 (3.35) − − a∈ b∈ It follows from (1.1) and (3.34) that n−2 n−1 − (q − 1)nq (q − 2) d(¯  ) = . (3.36) (qn−1 − 1)2 Hence, by (1.3), (3.35) and (3.36), [(q − )(n + ) + (q − )2n2]qn−3(qn−1 − ) var(−) = 1 2 1 2 (qn−1 − 1)2 (n− ) n− (q − 1)2n2q2 2 (q 1 − 2)2 − . (3.37) (qn−1 − 1)4 Note that n− n− − n− q 3(q 1 − 2) = q 2[(q 1 − 1)2 − 1], (3.38) (n− ) n− − n− n− q2 2 (q 1 − 2)2 = q 2[(q 1 − 1)4 − 2(q 1 − 1)2 + 1]. (3.39)

Hence, we obtain (3.16) from (3.37)–(3.39). Now we compute var(). From the definition of , (3.7) and (3.8), we have     2n−3 dH(a, b) = dH(x, y) = (q − 1)(n − 1)q , (3.40) ∈ ∈ n−1 n−1 a b x∈Zq y∈Zq     d2 ( , ) = d2 ( , ) H a b H x y ∈ ∈ n−1 n−1 a b x∈Zq y∈Zq n− = (q − 1)(n − 1)[1 + (q − 1)(n − 1)]q2 4. (3.41)

Hence, by (1.1) and (3.40), we have (q − )(n − ) d(¯ ) = 1 1 . q (3.42) F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478 473

By (1.3), (3.41) and (3.42), (q − )(n − ) var() = 1 1 . q2 + Now we compute var( ). From the definition of  and (3.3), we have   wH(a − (0,...,0, 1)) = [wH(x) + 1] ∈ n−1 a x∈Zq n− n− = q 1 + (q − 1)(n − 1)q 2. (3.43) In the same way, by (3.3) and (3.4),  w2 ( − ( ,..., , )) H a 0 0 1  a∈ 2 = [wH(x) + 1] n−1 x∈Zq n− n− n− = q 1 + 2(q − 1)(n − 1)q 2 + (q − 1)(n − 1)[1 + (q − 1)(n − 1)]q 3. (3.44) + Hence, by the definition of  , Proposition 1, (3.40), (3.41), (3.43) and (3.44), we have   2n−3 n−1 n−2 dH(a, b) = (q − 1)(n − 1)q + 2q + 2(q − 1)(n − 1)q + + a∈ b∈ (q − )(n − ) qn − (q − )(n − ) = 1 1 (qn−1 + )2 + 2 1 1 , q 1 q (3.45)   d2 ( , ) = (q − )(n − )[ + (q − )(n − )]q2n−4 + qn−1 H a b 1 1 1 1 1 2 + + a∈ b∈ n− n− + 4(q − 1)(n − 1)q 2 + 2(q − 1)(n − 1)[1 + (q − 1)(n − 1)]q 3 (q − 1)(n − 1)[1 + (q − 1)(n − 1)] n− = (q 1 + 1)2 q2 n+ n 2q 1 + 4(q − 1)(n − 1)q − (q − 1)(n − 1)[1 + (q − 1)(n − 1)] + . (3.46) q2 It follows from (1.1) and (3.45) that n + (q − 1)(n − 1) 2q − (q − 1)(n − 1) d(¯  ) = + . (3.47) q q(qn−1 + 1)2 Hence, by (1.3), (3.46) and (3.47), we have   + 1 + var( ) = d2 (a, b) −[d(¯  )]2 n−1 2 H (q + 1) + + a∈ b∈ (q − )(n − ) qn+1 + (q − )(n − )[(q − )(n − ) − ] = 1 1 + 2 1 1 1 1 1 q2 q2(qn−1 + 1)2 [ qn − (q − )(n − )]2 − 2 1 1 . q2(qn−1 + 1)4 − − Now we compute var( ). From the definitions of  and  and (3.3), we have    n−2 wH(a) = wH(a) = wH(x) = (q − 1)(n − 1)q . (3.48) − ∈ n−1 a∈ a x∈Zq In the same way, by (3.4),   w2 ( ) = w2 ( ) = (q − )(n − )[ + (q − )(n − )]qn−3. H a H x 1 1 1 1 1 (3.49) − n−1 a∈ x∈Zq 474 F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478

− Hence, by the definition of  , Proposition 1, (3.40), (3.41), (3.48) and (3.49), we have   (q − )(n − ) d ( , ) = 1 1 [(qn−1 − )2 − ], H a b q 1 1 (3.50) − − a∈ b∈   (q − 1)(n − 1)[1 + (q − 1)(n − 1)] n− d2 (a, b) = [(q 1 − 1)2 − 1]. (3.51) H q2 − − a∈ b∈ It follows from (1.1) and (3.50) that

− (q − 1)(n − 1) (q − 1)(n − 1) d(¯  ) = − . (3.52) q q(qn−1 − 1)2 Hence, by (1.3), (3.51) and (3.52), we have   − 1 − var( ) = d2 (a, b) −[d(¯  )]2 n−1 2 H (q − 1) − − a∈ b∈ (q − )(n − ) (q − )(n − )[(q − )(n − ) − ] = 1 1 + 1 1 1 1 1 q2 q2(qn−1 − 1)2 (q − )2(n − )2 − 1 1 . q2(qn−1 − 1)4 This completes the proof. 

n− Remark 1. There is a typographical error for T (n, 2 1 + 1; 2) in [15]. It follows from Theorem 2 and Proposition 3 that n− T (n, 2 1 + 1; 2) n − n+2 + (n − )(n − ) [ n+1 − n + ]2 = 1 + 2 1 2 − 2 1 4 4(2n−1 + 1)2 4(2n−1 + 1)4 n − n2 + n + ( n − n − )( 2n + n + n + ) = 1 + 2 + 2 1 2 2 1 . 4 4(2n−1 + 1)2 4(2n−1 + 1)4

4. Proof of Theorem 1

In this section, we first derive a general upper bound on var(C) for 2q 4, then we prove Theorem 1 by observing that + − this upper bound is tight for ,  and  .

Theorem 3. LetCbean(n, M; q) code. If 2q 4 and M ≡  (mod q), then (q − )n − qn−2 2(q − )2n2 var(C) 1 2 + 2 − q2 M q2M4 (q − ) n − [2q − 2 − n(q − 1)(q − 2) − n(n − 1)(q − 1)2]. (4.1) q2(q − 1)M2

Proof. The dual distance distribution of C is given by B0,B1,...,Bn. By Lemmas 1 and 3, we have qn n B = − − B − B 2 M 1 1 k k=3 n n   q (q − ) k− n  − 1 − B − (q − 1) 1 M 1 M2 k k= 3 n 2 q (q − ) n n(n − 1)(q − 1) = − 1 − B1 − q − 1 − (q − 1)n − . (4.2) M (q − 1)M2 2 F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478 475

B  (q−)n q  since 1 M2 , by Lemmas 2 and 3 and (4.2), we have for 2 4,

(q − )n − qn−2 (q − ) var(C) 1 2 + 2 + 4 B − 1 (B )2 q2 M q2 1 q2 1 (q − ) n − [2q − 2 − 2(q − 1)n − n(n − 1)(q − 1)2] (q − 1)q2M2 (q − )n − qn−2 2(q − )2n2  1 2 + 2 − q2 M q2M4 (q − ) n − [2q − 2 − n(q − 1)(q − 2) − n(n − 1)(q − 1)2]. q2(q − 1)M2

This completes the proof. 

Proof of Theorem 1. Note that

n− + n− − n− ||=q 1, | |=q 1 + 1, | |=q 1 − 1.

By Theorem 3 and Proposition 3, we have for 2q 4,

(q − )(n + ) var()R(n, qn−1; q) 1 2 = var(), q2 + n− var( )R(n, q 1 + 1; q) (q − )(n + ) qn + (q − )(n + ) − (q − )2n2  1 2 − 4 1 2 1 q2 q2(qn−1 + 1)2 (q − )2n2 − 1 q2(qn−1 + 1)4 = var(+), − n− var( )R(n, q 1 − 1; q) (q − )(n + ) (q − )2n2 − (q − )(n + ) (q − )2n2  1 2 + 1 1 2 − 1 q2 q2(qn−1 − 1)2 q2(qn−1 − 1)4 = var(−).

Hence, Theorem 1 follows from combining these assertions. 

5. Proof of Theorem 2

In this section, we first derive a general lower bound on var(C) for all q, then we prove Theorem 2 by showing that this lower bound is tight for some cases. Let C be an (n, M; q) code where M ≡  (mod q). Denote (q − )n sq (n, M) = , qM2

n− n q 1 1 (q − )[q − 1 − (q − 1)n] tq (n, M) = − − M q q(q − 1)M2 and define (q − ) F(x)= 2 x − x2,x>, q 0

rq (n, M) = min{F(sq (n, M)), F (tq (n, M))}. 476 F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478

Theorem 4. LetCbean(n, M; q) code with M ≡  (mod q). Then (q − 1)n (q − )(q − 1)n(n − 1) var(C) + + rq (n, M). (5.1) q2 q2M2

q−2 Moreover, if tq (n, M) q , then rq (n, M) = F(tq (n, M)).

B1 Proof. The dual distance distribution of C is given by B0,B1,...,Bn. Denote x0 = q . By Lemmas 2 and 3, we have (q − )n var(C) = 1 + 2 B + F(x ) q2 q2 2 0 (q − 1)n (q − )(q − 1)n(n − 1)  + + F(x ). (5.2) q2 q2M2 0

It follows from Lemma 3 that x0 sq (n, M). By Lemmas 1 and 3, we have qn n B = − − B 1 M 1 k k=2 n n   q (q − ) k− n  − 1 − (q − 1) 1 M M2 k k=2 n n q (q − )[q − 1 − (q − 1)n] = − 1 − . M (q − 1)M2

Hence x0 tq (n, M). Since for every interval F(x) achieves its minimum at one of its endpoints, we obtain (5.1). Moreover, q−2 since sq (n, M)0, it follows that F(sq (n, M))F(tq (n, M)) when tq (n, M) q . This completes the proof.  Proof of Theorem 2. Note that n− + n− − n− ||=q 1, | |=q 1 + 1, | |=q 1 − 1. n− For M = q 1,wehave = 0 and q − q − t (n, qn−1) = 1  2 . q q q Hence, we have   n− q − 1 q − 1 rq (n, q 1) = F =− . q q2 Thus, by Theorem 4 and Proposition 3, (q − )(n − ) var()T (n, qn−1; q) 1 1 = var() q2 which implies (1.7). n− For M = q 1 − 1, we have  = q − 1 and

n− q − 1 (q − 1)(n − 1) tq (n, q 1 − 1) = + . q q(qn−1 − 1)2 n−1 q−2 It is easy to see that tq (n, q − 1) q . Hence, by Theorem 4 and Proposition 3, we have − n− var( )T (n, q 1 − 1; q) (q − )(n − ) (q − )(n − )[(q − )(n − ) − ]  1 1 + 1 1 1 1 1 q2 q2(qn−1 − 1)2 (q − )2(n − )2 − 1 1 q2(qn−1 − 1)4 = var(−) which implies (1.8). F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478 477

n− For M = q 1 + 1, we have  = 1 and n n− q − 1 2q − (q − 1)(n − 1) tq (n, q 1 + 1) = − . q q(qn−1 + 1)2

n−1 q−2 It is easy to check that tq (n, q + 1) q for n3. If n = 2, then M = q + 1 and

q − 1 2q2 − q + 1 tq (2,q+ 1) = − , q q(q + 1)2 2(q − 1) sq (2,q+ 1) = . q(q + 1)2

It is easy to check that for n = 2 and q = 2, 3, q − t ( ,q+ ) 2 q 2 1 q

and for n = 2 and q = 4, t4(2, 5) = 0.46 and s4(2, 5) = 0.06, so that r4(2, 5) = F(t4(2, 5)). Hence, by Theorem 4 and Proposition 3, we have for n3or2q 4,

+ n− var( )T (n, q 1 + 1; q) (q − )(n − ) qn+1 + (q − )(n − )[(q − )(n − ) − ]  1 1 + 2 1 1 1 1 1 q2 q2(qn−1 + 1)2 [ qn − (q − )(n − )]2 − 2 1 1 q2(qn−1 + 1)4 = var(+)

which implies (1.9). This completes the proof. 

Acknowledgements

The authors would like to thank the anonymous reviewers and the Editor-in-Chief Professor Peter L. Hammer for their valuable suggestions and comments that helped to improve this paper. This research work is supported in part by the DSTA Research Grant R-394-000-011-422, the National Natural Science Foundation of China under the Grant 60172060, the Trans-Century Training Program Foundation for the Talents by the Education Ministry of China, and the Foundation for University Key Teacher by the Education Ministry of China, and the 100-Talents program of the Chinese Academy of Science.

References

[1] R. Ahlswede, I. Althöfer, The asymptotic behaviour of diameters in the average, J. Combin. Theory Ser. B 61 (1994) 167–177. [2] R. Ahlswede, G. Katona, Contributions to the geometry of Hamming spaces, Discrete Math. 17 (1977) 1–22. [3] R. Ahlswede, Z. Zhang, Coding for write-efficient memory, Information and Computation 83 (1994) 80–97. [4] I. Althöfer, T. Sillke, An average distance inequality for large subsets of the cube, J. Combin. Theory Ser. B 56 (1992) 296–301. [5] A. Ashikhmin, J. Simonis, On the Delsarte inequalities, Linear Algebra Appl. 269 (1998) 197–217. [6] P. Delsarte, An algebraic approach to the association schemes of coding theory, Philips Res. Rep. Suppl. 10 (1973). [7] P. Delsarte, Bounds for unrestricted codes, by linear programming, Philips Res. Rep. 27 (1972) 272–289. [8] F.-W. Fu, On the improvements of Althöfer-Sillke inequality, J. Math. Res. Exposition 16 (3) (1996) 325–328. [9] F.-W. Fu, S.-Y. Shen, On the expectation and variance of Hamming distance between two binary i.i.d. random vectors, Acta. Math. Appl. Sinica 13 (1997) 243–250. [10] F.-W. Fu, V.K. Wei, R.W.Yeung, On the minimum average distance for binary codes: linear programming approach, Discrete Appl. Math. 111 (2001) 263–281. [11] F.-W. Fu, C. Xing, On the minimum average distance for nonbinary codes, in: Proceedings of Fourth Shanghai International Conference on Combinatorics, Shanghai, China, May 24–28, 2002. 478 F.-W. Fu et al. / Discrete Applied Mathematics 145 (2005) 465–478

[12] A. Kündgen, Covering cliques with spanning bicliques, J. Graph Theory 27 (1998) 223–227. [13] A. Kündgen, Minimum average distance subsets in the Hamming cube, Discrete Math. 249 (2002) 149–165. [14] F.J. MacWilliams, N.J.A. Sloane, The Theory of Error-Correcting Codes, North-Holland, New York, 1985. [15] S.-T. Xia, F.-W. Fu, On the average Hamming distance for binary codes, Discrete Applied Math. 89 (1998) 269–276. [16] Z.Z. Zhang, A relation between the average Hamming distance and the average Hamming weight of binary codes, J. Statist. Plann. Inference 94 (2001) 413–419.