Quick viewing(Text Mode)

The Communication Complexity of the Hamming Distance Problem

The Communication Complexity of the Hamming Distance Problem

Information Processing Letters 99 (2006) 149–153 www.elsevier.com/locate/ipl

The communication complexity of the Hamming problem

Wei Huang a,1, Yaoyun Shi a,1, Shengyu Zhang b,2,YufanZhua,∗,1

a Department of Electrical Engineering and Computer Science, University of Michigan, 1301 Beal Avenue, Ann Arbor, MI 48109-2122, USA b Computer Science Department, Princeton University, NJ 08544, USA Received 20 February 2005; received in revised form 28 September 2005; accepted 9 January 2006

Communicated by P.M.B. Vitányi

Abstract

We investigate the randomized and quantum communication complexity of the HAMMING DISTANCE problem, which is to determine if the Hamming distance between two n-bit strings is no less than a threshold d. We prove a quantum lower bound of (d) qubits in the general interactive model with shared prior entanglement. We also construct a classical protocol of O(d log d) bits in the restricted Simultaneous Message Passing model with public random coins, improving previous protocols of O(d2) bits [A.C.-C. Yao, On the power of quantum fingerprinting, in: Proceedings of the 35th Annual ACM Symposium on Theory of Computing, 2003, pp. 77–81], and O(d log n) bits [D. Gavinsky, J. Kempe, R. de Wolf, Quantum communication cannot simulate a public coin, quant-ph/0411051, 2004]. © 2006 Published by Elsevier B.V.

Keywords: Computational complexity; Communication complexity; Hamming distance

1. Introduction We recall some basic concepts below. Let n be an in- teger and X = Y ={0, 1}n.Letf : X × Y →{0, 1} be a Communication complexity was introduced by Yao Boolean function. Consider the scenario where two par- [17] and has been extensively studied afterward not only ties, Alice and Bob, who know only x ∈ X and y ∈ Y , for its own intriguing problems, but also for its many respectively, communicate interactively with each other applications ranging from circuit lower bounds to data to compute f(x,y).Thedeterministic communication streaming algorithms. We refer the reader to the mono- complexity of f , denoted by D(f ), is defined to be the graph [12] for an excellent survey. minimum integer k such that there is a protocol for com- puting f using no more than k bits of communication on any pair of inputs. The randomized communication complexity of f , denoted by Rpub(f ), is similarly de- * Corresponding author. fined, with the exception that Alice and Bob can use E-mail addresses: [email protected] (W. Huang), publicly announced random bits and that they are re- [email protected] (Y. Shi), [email protected] quired to compute f(x,y) correctly with probability (S. Zhang), [email protected] (Y. Zhu). 1 Supported in part by NSF grants 0347078 and 0323555. at least 2/3. One of the central themes on the classi- 2 Supported in part by NSF grants CCR-0310466 and CCF- cal communication complexity studies is to understand 0426582. how randomness helps in saving the communication

0020-0190/$ – see front matter © 2006 Published by Elsevier B.V. doi:10.1016/j.ipl.2006.01.014 150 W. Huang et al. / Information Processing Letters 99 (2006) 149–153 cost. A basic finding of Yao [17] is that there are func- exponential separation is generalized by Yao [19], tions f such that R(f ) = O(log D(f )). One example is showing that R,pub(f ) = constant implies Q(f ) = the EQUALITY problem, which simply checks whether O(log n). As an application, Yao considered the HAM- x = y. MING DISTANCE problem defined below. For any Later results show that different ways of using ran- x,y ∈{0, 1}n, the of x, denoted by domness result in quite subtle changes on communica- |x|, is the number of 1’s in x, and the Hamming dis- tion complexity. A basic finding in this regard, due to tance of x and y is |x ⊕ y|, with “⊕” being bit-wise Newman [13], is that public-coin protocols can save at XOR. most O(log n) bits over protocols in which Alice and   Bob toss private (and independent) coins. The situation Definition 1.1. For 1 d n,thed-HAMMING DIS- is, however, dramatically different in the Simultaneous TANCE problem is to compute the following Bool- { }n ×{ }n →{ } Message Passing (SMP) model, also introduced by Yao ean function HAMn,d : 0, 1 0, 1 0, 1 , with = | ⊕ | [17], where Alice and Bob each send a message to a HAM(x, y) 1 if and only if x y >d. third person, who then outputs the outcome of the pro- Lemma 1.2. (Yao [19].) R,pub(HAM ) = O(d2). tocol. Apparently, this is a more restricted model and n,d for any function, the communication complexity in this In a recent paper [10], Gavinsky et al. gave another model is at least that in the general interactive commu- classical protocol, which is an improvement over Yao’s nication model. Denote by R(f ) and R,pub(f ) the when d  log n. communication complexities in the SMP model with private and public random coins, respectively. It is in- ,pub =  Lemma 1.3. (Gavinsky et al. [10].) R (HAMn,d ) teresting to note that R ,pub(EQUALITY) = O(1) but  √ O(d log n). R (EQUALITY) = ( n) [2,14,5]. Yao also initiated the study of quantum communica- In this paper, we observe a lower bound for ∗ tion complexity [18], where Alice and Bob are equipped Q (HAMn,d ), which is also a lower bound for ,pub with quantum computational power and exchange quan- R (HAMn,d ) according to Eq. (1). tum bits. Allowing an error probability of no more than Notice that HAM(x, y) = n − HAM(x, y)¯ , where def 1/3 in the interactive model, the resulting communi- y¯ = 11 ···1 ⊕ y. Therefore cation complexity is the quantum communication com- ∗ = ∗ plexity of f , denoted by Q(f ). If the two parties are Q (HAMn,d ) Q (HAMn,n−d ), allowed to share prior quantum entanglement, the quan- and we need only consider the case d  n/2. tum analogy of randomness, the communication com- ∗ ∗ plexity is denoted by Q (f ). Similarly, the quantum Proposition 1.4. For any d  n/2, Q (HAMn,d ) = communication complexities in the SMP model are de- (d). noted by Q and Q,∗, depending on whether prior en- tanglement is shared. The following relations among the We then construct a public-coin randomized SMP measures are easy to observe. protocol that almost matches the lower bound and im- proves both of the above protocols. pub ∗  R (f )  ,pub Q (f ) ,∗ R (f ). (1) ,pub Q (f ) Theorem 1.5. R (HAMn,d ) = O(d log d).

Two very interesting problems in both communi- We shall prove the above two results in the follow- cation models are the power of quantumness, i.e., de- ing sections. Finally we discuss open problems and a termining the biggest gap between quantum and ran- plausible approach for closing the gap. domized communication complexities, and the power of shared entanglement, i.e., determining the biggest Other related work. Ambainis et al. [3] considered gap between quantum communication complexities the error-free communication complexity, and proved with and without shared entanglement. An impor- that any error-free quantum protocol for the Hamming tant result for the first problem by Buhrman et al. Distance problem requires at least n − 2 qubits of com-  [7] is Q (EQUALITY) = O(log n), an exponential sav- munication in the interactive model, for any d  n − 1, ing compared to the randomized counterpart result Feigenbaum et al. [9] started the secure multiparty ap-  √ R (EQUALITY) = ( n) mentioned above. This proximate computation of the Hamming distance. W. Huang et al. / Information Processing Letters 99 (2006) 149–153 151

2. Lower bound of the quantum communication d  3n /8 and d = (d). Employing the result of the ∗ complexity of the Hamming distance problem case that d  3n/8, we have Q (HAMn ,k ,d ) = (d ). ∗ ∗ Thus Q (HAMn,d )  Q (HAMn ,k ,d ) = (d ) = 2 For proving the lower bound, we restrict HAMn,d (d). on those pairs of inputs with equal Hamming distance. More specifically, for an integer k,1 k  n, define 3. Upper bound of the classical communication def n complexity of the Hamming distance problem Xk = Yk ={x: x ∈{0, 1} , |x|=k}. Let HAMn,k,d : Xk × Yk →{0, 1} be the restriction of HAMn,d on To prove Theorem 1.5, we reduce the HAMn,d prob- Xk × Yk. Before proving Proposition 1.4, we briefly intro- lem to HAM16d2,d problem by the following lemma. duce some related results. Let x,y ∈{0, 1}n.The Lemma 3.1. DISJOINTNESS problem is to compute the following   n n   Boolean function DISJn : {0, 1} ×{0, 1} →{0, 1}, ,pub = ,pub R (HAMn,d ) O R (HAM16d2,d ) . DISJn(x, y) = 1 if and only if there exists an inte- ger i,1 i  n, so that xi = yi = 1. It is known that ∗ √ Note that Theorem 1.5 immediately follows from R(DISJ ) = (n) [11,15], and Q (DISJ ) = ( n) ,pub n n Lemma 3.1 because by Lemma 1.3, R (HAMn,d ) = [16,1,4]. ,pub = 2 = O(d log n), thus R (HAM16d2,d ) O(d log d ) We shall use an important lemma in Razborov [16], O(d log d). Now by Lemma 3.1, we have which is more general than his remarkable lower bound ,pub = on quantum communication complexity of DISJOINT- R (HAMn,d ) O(d log d). NESS. Here we may abuse the notation by viewing So in what follows, we shall prove Lemma 3.1. De- ∈{ }n { ∈[ ] = } x 0, 1 as the set i n : xi 1 . fine a partial function HAMn,d|2d (x, y) with domain {(x, y): x,y ∈{0, 1}n, |x ⊕ y| is either less than d or Lemma 2.1. (Razborov [16].) Suppose k  n/4 and at least 2d} as follows:  [ ]→{ }  l k/4. Let D : k 0, 1 be any Boolean predi-  = − × → 0 if HAM(x, y) d, cate such that D(l) D(l 1). Let fn,k,D : Xk Yk HAMn,d|2d (x, y) = (2) def 1 if HAM(x, y) > 2d. {0, 1} be such that√ fn,k,D(x, y) = D(|x ∩ y|). Then ∗ Then Q (fn,k,D) = ( kl).

Proof of Proposition 1.4. Consider D in Lemma 2.1 Lemma 3.2. = ∈ ,pub such that D(t) 1 if and only if t d. Therefore, D( x y ) 1 if and protocol for HAMn,d|2d . Assume the Hamming distance only if HAM(x, y) > d. This implies that fn,k,D and between x and y is k. Alice and Bob share some ran- HAMn,k,d are actually the same function, and thus dom public string, which consists of a sequence of γn ∗ ∗ Q (fn,k,D) = Q (HAMn,k,d ). (γ is some constant to be determined later) random To use Lemma 2.1, the following two constraints on bits, each of which is generated independently with k and l need to be satisfied: k  n/4 and l  k/4. When probability p = 1/(2d) of being 1. Denote this string d  3n/8, let k = 2d/3  n/4, then l = 2d/3 − d/2 = by z1,z2,...,zγ , each of length n.PartyA sends the d/6  n/16. Both requirements for k and l are satis- string a = a1a2 ...aγ to the referee, where ai = x · ∗ fied. So applying√ Lemma 2.1, we get Q (HAMn,k,d ) = zi (mod 2).PartyB sends the string b = b1b2 ...bγ to ∗ Q (fn,k,D) = ( kl) = (d). the referee, where bi = y · zi (mod 2). The referee an- For 3n/8

Lemma 3.3. Assume that the Hamming distance be- when she runs P1 and P2, respectively. So does Bob tween x and y is k. Given c as defined above, each ci is send the concatenation of his two corresponding mes- an independent random variable with probability αk of sages mB,1 and mB,2. The referee then runs protocol Pi = − − k being 1, where αk 1/2 1/2(1 1/d) . on (mA,i,mB,i) and gets the results ri . The referee now announces |x ⊕ y|  d if and only if both r1 and r2 say Since αk is an increasing function over k, to separate |x ⊕ y|  d. k  d from k>2d, it would be sufficient to discrim- It is easy to see that the protocol is correct. If |x ⊕ y| = = inate the two cases that k d and k 2d.LetNk be  d, then both protocols announces so with probability a random variable denoting the number of 1’s in c, at least 7/8, and thus P says so with probability at least and E(Nk) and σ(Nk) denote corresponding expecta- 3/4. If |x ⊕ y| >d, then one of the protocols gets the tion and standard deviation, respectively. Then we have correct range of |x ⊕ y| with probability at least 7/8, =  1/2 − E(Nk) αkγ , and σ(Nk) (αkγ) . Thus E(N2d ) and thus P announces |x ⊕ y| >d with probability at E(N ) = γ(α − α ) = 1 γ( − 1 )d ( − ( − 1 )d )  d 2d d 2 1 d 1 1 d least 7/8 too. 1 = −  8 γ .Letγ 20000, then E(N2d ) E(Nd ) 2500, Now it remains to design a protocol of 1 1/2 = ,pub while σ(Nd ), σ (N2d )<(2 γ) 100. The cutoff O(R (HAM16d2,d )) communication to distinguish point in the protocol is the middle of E(Nd ) and |x ⊕ y|  d and d<|x ⊕ y|  2d. First we assume E(N2d ). By Chebyshev Inequility, with probability of that n is divisible by 16d2, otherwise we pad some at most 1/100, |Nd − E(Nd )| > 10σ(Nd ) = 1000. So 0’s to the end of x and y. Using the public ran- does N2d . Thus with probability of at least 49/50, the dom bits, Alice divides x randomly into 16d2 parts number of 1’s in c being more than cutoff point implies evenly, Bob also divides y correspondingly. Let Ai,Bi k>2d and vice versa. Therefore, O(γ ) communication (1  i  16d2) denote corresponding parts of x,y.By is sufficient to discriminate the case HAM(x, y) > 2d Fact 1, with probability at least 7/8, each pair A ,B  i i and HAM(x, y) d with error probability of at most would contain at most one bit on which x and y dif- 1/50. 2 fer. Therefore, the Hamming distance of Ai and Bi would be either 0 or 1, i.e, the Hamming distance The following fact is also useful of Ai and Bi equals the parity of Ai ⊕ Bi , which is further equal to PARITY(A ) ⊕ PARITY(B ).Leta Fact 1. If 2d balls are randomly thrown into 16d2 buck- i i i denote the parity of Ai , bi denote the parity bit of ets, then with probability of at least 7/8, each bucket = = has at most one ball. Bi , and let a a1a2 ...a16d2 , b b1b2 ...b16d2 . Then HAM 2 (a, b) = HAM (x, y) with probability at   16d ,d n,d 2d least 7/8. So we run the best protocol for HAM16d2,d Proof. There are 2 pairs of balls. The probability of one specific pair of balls falling into the same bucket is on the input (a, b), and use the answer to distinguish | ⊕ |  | ⊕ |  2 1 · 1 ·16d2 = 1 . Thus the probability of having x y d and d< x y 2d. 16d2 16d2 16d2 a pair of balls in the same bucket is upper bounded by 1 · 2d < 1/8. Thus Fact 1 holds. 2 16d2 2 4. Discussion Now we are ready to prove Lemma 3.1. We conjecture that our quantum lower bound in Proof of Lemma 3.1. If 16d2  n, the lemma is obvi- Lemma 1.4 is tight. It seems plausible to remove the ously true by appending 0’s to x and y. O(log d) factor in our upper bound. Recently, Aaron- 2 son and Ambainis [1] sharpened the upper bound of the If 16d 2d with error probability at most 1/8. Now and the connection between quantum query and com- the whole protocol P is as follows. Alice sends the con- munication [8]. Methods similar to [1] might help to catenation of mA,1 and mA,2, which are her messages remove the O(log d) factor in this upper bound. W. Huang et al. / Information Processing Letters 99 (2006) 149–153 153

Acknowledgement 30th Annual ACM Symposium on Theory of Computing, 1998, pp. 63–68. We thank Alexei Kitaev for suggesting the relation of [9] J. Feigenbaum, Y. Ishai, T. Malkin, K. Nissim, M.J. Strauss, the HAMMING DISTANCE problem and the DISJOINT- R.N. Wright, Secure multiparty computation of approximations, in: Proceedings of 28th International Colloquium on Automata, NESS problem, Ronald de Wolf for pointing out a mis- Languages and Programming, 2001. take in our earlier draft, Jialin Zhang for suggesting to [10] D. Gavinsky, J. Kempe, R. de Wolf, Quantum communication remove the O(d log d) term in Lemma 3.1 in our earlier cannot simulate a public coin, quant-ph/0411051, 2004. draft and anonymous reviewers for helping us improve [11] B. Kalyanasundaram, G. Schnitger, The probabilistic communi- the presentation of this paper. cation complexity of set intersection, SIAM Journal on Discrete Mathematics 5 (4) (1992) 545–557. References [12] E. Kushilevitz, N. Nisan, Communication Complexity, Cam- bridge University Press, Cambridge, 1997. [1] S. Aaronson, A. Ambainis, Quantum search of spatial regions, [13] I. Newman, Private vs. common random bits in communication in: Proceedings of the 44th Annual IEEE Symposium on Foun- complexity, Information Processing Letters 39 (2) (1991) 67– dations of Computer Science, 2003, pp. 200–209. 71. [2] A. Ambainis, Communication complexity in a 3-computer mo- [14] I. Newman, M. Szegedy, Public vs. private coin flips in one del, Algorithmica 16 (3) (1996) 298–301. round communication games, in: Proceedings of the 28th An- [3] A. Ambainis, W. Gasarch, A. Srinavasan, A. Utis, Lower bounds nual ACM Symposium on Theory of Computing, 1996, pp. 561– on the deterministic and quantum communication complexity of 570. hamming distance, cs.CC/0411076, 2004. [15] A.A. Razborov, Applications of matrix methods to the the- [4] Z. Bar-Yossef, T.S. Jayram, Ravi Kumar, D. Sivakumar, Infor- ory of lower bounds in computational complexity, Combinator- mation statistics approach to data stream and communication ica 10 (1) (1990) 81–93. complexity, Journal of Computer and System Sciences 68 (4) [16] A.A. Razborov, Quantum communication complexity of sym- (2004) 702–732. predicates, Izvestiya Math. 67 (1) (2003) 145–159 (Eng- [5] L. Babai, P.G. Kimmel, Randomized simultaneous messages: so- lish version); also in: quant-ph/0204025. lution of a problem of Yao in communication complexity, in: [17] A.C.-C. Yao, Some complexity questions related to distributive Proceedings of the 12th Annual IEEE Conference on Compu- computing, in: Proceedings of the 11th Annual ACM Sympo- tational Complexity, 1997, pp. 239–246. sium on Theory of Computing, 1979, pp. 209–213. [6] G. Brassard, P. Høyer, A. Tapp, Quantum counting, in: Proceed- ings of 25th International Colloquium on Automata, Languages [18] A.C.-C. Yao, Quantum circuit complexity, in: Proceedings of the and Programming, 1998. 34th Annual IEEE Symposium on Foundations of Computer Sci- [7] H. Buhrman, R. Cleve, J. Watrous, R. de Wolf, Quantum finger- ence, 1993, pp. 352–361. printing, Physical Review Letters 87 (16) (2001). [19] A.C.-C. Yao, On the power of quantum fingerprinting, in: Pro- [8] H. Buhrman, R. Cleve, A. Wigderson, Quantum vs. classi- ceedings of the 35th Annual ACM Symposium on Theory of cal communication and computation, in: Proceedings of the Computing, 2003, pp. 77–81.