The Communication Complexity of the Hamming Distance Problem

Information Processing Letters 99 (2006) 149–153 www.elsevier.com/locate/ipl The communication complexity of the Hamming distance problem Wei Huang a,1, Yaoyun Shi a,1, Shengyu Zhang b,2,YufanZhua,∗,1 a Department of Electrical Engineering and Computer Science, University of Michigan, 1301 Beal Avenue, Ann Arbor, MI 48109-2122, USA b Computer Science Department, Princeton University, NJ 08544, USA Received 20 February 2005; received in revised form 28 September 2005; accepted 9 January 2006 Communicated by P.M.B. Vitányi Abstract We investigate the randomized and quantum communication complexity of the HAMMING DISTANCE problem, which is to determine if the Hamming distance between two n-bit strings is no less than a threshold d. We prove a quantum lower bound of (d) qubits in the general interactive model with shared prior entanglement. We also construct a classical protocol of O(d log d) bits in the restricted Simultaneous Message Passing model with public random coins, improving previous protocols of O(d2) bits [A.C.-C. Yao, On the power of quantum fingerprinting, in: Proceedings of the 35th Annual ACM Symposium on Theory of Computing, 2003, pp. 77–81], and O(d log n) bits [D. Gavinsky, J. Kempe, R. de Wolf, Quantum communication cannot simulate a public coin, quant-ph/0411051, 2004]. © 2006 Published by Elsevier B.V. Keywords: Computational complexity; Communication complexity; Hamming distance 1. Introduction We recall some basic concepts below. Let n be an integer and X = Y ={0, 1}n.Letf : X × Y →{0, 1} be a Communication complexity was introduced by Yao Boolean function. Consider the scenario where two par- [17] and has been extensively studied afterward not only ties, Alice and Bob, who know only x ∈ X and y ∈ Y , for its own intriguing problems, but also for its many respectively, communicate interactively with each other applications ranging from circuit lower bounds to data to compute f(x,y).Thedeterministic communication streaming algorithms. We refer the reader to the mono- complexity of f , denoted by D(f ), is defined to be the graph [12] for an excellent survey. minimum integer k such that there is a protocol for computing f using no more than k bits of communication on any pair of inputs. The randomized communication complexity of f , denoted by Rpub(f ), is similarly de- * Corresponding author. fined, with the exception that Alice and Bob can use E-mail addresses: [email protected] (W. Huang), publicly announced random bits and that they are re- [email protected] (Y. Shi), [email protected] quired to compute f(x,y) correctly with probability (S. Zhang), [email protected] (Y. Zhu). 1 Supported in part by NSF grants 0347078 and 0323555. at least 2/3. One of the central themes on the classi- 2 Supported in part by NSF grants CCR-0310466 and CCF- cal communication complexity studies is to understand 0426582. how randomness helps in saving the communication 0020-0190/$ – see front matter © 2006 Published by Elsevier B.V. doi:10.1016/j.ipl.2006.01.014 150 W. Huang et al. / Information Processing Letters 99 (2006) 149–153 cost. A basic finding of Yao [17] is that there are func- exponential separation is generalized by Yao [19], tions f such that R(f ) = O(log D(f )). One example is showing that R,pub(f ) = constant implies Q(f ) = the EQUALITY problem, which simply checks whether O(log n). As an application, Yao considered the HAM- x = y. MING DISTANCE problem defined below. For any Later results show that different ways of using ran- x,y ∈{0, 1}n, the Hamming weight of x, denoted by domness result in quite subtle changes on communica- |x|, is the number of 1’s in x, and the Hamming dis- tion complexity. A basic finding in this regard, due to tance of x and y is |x ⊕ y|, with “⊕” being bit-wise Newman [13], is that public-coin protocols can save at XOR. most O(log n) bits over protocols in which Alice and Bob toss private (and independent) coins. The situation Definition 1.1. For 1 d n,thed-HAMMING DIS- is, however, dramatically different in the Simultaneous TANCE problem is to compute the following Bool- { }n ×{ }n →{ } Message Passing (SMP) model, also introduced by Yao ean function HAMn,d : 0, 1 0, 1 0, 1 , with = | ⊕ | [17], where Alice and Bob each send a message to a HAM(x, y) 1 if and only if x y >d. third person, who then outputs the outcome of the pro- Lemma 1.2. (Yao [19].) R,pub(HAM ) = O(d2). tocol. Apparently, this is a more restricted model and n,d for any function, the communication complexity in this In a recent paper [10], Gavinsky et al. gave another model is at least that in the general interactive commu- classical protocol, which is an improvement over Yao’s nication model. Denote by R(f ) and R,pub(f ) the when d log n. communication complexities in the SMP model with private and public random coins, respectively. It is in- ,pub = Lemma 1.3. (Gavinsky et al. [10].) R (HAMn,d ) teresting to note that R ,pub(EQUALITY) = O(1) but √ O(d log n). R (EQUALITY) = ( n) [2,14,5]. Yao also initiated the study of quantum communica- In this paper, we observe a lower bound for ∗ tion complexity [18], where Alice and Bob are equipped Q (HAMn,d ), which is also a lower bound for ,pub with quantum computational power and exchange quan- R (HAMn,d ) according to Eq. (1). tum bits. Allowing an error probability of no more than Notice that HAM(x, y) = n − HAM(x, y)¯ , where def 1/3 in the interactive model, the resulting communi- y¯ = 11 ···1 ⊕ y. Therefore cation complexity is the quantum communication com- ∗ = ∗ plexity of f , denoted by Q(f ). If the two parties are Q (HAMn,d ) Q (HAMn,n−d ), allowed to share prior quantum entanglement, the quan- and we need only consider the case d n/2. tum analogy of randomness, the communication com- ∗ ∗ plexity is denoted by Q (f ). Similarly, the quantum Proposition 1.4. For any d n/2, Q (HAMn,d ) = communication complexities in the SMP model are de- (d). noted by Q and Q,∗, depending on whether prior entanglement is shared. The following relations among the We then construct a public-coin randomized SMP measures are easy to observe. protocol that almost matches the lower bound and im- proves both of the above protocols. pub ∗ R (f ) ,pub Q (f ) ,∗ R (f ). (1) ,pub Q (f ) Theorem 1.5. R (HAMn,d ) = O(d log d). Two very interesting problems in both communi- We shall prove the above two results in the follow- cation models are the power of quantumness, i.e., de- ing sections. Finally we discuss open problems and a termining the biggest gap between quantum and ran- plausible approach for closing the gap. domized communication complexities, and the power of shared entanglement, i.e., determining the biggest Other related work. Ambainis et al. [3] considered gap between quantum communication complexities the error-free communication complexity, and proved with and without shared entanglement. An impor- that any error-free quantum protocol for the Hamming tant result for the first problem by Buhrman et al. Distance problem requires at least n − 2 qubits of com- [7] is Q (EQUALITY) = O(log n), an exponential sav- munication in the interactive model, for any d n − 1, ing compared to the randomized counterpart result Feigenbaum et al. [9] started the secure multiparty ap- √ R (EQUALITY) = ( n) mentioned above. This proximate computation of the Hamming distance. W. Huang et al. / Information Processing Letters 99 (2006) 149–153 151 2. Lower bound of the quantum communication d 3n /8 and d = (d). Employing the result of the ∗ complexity of the Hamming distance problem case that d 3n/8, we have Q (HAMn ,k ,d ) = (d ). ∗ ∗ Thus Q (HAMn,d ) Q (HAMn ,k ,d ) = (d ) = 2 For proving the lower bound, we restrict HAMn,d (d). on those pairs of inputs with equal Hamming distance. More specifically, for an integer k,1 k n, define 3. Upper bound of the classical communication def n complexity of the Hamming distance problem Xk = Yk ={x: x ∈{0, 1} , |x|=k}. Let HAMn,k,d : Xk × Yk →{0, 1} be the restriction of HAMn,d on To prove Theorem 1.5, we reduce the HAMn,d prob- Xk × Yk. Before proving Proposition 1.4, we briefly intro- lem to HAM16d2,d problem by the following lemma. duce some related results. Let x,y ∈{0, 1}n.The Lemma 3.1. DISJOINTNESS problem is to compute the following n n Boolean function DISJn : {0, 1} ×{0, 1} →{0, 1}, ,pub = ,pub R (HAMn,d ) O R (HAM16d2,d ) . DISJn(x, y) = 1 if and only if there exists an integer i,1 i n, so that xi = yi = 1. It is known that ∗ √ Note that Theorem 1.5 immediately follows from R(DISJ ) = (n) [11,15], and Q (DISJ ) = ( n) ,pub n n Lemma 3.1 because by Lemma 1.3, R (HAMn,d ) = [16,1,4]. ,pub = 2 = O(d log n), thus R (HAM16d2,d ) O(d log d ) We shall use an important lemma in Razborov [16], O(d log d). Now by Lemma 3.1, we have which is more general than his remarkable lower bound ,pub = on quantum communication complexity of DISJOINT- R (HAMn,d ) O(d log d). NESS. Here we may abuse the notation by viewing So in what follows, we shall prove Lemma 3.1.

The Communication Complexity of the Hamming Distance Problem

The One-Way Communication Complexity of Hamming Distance

Fault Tolerant Computing CS 530 Information Redundancy: Coding Theory

Sequence Distance Embeddings

Dictionary Look-Up Within Small Edit Distance

Searching All Seeds of Strings with Hamming Distance Using Finite Automata

B-Bit Sketch Trie: Scalable Similarity Search on Integer Sketches

Large Scale Hamming Distance Query Processing

Parity Check Code Repetition Code 10010011 11100001

Bit-Parallel Approximate String Matching Under Hamming Distance

Calculating Hamming Distance with the IBM Q Experience José Manuel Bravo [email protected] Prof

Low Distortion Embedding from Edit to Hamming Distance Using Coupling

Chapter 10 Error Detection and Correction