Document Type

Article

Publication Date

10-25-2012

Publication Title

SIAM Journal on Computing

Department

Department of Computer Science

Abstract

We prove an optimal Ω(n) lower bound on the randomized communication complex- ity of the much-studied gap-hamming-distance problem. As a consequence, we obtain essentially optimal multipass space lower bounds in the data stream model for a number of fundamental prob- lems, including the estimation of frequency moments. The gap-hamming-distance problem is a communication problem, wherein Alice and Bob receive n-bit strings x and y, respectively. They are promised that the Hamming distance between x and y is either at least n/2 + √n or at most n, and their goal is to decide which of these is the case. Since the formal presentation of then/2−√ problem by Indyk and Woodruff [Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, 2003, pp. 283–289], it had been conjectured that the na ̈ıve protocol, which uses n bits of communication, is asymptotically optimal. The conjecture was shown to be true in several special cases, e.g., when the communication is deterministic or when the number of rounds of communication is limited. The proof of our aforementioned result, which settles this conjecture fully, is based on a new geometric statement regarding correlations in Gaussian space, related to a result of Borell [Z. Wahrsch. Verw. Gebiete, 70 (1985), pp. 1–13]. To prove this geometric statement, we show that random projections of not-too-small sets in Gaussian space are close to a mixture of translated normal variables.

DOI

10.1137/120861072

COinS