Optimal Data-dependent Hashing For Approximate Near Neighbors
Andoni Alexandr, Razenshteyn Ilya. Arxiv 2015
[Paper]
ARXIV
Independent
LSH
We show an optimal data-dependent hashing scheme for the approximate near
neighbor problem. For an -point data set in a -dimensional space our data
structure achieves query time and space \(O(n^{1+\rho+o(1)}
- dn)\), where for the Euclidean space and
approximation . For the Hamming space, we obtain an exponent of
.
Our result completes the direction set forth in [AINR14] who gave a
proof-of-concept that data-dependent hashing can outperform classical Locality
Sensitive Hashing (LSH). In contrast to [AINR14], the new bound is not only
optimal, but in fact improves over the best (optimal) LSH data structures
[IM98,AI06] for all approximation factors .
From the technical perspective, we proceed by decomposing an arbitrary
dataset into several subsets that are, in a certain sense, pseudo-random.
Similar Work