A Framework For Similarity Search With Space-time Tradeoffs Using Locality-sensitive Filtering
Christiani Tobias. Arxiv 2016
[Paper]
ARXIV
Independent
LSH
We present a framework for similarity search based on Locality-Sensitive
Filtering (LSF), generalizing the Indyk-Motwani (STOC 1998) Locality-Sensitive
Hashing (LSH) framework to support space-time tradeoffs. Given a family of
filters, defined as a distribution over pairs of subsets of space with certain
locality-sensitivity properties, we can solve the approximate near neighbor
problem in \(d\)-dimensional space for an \(n\)-point data set with query time
\(dn^{\rho_q+o(1)}\), update time \(dn^{\rho_u+o(1)}\), and space usage \(dn + n^{1
- \rho_u + o(1)}\). The space-time tradeoff is tied to the tradeoff between
query time and update time, controlled by the exponents \(\rho_q, \rho_u\) that
are determined by the filter family. Locality-sensitive filtering was
introduced by Becker et al. (SODA 2016) together with a framework yielding a
single, balanced, tradeoff between query time and space, further relying on the
assumption of an efficient oracle for the filter evaluation algorithm. We
extend the LSF framework to support space-time tradeoffs and through a
combination of existing techniques we remove the oracle assumption.
Building on a filter family for the unit sphere by Laarhoven (arXiv 2015) we
use a kernel embedding technique by Rahimi & Recht (NIPS 2007) to show a
solution to the \((r,cr)\)-near neighbor problem in \(\ell_s^d\)-space for \(0 < s
\leq 2\) with query and update exponents
\(\rho_q=\frac{c^s(1+\lambda)^2}{(c^s+\lambda)^2}\) and
\(\rho_u=\frac{c^s(1-\lambda)^2}{(c^s+\lambda)^2}\) where \(\lambda\in[-1,1]\) is a
tradeoff parameter. This result improves upon the space-time tradeoff of
Kapralov (PODS 2015) and is shown to be optimal in the case of a balanced
tradeoff. Finally, we show a lower bound for the space-time tradeoff on the
unit sphere that matches Laarhoven’s and our own upper bound in the case of
random data.
Similar Work