Graph-based Time-space Trade-offs For Approximate Near Neighbors

Laarhoven Thijs. 2017

We take a first step towards a rigorous asymptotic analysis of graph-based approaches for finding (approximate) nearest neighbors in high-dimensional spaces, by analyzing the complexity of (randomized) greedy walks on the approximate near neighbor graph. For random data sets of size $n = 2^{o (d)}$ on the $d$ -dimensional Euclidean unit sphere, using near neighbor graphs we can provably solve the approximate nearest neighbor problem with approximation factor $c > 1$ in query time $n^{ρ_{q} + o (1)}$ and space $n^{1 + ρ_{s} + o (1)}$ , for arbitrary $ρ_{q}, ρ_{s} \geq 0$ satisfying $\begin{array}{r} (2 c^{2} - 1) ρ_{q} + 2 c^{2} (c^{2} - 1) \sqrt{ρ_{s} (1 - ρ_{s})} \geq c^{4} . \end{array}$ Graph-based near neighbor searching is especially competitive with hash-based methods for small $c$ and near-linear memory, and in this regime the asymptotic scaling of a greedy graph-based search matches the recent optimal hash-based trade-offs of Andoni-Laarhoven-Razenshteyn-Waingarten [SODA’17]. We further study how the trade-offs scale when the data set is of size $n = 2^{Θ (d)}$ , and analyze asymptotic complexities when applying these results to lattice sieving.

Awesome Learning to Hash

Graph-based Time-space Trade-offs For Approximate Near Neighbors

Laarhoven Thijs. 2017

Similar Work