Near Neighbor Who Is The Fairest Of Them All | Awesome Learning to Hash Add your paper to Learning2Hash

Near Neighbor Who Is The Fairest Of Them All

Sariel Har-peled, Sepideh Mahabadi. Neural Information Processing Systems 2019

[Paper]    
Independent LSH NEURIPS

In this work we study a “fair” variant of the near neighbor problem. Namely, given a set of n points P and a parameter r, the goal is to preprocess the points, such that given a query point q, any point in the r-neighborhood of the query, i.e., B(q,r), have the same probability of being reported as the near neighbor.

We show that LSH based algorithms can be made fair, without a significant loss in efficiency. Specifically, we show an algorithm that reports a point p in the r-neighborhood of a query q with almost uniform probability. The time to report such a point is proportional to O(\dns(q.r)Q(n,c)), and its space is O(S(n,c)), where Q(n,c) and S(n,c) are the query time and space of an LSH algorithm for c-approximate near neighbor, and \dns(q,r) is a function of the local density around q.

Our approach works more generally for sampling uniformly from a sub-collection of sets of a given collection and can be used in a few other applications. Finally, we run experiments to show performance of our approach on real data.

Similar Work