On The Complexity Of Inner Product Similarity Join | Awesome Learning to Hash Add your paper to Learning2Hash

On The Complexity Of Inner Product Similarity Join

Ahle Thomas D., Pagh Rasmus, Razenshteyn Ilya, Silvestri Francesco. Arxiv 2015

[Paper]    
ARXIV LSH Supervised

A number of tasks in classification, information retrieval, recommendation systems, and record linkage reduce to the core problem of inner product similarity join (IPS join): identifying pairs of vectors in a collection that have a sufficiently large inner product. IPS join is well understood when vectors are normalized and some approximation of inner products is allowed. However, the general case where vectors may have any length appears much more challenging. Recently, new upper bounds based on asymmetric locality-sensitive hashing (ALSH) and asymmetric embeddings have emerged, but little has been known on the lower bound side. In this paper we initiate a systematic study of inner product similarity join, showing new lower and upper bounds. Our main results are:

Similar Work