Efficient Approximate Search For Sets Of Vectors
Leybovich Michael, Shmueli Oded. Arxiv 2021
[Paper]
ARXIV
Graph
Independent
LSH
Quantisation
We consider a similarity measure between two sets and of vectors,
that balances the average and maximum cosine distance between pairs of vectors,
one from set and one from set . As a motivation for this measure, we
present lineage tracking in a database. To practically realize this measure, we
need an approximate search algorithm that given a set of vectors and sets
of vectors , the algorithm quickly locates the set that
maximizes the similarity measure. For the case where all sets are singleton
sets, essentially each is a single vector, there are known efficient
approximate search algorithms, e.g., approximated versions of tree search
algorithms, locality-sensitive hashing (LSH), vector quantization (VQ) and
proximity graph algorithms. In this work, we present approximate search
algorithms for the general case. The underlying idea in these algorithms is
encoding a set of vectors via a “long” single vector. The proposed approximate
approach achieves significant performance gains over an optimized, exact search
on vector sets.
Similar Work