Locality-sensitive Hashing Scheme Based On Longest Circular Co-substring
Lei Yifan, Huang Qiang, Kankanhalli Mohan, Tung Anthony K. H.. Arxiv 2020
[Paper]
ARXIV
Independent
LSH
Locality-Sensitive Hashing (LSH) is one of the most popular methods for
-Approximate Nearest Neighbor Search (-ANNS) in high-dimensional spaces.
In this paper, we propose a novel LSH scheme based on the Longest Circular
Co-Substring (LCCS) search framework (LCCS-LSH) with a theoretical guarantee.
We introduce a novel concept of LCCS and a new data structure named Circular
Shift Array (CSA) for -LCCS search. The insight of LCCS search framework is
that close data objects will have a longer LCCS than the far-apart ones with
high probability. LCCS-LSH is LSH-family-independent, and it supports
-ANNS with different kinds of distance metrics. We also introduce a
multi-probe version of LCCS-LSH and conduct extensive experiments over five
real-life datasets. The experimental results demonstrate that LCCS-LSH
outperforms state-of-the-art LSH schemes.
Similar Work