Faster Binary Embeddings For Preserving Euclidean Distances
Zhang Jinjie, Saab Rayan. Arxiv 2020
[Paper]
ARXIV
Independent
Quantisation
We propose a fast, distance-preserving, binary embedding algorithm to
transform a high-dimensional dataset into
binary sequences in the cube . When consists of
well-spread (i.e., non-sparse) vectors, our embedding method applies a stable
noise-shaping quantization scheme to where
is a sparse Gaussian random matrix. This contrasts with most binary embedding
methods, which usually use for the embedding.
Moreover, we show that Euclidean distances among the elements of
are approximated by the norm on the images of under a
fast linear transformation. This again contrasts with standard methods, where
the Hamming distance is used instead. Our method is both fast and memory
efficient, with time complexity and space complexity . Further, we
prove that the method is accurate and its associated error is comparable to
that of a continuous valued Johnson-Lindenstrauss embedding plus a quantization
error that admits a polynomial decay as the embedding dimension increases.
Thus the length of the binary codes required to achieve a desired accuracy is
quite small, and we show it can even be compressed further without compromising
the accuracy. To illustrate our results, we test the proposed method on natural
images and show that it achieves strong performance.
Similar Work