Johnson-lindenstrauss Embeddings For Noisy Vectors -- Taking Advantage Of The Noise
Shao Zhen. Arxiv 2022
[Paper]
ARXIV
This paper investigates theoretical properties of subsampling and hashing as
tools for approximate Euclidean norm-preserving embeddings for vectors with
(unknown) additive Gaussian noises. Such embeddings are sometimes called
Johnson-lindenstrauss embeddings due to their celebrated lemma. Previous work
shows that as sparse embeddings, the success of subsampling and hashing closely
depends on the to ratios of the vector to be mapped. This
paper shows that the presence of noise removes such constrain in
high-dimensions, in other words, sparse embeddings such as subsampling and
hashing with comparable embedding dimensions to dense embeddings have similar
approximate norm-preserving dimensionality-reduction properties. The key is
that the noise should be treated as an information to be exploited, not simply
something to be removed. Theoretical bounds for subsampling and hashing to
recover the approximate norm of a high dimension vector in the presence of
noise are derived, with numerical illustrations showing better performances are
achieved in the presence of noise.
Similar Work