[Paper]
Tensor Core Units (TCUs) are hardware accelerators developed for deep neural networks, which efficiently support the multiplication of two dense (\sqrt{m}\times \sqrt{m}) matrices, where (m) is a given hardware parameter. In this paper, we show that TCUs can speed up similarity search problems as well. We propose algorithms for the Johnson-Lindenstrauss dimensionality reduction and for similarity join that, by leveraging TCUs, achieve a (\sqrt{m}) speedup up with respect to traditional approaches.