Fully Understanding The Hashing Trick
Freksen Casper Benjamin, Kamma Lior, Larsen Kasper Green. Arxiv 2018
[Paper]
ARXIV
Independent
Feature hashing, also known as {\em the hashing trick}, introduced by
Weinberger et al. (2009), is one of the key techniques used in scaling-up
machine learning algorithms. Loosely speaking, feature hashing uses a random
sparse projection matrix (where )
in order to reduce the dimension of the data from to while
approximately preserving the Euclidean norm. Every column of contains
exactly one non-zero entry, equals to either or .
Weinberger et al. showed tail bounds on \(|Ax|2^2\). Specifically they
showed that for every , if \(|x|{\infty} / |x|2\) is
sufficiently small, and is sufficiently large, then $|x|{\infty} / |x|_2, m, \epsilon, \delta1$, thus further
illustrating the tightness of the presented bounds in practice.
Similar Work